Plain text wouldn't allow for annotations, TOC, references, etc. These get ignored more often than not but they're still requirements and some do use them properly. Something like latex would be better, but apparently it's too much to expect these highly educated lawyers to learn some basic markup.
I'd bet a tool like lyx (https://www.lyx.org/Screenshots) would go along way to being a better solution. It's a great document writer (as opposed to a word processor) and apparently has some VCS support (although I haven't tried this) and diff viewing and is a lot more reliable for annotations and all that stuff than word.
I've done some legal automation work before, it would have been so much easier (and get better results) to have a plaintext format we could manipulate with shell scripts than working with docx and word.
- But doesn't plain text suck for difs?
+ But doesn't plain text suck for diffs?
- Especially when you have a really long, multi-line
- paragraph that has bmore than one change? And if we
- relied on word wrap, then the scope of change would
- be the paragraph and all hope is lost. At least
- this way we might eventually regain some
+ Especially when you have an exceptionally long,
+ multi-line paragraph that has more than one change?
+ But if we relied on word wrap, then the scope of
+ change would be the paragraph and all hope is lost.
+ At least this way we might eventually regain some
equivalency by chance and it's clear that nothing
after that point has changed.
Git is great, but it's designed for code. Legal documents are probably structured more like code than any other prose with clear identified chunks and frequently breaking out into lists, but it's not the same thing.
(For clarity, I think probably a sentence-by-sentence diff would work well. But then it's not plain text; you're relying on some processor to collect sentences into paragraphs.)
Don't use plaintext, use LaTeX. Since paragraphs are separated by an empty line, you can have each sentence on a separate in a paragraph on a separate line and have it still formatted as a paragraph in the final document. This allows for better diffs. Here is an example:
The diff algorithm/display is orthogonal to the data format; for example, I use the --color-words option to the git tools to use a word-oriented diff everywhere. I've even seen plugins floating around online for visual image diffs in git!
But again, all those tools are built for developers, and especially the ones used by git for internal purposes like for conflict resolution make some strong assumptions that don't hold for prose.
I'd bet a tool like lyx (https://www.lyx.org/Screenshots) would go along way to being a better solution. It's a great document writer (as opposed to a word processor) and apparently has some VCS support (although I haven't tried this) and diff viewing and is a lot more reliable for annotations and all that stuff than word.
I've done some legal automation work before, it would have been so much easier (and get better results) to have a plaintext format we could manipulate with shell scripts than working with docx and word.