Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Version Control for Structure Editing (alarmingdevelopment.org)
126 points by mepian on Oct 19, 2021 | hide | past | favorite | 55 comments


Just a reminder that git stores files, not diffs, and you can replace the merging strategy (e.g. how it handles multiple heads), merge driver (e.g. word vs. line based merging), and interactive diffing tool with anything you want. In this sense git is purely concerned with version control (what instance do I have of this data and what is its provenance in regards to other instances), and doesn't really give a crap how those files got there.

I see a new structured editing project kicking off 3-4 times a year and for some reason all of them seem to start by replacing git. Thereby they immediately have to contend with storage, branching, naming, and distribution, rather than using git as an object store and focusing on their new editing algorithms.

(There are also very real workflow issues with the snapshot model! But these structure editing projects don't try to address those either.)


True, indeed JetBrains MPS has its own git driver


The answer for successfully applying VCS to higher-dimensional spaces will demand more mathematically-elegant intermediate representations. Most source code files are highly structured by default. Image files are mostly feasible to diff as-is. Typical 3d models, not so much. 3d models with animation, even less so.

To be clear - the problem isn't that we cant detect a difference, it's that we cannot produce a useful view of the difference such that a human can make an informed decision. With images/audio/code, you can still extract useful knowledge as long as you know the shape of the difference relative to the whole, even if the difference itself is a meaningless mesh of colors between 2 image files.

Writing a useful diff engine for 3d models represented using constructive solid geometry would probably be substantially easier than with other approaches. I don't know if CSG is actually constrained to 3 dimensions either... I feel like GitHub actually tried to do something like this but I don't know if it went very far.


Here is the GH blog post I'm thinking of from 2013:

https://github.blog/2013-09-17-3d-file-diffs/


I've been working in the collaborative editing space for a decade now (!) and I think this problem is simply a consequence of an age old architectural mistake that we persist because we don't know any better.

The problem is that the filesystem acts as a dumb blob store that doesn't understand changes. And it doesn't understand any of the semantics of what its storing. Its a lovely abstraction - but it has bullied us into the age old "load / edit / save" workflow. And that workflow just isn't very good:

- Its slower for the computer - because no matter how incremental, the computer has to rewrite the entire file to disk

- Its buggy by default. A power loss or malfunction can and will result in a corrupt save file. You can fix this with careful use of fsync() and renames, but almost nobody does that correctly because its complicated and confusing. And it makes saving slower. You see this all the time in video games - "When you see this symbol, don't turn off your console".

- Because you only give the OS snapshots in time, you end up discarding information about the changes that have happened along the way.

With code editing we work around this with diff-match-patch, but its lossy, not realtime, and not good enough for collaborative editing.

With more complex data (like 3d models) this problem is much more egregious. There's no reason the modelling program couldn't output a fine grained sequence of changes as they're made by the user. Any collaborative editor already does this.

Having a rich set of changes gives us way more options for merging - eg, they could be merged in a conflict-free way if we want. Or with a custom UI or whatever. Its strictly more information that you can generate with diff-match-patch.

The problem is there's no obvious place on the filesystem to put that information, or way to tell the OS about changes as they happen. And so long as the file system is the way that disparate programs talk to each other, and the file system can't handle this information, programs ends up just discarding any detailed change sets. And we end up with ugly JSON merge conflicts in 3d files. Collaborative editing (if it happens at all) ends up done on a per-app basis rather than just working by default across the whole OS - like it should.

And the result of all that is that the desktop is being slowly sidelined in favour of proprietary web apps (like figma or google docs). Why? Because web apps aren't constrained to the paradigm of saving and loading files.

It won't be easy, but it seems to me like a no-brainer for us to fix.


> Having a rich set of changes gives us way more options for merging - eg, they could be merged in a conflict-free way if we want. Or with a custom UI or whatever. Its strictly more information that you can generate with diff-match-patch.

this reminds me of an event store from event sourcing. the file itself is just the on-disk persisted format of the 'permanent undo log'.

Depending on how fancy the system is, one could view all of the changes, the mark a particular change in the past as 'deleted', or modify that change's parameters, or maybe even do a 'this sequence of changes replaces that change there' operation. on disk, there are just more events added to the event stores, while onscreen, the file is re-rendered / re-projected anew to reflect what it would look like as ifthose changes were original. Surely i'm just poorly reinventing CRDTs or DAGs or similar, right?

granted, i don't know how easy it would be to not cascade subsequent edit events if one of their past dependencies was gone, so we'd need to make the events be 'pure functions' (the X-eith thru X+N-eth words equal Y) rather than imperative statements (add text at offset X with content Y)


> web apps aren't constrained to the paradigm of saving and loading files.

How is reading and writing bytes to FS that different from doing the same on a network stack? In fact I'm puzzled by your post which seems to imply that a 'database' -- a store for structured data -- can't be written over a file system.

> The problem is there's no obvious place on the filesystem to put that information, or way to tell the OS about changes as they happen.

A thin layer of abstraction over FS seems reasonable, possible, and frankly ancient tech: they are called databases.


Of course you can build things on top of files. But most people don't, they use files directly, and therefore get "bullied" (which I think is a very apt term) into load/edit/save cycles of file-shaped chunks.

Application programmers could be provided with a better storage API, either as a library or as a standard from the platform. The result would be better software with little work from the application programmer's side. I'm not a Mac person anymore, but I believe they have something like this in the form of CoreData (no idea of it's good or not), and I imagine Windows has seven things that you can either use or not.

An analogy would be how it's possible to build something that resembles concurrent processing and top of any single-tasking system, but a platform that doesn't provide it as a built-in will be very different to a platform that does.

Non-UNIX operating systems had structured data as a core part of their libraries. This was probably a good idea that got lost thrown aside by some people call "the minimalist elegance of UNIX", which other people, including I imagine my GP post, have less endearing terms for.


The GP I replied to starts with "I've been working in the collaborative editing space for a decade now (!)" and claiming a network interface somehow magically does something that an FS can not (and speaking of Unix, entirely ignoring that both mechanisms use file descriptors). I mean, what to make of statements such as "[t]he problem is there's no obvious place on the filesystem to put that [meta-data] information"?

As for the generic "application programmers", possibly you are right, and doing 'literal' mappings between their memory and persistent models is to be expected.


> [t]he problem is there's no obvious place on the filesystem to put that [meta-data] information

Well, there isn't, except the file itself. I.e. you build a protocol on top of it.


Yeah, just like "web based" systems use a protocol to shuffle bits around. So an FS will in fact support building higher semantics on top of files and folders, and a visit to your local .git directory will provide an example of a contemporary popular tool that does precisely that.

Now possibly the GP is upset that random app A can't just use the files spit out by app B, which is as reasonable an expectation as having random client C use a universal driver to talk to various servers.


The problem isn’t that doing this stuff is impossible. It obviously is - databases work well. The problem is the point of interoperability between applications in Unix is the file. It’s not the database - applications don’t pass data to each other via Postgres. The pass data with files. And that means if you want multiple programs (eg, a 3D modelling program and git) to share information, the way they do so is by the modelling program saving flat files and git reading them (and diffing them). That just isn’t very good.

We can work around this on a per application basis. My code editor could output all the changes to a custom SQLite database. But then it won’t work with all the other tools on the system. I can't use grep or awk. I can't use git properly. We end up with isolated bundles bits that can't easily be used together.

Doing that goes against platform conventions. It’s not the Unix way. So people don’t do it. And as a result, for interoperability reasons applications don't save their data to databases. They use flat files, hope nothing gets corrupted and discard data about changes. The reason isn’t technical. It’s because conforming your software to the platform convention is what people want. I want to be able to just copy my data to a usb key without thinking about postgres's data format and extraction tools.

There's a parallel in programming. Languages define common types - like strings. But they don’t need to. Rust could leave the String implementation details to 3rd party crates. The problem is if they do that, libraries wouldn’t be able to share strings easily with each other because they might each be using a different string data type. So standard APIs would probably avoid strings whenever they can. This is the problem we have on Unix. There’s no interoperable, platform-standard way to store changes on disk. So all the apps I use don’t. When git needs change feeds, it’s forced to use diff-match-patch instead. But it’s not as good: You don’t get real-time feeds. And it doesn’t work properly with non-text data. Conflicts are unavoidable. And patches are non canonical.

You’re right - we can solve all this on a per application basis. I don’t want individual apps to have their own hand rolled data silos and their own crappy git replacements. I want a platform where those tools work with any kind of data. Even with apps that haven’t been written yet. That’s the Unix way. I love the filesystem. I just think that after decades of experience we can do it better.


Of course, what you want is understood.

Put on your OS designer hat and let's think this through. A generalized capability for persistent data with universal semantics -- the analog of the magical universal driver for networked IPC -- is a design brief for a generalized database. So your first decision is 'how is the data organized'? Given the generality of the brief, the optimal candidate is a graph of documents. So a graph database that would allow arbitrary, semantic, linkages between files.

So now we need to get a handle on the semantics. At the individual document level, the OS will need to provide some sort of type system for data, the equiv of your example of types in languages.

Then we need a generalized mechanism to allow for creation of higher level semantics -- the bits that semantically relate doc A to doc B. These would be the arcs of the graph. So now we're looking at something like RDF.

https://www.w3.org/RDF/

https://www.w3.org/DesignIssues/Metadata

Will this be a freaking cool OS? Definitely. Will it be performant? TBD, but obviously not as performant as the current state of affairs. And that's only the mechanisms aspect of it. You will still need to address all the issues related to 'shared/universal semantics' that the Semantic Web people have to deal with.

At that point you the OS designer need to ask yourself if this is a reasonable task for the OS.



It’s easy to represent 3D data as JSON polygon soup, which can then be diffed and merged as text. A lot of game engines store geometry data in this way. It often works fine until there’s a merge conflict, at which point things become difficult.

Revisions A and B have, as part of their changes, both modified vertices originally on lines 437 and 7689, and A has changed 5678 while B has deleted it. It’s often easier to discard one revision and redo the work in-order, rather than try to resolve.


> To be clear - the problem isn't that we cant detect a difference, it's that we cannot produce a useful view of the difference such that a human can make an informed decision.

The comment is not talking about diffing JSONs but being able to understand what the difference means.


Yes, exactly. I was providing an example.


Most CADs are already history based, and in fact can't work without it.

A much bigger challenge is to make a historyless CAD AKA "direct modelling"

I honestly admit, I failed to understand how geometry solvers of direct modelling CADs work.


Ooh this is exactly what I've been thinking about. Text is such a slow, clunky medium. It'd be interesting if you could think of versions as events modifying a tree. Renaming a variable and inserting a character would both be an event. Also I wonder if structural editing will take over. IDEs are already so powerful that if you could create good keybindings, you could do so much with just IDE commands (generate expr, rename var, swap args, etc.). Then if your editor knows that it will always keep a valid AST, what can you do with your tooling?


I really, really hope so.

Text is so clunky, especially in languages with superfluous syntax (semicolon, braces). My tree based outliner allows me to easily rearrange arbitrarily large blocks while never creating invalid syntax, why the heck doesn’t my IDE? Code is just a damn tree. Why can’t I arbitrarily choose to comment out/in code without breaking basically all the IDE tooling (collapsed a block? Well too bad!!)?

We should never have to think about syntax. Yet we (or certainly I) do a significant portion of the time.

The stuff I’m thinking of should be fairly possible to do as a Vscodium/VSCode plugin. Can somebody please tell me it’s already being done?


Those semicolons are redundant but not superfluous. Here are some good reasons why you might want to keep them around even in they aren’t strictly necessary in parsing your program.

https://digitalmars.com/articles/b05.html


Does it really make much of a difference whether you press an end-of-statement keyboard shortcut vs. typing a semicolon?

Having the latter as part of the source code is more explicit, similar to LaTex vs. invisible formatting marks in a word processor.


I don’t think that always keeping a valid AST is important. Realtime highlighting of syntax errors already resumes parsing after invalid code, usually mapping to error nodes internally. That is, you still have an AST, just with additional node types. Having an interim state with error nodes isn’t really different from having intermediate states with temporary (possibly large) changes in valid code, e.g. where you move/cut/paste larger portions of code around, and then maybe decide to change it back (or just change back some parts). Creating a sensible history of AST operations doesn’t really depend on whether you have error nodes in your AST grammar or not.

On the other hand, allowing error nodes (i.e. invalid code) at least as an intermediate state arguably allows more freedom and creativity when editing code, and feels less coercive. It is also unavoidable in certain contexts, such as while typing an identifier, the identifier may be invalid in most intermediate states until you have finished typing it.

Therefore I’m unconvinced that restricting editing to valid ASTs is (a) critical to collaborative editing and versioning, and (b) strictly desirable from a usability perspective.


Agreed, structure editors must have interim/liminal representations that are only partially formed, or else the editing experience doesn't flow. I do think there's value in tracking the movement/mutation of AST nodes with the granularity and precision only achievable by a structure editor, so I hope someone cracks the formula.

One thing I think would be cool is if you could select a region of code and move it up/down the abstraction ladder e.g. convert an AST subtree into tokens, or all the way down to individual UTF8 characters. Then do some text manipulation on the plaintext, and finally "cook" them back together into AST nodes. The real cool trick would be to have the editor diligently track the flow of nodes/identifiers spanning the transmutation.


JetBrains MPS is such an IDE, for a while I was also fascinated by this approach (structured editing) but I now think that the freedom character level editing gives is non-negotiable for me

it is easier to ask for forgiveness than for permission, the same is true for syntax errors, it is easier to report errors (and fix them) than trying to always maintain a document in a valid state


Darcs (and Pijul?) can support more patch types than textual diffs, but I doubt much use has ever been made of that. I don't know about the more general case, but it supports the extra type now for identifier replacement, at least as basically s/x/y/g. (One place where another type might be useful is changelogs, but I never looked at what that might take.)

The Toolpack tool set for Fortran from the '80s was based around parse trees and had a VCS, but I don't remember whether that actually operated on trees or just text.


Darcs could have some slightly crazy patches like replace-regexp, but to be honest I don’t think that works so well on a large codebase.

The problem I’ve had so far thinking about structural patches is that they need to capture several weird kinds of edit that imply that the patch cannot simply mirror the tree structure of the file. Some examples of tree patches (using sexps to show the tree structure):

  (A B C) -> (A B B2 C)
  (A B C) -> (let ((b B)) (A b C))
  ((A B) (C D)) -> ((A B C D))
  ((A B C D)) -> ((A B) X (C D))
  (F X (G Y)) -> (G Y)
Specifically it is operations like splitting, merging, raising, lowering, and reordering that can cause problems with computational and logical complexity.

If you look at the model for pijul, it goes:

1. A file is a totally ordered set of lines (or other atoms), some of which are marked as “deleted”

2. We consider a patch to be a map from one file to another, taking lines that don’t change to their counterparts, deleted lines to “deleted” lines, and changed lines to their new versions. The map must respect the order by being strictly increasing

3. We now generalise to allow for merges (and merge conflicts) by changing total order to partial order (it is a conflict where the ordering is not total) and patches are increasing maps.

It is hard to me to see how such a thing would extend to tree-structured files. But other Brooke online seem more confident so maybe I’m wrong.


git can support different diff/merge tools. I just wish more of gits configuration could be added to the repo itself. As it is, if you needed a custom merge tool (like UnityYamlMerge) you need each user to configure it separately.

The consequence is every contributor needs to know enough about every file type in the repo to know if a custom merge tool should be add/updated. You might get surprised with a merge conflict in a filetype you never touched if you happen to be the one merging down feature branches.

Hopefully some of this stuff and default client githooks are fixed one day. Seems easy enough to add a "suggested project config" to git.


The problem is those merge tool configurations are all security holes. Also, I get annoyed whenever something (like husky) messes with my git configuration: like editor customizations, I have a certain git workflow and I dislike projects that disrupt it.


Sure, I don't think it should be more than a suggestion you can vet yourself but I would like the process to be automated once approved.


Defining the drivers themselves needs to be done outside the repo as letting the repo run arbitrary commands is a security hole, but which merge/diff drivers appliy to which files can be done in the gitattributes file, and that does work in-repo.


>security hole

Its really not that hard to imagine how you can do this in a secure way. Git has identity and signatures built in. Maybe only updates from trusted users are considered, and you still must agree to the updates, etc etc.

Throwing up our hands and saying its impossible seems like a terrible choice.


If you're going to make me manually agree to the updates, you may as well just have me run a script manually.

I'm not saying it's impossible! I'm saying it's pointless. Trust people set up their own config given tools and instructions, and they might even surprise you by coming up with a better workflow.


Third party tooling could interface with something built in but not the one off scripts people use now. I'm not trying to be draconian here. I just want a built in way to share needed configuration information that we have to special case now.


Third-party tooling already interfaces with hooks/merge config without having to check them in. You put the script that does it in the repo, and you ask users to run the script on checkout.


I want exactly this but with an official format.


Since the beginning of computer time people have been working on structure editing, because academically it's very compelling, yet in practice text wins out over and over. That said, there's probably a lot of opportunity to have "structure under the hood", but that's kind of a moot point in general because that's what linters, compilers, etc., are.

But maybe his specific point about structural diffs is salient; that maybe there are huge wins in structural diffs that we haven't tapped into for some reason. Again, there are decades of research in structural diffing, so where's the impact?


It works well with lisps at the very least


Of the Lisp Machine vendors, only Xerox had a structure editor.


This is right, strictly-speaking, but paredit-style editing was one of the most compelling features in my decision to write lisp whenever I can


Structure editor can mean many things things.

You have text file that contains structures the editor knows about. You can edit and modify syntactically and semantically incorrect structures, leave them broken incomplete, etc.

Then there is a strict structure editor idea, where editing is writing ensures correct structures, modification happens from correct structure to another correct structure. That has never cached on. It turns out that editing code is more than just writing down the algorithm. It's also sketching, doodling and playing with ideas, ...

Lisp environments have the best of the both worlds. You have text, but also alive structures that live in the runtime image and can be saved. Close match between text syntax and the internal data structure makes it pleasure to work with.


Check out Dion Systems [0] for a recent, useful take on structured editing. The demo is rather inspiring.

[0] https://media.handmade-seattle.com/dion-systems


The challenge with implementing this is dealing with half a dozen types of operations or maybe more. In typical string OT/CRDT we are dealing with a minimal set of operations (insert/delete) but when it comes to a structure (= semantic trees) the ops are very tailored for that semantic structure and could span and evolve with the structure.

Even if we get the OT part right, it’d be huge effort to port this to support other semantic structures with different set of ops. Also I can’t wrap my head around how transformations and conflict detections work under these cases. Will watch out for more from this project.


> Even if we get the OT part right, it’d be huge effort to port this to support other semantic structures with different set of ops

Its very doable, and its been done. Eg, in json1[1] we have support for lots of tree operations on a set of data stored as a JSON tree:

- Insert / delete in lists

- Move (reparent) any object in the tree

- Embedded edits (so if you have a custom type inside a field and define your own operations)

This is all very general purpose. You don't need to do reinvent any of this per application.

[1] https://github.com/ottypes/json1


Ah, we meet again! Good morning Joseph:)

Let me be specific. I agree trees(or json) in general can have a fixed set of general purpose ops. This particular paper though, the author claims changing the type of a record struct is itself an operation. This allows to treat it as a special kind, thereby applying it means it has to touch a whole other set of nodes. This is just one example, there could be ops like this tailored for the structure he is working with which is what got me into thinking :)


Also, what happens if the structure was edited in some other editor and you suddenly get two structures with no history to compare against?


Haven't read the paper thoroughly yet, looking forward to it. The idea here seems to be very type driven and I think there is something to it.

The general goal reminds me of Unison[0], which takes a different approach. It sees code as kind of a database where the functions are immutable entries. So it is less granular, but likely more semantic.

What I immediately thought of reading your comment is paredit. I know of the Emacs mode[1] and the Calva VSCode plugin[2]. One could work from there, see code evolution as collection of structured editing units.

And then, some languages are extremely terse like APL or Forth. Haven't yet found time to study them, but maybe their representation and semantics are more suitable for this type of thing?

But yeah, just text might just not be the right medium for code in the first place. Not when we start thinking about what code actually is.

We're manipulating structures indirectly by manipulating text. Something is not right here... I know there have been many attempts to move away from it, some are successful but only for specific use-cases and I don't think anything succeeded in the general purpose space. Maybe someone will succeed though. There is no reason to believe otherwise. I feel like it would have to be a very cross disciplinary collaboration. People who make games, databases, art, science. Different perspectives to break out of what we think programming is or should be.

I watched this talk[3] some months ago. One of the cool things is the discussion near the end of the video at around 1h11m: look what Sussman does, when he talks about stratification and code structure - he closes his eyes. What is he seeing there? He explains it sure, but he _sees_ something. That's what the program _is_, not the text, not the bits and bytes. It's a deeply connected, complex, flowing structure - I think they talk about forests in there.

When we program, we manipulate this structure and the text we write is kind of far away from the actual mental model we have. Yes, I see code in my inner eye too, but that is when I think about implementing it, or when I navigate actually written code from memory. But it's not _the thing_.

[0] https://www.unisonweb.org/

[1] https://www.emacswiki.org/emacs/ParEdit

[2] https://calva.io/paredit/

[3] "Stratified Design: A Lisp Tradition" https://www.youtube.com/watch?v=BoGb56k2txk


I spent a while working on a generalized version control system when I graduated two years ago. It was called Saga [1]. Saga - get it? The name was the best bit.

It allowed you to specify a “file representation format,” and then used some messy 2d-and-above longest-common subsequence matching algo [3] I can up with to diff the files, and merge them if you wanted. It was a lovely learning experience I tried to pass off as a startup, and got two of my friends involved as cofounders.

From there, we tried to focus (generalized version control is really hard… technically and otherwise), and pivoted to version control for Excel spreadsheets. At one point we had branching and merging working for XLSX files. But as we began to discover what version of Excel customers used, things got a lot less fun. That + lack of interest led to another pivot.

Anyways, for the past 1 year (just passed!) we’ve been building Mito [3] with our learnings from all those spreadsheet folks we spent time above. Mito is effectively a spreadsheet within your Python environment. It’s absolutely still getting off the ground, but we’re pretty proud of the value we’re delivering to users currently!

[1] https://github.com/saga-vcs/saga

[2] https://github.com/saga-vcs/saga/blob/master/saga/base_file/...

[3] https://trymito.io/hn


Of historical interest was Interlisp-D as a system that did structure editing and version management. it was at the beginning of time so getting it to work again as a practical development environment is a lot of work.

https://github.com/Interlisp/medley/issues/533


> Perhaps version control is actually the weak point of the textual edifice, where we might be able to win a battle.

It would be interesting because as the paper says textual editing has great deployment and collaboration tooling. So if non textual could get a foothold in that exact area — git — it could draw a ton of people who just want to get things shipped.


Super interesting. Instead of going whole-hog, could we add some kind of hinting system to existing text-based systems that would make structural changes known to the VCS? Maybe also make it clear what's a comment or other insignificant change so that the important changes can be tracked separately?


At first glance I thought it was some kind of version control for designing tool, like figma.

In my experience, the workflow between designers are highly variable and the designs rarely reflect production fidelity. I am hoping to have a tool to facilitate the collaboration between visual/UI design and engineering. Anyway, am getting tangential here


The mention of schemas made me wonder if David Spivak's work on functional database migration might be relevant.

http://math.mit.edu/~dspivak/informatics/FunctorialDataMigra...

Note: requires big brain


Demo video linked in the paper: https://vimeo.com/631461226


Anyone working with Unity in Git knows this pain.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: