Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The benchmarks Matt Weidner has been working on are great and outside scrutiny is always welcome, but I should note that I find there's an element of artificiality to them. In particular, testing the performance of the sync system while simulating many users typing into the same document doesn't really measure behaviour we have observed "in the wild". In our research, we've found that editing is usually serial or asynchronous. (See https://inkandswitch.com/upwelling for further discussion of our collaboration research.)

The benchmark that concerns me (and that I'm pleased with our progress on!) is that you can edit an entire Ink & Switch long-form essay with Automerge and that the end-to-end keypress-to-paint latency using Codemirror is under 10ms (next frame at 100hz).

While these kinds of benchmarks are incredibly appreciated and absolutely drive us to work on optimizing the problems they uncover, we try to work backwards from experienced problems in real usage as our first priority.



Ouch Peter. Massive offense taken.

> In our research, we've found that editing is usually serial or asynchronous.

Medium-to-large-size company with a town hall = many people editing a document at the same time. Workshop at a company or a university with a modest size classroom = many people editing a document at the same time. I can't tell you how many times our web-based collaborative code editors would fall over during talks with small audiences we would give back in the days when I led the Scala Center.

Just because one of the benchmarks you have seen (of a multitude of benchmarks) breaks automerge by stressing it in what we believe is the most stressful scenario possible– multiple concurrent users, which is sort of the point of concurrency/collaboration frameworks, does not make it artificial or worth so flippantly discarding.

> long-form essay with Automerge and that the end-to-end keypress-to-paint latency using Codemirror is under 10ms (next frame at 100hz)

Not at all what we measured.

I'd just like to register here that Yjs is the framework most widely used "in real usage" (your words) and not automerge (for many reasons, not just performance.)


Please accept my unreserved apologies, Heather! No offense is intended. I can speak for everyone working on Automerge when I say that we've very much appreciated Matthew's work and have indeed spent quite a lot of time studying and responding to it. We spoke about it in person last week, in fact.

As for the use-cases, I do not mean to exclude live collaboration from consideration, just to note that it hasn't been our focus or come up often in the use-cases we study. Live meeting notes are definitely a real use-case and I don't dispute the performance results you show.

As for Y-js, it's a wonderful piece of software with excellent performance and a vibrant community made by exceptional people like Kevin Jahns. We simply have slightly different goals in our work, which undoubtedly reflect where our engineering investments lie.

Indeed, your paper did not measure the same things we look at, and that's why it found new results. Hopefully in time we will join the other systems in performing well on your benchmarks as well.


> We simply have slightly different goals in our work, which undoubtedly reflect where our engineering investments lie.

I’d love to hear more about this. Do you elaborate anywhere?


I have been writing a video game using automerge-repo for networking & save files. I researched Yjs and Automerge and felt that Yjs is better suited to an ongoing session like a conference call, whereas automerge is better suited for network partitions etc. This fit my use-case best. My opinion might be out-of-date as this area is moving quickly, and there are quite a few options out there now.


> there's an element of artificiality to them. In particular, testing the performance of the sync system while simulating many users typing into the same document doesn't really measure behaviour we have observed "in the wild".

I've seen Matt's work and I think it's quite reasonable to benchmark a concurrent datastructure under concurrent load. Placing systems under high load, even just as a limit study, is how we reveal scalability bottlenecks, optimize them, and avoid pathologies. It's part of good engineering.

If your work can produce more representative workloads from the real world, then they could add to the field's knowledge with new benchmarks.


> testing the performance of the sync system while simulating many users typing into the same document doesn't really measure behaviour we have observed "in the wild"

We use co-editing far more commonly than serial editing.

Coming from a background of XP (extreme programming, pair programming) and a Pivotal Labs style approach to co-thinking, even for executive work we require everyone in a meeting (whether at conference table or remote) to be in the document being shared, and instead of giving feedback, comment or edit in place.

We care a LOT about how laggy this works, how coherent it remains, or whether it blows up and has to be restarted, or worse, reverted.

If a firm culture "whiteboards" by having one person at the board and everyone else surfing HackerNews, they might not be exercising this. If a firm culture is that whiteboards are a shared activity, everyone gathered around holding their own marker, or even just grabbing it from each other, they might need to exercise CRDTs this way.

Put another way, if you "Share" in conf room with an HDMI cable to a TV, or share in a Teams or Zoom by window sharing, you may not be a candidate.

If you "share" by dropping a link to the document in a chat, and see by the cursors and bubbles who is following along, you are a candidate.

. . .

In "Upwelling" you describe an introverted and solitary creative process, before revealing a sufficient quality update to others.

That is certainly a valid use case for unspooling thoughts from one brain, and if those are the wilds you are observing, makes sense why that's what you'd observe in the wild.

It is not, however, the most productive for inventing solutions to logic puzzles with accuracy and correctness in fewer passes, nor for most any other "group" activity. So maybe your "not what we see in the wild" should be qualified by "but we're actually not looking for live collaboration, we're looking for post drafting merge".

That said, now the choice of the term "auto-merge" is much clearer, advertising your use case right on the tin, if one thinks about it.

So thanks for the upwelling link, repeated here for convenience:

https://inkandswitch.com/upwelling


Automerge does indeed work with live collaboration, though apparently not currently as efficiently as some other solutions. Everyone working in this space is exploring and looking for solutions that will work for users woth slightly differing priorities. In addition to automerge consider checking out yjs, electricsql, diamond types, replicache, vulcn, or any of the other folks. Hopefully one of them will be just right for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: