I'm happy to see representatives from both the matrix and yjs communities interact more directly.
Can you expand on what you mean by this?
> an API that others can use to efficiently store shared data
What would you expect this API look like in a bit more detail? Would it be able to abstract any of the underlying CRDT logic? Would it just be a raw stream of authenticated messages with partial ordering? Something in-between?
Instead of building another shared-editing solution specialized for Matrix, there could be an API that can be used to store and distribute real-time updates efficiently (probably in the Matrix DAG).
The matrix-crdt works really well. To reduce overloading the Matrix server with many small messages (each single keystroke produces an update message), it stores merged updates in the DAG after a short debounce. The optional WebRTC extension allows you to distribute messages immediately "of the chain", so you don't notice the debounce.
After a time the message-log gets pretty huge. So in matrix-crdt, a random client will eventually store a "snapshots" of the current state in the DAG and removes old entries. This way, new clients don't need to download the huge message-log.
It would be nice if there was a possibility to create a server-component that does the merging.
(Btw, all credit to the above approach goes to Yousef)
Now, there might be a better solution to store CRDT data in the Matrix DAG - the developers probably know best and might be able to expose some hidden API that would make everything even more efficient.
I'm just asking that instead of creating yet another CRDT and integrating it into Matrix, open up this space, provide better APIs, and let others integrate their CRDTs.
> Would it be able to abstract any of the underlying CRDT logic?
Modern shared-editing frameworks don't require you to think about internal logic. They just set some requirements on the ordering of update messages. CRDTs in particular don't care in which order you transmit data, which makes them a very interesting choice in practice.
It might be worth putting together a chat between matrix and a few of us! I have some thoughts on this too, having written two differently designed CRDTs with diamond types.
Replaying a series of changes from an operation log is quite doable (blog post incoming). But having a way to compress / annotate the operation stream will lead to far better performance in lots of ways. Especially as Kevin says - with CRDTs like Yjs and automerge which consider document order (not time order) as the canonical representation.
Have you guys taken a look at all of the various torrent-based[0] approaches that's been going around HN? Feels like if you combine the storage component of those with the real-time approach you've got here, it would feel like magic.
Can you expand on what you mean by this?
> an API that others can use to efficiently store shared data
What would you expect this API look like in a bit more detail? Would it be able to abstract any of the underlying CRDT logic? Would it just be a raw stream of authenticated messages with partial ordering? Something in-between?