Hacker News new | past | comments | ask | show | jobs | submit login
Braid: Synchronization for HTTP (braid.org)
233 points by walterbell 11 months ago | hide | past | favorite | 88 comments



We're about to release a new braid-text library:

    https://github.com/braid-org/braid-text/
    https://www.npmjs.com/package/braid-text
This is the easiest way to add collaborative editing to a web app. You can add it to any (req, res) handler in your nodejs app. No websocket necessary, since it extends HTTP itself!

Plus, it features a new `simpleton` merge-type that requires zero history overhead on the client, and you can implement the client protocol from scratch in just 50 lines of code. There's no reason not to add collaborative editing to every user-editable string in your app now!

All the network traffic is just HTTP, with an open, easy-to-read protocol that supports any OT or CRDT (currently defaulting to josephg's Diamond-Types). You can even read and write directly using the Braid-Chrome extension, which adds a devtools panel to view the version history of any Braid-HTTP resource: https://github.com/braid-org/braid-chrome.

I'm really excited about this library. We were going to announce it tomorrow (at https://braid.org/meeting-86), but since this hacker news discussion is happening... I can't hold myself back from talking about it now!


> No websocket necessary, since it extends HTTP itself!

What does that mean? How do other clients get notified of remote changes?

edit: Braid include its own SSE-like construct, completely incompatible, with a brand-new HTTP status code "209". Seems unnecessary... How does that work with existing servers/proxies/middleboxes?


It works just fine. Existing servers, proxies, and middleboxes just pass through the response if they don't understand status 209. The braid.org site itself is running on Braid-HTTP, through a reverse proxy, and everything is peachy keen.

SSE doesn't work because it (1) doesn't support binary data, (2) assumes that versioning is linear (which doesn't work in a distributed system), and doesn't provide any way to add new headers to each response.

SSE also has an awkward encoding format, but that's neither here nor there.


I'm not saying you can reuse the entire SSE protocol and interfaces, but why not use 200 and Content-type text/event-stream, like SSE does? Use mixed/subscription of mixed/braid if you really want, but this whole new thing, why?

I don't see what's too fundamentally different about Braid that it needs a new status code and protocol. Shouldn't SSE use 209 then?


It's possible that Braid could switch back to status code 200. I expect this choice to be revised in discussion within the IETF HTTP WG, but we haven't gotten to this level of detail yet. If I remember correctly, the switch to 209 in the current draft was to discourage middleboxes from caching braid responses, but it's possible that "Cache-Control: no-cache" does enough of this and that 209 is not necessary. I'll keep an eye out. Thanks for the thought.

As for text/event-stream -- braid responses are not text (they can contain binary updates to things like images) and they are not an "event" stream. Braid provides an "update" stream, as a stream of HTTP responses. Each response can specify an update using a status code, headers, and a body.

If we were to use SSE, we would be encoding an HTTP response, within a base64-encoding (to fit as text), within a sequence of `data: ` lines, in an "event", within an event stream, within a text/event-stream content-type, within an HTTP response. It's a lot simpler to just extend HTTP to say "instead of one response, a server can provide N responses" than to go through all this rigamorale to encode responses within an event stream within a response.

As for events vs. updates, an update may or may not describe the raw underlying events. Updates can also be summaries of many events. For instance, a normal HTTP response body provides a snapshot that summarizes all of the edits up to that point that created the resource. This is an update, but is not the raw sequence of events.


Ah, I've found the discussion on status code 209 here: https://github.com/braid-org/braid-spec/issues/16

Hope that helps.


Use mixed/braid then. My point is, reuse some of the technical decisions that were made when creating almost the exact same mechanism.

What is the point of standard decisions if they change every time?


Perhaps you mean multipart/braid? There is no mixed/* mime-type prefix that I know of.

How would your suggestion work? Can you give an example HTTP response to explain what it would look like?

The purpose of standards is interoperability. What software would this mixed/braid or multipart/braid mime-type help us be interoperable with?


> doesn't support binary data,

You could just as easily base64 encode it (or use your favorite encoding).

> assumes that versioning is linear (which doesn't work in a distributed system),

Whether you're sending SSE frames or subscription events, they arrive in the order the server sent them by virtue of being sent over TCP. Your versioning is an application-level concern, not a transport-level concern

> doesn't provide any way to add new headers to each response.

It's trivial to add headers at the application layer (literally just encode your own "header" to the start of each event). The headers don't need to be a part of the protocol.


Base64 encoding hurts performance, adding about 33% overhead. It's more efficient to send binary over HTTP directly.

> SSE frames or subscription events ... arrive in the order the server sent them by virtue of being sent over TCP.

We support situations more complex than that. Multiple clients can make simultaneous edits with multiple PUTs. They can arrive in different orders to a server that relays them to other clients. Each peer in the system needs to know which versions the edits were parented from, so they can reconstruct the version DAG and merge consistently.

> The headers don't need to be a part of the protocol.

Yes, they do. Not only headers, but also status lines, and bodies. We are streaming entire HTTP responses. A HTTP response consists of a status line, headers, and a body. We need all of these. For instance, if a resource is deleted, we need to send a new 404 status line in the update. If the version changes, we need to update the Version: header. These need to be a part of the protocol so that all peers can update themselves in the same way, otherwise you don't get consistency.


> Base64 encoding hurts performance, adding about 33% overhead. It's more efficient to send binary over HTTP directly.

Other encodings are more space efficient. But it's also the case that the purposes of Braid don't lend themselves to large binary blobs. Chances are the data still fits in a single packet.

> They can arrive in different orders to a server that relays them to other clients.

Again, whether you emit an update over SSE or a HTTP subscription doesn't change that. Both are literally just TCP connections. Whether you wrap the data as an SSE event or a piece of a 209, it'll arrive in the same way.

> Yes, they do. Not only headers, but also status lines, and bodies. We are streaming entire HTTP responses. A HTTP response consists of a status line, headers, and a body. We need all of these.

You can package that data not as in the shape of a HTTP response. You're making the choice to do it that way but there's exactly nothing that's requiring you to do it.

The value of putting it into HTTP the protocol is making the user agent be able to see a snapshot is a remote resource at any given time. But that's not the interface that HTTP exposes. fetch() or curl gives you back a stream or a buffer. If I created a subscription, streams are right out. So you'd need to get back a full buffer of the resource every time it's updated. But then you don't have data about what changed.

So now you have to get the details about those changes, which means HTTP isn't adding any value. Which is to say, if my application needs to be aware of the details of the protocol, it's not a transport protocol anymore, it's an application protocol. If the abstraction of the protocol just gives you ~the data that's sent over the wire, it's not an abstraction anymore and you're really just piping lines of data from the wire to your application logic (which is exactly what SSE is).

The only benefit that I can really see is a user agent could facilitate caching better? But then you're dipping into the territory of H2 server push.


> But it's also the case that the purposes of Braid don't lend themselves to large binary blobs.

That's not true. We want to use Braid to distribute OS updates, sending patches to multi-GB binary blobs; and for filesystem synchronization (think Dropbox/SyncThing/Resilio) with large binary files.

Braiders have put an impressive amount of work into compressing data: https://josephg.com/blog/crdts-go-brrr/. Data compression is critical, because these mutation histories can grow very large. There would be very strong resistance to a 33% overhead without a good reason.

> whether you emit an update over SSE or a HTTP subscription doesn't change that. Both are literally just TCP connections.

You're failing to consider a P2P mesh network. TCP only connects two computers. Braid enables a P2P mesh network where any peer can come and go, work offline, or switch connections, and nonetheless the whole network must synchronize with strong eventual consistency. Think distributed. Watch the distributed network partition, repartition, and reconfigure in https://braid.org/antimatter#viz. This won't make sense until you flip your mindset to distributed.

> You can package that data not as in the shape of a HTTP response.

Then we wouldn't be using HTTP. We wouldn't be able to re-use HTTP parsers and generators. We would lose a huge amount of interoperability with existing code.

It's confusing that you're arguing against using HTTP in order to use SSE -- while insinuating that using SSE means using HTTP. SSE is built on, but is not itself HTTP. It wasn't even standardized in the IETF HTTPWG. It's a way to tunnel a specific, limited, use-case of centralized, text-only, server-to-client-only event streaming over a HTTP response, without bothering to specify anything about how those events might change actual HTTP resource state. It's time to do better, and to bake that into HTTP more generally. We are making something way more powerful than SSE.

As for the rest of your comment, I am trying, but cannot understand what you are trying to say. Perhaps you could rephrase it.


> doesn't support binary data

But HTTP is a text based protocol. Why do you need binary?

> assumes that versioning is linear

What does linear mean? Afaict SSE ids are opaque strings.

> doesn't provide any way to add new headers to each response

XY problem? Do you really need HTTP headers or do you need a way to pass X from server to client?

I’m not familiar enough with the specific problem, but it’s always good to reuse as much as possible of existing infrastructure.


> But HTTP is a text based protocol.

Why do you say that? It supports binary content just fine. If that were true images would be horribly inefficient to send, like they are with smtp (which is an actual text based protocol).


It’s text by default, but you’re right. I would say it’s idiomatic to use text until needed, at least. But if the CRDT payloads allow binary it makes sense to avoid SSE.


Yes, like for images. You can't send image updates over SSE.


Braid and Statebus, https://stateb.us/what

> Every piece of state has a state:// URL. You can re-use another site's state as easily as linking to a page with today's web. Websites can build on top of one another, and collaboratively outcompete today's centralized monopolies.

If multiple businessses are cooperating to process state owned by multiple websites, how does Statebus envision the encapsulation of state for {user, device, app, feature, business}? Could some state be opaque/encrypted?

What's the best place to learn more about the use cases below, https://stateb.us/why

  Built-in offline mode
  Collaborative editing by default
  Website internal state is opened
    Make a new user interface
  Implement a blockchain on top of the web
  Enables better Email protocol: decentralized, simpler, realtime, spam-free, encrypted
  Improves source control:
    Source code becomes state—replaces git with the web protocol itself
    Any user can edit source for any website, and have their own version
    Editable, brancheable, forkable inline


That is wild! I'd also like to learn more about this.

The language is a bit, idk superlative, but if it does what it says then it would be great.


Yes, when I wrote that stateb.us/what page, my intent was to express the aspirational vision of statebus. It soon became clear that, in order for the vision to succeed, we needed the network protocol to become a widely-used standard. Thus began my Braid work.

I started approaching the IETF to learn how to standardize State Synchronization in HTTP. I ran across this HN comment by JosephG [1] making a call to the same mission. I sent him an email, we decided to team up, and we went to dwebcamp in California and then IETF in Montreal together, and made a presentation to the HTTPWG proposing to extend HTTP into a state sync protocol. We got an enthusiastic reception, and I have been working on it ever since.

The HTTP extension is getting into pretty good shape, and we're now manifesting the promises on that /what page. They aren't all production-ready (e.g. no blockchains built on statebus/braid yet), but they are coming true at a steady clip!

[1] https://news.ycombinator.com/item?id=19816648


Neat!

Any plans to release a Firefox extension, too? It shouldn't be too hard, IIRC the API is largely compatible.


That is cool! Nobody has worked on a FireFox devtool yet, but we welcome contributions.


They seem to have made a choice that URL's do not include version numbers. Instead, the version is sent as a separate header. That would make it hard to link to a specific version. I suppose there's nothing stopping anyone from also having URL's for specific versions, but it's not standardized.

More generally, I'm unconvinced that synchronization should be so closely tied to HTTP, or that the metadata should be independent of the actual datatypes being synchronized. Any given application is going to need to implement not only a synchronization algorithm, but the datatypes being synced.

Contrast with git, where the data is standardized (file, directories, commits) and it supports many ways of communicating about the data.

(Though, at another level, git doesn't care what's in a file, and that's also true of HTTP.)


Yes, we will want to extend URLs to standardize an optional version identifier.

It's important for this to be optional. Consider that each keystroke that edits a text resource changes its version ID. You want to be able to link to the text without changing the ID with each keystroke.

It's also important to be able to pass versions within headers of a request or response. The server wants to update the version ID with each update it sends to the client, and that's most elegantly done in a header.

As for your skepticism about integrating synchronization into HTTP instead of embedding it within a datatype, I can empathize -- but you might be surprised at just how elegantly HTTP extends into a full-featured synchronization protocol. A key to this elegance is the Merge-Type: this is the abstraction that allows a single synchronization algorithm to merge across multiple data types.

As an application programmer, you will specify both the data types of your variables (e.g. int, string, bool) and also the merge-types (e.g. "this merges as a bank account balance, or a LWW unique ID, or a collaborative text field"). This is all the application programmer needs to specify. The rest of the synchronization algorithm gets automated by middleware libraries that the programmer can just use and rely upon, like his compiler, and web browser.

I'd encourage you to check out the Braid spec, and notice how much we can do with how little. This is because HTTP already has almost everything we need. Compare this with the WebDAV spec, for instance, which tries to define versioning on top of HTTP, and you'll see how monstrous the result becomes. Example here: https://news.ycombinator.com/item?id=40481003


Thanks. Yes, that's a much more fine-grained use of versioning than we use in git repositories. I suppose this sort of synchronization is more like a streaming protocol than a version-control system. (A stream of updates.)


I'm not keen that this extends HTTP instead of building on existing standards. Subscriptions, for instance, could simply be server sent events. Baking braid into the protocol layer itself means it's harder to support it with existing libraries and code.

I'm also a little unsure of why "partial PUTs" are needed. "PUT an update as a patch" seems like exactly the use case for PATCH. Not that it super matters either way as far as compatibility is concerned, but it feels like it's contorting PUT to do what PATCH already accomplishes


See this comment on SSE: https://news.ycombinator.com/item?id=40482389

PATCH is supported in Braid, but only works for client->server. It does not support updates from server->client. So we need a way to express general updates, in both directions. Braid lets you use existing PATCH specs (as Patch-Type) if you want.

The PATCH specification also confounds multiple concerns (versioning, conflicts, validation, data formatting, applicable data types) into a single spec. It's nicer for these aspects to be modular and independent. Partial PUT happens to be already a modular, independent way to specify just the patch format, and can be extended to new data types by "Range Units." But use PATCH if you prefer!


I don't feel especially strongly about the PUT vs PATCH distinction. But I left a comment on the thread you linked: I'm just not convinced there's a need for 209. Braid is trying to be both application logic and a transport protocol, but it doesn't have to be. SSE isn't perfect but all of the concerns around it could be addressed with relatively simple solutions.


I'm not convinced on 209, either. There might be a better response code. See the discussion at https://github.com/braid-org/braid-spec/issues/16

Braid is not trying to be "application logic" or "transport protocol" -- it is a State Synchronization Protocol. If you can think of a more elegant method to support all of state synchronization within SSE, we're all ears!

But AFAICS you end up having to embed base64-encoded HTTP responses within SSE events within HTTP responses, which is gross pile of russian dolls, where both the innermost and outermost dolls are the same type of doll -- an HTTP response! So why all the intermediate dolls?

It's more elegant and simpler to extend HTTP at the root level, and say: "Instead of returning 1 HTTP response, a Braid server can return N HTTP responses, concatenated together with optional newline separators."


A handful of iOS apps (2Do, Omnifocus, DevonThink, PhotoSync, GoodReader) support user-hosted WebDAV/CalDAV storage for state sync between devices. If Braid can lower the cost for apps to synchronize state across devices without "the cloud", that would be a win for decentralized infrastructure.


I cant help but think that this feels like webDav on steriods, and webdav never really caught on.

I kind of think sometimes it makes more sense to layer on top of http instead of extending.


It kind of is layered on top, no?

WebDAV didn't really catch on, but the general product space of remote drives did. The problem with WebDAV is that like most Web* tech (and maybe Braid) it's design by committee in the abstract. To make remote drives work well for the end user requires a fairly complicated protocol with tons of ugly edge cases. The interoperable standards from committees approach tends to fail in those situations, whereas hard driving startups that use proprietary protocols they can iterate quickly tend to win.

Iteration speed > openness, when there's nothing to copy from. Standardization tends to be about bigger companies trying to commodify their competitors.


many proprietary protocols iterate themself out of existence, just because some of them survive does not mean its the better model. Look at the video codec space, proprietary codes have long lost and have gone the way of Real Media, Windows Media Video and many others. While H.26X which is the result of design by committee are used everywhere.


It might be a different scenario with (modern) codecs where to get good use out of them it helps a lot to ship dedicated hardware. That's not true of file sync protocols.


> Standardization tends to be about bigger companies trying to commodify their competitors.

Didn't you mean to commodity their complements? Like when Microsoft (software company) commoditied the hardware on top of where their OS ran.


No, I meant commodify their competitors. Big company has moved up the value chain and doesn't want others following them, so they standardize the tech at the level below. Now competitors are all implementing the standard and thus can't get an edge over each other, so they end up driving margins to zero via price competition and have nothing left over for innovation, whilst simultaneously reducing the bigger firm's cost by turning the lower levels they still rely on into a commodity.


WebDAV was really complex, and required (ugh) locking for concurrent edits, which really sucks.

Let's compare WebDAV with Braid for something simple like looking up the current version of the resource.

In WebDAV, you have to:

- Send a PROPFIND request to the URL of the resource you want to check. In the request body, specify the DAV:version-controlled-binding property you want to retrieve, using a custom XML syntax. Here's an example:

    PROPFIND /path/to/resource HTTP/1.1
    Host: example.com
    Depth: 0
    Content-Type: application/xml
    
    <?xml version="1.0" encoding="utf-8"?>
    <propfind xmlns="DAV:">
      <prop>
        <version-controlled-binding />
      </prop>
    </propfind>
- The server responds with a 207 Multi-Status response, with XML to parse:

    HTTP/1.1 207 Multi-Status
    Content-Type: application/xml
    
    <?xml version="1.0" encoding="utf-8"?>
    <multistatus xmlns="DAV:">
      <response>
        <href>/path/to/resource</href>
        <propstat>
          <prop>
            <version-controlled-binding>
              <href>/path/to/version/history</href>
            </version-controlled-binding>
          </prop>
          <status>HTTP/1.1 200 OK</status>
        </propstat>
      </response>
    </multistatus>
This does not yet give you the version -- it tells you a separate URL that contains the version history. So now you have to query that for the version:

- Send a REPORT request to that url, and in the request body, specify the DAV:version-tree report you want to retrieve:

    REPORT /path/to/version/history HTTP/1.1
    Host: example.com
    Content-Type: application/xml
    
    <?xml version="1.0" encoding="utf-8"?>
    <version-tree xmlns="DAV:">
      <prop>
        <version-name />
        <creator-displayname />
        <creation-date />
        <comment />
      </prop>
    </version-tree>
- The server responds with a 207 Multi-Status, with more custom XML for you to parse:

    HTTP/1.1 207 Multi-Status
    Content-Type: application/xml
    
    <?xml version="1.0" encoding="utf-8"?>
    <multistatus xmlns="DAV:">
      <response>
        <href>/path/to/version/1</href>
        <propstat>
          <prop>
            <version-name>1.0</version-name>
            <creator-displayname>John Doe</creator-displayname>
            <creation-date>2023-05-26T10:00:00Z</creation-date>
            <comment>Initial version</comment>
          </prop>
          <status>HTTP/1.1 200 OK</status>
        </propstat>
      </response>
      <response>
        <href>/path/to/version/2</href>
        <propstat>
          <prop>
            <version-name>2.0</version-name>
            <creator-displayname>Jane Smith</creator-displayname>
            <creation-date>2023-05-27T14:30:00Z</creation-date>
            <comment>Updated version</comment>
          </prop>
          <status>HTTP/1.1 200 OK</status>
        </propstat>
      </response>
    </multistatus>
You can find the versions in there, and if you then parse and filter to the most recent version, you can see that the current version is 2.0.

That was a LOT of work!

Now let's see it in Braid.

To get the current version, just do a normal GET or a HEAD, and look at the Version: header in the response:

Request:

    GET /path/to/resource HTTP/1.1
Response:

    HTTP 200 Ok    
    Version: "2.0"

    <body of resource>
That's it! It's that simple!

Braid is this simple because it is baked into HTTP. It turns out that HTTP itself is already very close to a State Synchronization protocol -- it just needs a few headers, and the ability to stream updated responses when a resource changes. I encourage you to check out the spec, and see how easy it is to do things that were complicated or impossible in WebDAV:

https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...


https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...

Looks really stupid when transfer-encoding chunked already exists for exactly this use case.


Transfer-Encoding: chunked is for something completely different. It is for streaming a single body that you do not know the length of. It does not define a subscription format. It does not specify versioning. It does not define a patch format.

Some people have suggested using chunked boundaries as a way to encode multiple responses to a subscription. Perhaps this is what you are thinking of. However, chunk boundaries cannot be relied upon to remain intact; they can be changed by intermediaries. The chunks are only an encoding of the transfer. They are not supposed to contain any semantic information.

Chunked transfer is also disallowed in http/2.


I dont get it. What would I use this for? (Honest question. Im probably not smart enough to get it)


I didn't get this either until I looked at the linked doc, which breaks it down more obviously imo (section 6.1 for examples).

Looks interesting to me, seems nice to be able to handle refreshing content with pure http as described here rather than with web sockets.

https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...


From what I understood, it's a protocol for exposing a tree of files for synchronization and concurrent editing, similar to WebDAV.


same reason you'd use HTTP, it's a superset of HTTP


I still dont get it. Is this to sync two servers between each other?


From the article:

> Together, these features enable a web resource to synchronize automatically across multiple clients, servers and proxies, and support arbitrary simultaneous edits by multiple writers, under arbitrary network delays and partitions, while guaranteeing consistency using an OT, CRDT, or other algorithm.

So, you'd use it if you want to "support arbitrary simultaneous edits by multiple writers, under arbitrary network delays and partitions, while guaranteeing consistency", for example for apps that want to support collaborative editing.

Braid wants to turn state transfer (like in RESTful API designs - Representational State Transfer) into a kind of synchronization transfer for state. Currently, state synchronization is handled on the state management layer of each individual application, and API calls just transfer that state, but Braid wants to have it solved as an extension to HTTP libraries on the API layer of an application AFAICS.


Yes, or even more simply:

Let's say that your client runs GET /some-json, and that JSON gets updated, and the client wants to get the updates. Right now, your options are:

1. Re-run the GET /some-json again from the client (polling)

2. Start a websocket, and invent some custom protocol over the websocket to let the client subscribe to /some-json.

With Braid, you just do:

    GET /some-json
    Subscribe: true
And the client will automatically get the updates, within HTTP. The Braid-HTTP ponyfill library abstracts the details behind a normal fetch() for you, until browsers implement it.

Today, we tend to use HTTP for static assets, and then switch to a websocket or SSE or something whenever things get dynamic. That sucks! We lose the standard of HTTP!

Braid lets you keep using HTTP, even for your dynamic assets. It also solves issues in caching. Instead of using max-age heuristics, we have actual real versioning!

And yes— the protocol generalizes all the way to full-peer-to-peer multi-writer CRDT/OT algorithms. But start simple!


Frankly, your explanation isn't any simpler, quite the contrary.

> That sucks! We lose the standard of HTTP!

Why does this suck? WebSockets or SSEs are also standardized.

> the protocol generalizes all the way to full-peer-to-peer multi-writer

Why "peer-to-peer"? While Braid can be used in a local-first setting and probably do peer-to-peer, that isn't the focus here.

The focus is to help application developers to not get into the shenanigans of CRDTs and distributed algorithms etc., but solve those things already via polyfills.


> Why "peer-to-peer"? While Braid can be used in a local-first setting and probably do peer-to-peer, that isn't the focus here.

> The focus is to help application developers to not get into the shenanigans of CRDTs and distributed algorithms etc., but solve those things already via polyfills.

Yes, this ^^ is the initial focus, and is enough to provide value now.

However, ultimately we will generalize HTTP into a fully peer-to-peer system. Each extension we add provides a new dimension of p2p, and at some point we won't need servers at all. The big blocker to that, right now, is TLS+DNS, which are baked into the definition of https:// URIs. More on this at https://braid.org/meeting-2 in the "HTTP2P" talk.

Updates (as requested by d-z-m):

- Link to talk in Meeting 2: https://braid.org/video/https://invisiblecollege.s3.us-west-...

- Slides: https://braid.org/files/http2p2p.pdf

- The specific section on TLS+DNS: https://braid.org/video/https://invisiblecollege.s3.us-west-...


In the linked video, is there a timestamp for when the TLS+DNS blocker is discussed?


WebSockets and SSE are standards in the same way that TCP is a standard -- if you use them, you are still defining an ad-hoc protocol on top to subscribe to state and publish new state.

HTTP is a standard on top of TCP. It provides a higher-level abstraction than a socket -- the abstraction of State Transfer. When you use a WebSocket, you're back to a low-level socket, and have to redefine "state", and the methods to get it, put it, and subscribe to it.

Since each web programmer defines those methods in a different way, his state gets hidden behind his own non-standard protocol. There is no way for website A to re-use the state on website B, for instance, without learning website B's custom WebSocket protocol and re-implementing it perfectly.

CDNs and other caches cannot handle WebSocket traffic. But if you use a standard like Braid-HTTP, they can cache your dynamic assets along with your static assets.


Thanks for getting back at me, didn't notice you're the M. Toomim from the article.

I always wanted to tackle CRDTs etc. for state synchronization, but didn't get yet so far. So without much experience in that space, let me ask some really stupid questions...

> HTTP is a standard on top of TCP. It provides a higher-level abstraction than a socket -- the abstraction of State Transfer. When you use a WebSocket, you're back to a low-level socket, and have to redefine "state", and the methods to get it, put it, and subscribe to it.

For an application developer HTTP and WebSocket are both just application traffic protocols. Iv'e seen people misuse the extensibility of HTTP more often than anything WebSocket APIs have to offer. No wonder, state and methods (open, close, send) are much more refined in the WebSocket API standard compared to HTTP - the expectations are low and the responsibility high, libraries handle the basics, don't you think? Why would I go back to the complexity of HTTP again and think about headers and the idempotency of my methods when all I want is to pass payloads to topics in a bidirectional manner? ...I can imagine that CDNs to replay state come into play here, but would need some more inspiration.

> Since each web programmer defines those methods in a different way, their state gets hidden behind their own non-standard protocol. There is no way for website A to re-use the state on website B, for instance, without learning website B's custom WebSocket protocol and re-implementing it perfectly.

What is the use case for an application here? Where there's OpenAPI to document and share REST API implementations, there's AsyncAPI for WebSocket implementations, and of course there's GraphQL and client libraries otherwise... isn't this better approachable with a CollabAPI specification (I just made that up)?

The polyfill design suggests that browsers and standard libraries should implement Braid. What incentive do they have that can't be done with a library?


Yes, the big secret here is that CRDTs + Braid will enable a higher level of abstraction for programmers: an Abstraction of Distributed State.

As a programmer, you'll no longer have to write any networking code. You won't be touching HTTP headers and methods. That will all be handled by libraries for you, which will let you read and write any state, anywhere on the network, as if it's a local variable on your own computer—already downloaded, and always up-to-date.

The CRDT handles network delays, and race conditions, under multiple writers, so that everything merges automatically, and you don't have to think about them.

The Braid Protocol ensures that all servers, everywhere on the network, speak "state sync" in a standard way, even if they have different implementations behind the scenes, even with different algorithms.

This means that our libraries will be able to detect which algorithms need to be employed to synchronize with any service you are connecting to, and implement all that for you. You, as the application programmer, will just read and write variables like:

    state['https://foo.com/news-feed'].push({
       author: 'me',
       post: 'hey guys!!!',
       inreplyto: state['https://bar.net/post/2423h'].id
    })
The reason why HTTP gets abused today is that it doesn't provide full synchronization support, which means that programmers have to abuse it in order to actually write web apps, which becomes a pain in the ass, and then everything feels a lot easier when you drop down into a WebSocket and get rid of the cruft. But now you're writing a custom WebSocket protocol, that only you will fully understand...

...unless you document it, and try to follow a standard like AsyncAPI, which gets you part of the way there...

...but what you really need is a State Synchronization protocol, and libraries that abstract away all the networking for you, and guarantee interoperability, creating a system of shared state. We're getting closer to this glorious world. The magical statebus in the sky that connects all our state together. When the abstraction is complete, it's going to blow everything else away.

We don't need Web Browsers to implement this natively -- that'll just be a performance improvement, like when JSON.stringify() and parse() became native in browsers. The important part is defining a standard that allows us to invest in robust State Sync algorithms and libraries, and lets developers invest in the applications that share this beautifully interoperable state, and make this new abstraction succeed across the globe.


You could sync two servers. Or you could sync a server with clients.

Braid is transitioning HTTP into a peer-to-peer world, where the distinction between clients and servers doesn't matter.


Could you call it websockets without websockets?


In some sense, you could!

The reason is that almost every use WebSockets is actually for Synchronizing State. What you really want to do is to synchronize state. Braid is a protocol for doing just that. So you don't need to turn to WebSockets anymore!


I wonder if Braid could be a transport layer for Phoenix Liveview. It uses Websockets but falls back to long-polling if websockets isnt available.


Yes! Phoenix Liveview is very similar to the Statebus project (https://stateb.us) that inspired Braid. Braid was the protocol that Statebus needed, and Phoneix Liveview is one of the most exciting projects in the Statebus space that I know!

I think we are working from a common inspiration!


Yes, and to elaborate — for any HTTP usage with dynamic state.

HTTP was invented for static pages, that were written by hand, and rarely changed. But today's web has dynamic pages, driven by javascript and databases, and users expect them to update in realtime. Braid-HTTP adds support for dynamic state to HTTP.


For an HTTP company they sure do hate hypertext. Their entire write-up is just a blank white page unless one successfully executes the javascript from 5 separate domains.


Point taken.

But to be fair, we're not a company, but a working group. And we don't hate hypertext, we just haven't gotten around to implementing server-side rendering, which would be a nice thing to do.


Glad to hear it.

What on that page of just text and images made you feel an entire application was appropriate instead of just a hypertext document? I hope this application-centric approach isn't also being applied to the HTTP extension. Hypertext documents should at least get an equal share of the consideration for an HTTP protocol.

I get that for commerce HTTP is just a transport to deliver the javascript/json/etc application. That's the way things are. But HTTP in general has to handle more use cases than just commerce.

I worry that adding this dynamic checking of state to HTTP itself will make websites that adopt it inaccessible and unusable for almost every browser that currently exists. Only new software will be able to use them. It's a substantial break. Maybe don't call it HTTP.


This allows subscriptions of changes to individual resources... but is more general event streaming a goal? Could it become so? (as in, "give me updates to all resources", not just one).

We have been using our own protocol "FeedAPI" for broker-less, multi-service event sourcing over HTTP. But it is a bit home-grown, would be great if event sourcing over HTTP was a more widespread thing with a common standard. There are so many scenarios where event streaming is a great model, but deploying Kafka is overkill, or would couple together the infra too much of publisher and consumer (e.g. different between organizations).

https://github.com/vippsas/feedapi-spec


Cool! We have been building feeds on braid, too! It would be very interesting to support FeedAPI with Braid -- you would get much faster pushed updates than you currently do with client-pulled updates!

See our current work on the "feed" spec in Braidmail: https://braid.org/apps/braidmail/spec2.

To support event-sourcing, you could replace those {link: ...} objects with event objects, and just treat each event as data. Then you'd have a url like /event-stream that peers would subscribe to.

Or, you could go deeper, and come up with a custom `Patch-Type:` for your events, which could make your events first-class HTTP citizens that could also be applied to a server using the PATCH method. Then your url might be /system-state, and represent the state for the whole system.

Braid supports such custom events, but be aware that this model is limited in a few ways:

- It's less interoperable, because all peers must know and implement the custom event types (aka patch-type) in order to read or write state.

- You don't get to re-use the beautiful and complicated merge algorithms that support eventual consistency with multiple writers (the OT/CRDT algorithms).

- It doesn't allow summarization of history. It's hard to prune history when each peer needs to know all events in order to do anything.

You might also be interested in PREP, which is a pure event-update protocol being worked on by my colleague Rahul Gupta: https://github.com/CxRes/prep We're trying to merge our work together, if we can.

You might also be interested in our deliberation on the pros/cons of "events" vs. "state updates" here: https://github.com/braid-org/braid-spec/issues/102#issuecomm...


Related. Others?

Braid: Synchronization for HTTP - https://news.ycombinator.com/item?id=26385480 - March 2021 (1 comment)

Braid: Synchronization for HTTP - https://news.ycombinator.com/item?id=21626261 - Nov 2019 (47 comments)


HTTP is a request-response protocol. It doesn't deal with state transfer nor does it define it. I find it weird that the first thing they mention is this:

> Braid-HTTP is an extension to HTTP that generalizes it from a state transfer to a state synchronization protocol.

They could simply say it's a synchronization protocol over HTTP.


> HTTP ... doesn't deal with state transfer nor does it define it.

The HTTP acronym stands for HyperText Transfer Protocol. ReST stands for Representational State Transfer.

HTTP began as a HTML Transfer Protocol, but then generalized to content-types beyond HTML, such as images, scripts -- and general state. So today HTTP is known as a State Transfer Protocol. This architecture was canonicalized as ReST in Roy Fielding's dissertation.

In order to generalize HTTP from State Transfer to State Synchronization, we have to augment some parts of it, such as request/response. Instead of just getting a single response to a request, Braid allows a client to subscribe to a resource, with a single request which will be then given multiple responses -- one response for each update to the resource.

It is quite elegant to extend HTTP at this level! And we can do so without requiring any changes to web browsers.


Is it only for realtime?

Can it prune history?

What if I write a todo application in it, and one of the clients will connect only once a month? What about a client that will drop and never return? (in some systems it means past updates will pile up indefinitely, and state on other clients will keep growing)


You can use any CRDT with Braid. Many of them prune history. In fact, we have developed the first pruning p2p text sync algorithm: braid.org/antimatter

Indeed, text sync requires old history to merge with old edits. However, Braid does not force you to hold everything. Peers can implement their own policies for deciding who to sync with. They can also ask each other for modules of old history they've forgotten if they realize they need it to merge something. This is made possible with the new Time Machine architecture that we are writing up.

This architecture also allows you to sync at multiple time resolutions -- realtime, fine grained, or slower, course grained -- within the same system. Different peers can hold and share at different resolutions. And they can all guarantee consistency with each other in the end. See the recent simpleton algorithm work for an example.


I wonder how well that scales and how it handles network partitions or erratic peers.


You can use CRDT algorithms with Braid that heal from network partitions transparently, and guarantee strong eventual consistency. Here's an example CRDT algorithm for collaborative text that can handle arbitrary network partitions, and guarantee full eventual consistency, and prune all unnecessary history:

https://braid.org/antimatter

This algorithm is the first to do so, and happened to be developed in the Braid group.

As for scaling, check out JosephG's https://josephg.com/blog/crdts-go-brrr/ CRDT, which scales great, and is thoroughly tested. The Braid protocol is also architected with a new type of OT/CRDT architecture called a Time Machine that lets you do advanced things like apply backpressure through a network to decrease the frequency of updates, which we presented in https://braid.org/meeting-81, and are releasing in the https://github.com/braid-org/braid-text library that you can try right now.


Can someone ELIF? I can synchronize clients around some state using http. What is an http extension? Does it add new http methods? If so, will anyone else be able to respond to these methods? Isn't http stateless what do you mean versioning? And etcetera!

I'm sorry to criticize but who is this article written for? It's either so dense that you already need to be an expert to unpack all the information, or the information is not there.

I would recommend starting from the assumption that the Hacker News layman knows the http protocol. Make sure what you show us is possible to follow from some reasonable starting point.


HTTP (HyperText Transfer Protocol) is a State Transfer protocol. ReST (Representational State Transfer) is a State Transfer architecture. It was built for transferring a page from server to client. But if the page changed afterward, the protocol threw up its hands, and left it to the user to click "reload."

Braid proposes new HTTP headers that give HTTP new features, so that instead of just being a State Transfer protocol, it gains the functionality to become a full State Synchronization protocol.

To "extend" a protocol means to add new features to it, e.g. with new headers, which doesn't break existing uses of the protocol, but allows new implementations to opt into new extended functionality.

HTTP was originally stateless, but then added cookies. Over time, web apps evolved to store tons of state in databases on the server. Then they started storing tons of state in javascript variables and DOM on the client. Today, most of the code we write in a web app is there to synchronize the changing state on a server with the changing DOM on the client. The web is no longer static pages. It is dynamic state. But since we haven't extended HTTP to support dynamic state, programmers have to write all this synchronization code by hand. This has been a pain in the ass, and so we've evolved monstrous stacks of Javascript frameworks (react, redux, etc. etc.) to try to help manage this state synchronization from server to client. If we build this functionality directly into HTTP, tons of code goes away, and the web itself gains great new features. Then, since we will have a standard for state synchronization, our state can interoperate, and walled siloes of websites break down. We can have a P2P web.

When this work is complete, the browser will no longer need a reload button. The browser/server will guarantee that every page is always up-to-date, and every peer that synchronizes with that state will have an equal copy.


> the browser will no longer need a reload button

I think this is skipping over some tricky UI issues regarding updates. For a text-heavy page, maybe you don't actually want a web page to jump around while you're reading it? Maybe there should be a live indicator to indicate that there are new updates, along with a reload button that lets you decide when you want to see them?

Browsers themselves do similar things - they let you know there's a new browser version, but you don't have to click "update" right away. You can finish what you're doing.

Contrast with video games where things are expected to move around. The visual design is different. There are animations so that you can track objects moving around.

Another example: recently I had been playing around with automatic syncing of a web form so that the user wouldn't be editing stale data, but I realized it was confusing so I took it out. For a form, a good time to inform the user that there is an edit conflict is when they press the "submit" button. People expect to see form errors at that point. But live edit indicators and a reload button might be good too?

Or what if you're reading a web forum, switch to another tab, then come back again. Should the page automatically refresh when it becomes active? Maybe, maybe not. It could be confusing. Maybe the new messages that appeared while you were away should be indicated somehow?

For these reasons, I'm skeptical of automatic browser syncing. I think the web app designer needs control over how to apply live updates.

Much like mobile apps were not just desktop apps scaled down, live web pages require different UI conventions. Maybe new HTML widgets?


submitted by

walterbell

nearly all criticisms responded to by a (the?) toomim brother, (who styles himself an independent psychology researcher, interesting) plus an appearance by mike_hearn.

Why is this proposal to significantly extend the scope and basic functionality of http so popular with bitcoin promoters?


> submitted by walterbell ... bitcoin promoters

Thanks to your left field comment, I've learned from Algolia that my HN submissions and comments include the terms "bitcoin" ~0.3% and "cash" ~1.3%.

> nearly all criticisms responded to by a (the?) toomim brother

Co-author of the technical proposal? It's usually good to see the author of a technical article providing substantive responses to HN comments. It's also good to have more feedback on the substance of the article. Any comments on the technical proposal?

> Why is this proposal to significantly extend the scope and basic functionality of http so popular with bitcoin promoters?

This draft proposal could benefit from contribtors from more than one organization (Invisble College), but at least there appear to be multple implementations of the proposed protocol, as is typical of IETF drafts.


not a front-end developer, so this is not something I'd be directly interacting with although it sounds interesting.

But I was struck by the preponderance of questions from other HNers in this comment section, all of which seem to be categorized it under the rubric, "what's the point of this?" It was only then that I noticed the connection between the few people making most of the answers to those questions, as well as the submitter.

So I ask again, why are Bitcoiners so clustered around this project?


> noticed the connection between the few people making most of the answers to those questions, as well as the submitter.

As the submitter, what connection are you referencing? Your comment thread is the first I've answered on this article. mike_hearn has one (1) answer. That leaves responses from the author of the article, i.e. one person. Where are the "few people" to whom you are alluding?

> questions .. seem to be categorized it under the rubric, "what's the point of this?"

Given the effort involved in any IETF draft proposal with at least two competing implementations of the protocol, this is a good question. Since learning about this project 10 hours ago, when the story was submitted, I'm still trying to understand use cases beyond "less complex" than alternatives. Based on the HN responses, it's not a well known project, so it would benefit from wider evaluation and feedback.

Work on substantive protocols for P2P decentralized publishing is always welcome, especially in 2024 when LLMs and "AI" are consuming available oxygen. If there is some (unstated?) connection between the proposed protocol and digital payments/assets, then protocol security and access control deserve close attention.


The few people are the same as listed in my original reply. There may be others, but these are names I recognize as proponents to varying degrees. Yes of the three you are comparatively a very minor influence, just as they are likely even less relevant to the promotion of D.

If you just learned of this project 10 hours ago, found it compelling enough to merit a submission, and are also pro-bitcoin, then this suggests there is an answer to "what's the point" in the set of motivations that attract the sort of people who think crypto-currency is a worthwhile endeavor.

So a first order impression is that braid would help foster decentralization, or at least counter latent centralizing tendency(ies) baked into http in its current form.


> and are also pro-bitcoin

For the third time, what's the basis of the claim that walterbell/submitter is pro-bitcoin? You're making a claim about a "group of three" where there's no data to include walterbell in the group, so that leaves an alleged "group" of two, i.e. a pair. Of the alleged pair, one person has posted once in this thread, the other posted ~20X. So you're referencing one (1) person, author of the article, whose affiliation is stated at the top of the IETF draft, no questioning needed.

To salvage some value from this sub-thread, I found a crypto reference in this 2019 Braid post by toomim, https://news.ycombinator.com/item?id=21642051

> if you want to build a peer-to-peer network, then you will replace the server with a validation function running on each peer, and authentication with a crypto scheme. But we aren't at the point of trying to standardize that stuff yet.

And a follow-up question by sagichmal:

  How do you break a GET request of some state blob into "granular patches" if the state is encrypted?
Question on encrypted state and possible "crypto schemes" for authentication in P2P networks built on Braid-HTTP, https://news.ycombinator.com/item?id=40484475


both bitcoin and braid are popular with some distributed systems nerds, not sure why the overlap strikes you as nefarious


Interesting idea, I would certainly have use for it. But the current severside library support is limited to Lua and Haskell - quite the sign that the project is squarely in the hands of academics and has never seen real world dependent on a heavily used system. Even if my stack were based on a supported language, I would take pause at that.


There are a couple nodejs implementations as well. I would know - I wrote one of them.


That's great to know. The fine article mentions only Haskell and Lua unfortunately - I wonder if it is out of date.


This looks somewhat similar to websockets and SignalR. What am I missing here?

Is it like GRPC as a protocol, built on top of HTTP/2?

I’d be interested in working on a port for .NET.


It’s like how http and tcp are different. Http adds a semantic layer on top of tcp which describes fetching documents and updating them. Of course you could make your own nonstandard http on top of tcp, but having a consistent protocol that everyone uses means we can have content-agnostic tooling and middleware for caching and fan out. When you use http, there’s a lot of existing tools that will just work and be compatible with your software.

That’s what we want for braid. Braid is an extension of http which adds semantics for documents and data that change over time.

(I’m no longer directly involved, but I really hope the project succeeds!)


Not sure where you're seeing "limited to Lua and Haskell."

Most of the implementations are in Javascript. Check out braid-text, it's super practical: https://www.npmjs.com/package/braid-text

There are also implementations in Rust and Go.


luajit is stupid fast, and cooperating with a lua program from other languages is trivial. Might be worth a look?


It is certainly worth a look, and of course if the project takes off then Python and other library support would be expected eventually. That's how open source communities usually develop. My point was that the current library support is a good proxy for the state of development and deployment, and currently this state is very early.


Given there's both client and server code available for JS that seems like a perfectly reasonable place to start experimenting with it.

The python community tends to be less neophilic than JS (often to its advantage) so I wouldn't expect that to show up until later.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: