Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cap'n Proto 0.6 released – 2.5 years of improvements (capnproto.org)
336 points by 0x1997 on May 2, 2017 | hide | past | favorite | 110 comments


As usual, nobody has said "capability" yet, which is unfortunate, because one of Capn's biggest strengths is that it embodies the object-capability model and is cap-safe as a result.

Edit: Why does this matter? Well, first, it matters because so little software is capability-safe. Capn's RPC subsystem is based directly upon E's CapTP protocol. (E is the classic capability-safe language.) As a result, the security guarantees afforded by cap-safe construction are extended across the wire to cover the entire distributed system. This security guarantee holds even if not every component of individual nodes is cap-safe, and that's how Sandstorm works.

Continuing on, there's also historical stuff going on. HN loves JSON; JSON is based on E's DataL mini-language for data serialization. HN loves ECMAScript; ES's technical committee is steered by ex-E language designers who have been porting features from E into ES.


The trouble is, we capability people have come up with our own language full of jargon that no one else knows, and I think it confuses people. When talking to people new to the idea, I try to use the term "object reference" or maybe "endpoint reference" rather than "capability", to be more approachable.


Is this the E language you are referring to? Had never heard of it before.

https://en.wikipedia.org/wiki/E_(programming_language)


Yes. It's not widely known because it was somewhat of a research language, never really production ready. But, it presented clean solutions to a huge number of hard problems, and as a result has been pretty influential behind the scenes. E.g. Javascript promises are directly inspired by E.


Could you elaborate?


I think the built in security of "you can't call if you don't know the address" coupled with addresses being first-class (so you can send the address of a function to a client, hereby granting the capability): https://news.ycombinator.com/item?id=14244540

this is very similar to what the new bus1 ipc author wants to do: https://www.youtube.com/watch?v=6zN0b6BfgLY


This might be a contrarian view, maybe I'm misunderstanding it. To me, much of message passing is not performance critical, so it would be well served by JSON/YAML/XML for easy implementation/testing/debugging. If one needs performance, he can just send bytes over, which can be (de)serialised in a (couple) dozen SLOC. Sure, when you're talking about very complex structures, RPC, dynamic schemas etc., then you might opt for something like this, but let's be honest - that's quite a minority of current users, isn't it?

I never minded these frameworks, but then I wanted to write a few parsers for some file formats and they used Thrift/Flatbuffers to encode a few ints, which seemed like a major overkill. There was no RPC, no packet loss, no nothing.


Your view isn't contrarian at all -- in fact it's widely-held by smart people. But I'll present the counter-argument.

True, in many use cases, performance is not critical. However, when you're doing big data processing or serving at scale, serialization performance does in fact start to matter a lot. If Google Search passed all backend messages as JSON rather than Protobuf, it would probably require 10x the hardware (we may be talking about millions of machines here) and wouldn't be able to respond as quickly. Similarly, Cloudflare couldn't handle its logging load without Cap'n Proto.

Many times, companies who started small and thought performance didn't matter later found themselves needing to do massive, expensive rewrites in order to keep up with scale (e.g., famously, Twitter).

Hand-written binary serializers may be easy to write initially, but they are very hard to maintain. Over time, you'll need to add and remove fields, while maintaining compatibility with old binaries already running in the wild, or compatibility across languages. It's very easy to screw this up with a hand-written encoder, but very easy to get right with Protobuf or Cap'n Proto.

Finally, even if performance isn't an issue, you may find that type safety is useful. If you are writing code in a type-safe language, correctly dealing with JSON actually tends to be a huge pain involving lots of branching. By defining a schema upfront and generating code, you can get a much nicer interface. And if you're doing that anyway, then you might as well take advantage of the more-efficient encoding while you're at it -- it's easy to dump human-readable text for debugging when needed. (As of this release, Cap'n Proto ships with a library for converting to/from JSON, and there are lots of third-party libraries to do the same with Protobuf. Both libraries also have one-liner "dump a debug string" functions.)


A more approachable example: logging to a logging process in compute-heavy and/or time sensitive application (simulators, games, rasterizers, ...). In such situations the transport layer is usually bandwidth bound, so the use of a well-designed serialization layer (compression) is highly desireable. That being said, due to Zipf's law and the fact that whitespace delimiting is a pseudo-RLE compressor, I've found most binary serialization formats to be a wash compared to text.


Or, in slightly fewer words: there's a reason ethernet, IP, and UDP packets aren't encoded in JSON. It's that when you have to encode a lot of a thing, and the encoding is relatively expensive compared to the rest of the processing that needs to be done on that thing, having inexpensive encoding is important.


> there are lots of third-party libraries to do the same with Protobuf

Recent releases have it built in already, just FYI.

(Also, hi Kenton! I spent lots of time looking at your code when hacking on protobuf :))


It isn't just the serialization, but the fact that being careless about these things tends to co-occur with other kinds of sloppiness.

Here's one example that has repeated itself a few times in a few companies. I kinda like having logging that works without strangling what you are monitoring and/or the server doing the monitoring. I also like having extensible log servers that you can write plugins for (although I'm not that fond of the idea anymore as people invariably will write _slow_ plugins, and then you are screwed. Do put in the APIs, but don't tell anyone it is easy to add processing chains). I like it to the point where people made fun of me for this "obsession" even 15 years ago.

So it goes like this: I use some fast serialization that can be parsed using next to no CPU per message -- or in later years, using Protobuf or similar. Because it is more convenient and I can throw in binary payloads in exceptional cases. I benchmark and I usually get high ridiculously high throughput without even making much of an effort. Great, this saves me the trouble of having lots of log server horsepower and I can get on with life. (Stuff will need to be sharded eventually, but since we keep everything dead simple and we actually have a plan for how to do this when needed I don't worry).

Then at some point someone who has never had to write fast code comes along. It used to be "let's use XML". Now it is invariably "let's use JSON". Because "it is fast enough". This is where it starts. And they'll use Ruby, or something else that is godawfully useless for anything that is supposed to be high throughput, because hey, it is what they understand.

And for a while, it is fast enough. Sure, you now have only a fraction of the original throughput, but the systems are lightly loaded and you are not bumping into any CPU limits. "See!? I told you it would work!".

Then people start to do serious logging and the stuff can't keep up anymore. So they start coming up with lots of schemes for dealing with the load. For every scheme implemented, stuff gets a bit more brittle and complex because now the code has all of these assumptions and mechanisms to preserve. And before you know it, you have a slow, complex logging system, that everyone depends on and nobody wants to touch. Well, sometimes someone will reimplement it, but they'll use whatever is hip at the moment. Like JavaScript and then ooh'ing and aah'ing when they double the throughput -- even though they kind of originally lost 2-3 orders of magnitude of performance.

And it isn't like making a performant version is a big task. With Cap'n Proto, Protobuffers and whatnot, someone has done the heavy lifting for you. Someone a lot smarter than you. And it is probably going to be faster to implement, more efficient, and easier to maintain in the long run.

Performance matters where it matters. And it _is_ stupid to forego it when it costs you little or no effort. And yes, being an old fart means I don't give a shit about people's egos anymore and I'll call them stupid to their faces when they are being stupid.

If you log using JSON (or XML, yech!) as your envelope format, you _are_ stupid.


You don't need a schema for type safety. Just an encoding that preserves type information like all the schemaless binary serializations (BSON, CBOR, etc.) do.


You're talking about a different kind of "type safety".

I mean that when I write code to consume a message, I would like to detect (at compile time) if I typo a field name, and I also don't want to worry about what happens if a (possibly-malicious) sender sending me a different type than I expected.


I like to use the term "binding" to differentiate between the compile-time type checks and static typing of messages in a programming language vs. the runtime type equivalence checks performed between sender and receiver of a message.


Please don't use the term "binding". It has many uses and it mostly tends to confuse people.


What kinds of uses of the term "binding" causes confusion in the context of message data types?

I think binding is precisely what is going on when a compiler generates code for marshaling and unmarshaling messages based on a schema.


Binding, as a term, is used for anything from assigning values to variables to what is essentially dependency injection of implementations of more or less abstract interfaces.

I'm merely pointing out that the term is vague and can cause confusion. That you make a strong association with one use of the term doesn't really govern what others associate with it.


IMHO performance is only one reason for using a system like Cap'n'Proto/grpc/Thrift vs some handwritten json/yaml/xml/binary protocol.

Others reasons are:

- Starting from an IDL you get a nice service contract. Which means both sides of the service (client & server implementors) get a strong description on how the service should behave and what they need to implement. The more roles you have (architects, client-implementors, service-implementors, testers, mock-service implementors), the more you profit from a good specification.

- If you generate all the serialization/deserialization code you avoid the tedious handwriting of this code and the possibility of introducing errors in this. E.g. some misspelling of field names or functions or copy&paste errors can always occur and some of these errors are only detected late in the development process.

These advantages might not be visible if you have a small service and development team - you might even experience an overhead compared to just "changing the client and server code", which are both under your control. But once you are working on a system with thousands of APIs you are quite happy if you don't have to care about handwriting serialization and RPC code and instead can focus on business logic. I personally was responsible for deploying a similar solution for an in-vehicle infotainment system with hundreds of services and around 4000 functions, which worked out really well.


Another advantage: using an IDL defines your contract in a way that is both language agnostic and portable, which is great for client support and for long-term maintenance.


Thanks for making the point about large teams -- I was coming here to make it myself. Even if you are using JSON, having a contract between client and server is super helpful.

Out of curiosity, what format did your infotainment system for serialization?


it's a proprietary binary serialization. However nothing revolutionary, just comparable to thrift and protobuf.


I once worked on a rather convoluted bunch of microservices where 40% of CPU time was spend inside cJSON library. Cap'n Proto does exactly what you propose - sends bytes over, except in a documented, organized, forward- and backward-compatible manner, instead of a haphazard implementation burdened by technological debt.


When you start having polyglot microservices this all will make sense. For example protobuf gives you versioning and compiles to a dozen languages...just because it is not useful to you don't dismiss the idea. And since I'm part of the minority I'm happy to have libs like this around. Just a sidenote: this behavior of neglecting performance alltogether leads to systems which take ages to build and collapse under pressure.


> When you start having polyglot microservices this all will make sense.

I'll be surprised if you use a language that has a good protobuf implementation that doesn't also have a good json/yaml/xml implementation.

Incidentally, as far as I can tell, protobuf also neglects performance. For high performance you want as much of your message to be predictably positioned in the bytestream as possible, but protobuf uses a key-value style system which means you always have to look at each key-value pair to determine what it is and how big it is. This makes good sense for compatibility, but it's a definite trade off that limits how fast deserialization can be.


If you use a type safe language, Protobuf gives you type safety. This helps a lot as a system grows complex. You don't get that with JSON/XML/etc.


Huh? All (?) of the XML parsers/unmarshallers I've ever used have been type safe - certainly the built in ones in C#, Java, and Go.


I think there's two distinct notions of type safety being discussed here: most deserializers in typed languages provide type safety once successfully deserialized, but the IDL/shared-schema approach helps guarantee that you'll be able to successfully deserialize the data into a type-safe representation in the first place.

I can't speak to XML, but can say that even with "type safe" json serialization, I've seen several bugs due to some json libraries treating all numbers as doubles internally, meaning bad things happen when you're actually serializing integers larger than 52 bits (say, a nano timestamp). Sure, maybe it's a bug in the json library because the json spec doesn't establish any max precision for numbers, but it's not hard to run into it when different components/languages with independently implemented json libraries try to talk to each other.

Shared schema approaches like proto also make it really hard to accidentally typo the name of the field, or accidentally try to read a field as a string rather than a list of strings.

Just like I wouldn't consider code to be type safe if it passed data around as raw Objects and each usage cast it to the expected type, I wouldn't consider services that communicate with each other using independently (manually) implemented parsers/extractors as being type safe, even if the "meat" of the code is dealing with strongly typed data on either end.


Another numeric oddity; the spec says that ints should have from 1 to 9 digits (in theory limiting it to 35 bit numbers or so)! Few libraries actually honour this, but it's really pot luck what they'll do with anything greater than a 32 bit integer.


Is that actually the case? The reference I see for 1-9 in the spec is not talking about length of the number, but is referring to digits 1,2,3...,9 as a single token in the grammar. digit1-9 is a token that's used to disallow leading zeros: a number must start (ignoring the possibility of "-" prefix) with a `digit1-9` which may then be followed by any number of `digit` tokens (inclusive of the digit "0" this time), or else it must start with a "0" which may only be followed by "." etc.

I don't see any reference to range/precision in the spec, other than the suggestion that numbers that can be represented as an IEEE double are likely to be interoperable, and integers within 53 bits of range will generally be interoperable due to exact agreement on their value.


In Golang you can have type safety with simple JSON marshaling between structures.


It's been a while since I looked at it but back when I was trying to map a JSON wire format onto a Go structure, this was a PITA, as it typically is in strongly-typed languages. You say it's "simple". What's the best way to go about it?


The other commenter posted a link to the official docs which explain it pretty well. I'm new to Go so I don't know what was there before but basically all I have to do is define a Go struct with the types that I want and then using struct field tags I can name the fields (for example if I want to use snakecase or something for JSON). Then when I encode to a string it automatically takes care of formatting it as JSON and I can decode a string and get back the typed struct (it will throw an error if it can't be done).


I've found it very easy just using the Marshal and Unmarshal packages for golang:

https://golang.org/pkg/encoding/json/#Unmarshal

They have some reasonable examples there.


The idea is that if you're inventing a new data format, you can write the Go types first, and the JSON schema is based on that. This works ok if you're starting in Go.

Mapping arbitrary JSON to Go types may or may not be simple, depending on the data format.


json/yaml/xml have much less support for a schema-first workflow. XML offers it but only in a terrible schema language; JSON/YAML don't offer it at all as far as I know, and have difficulty expressing sum types, never mind anything more complicated.


The serialization format is an implementation detail. That's why there are similar schema/definition formats that ARE natively represented in JSON, like JSON-Schema.

Personally I think it's easier to maintain a format defined in an IDL like Protobuf or Thrift than a JSON format. Everyone is always on about how JSON is a "human-readable" format, which is nice and probably 90% true. What we want is a human-writable format, which neither JSON nor YAML really are. It's easier to define a structure in Thrift than in JSON.

I still gotta review this release of Cap'n Proto, but after struggling with some buggy language implementations from the upstream Thrift project and evaluating Protobuf 3.0 + grpc, I think that Protobuf 3.0 is going to be the way forward for now.


I work on such a system (multiple "microservices" written in a variety of languages) and we use JSON on top of zeromq. It's super easy to debug without any fancy tool. And as long as I'm consistent with my API design it's trivial to extend and keep everybody up to date.

If performance ever becomes an issue (unlikely in my case) I could probably switch to some binary serialization protocol eventually.


Clearly there are needs for very efficient message parsing. That's why these serialization frameworks exist. I've got servers that spend too many cycles in serialization, so I pay careful attention to it. I put a lot of effort into using JSON despite the performance problems, and have pretty low-level code to read write it without too many allocations etc. Nothing thats easy on the eyes.

But I'd hazard a guess that you're spot on regards the normal needs of normal programs. The vast majority of programs utilizing message-passing are not serialization-bound. For these programs I'd recommend JSON because it is easy for humans to inspect and debug, and there are plenty of libraries to choose from if you are not performance-sensitive.

(YAML is hopeless regards security and XML is just horrid to look at when you're debugging).


Being able to inspect messages is down to the proper tooling.

But I can certainly understand where you are coming from. I've designed several text based protocols in my time for that exact reason. And I still do for stuff that I positively know will never require any sort of performance (mostly on embedded systems, somewhat ironically since those are sometimes constrained down to just a few hundred bytes of memory :-))


The "you can just send bytes over" bit is kind of the whole point. While I use Google's flatbuffers instead of Capn'Proto due to easier use w/ in-memory buffers they more or less have the same design idea: the underlying data is little-endian and stored in machine native types.

For my use case in electronic trading, a format like this is great b/c you don't waste many CPU cycles in decoding/encoding while at the same time you can take complex message structures and easily generate code stubs to read/write them in various languages. You change something? Just regenerate the code stub. You can even guarantee backwards compatibility if needed.


I'd be curious what you mean by "easier use w/ in-memory buffers".


One advantage of things like Protobufs/Thrift/Avro/etc, beyond the performance (and the performance difference can be _huge_), is that they have a (generally upgradeable) schema; this is particularly handy when using strongly-typed languages. For JSON and YAML you, at best, can use a bolt-on schema checker thing.

I'd also disagree that performance on message passing doesn't matter; depends how many messages you're passing :) I've seen CPU-bound systems which spend most of their CPU time decoding JSON.


There's MessagePack for substential improvement of parsing speed and size of JSON while still being straight-forward to parse and debug. These guys (Capt'n'Proto) are for cases when that's not enough either. As I understood, Capt'n'Proto removes the need of parsing itself, so you don't have to parse a data structure into your language's structure if you want to access a field, you just keep your data as a binary blob in memory without copying or parsing the whole thing.


I've found myself in all three situations: where performance didn't matter much and something like JSON or XML was fine, where performance mattered and a binary format was an obvious win, and one situation where serialized data needed to be as compact as possible and I could only achieve that by hand coding.

To me, formats like JSON or XML have one clear advantage: debugging is easier. I can fire up a decoder for a binary format, but that slows down the investigation.

That said, most binary formats have a schema file due to necessity of not serializing field names. That schema file makes it easy to have tools that spit out language bindings. And that reduces the tedium of actually writing code that uses (and validates) the data, so I generally prefer those formats for that reason.


In addition to the problem of premature binarification, I have the sneaking suspicion that a lot of the remaining use cases could well be handled by ASN.1.


Experientially, nothing is handled well by ASN.1. Google for "asn.1 cve" and see how many security vulnerabilities are due to it being approximately impossible to parse correctly.


Some time ago I wanted to use Cap'n Proto in the browser but then I found that the only existing implementation written in JavaScript hadn't been updated in two years and the author himself recommended against its use somewhere in a thread on Stackoverflow. I would love to use Cap'n Proto but for me a robust JS implementation is a sine qua non. Does anyone here happen to know if there's been any progress in this regard or have I missed something?


There has been some work on a new Javascript implementation, as described in this thread: https://groups.google.com/d/msg/capnproto/lESKRE_pix8/jX59zE...

But it's been slow.

On the other hand, now that 0.6 includes a JSON library, it's relatively easy to do browser<->server in JSON and then use Cap'n Proto on the back-end. But, obviously, it would be nicer to use Cap'n Proto through the whole stack.

Contributions are welcome!


I've got about 70% of a capnproto implementation written in pure javascript that works in the browser.

Part of the problem is that there is a bit of an ideological difference, where I prefer dynamic code to adding build steps but all the existing tools and code assume you want to generate source. I also found it quite irritating to bootstrap too, because the tooling itself uses capnproto. It sounds neat, but then it means that you have to be able to read capnproto in order to be able to read it.

The documentation was also incredibly patchy at the time which made it quite hard to develop for, although I'm told that the kentonv & gang are very helpful.

In the end, I was just starting to struggle with capnproto generics when my requirement went away as the server I was connecting to added a msgpack option.


FWIW, the Python implementation of Cap'n Proto loads schemas dynamically -- there is no code generator.

However, this is done through a C extension that calls into the C++ implementation. For browser-side Javascript, that's obviously problematic. (Emscripten isn't really the answer -- would be way too large a download.)


> Emscripten isn't really the answer -- would be way too large a download.

Could the emscripten result be slimmed-down?

(I fear the answer is probably “in theory, yes, in practice, it's >100KB”)


Do you happen to have a repository for this on Github?


It's still very much in a sketch/exploration state (looking at it again, 70% is pretty optimistic :-) so it's unlikely to be much use to anyone who hasn't spent a bunch of time trying to implement it themselves, but if you think there's a chance it might help you then send me your contact details (my email is in my profile) and I'll send it to you.


Not to take away from this library's announcement, but on this note I have had a great experience with protobufjs (for vanilla Google protocol buffers)[1]. I use it to both read and write protobuf messages in the browser.

Going to protobuf from JSON saved us about 50% on bandwidth for the high volume real time data service that we develop. Love it.

[1] https://www.npmjs.com/package/protobufjs


Check out Google FlatBuffers [1], I haven't used in a while but it had a working JS implementation and better language support for other languages than Cap'n Proto last time I needed something and compared the two. It's just a serialization framework and doesn't bundle RPC/distributed objects like Cap'n Proto - no event loop library or the complexity that comes with this - if you just need a serialization I think it's more approachable.

[1] https://github.com/google/flatbuffers


I think the target audience of Cap'n Proto and JSON are almost non-overlapping sets. If you use JS, anything other than JSON is a damn inconvenience. And if you cared about what people who are the target audience of Cap'n Proto care about, you won't be using JS :-)


> if you cared about what people who are the target audience of Cap'n Proto care about, you won't be using JS

What exactly would you be using? Flash?

In all seriousness, it's nice to have a format that works for all your clients (browser, iOS, Android, desktop, server).


I don't understand your question.


There is some strange geeky enjoyment from browsing all the serialization libraries out there. For my taste, I have centered on Cereal. When workind with C++ end to end, I have found it to be the easiest and fastest way to throw data around.


Serialization can easily be a bottleneck, especially for write-heavy systems (like we do, with Event Sourcing). So i think it's quite natural that people try and squeeze the last few milliseconds out of it. It can easily make a huge difference when serializing/deserializing billions of entities


Microservice architecture CPU usage can be dominated by serialization. It is one of the reasons that the JVM was so much faster than Ruby at Twitter for the frontend. The business logic just didn't matter as much as deserializing thrift and serializing json or html.


Serialization is order of microseconds, not milliseconds.


That depends entirely on how much (in both quantity and size) you are serializing, doesn't it?


I've seen single messages that took milliseconds... or even seconds.


There's a Wikipedia article that provides a nice overview:

https://en.wikipedia.org/wiki/Comparison_of_data_serializati...


Woohoo! Lack of first class Windows support always held me back. Looking forward to playing with this and seeing hopefully more regular future updates.


Great stuff! Since this release comes after the release of GRPC and (slightly less related) Graph API and ships with a web server, how does it compare?


Cap'n Proto RPC was originally released before gRPC. That said, gRPC is heavily based on Google's internal RPC system which has been around for a very long time, albeit not publicly.

If you read the Cap'n Proto RPC docs, everywhere where it mentions "traditional RPC", I specifically had Google's internal RPC in mind (having previously been the maintainer of Protobufs at Google). So, you can more-or-less substitute gRPC in there for a direct comparison. https://capnproto.org/rpc.html

There are two key differences:

1. Cap'n Proto treats references to RPC endpoints as a first-class type. So, you can introduce a new endpoint dynamically, and you can send someone a message containing a reference to that endpoint. Only the recipient of the message will be able to access the new endpoint, and when that recipient drops their reference or disconnects, you'll get notified so that you can clean it up. This is incredibly useful for modeling stateful interactions, where a client opens an object, performs a series of operations on it, then finally commits it. Put another way, this allows object-oriented programming over RPC. Also note that you can easily pass off object references from machine to machine -- currently this will set up transparent proxying, but in the future we plan to optimize it so that machines automatically form direct connections as needed, which will be really powerful for distributed computing scenarios.

2. Relatedly, Cap'n Proto supports "promise pipelining", which allows you to use the result of one RPC as an input to the next without waiting for a round-trip to the client. This makes it possible to use object-oriented interaction patterns with deep call sequences without introducing excessive round-trip latency. This is described in detail at the RPC link above.


For me another key difference (and really a killer feature if you need it) is that grpc supports streaming usecases natively besides RPC. In server->client, client->server and bidirectional ways.

Server->Client streaming is e.g. a powerful way to get realtime updates about some state on the server to the client without needing to poll (not performant) or needing to define callback services (ugly to maintain, service lifecycle questions, and if you need an extra connection for from the service to the callback service (client) there are also challenges around routing).

Native support for streaming also removes the need to model explicit flow control (backpressure) behavior in the user defined APIs (like functions for requesting some items in addition to the callback functions for delivering items).

Another difference is that grpc utilizes HTTP/2 as underlying transport protocol. However I'm thinking that's mainly an implementation detail. If Google had designed grpc slightly different (not relying on barely implemented features like HTTP trailers) it could have been a bigger differentiator, e.g. by allowing browsers to directly make grpc calls without proxies.


gRPC-style streaming can be implemented in terms of Cap'n Proto object references: When starting a streaming call, the client creates a callback object to be called by the server every time a new message is available, and then sends a reference to that object as part of the request. Or for client->server streaming, the server returns such an object reference.

The drawbacks of callbacks that you mention don't apply in this scenario: The same network connection is utilized for the callback, solving the routing question. The callback object receives a notification when the caller is done with it (or disconnects), so it can clean up, solving the lifecycle question. Flow control / backpressure can be achieved by pausing calls to the callback if too many previous calls have not returned (I actually plan to bake this pattern into the library in the future to make it dead simple).

So, it seems to me streaming in gRPC is actually a narrow subset of what Cap'n Proto can express.

For web stack compatibility purposes, it's straightforward to implement Cap'n-Proto-over-WebSocket. I'm not sure that HTTP/2 buys much on top of that.


Thanks for the explanation. If it works like you describe it seems to be a reasonable way to obtain streaming behavior. Will take a look at it if I find some time.

Regarding backpressure and HTTP/2: You get some different kind of backpressure behavior with your approach (application level flow control) and the grpc approach (transport level flow control). Let's say you have 2 functions which are called in parallel. For one the arguments are very big (let's say 100kB), for the other one they are small (some bytes). With HTTP/2 and transport level flow control the small function could get the bandwidth as the big one, which means more requests/s for the small function. With pure application level flow control the small function needs to wait until the big one is fully sent before it can be put on the wire.

Which means if I have 2 tasks that execute concurrently and look like

    A: while (true) { await callBigFunction(); }
    B: while (true) { await callSmallFunction(); }
then with grpc I get more calls for B and without Cap'N'Proto I get the same for both (correct me if I'm wrong).

I don't think it's a huge disadvantage in practice, because if you have large data chunks you should probably design your API different - and if you want realtime behavior then probably both protocols are not optimal. But it's still something to keep in mind.


You can achieve what you describe under my scheme by counting the total size of the parameters of all calls in flight rather than just counting the calls. This is what sophisticated systems are doing in practice and is what I plan to do when incorporating the pattern into the library.


The described approach sounds very reasonable!

Besides building more support in the library it would maybe also be very worthwhile to have these patterns described together with examples on the website. E.g. how to achieve callback interfaces which don't user another connection, server->client streaming, etc.

For me as a potential user information about these kind of features would make Cap'n'Proto immediately look more interesting. When I look at the website I only see Promise Pipelining described as a main feature - which is novel for sure, but not something that attracts my interest if I look for the other features we discussed about.


Yes, gRPC's stream support is a big positive in its book, and combined with the newer iterations of Protobuf that support lists and maps, it gives Thrift a run for its money. I was still initially attracted to Thrift because of its wide language support, until I gave these a test run and found that many of them are buggy.


> Put another way, this allows object-oriented programming over RPC.

So... CORBA?


No. CORBA tried to make remote objects look and behave like local objects, which was a deadly mistake. Cap'n Proto does not attempt to hide the network, but rather offers tools to work with it (e.g. the aforementioned promise pipelining).


More like E http://erights.org


What always trips me up about capnproto is that it's billed as a serialization library, but what it is is an in-memory storage layout, and "serialization" is mostly just dumping memory into a file, right? (which is cool)

What confuses me is, then what are the costs of migrating to this system? Am I essentially dumping my programming language's object model for my capnproto implementation's? When can this be annoying? Or does it vary from implementation to implementation?

In a similar tangent - how similar is this to apache arrow, not because of the columnar analytics part, but could I expect to just dump a bunch of data in shared memory and read it from another process to eliminate IPC serialization/copy costs?


Generally I'd recommend using Cap'n Proto in much the same way as you'd use Protobuf. It's not intended that your in-memory state be in Cap'n Proto objects, only the messages you intend to transmit, or data stored on disk.


Wait, but then aren't you merely shifting the serialization cost instead into building capnproto objects? (and perhaps that's more efficient somehow?) It seemed to make more sense when you already have your data as capnproto objects, versus creating objects only to send and discard them, which is similar to regular old serialization again.


With Protobuf, you still have the cost of constructing the objects, and the cost of then serializing them. Cap'n Proto removes the latter cost. Serializing is generally the much more expensive step. It also turns out Cap'n Proto reduces the building cost by a fair amount, because the arena-style allocation needed to support zero-copy output also happens to be a lot cheaper and more cache-friendly, but that's somewhat of an accident.

For message-passing scenarios, Cap'n Proto is an incremental improvement over Protobufs -- faster, but still O(n), since you have to build the messages. For loading large data files from disk, though, Cap'n Proto is a paradigm shift, allowing O(1) random access.


I understand much better now - thank you very much! And congratulations on the release!


Why is not intended that in-memory state be in Cap'n Proto / Protobuf objects? What are the down-sides?


The classes generated by Cap'n Proto and Protobuf are 100% public and are limited to the exact structures supported by the respective languages. That means that if you decide one day that your state needs to include, say, a queue, or if you want to encapsulate some of your state to give a cleaner API to callers, you can't, unless you go all the way and wrap everything. Inevitably if you've been building up your internal APIs in terms of protobuf/capnp types all along then you're going to be resistant to rewriting it and will instead probably come up with some ugly hack instead, and over time these hacks will pile up.

With that said, using protobufs for internal state is not an uncommon practice and if you don't care about cleanliness and just want to pound out some code quickly, sometimes it can work well.

Cap'n Proto has an additional disadvantage here in that its zero-copy nature requires arena allocation, in order to make sure all the objects are allocated contiguously so that they can be written out all at once. This actually make allocation memory for Cap'n Proto object much faster than for native objects -- but you can't delete anything except by deleting the entire message. So if you have a data structure that is gradually gaining and losing sub-objects over time, in Cap'n Proto you'll see a memory leak, as the old objects aren't freed up. You can work around this by occasionally copying the entire data structure into a new message and deleting the old one -- essentially "garbage collecting". But it's rather inconvenient.

This is actually one reason I want to extend the Cap'n Proto C++ API to generate POCS (Plain Old C Structs) for each type, in addition to the current zero-copy readers/builders. You could use the POCS for in-memory state that you mutate over time, then you could dump it into a message when needed (requiring one copy, but it should still be faster than protobuf encoding).

https://capnproto.org/roadmap.html#c-capn-proto-api-features


After integrating protobufs in my application for messaging I decided to use a separate schema for storing the current state of the program. Ie. When state changes, the protobuf is updated and written to disk. When the program restarts, the state file is loaded into memory. I have not run into any problems doing this.

Edit: Your question is addressed here: https://news.ycombinator.com/item?id=14249367


Thanks, but I don't follow how that comment addresses my question. Is it that cost of constructing Cap'n Proto / Protobuf is quite a bit higher than constructing objects defined natively?


> Is it that cost of constructing Cap'n Proto / Protobuf is quite a bit higher than constructing objects defined natively?

I discussed in more detail in reply to your first post, but just to be really clear on this:

No. In fact, for deeply-nested object trees, constructing a Cap'n Proto object can often be cheaper than a typical native object since it does less memory allocation. However, there are some limitations -- see my other reply.

(Constructing Protobuf objects, meanwhile, will usually be pretty much identical to POCS, since that's essentially what Protobuf objects are.)

There is a common myth that Cap'n Proto "just moves the serialization work to object-build time", but ultimately does the same amount of work. This is not true: Although you could describe Cap'n Proto as "doing the serialization at object build time", the work involved is not significantly different from building a regular in-memory object.


Does it do protocol negotiation? i.e. can a client ask the server what interfaces it implements?


There are a few answers to that:

1. If you have a remote object reference, you can (explicitly) cast it to any interface type, and then attempt to call it. If it doesn't implement that interface (or that method), an "unimplemented" exception will be thrown back. It's relatively common to do feature detection this way.

2. It's easy for the application to define a Cap'n Proto interface to support fancier negotiation. For example, you could have all your RPC interfaces extend a common base interface which has a method getSchema() which returns the full interface schema, or a list of interface IDs, or whatever it is you want.

3. We actually plan to bake in schema queries in a future version, such that all objects will support some sort of getSchema() call implicitly. This would especially be useful for clients in dynamic languages that could potentially connect to a server without having a schema at all, and load everything dynamically.


I think this[1] API among others should be renamed and "future" be used instead of "promise" to better reflect C++ naming conventions.

[1] https://github.com/sandstorm-io/capnproto/blob/247e7f568b166...


Cap'n Proto Promises are highly analogous to promises in Javascript. Both are derived directly from the E language, which has been around for decades. It makes sense to stick with the terminology used in similar designs, so that people don't have to re-learn the same concept when switching languages.

C++'s standard library for some reason decided -- relatively recently -- to introduce the term "promise" to mean something different (what most people call a "resolver" or a "fulfiller" for a promise), which is unfortunate. It's C++ that is being inconsistent here.

There are also subtle historical differences between the meaning of "promise" and "future". Historically, futures have usually existed in multi-threaded designs rather than event-loop/callback designs; you wait() on a future, blocking the calling thread, whereas with promises you call promise.then() to register a callback to call when a promise completes. The C++ committee again seems to have gotten this confused with their definition of "future", which looks more like a traditional promise.


I have no idea why the C++ designers chose those particular names, but at least Scala uses the same Promise/Future dichotomy, where you resolve (or reject) a Promise into a Future. Perhaps they drew from the same well?


How do Cap'n Proto Promises and Rust futures relate?


They have a lot in common!

For a while, capnp-rpc-rust used `gj::Promise`, which is based directly on the C++ Cap'n Proto implementation of promises (i.e. `kj::Promise`). Back in January, capnp-rpc-rust was updated to use `futures::Future` instead, and it was a fairly straightforward transition, as described in this blog post: https://dwrensha.github.io/capnproto-rust/2017/01/04/rpc-fut...

The trickiest part of the transition was dealing with scheduling. The implementation of `kj::Promise` has a built-in scheduling queue that guarantees a certain form of deterministic FIFO semantics, and those semantics are heavily depended upon in the Cap'n Proto RPC implementation. Rust's `future::Future` is less batteries-included, requiring capnp-rpc-rust to explicitly create queues where deterministic scheduling is needed.

Confusing the terminology perhaps even more, in capnproto-rust there is a type `capnp::capability::Promise` that implements `futures::Future`.


Windows support is great but why is this a showstopper? This is a common problem in many libraries and the common solution is to mark those platforms as unsupported until sometime steps up. Also platform support is a dynamic list - platforms are kept alive by presence of contributors/maintainers.


I think it would set a pretty bad precedent to have Windows support in one release and then drop it in the next release, then bring it back, etc. Windows users would likely be left very confused. Saying "it's up to contributors to step up" is nice in theory but in practice I don't think a well-maintained library can operate that way. Few people are eager to volunteer to run tests over and over fixing tiny issues for a coordinated release...

That said, it probably would have been better to drop Windows temporarily than to go two years without a release. But it always seemed like I'd find time in the next month.

Also note that Windows wasn't the only thing. I have a huge test matrix that I run for every release and, again, to avoid confusion, I want the whole thing to pass for any release. Things like building with -fno-exceptions or 32-bit builds or Android or ancient GCC versions tend to break frequently as the code evolves, but we really ought to have all these working for a release.

But maybe I'm just too OCD about this... :)

We now have AppVeyor (and Travis-CI) set up to build a chunk of the test matrix on every commit, which should help a lot going forward.


> But maybe I'm just too OCD about this... :)

It's not OCD, it's good practice and I wish more open-source projects would follow that! Good job!


Kenton, the Windows support looks amazing and I'm grateful for it. There are so many of us for whom lack of up to date Windows support is a showstopper, so, thanks! (Thanks to Harris and others, as well)


Would it kill developers of open source to provide a simple, two-sentence explanation of what their app does?


How's this?

> Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except faster.

https://capnproto.org/


I guess I should make the top banner be a link to the home page, so that people on mobile (who can't see the sidebar very easily) can just mash the screen.

EDIT: done


Ver nice and technically better than gRPC. However i better bet my company success on a standard that the big giant Google is betting on, and other companies are embracing. Also client side libraries for all langauges is important. Probably gRPC is probably better here too?


FWIW, Cloudflare is betting on Cap'n Proto, and Cloudflare is pretty big too.


Though to be fair, Cloudflare is still 2-3 orders of magnitude smaller than Google by most measures. :)


A bit off topic, but that title banner is a massive waste of space. It takes up more than half the space on my screen (15" laptop).


I know, totally agree. It's actually one of those things that have made me shy away from recommending its use. I continually end up back at Protocol Buffers, simply because the Cap'n Proto website doesn't seem professional to me. Instead, it feels like the back of a breakfast cereal box.


Working as intended. :)


Ha ha, indeed! :)


The name "Cap'n" was forever tainted for me, from my traumatic experience with "Cap'n Software Forth".

http://www.art.net/~hopkins/Don/lang/forth.html

"The first Forth system I used was Cap'n Software Forth, on the Apple ][, by John Draper. The first time I met John Draper was when Mike Grant brought him over to my house, because Mike's mother was fed up with Draper, and didn't want him staying over any longer. So Mike brought him over to stay at my house, instead. He had been attending some science fiction convention, was about to go to the Galopagos Islands, always insisted on doing back exercises with everyone, got very rude in an elevator when someone lit up a cigarette, and bragged he could smoke Mike's brother Greg under the table. In case you're ever at a party, and you have some pot that he wants to smoke and you just can't get rid of him, try filling up a bowl with some tobacco and offering it to him. It's a good idea to keep some "emergency tobacco" on your person at all times whenever attending raves in the bay area. My mom got fed up too, and ended up driving him all the way to the airport to get rid of him. On the way, he offered to sell us his extra can of peanuts, but my mom suggested that he might get hungry later, and that he had better hold onto them. What tact!"




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: