Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've always liked Cap'n Proto because it was (quite literally) the ideas behind Protobuf taken to an extreme, or, depending on your point-of-view, reduced to its most basic components: data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?

Of course, the hardest part is convincing everyone that it's not your bespoke type-length-value struct, but that you have good reasons for what you're doing. I think the humorous, not-so-self-serious presentation has worked in its favor (but that's just a subjective opinion and I can't back it up with data).



I thought that the main reason we didn't do this was because it's was hard - platform inconsistencies like 32/64 bit, endianness, not to mention differences between how languages store things, etc.

The thing that irks me about these methods is that if you're using a capnproto Int rather than a regular Int, doesn't that mean that you're basically forgoing a lot of functionality that was built around and works with the regular old data types?

For example, we also do that with numpy data types in python, but there the performance benefit is super clear - numerical operations dominate. I guess it really depends on your use case. If most of your time is spend on serde, then perhaps it's worth it.


> platform inconsistencies like 32/64 bit,

In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.

> endianness,

It turns out almost everything is little-endian now. Also, big-endian architectures almost always have efficient instructions for loading little-endian data. So Cap'n Proto just has to make sure to use those instructions in the getters/setters for integer fields.

> not to mention differences between how languages store things, etc.

Cap'n Proto actually doesn't attempt to match how any language stores things. Instead, it defines its own layout that is appropriate for modern CPUs. It ends up being very similar to the way many languages store things (especially C), but isn't intended to exactly match.

The C++ implementation of Cap'n Proto generates inline getter/setter methods that do pointer arithmetic that is equivalent to what the compiler would generate when accessing a struct.

For Java, Cap'n Proto data is stored in a ByteBuffer, which effectively allows something like pointer arithmetic. Again, getters/setters are generated which use the right offsets.

Most other languages end up looking like either C++ or Java.


> In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.

Unless you're programming in C or C++ (or using a library from them), where size of int, long, etc. may change depending on architecture and compiler.


The integer types are all specifically sized, e.g. int8, int32, etc. Same thing for unsigned integers.

https://capnproto.org/language.html#built-in-types


>data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?

In C/C++ ya can! When making games in college that is exactly what we did. Take the struct, dump it into the socket. I was rather shocked when trying to recreate the same system in C#. "I can't? I CAN'T?"


Well...that's kind of the point of C#. Enforced memory safety.

Use C if that's not what you want. Don't hammer with a screwdriver.


That's an absolutely terrible idea if you want any kind of forward / backward compatibility. Just about any system is better than dumping structs over the network, once you're dealing with a system that can evolve.


Yeah, I once had to write an Android / Desktop C++ implementation of such an API that was squirting iOS datastructures over socket. The API was built by a programmer who didn't know what endianess (or size of the structs) was.

That was a special kind of hell :/


You might be surprised just how many successful applications simply dump a packed struct to disk as a serialization format (or even transmit it over the wire). If the operating environment (OS/hardware) is known to be of a certain type, endianness is not a concern.


It's not just endianness; it's about maintainability. If you ever think you might change the struct, EVER, you need to worry about these things.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: