I've always liked Cap'n Proto because it was (quite literally) the ideas behind Protobuf taken to an extreme, or, depending on your point-of-view, reduced to its most basic components: data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?
Of course, the hardest part is convincing everyone that it's not your bespoke type-length-value struct, but that you have good reasons for what you're doing. I think the humorous, not-so-self-serious presentation has worked in its favor (but that's just a subjective opinion and I can't back it up with data).
I thought that the main reason we didn't do this was because it's was hard - platform inconsistencies like 32/64 bit, endianness, not to mention differences between how languages store things, etc.
The thing that irks me about these methods is that if you're using a capnproto Int rather than a regular Int, doesn't that mean that you're basically forgoing a lot of functionality that was built around and works with the regular old data types?
For example, we also do that with numpy data types in python, but there the performance benefit is super clear - numerical operations dominate. I guess it really depends on your use case. If most of your time is spend on serde, then perhaps it's worth it.
In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.
> endianness,
It turns out almost everything is little-endian now. Also, big-endian architectures almost always have efficient instructions for loading little-endian data. So Cap'n Proto just has to make sure to use those instructions in the getters/setters for integer fields.
> not to mention differences between how languages store things, etc.
Cap'n Proto actually doesn't attempt to match how any language stores things. Instead, it defines its own layout that is appropriate for modern CPUs. It ends up being very similar to the way many languages store things (especially C), but isn't intended to exactly match.
The C++ implementation of Cap'n Proto generates inline getter/setter methods that do pointer arithmetic that is equivalent to what the compiler would generate when accessing a struct.
For Java, Cap'n Proto data is stored in a ByteBuffer, which effectively allows something like pointer arithmetic. Again, getters/setters are generated which use the right offsets.
Most other languages end up looking like either C++ or Java.
> In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.
Unless you're programming in C or C++ (or using a library from them), where size of int, long, etc. may change depending on architecture and compiler.
>data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?
In C/C++ ya can!
When making games in college that is exactly what we did. Take the struct, dump it into the socket. I was rather shocked when trying to recreate the same system in C#. "I can't? I CAN'T?"
That's an absolutely terrible idea if you want any kind of forward / backward compatibility. Just about any system is better than dumping structs over the network, once you're dealing with a system that can evolve.
Yeah, I once had to write an Android / Desktop C++ implementation of such an API that was squirting iOS datastructures over socket. The API was built by a programmer who didn't know what endianess (or size of the structs) was.
You might be surprised just how many successful applications simply dump a packed struct to disk as a serialization format (or even transmit it over the wire). If the operating environment (OS/hardware) is known to be of a certain type, endianness is not a concern.
Of course, the hardest part is convincing everyone that it's not your bespoke type-length-value struct, but that you have good reasons for what you're doing. I think the humorous, not-so-self-serious presentation has worked in its favor (but that's just a subjective opinion and I can't back it up with data).