I've always liked Cap'n Proto because it was (quite literally) the ideas behind ...

makmanalp · on Sept 11, 2016

I thought that the main reason we didn't do this was because it's was hard - platform inconsistencies like 32/64 bit, endianness, not to mention differences between how languages store things, etc.

The thing that irks me about these methods is that if you're using a capnproto Int rather than a regular Int, doesn't that mean that you're basically forgoing a lot of functionality that was built around and works with the regular old data types?

For example, we also do that with numpy data types in python, but there the performance benefit is super clear - numerical operations dominate. I guess it really depends on your use case. If most of your time is spend on serde, then perhaps it's worth it.

kentonv · on Sept 11, 2016

> platform inconsistencies like 32/64 bit,

In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.

> endianness,

It turns out almost everything is little-endian now. Also, big-endian architectures almost always have efficient instructions for loading little-endian data. So Cap'n Proto just has to make sure to use those instructions in the getters/setters for integer fields.

> not to mention differences between how languages store things, etc.

Cap'n Proto actually doesn't attempt to match how any language stores things. Instead, it defines its own layout that is appropriate for modern CPUs. It ends up being very similar to the way many languages store things (especially C), but isn't intended to exactly match.

The C++ implementation of Cap'n Proto generates inline getter/setter methods that do pointer arithmetic that is equivalent to what the compiler would generate when accessing a struct.

For Java, Cap'n Proto data is stored in a ByteBuffer, which effectively allows something like pointer arithmetic. Again, getters/setters are generated which use the right offsets.

Most other languages end up looking like either C++ or Java.

izacus · on Sept 11, 2016

> In terms of data layout, 32-bit vs. 64-bit architecture only really affects pointer size. But Cap'n Proto does not encode native pointers (that obviously wouldn't work), so this turns out not to matter.

Unless you're programming in C or C++ (or using a library from them), where size of int, long, etc. may change depending on architecture and compiler.

jasonfsmitty · on Sept 11, 2016

The integer types are all specifically sized, e.g. int8, int32, etc. Same thing for unsigned integers.

https://capnproto.org/language.html#built-in-types

jackmott · on Sept 11, 2016

>data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?

In C/C++ ya can! When making games in college that is exactly what we did. Take the struct, dump it into the socket. I was rather shocked when trying to recreate the same system in C#. "I can't? I CAN'T?"

paulddraper · on Sept 11, 2016

Well...that's kind of the point of C#. Enforced memory safety.

Use C if that's not what you want. Don't hammer with a screwdriver.

QuercusMax · on Sept 11, 2016

That's an absolutely terrible idea if you want any kind of forward / backward compatibility. Just about any system is better than dumping structs over the network, once you're dealing with a system that can evolve.

izacus · on Sept 11, 2016

Yeah, I once had to write an Android / Desktop C++ implementation of such an API that was squirting iOS datastructures over socket. The API was built by a programmer who didn't know what endianess (or size of the structs) was.

That was a special kind of hell :/

heywire · on Sept 12, 2016

You might be surprised just how many successful applications simply dump a packed struct to disk as a serialization format (or even transmit it over the wire). If the operating environment (OS/hardware) is known to be of a certain type, endianness is not a concern.

QuercusMax · on Sept 12, 2016

It's not just endianness; it's about maintainability. If you ever think you might change the struct, EVER, you need to worry about these things.