Hacker News new | past | comments | ask | show | jobs | submit login

Another team at my company developed a microservice that uses gRPC that my team depends on and it’s been an absolute nightmare. Protobufs are just too limited in what they can do to make them useful except in very specific, narrow cases.



That's surprising, since Google uses protobufs very (very!) extensively internally, and it'd have to be a very exotic use case for protobufs not to be useful. Which isn't to say they're the be-all-end-all, but I'd be curious to hear your use case.


I haven't looked into gRPC at all, but I know for a fact that Protobufs can do whatever is needed in regards to serializing data.

Being limited to types that Java supports is a huge limiting factor for some use cases, thankfully there are extensions to the spec to get around these limitations (but then you have to use libraries that all supports the same extensions, and that limits your choices, so yes this can become an issue!)

On the flip side, compared to JSON with its anemic type support, Protobufs looks great!

Of course at the end of the day, a huge % of apps have to talk to the web browser, so everything gets dumbed down to strings and doubles. :(


> compared to JSON with its anemic type support

Am I the only one who wants to keep serialization as far from my type system as possible? I want to have my internal data model for my software, and when I want to serialize it, I should be able to do it in any number of different ways (redacting values, using HATEOAS links for REST routes, using a lossless format for storage) depending on the situation.

Protobuf couples data modeling and data serialization into a single inseparable concern, and it becomes difficult to do anything else once you start using it.


It is a performance trade off.

Sure you can use JSON and make everything strings and do something like

    { type: 'int8', value: '52' }
 
but then parsing out the data ends up being a huge overhead.

I once saw an XML variant of this, where someone decided to make the ultimate Distributed Computing System and serialize every single function call, take up over 1/2 of a CPU core for de/serialization for just the one app running on that machine.

While I could admire the purity of the design, it was utterly insane as an actual thing to bring into the world.

Protobuf is a good mixture of "strict typing so everyone is on the same page", "freedom to do stuff", and "perf isn't horrible."

There are other encoding systems out there that offer even more freedom, e.g. cap'n'proto, and better performance, but with other trade offs.

On the opposite side of things, my team had been serializing straight C structs out over the wire, but every field addition was a breaking change, and communicating the changes to our structures across teams was a nightmare of meetings and "has your team merged the changes Bob made so we can roll out our new format yet?" We needed 3 teams to roll out changes at the same time!

With protobufs, we were able to make changes to our wire format incredibly rapidly, we had a nice source control managed asset that defined our format, there weren't any confusions as to how data was laid out, and clients using older versions of our definitions just missed out on newer features, nothing actually broke.

It was an insanely large improvement, and honestly for the managed platforms, the performance wasn't appreciably worse than trying to convince Java to read in uint8s and write out uint8s.

My team was working on an embedded platform with RAM measured in kilobytes, and it was worth us eating the overhead just to get rid of the countless meetings we had to hold whenever we made a change to any of our structures.


Your internal data model should be separate from protobufs. Protobufs are for modelling the contract between parties of the serialized content so that the producer knows what to produce and consumer knows what data to expect. Only when you have a need to serialize your internal data model into protobufs would you touch protobufs, just like you would with any other serialization format.

The structures that are generated by the protobuf tool are simply there to help with transforming your internal state into the format needed to satisfy the shared contract, allowing your language's compiler to tell you if you have violated that contract. Theoretically you could produce/consume protobufs without them, but they are provided as a convenience so that you can deal with serialization transforms directly in your language of choice instead of banging bytes.


I think most people don't take the time to discriminate between their models and protobuf's. They see automagically created data models and think WELP LET'S USE THIS EVERYWHERE.


>Protobuf couples data modeling and data serialization into a single inseparable concern, and it becomes difficult to do anything else once you start using it.

This is only true if you decide that your storage proto and wire proto are the same. That's not at all necessary (and generally not recommended). More to the point, FieldMask exists in proto, so redaction is fully supported, though I rarely see them used.


Not true - protobuf has a JSON representation: https://developers.google.com/protocol-buffers/docs/proto3#j...


really? I mean besides the fact that nearly every message across Google is protobuf-encoded (with a fairly rich schema), protobufs are general enough to encode any message within them (you could use it to just pass bytes around), it's hard to see why this would be true. Can you be more specific?


What kind of limits are you seeing?


I'm also curious


It’s the repeated/oneof issue that’s causing me problems right now.


You mean like:

  repeated oneof actions {
    ActionTypeOne type_one;
    ActionTypeTwo type_two;
  }
or like:

  oneof thingy {
    repeated string first_option;
    string second_option;
  }
I can see why you'd want to do either of those things, but it doesn't seem like a huge deal to me to wrap those in another message.


  oneof thingy {
    repeated string first_option;
    string second_option;
  }
Not the most beautiful solution, but I've seen it in production and it works fine.


And why can't you wrap it in another message?


Can you go into detail on this?

Perhaps you would prefer something like Ion: https://amzn.github.io/ion-docs/guides/why.html




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: