In truth, the vast majority of protobuf users do not construct messages in a streaming way anyway; they have the whole message tree in memory upfront and the serialize it all at once.
"Streaming" in gRPC (or Cap'n Proto) involves sending multiple messages. I think this is probably the right approach. Otherwise it seems very hard to use a message as its streaming in as you can never know which fields are not present vs. just haven't arrived yet. But representing a stream as multiple messages leaves it up to the application to decide what it wants in each chunk, which makes sense.
Yeah, that's another good point I forgot to mention; it's often hard to know the difference between "I haven't received this info yet but it might still come" and "I know this info definitely isn't coming at all". I've dealt with this more often in the context of a specific schema rather than in the encoding format itself (e.g. sending back a simple "success" response when relaying a message where the response will be delivered asynchronously so that the other side can tell the difference between the message or response getting lost and the message being successfully sent and the response just not having arrived yet), but it's definitely possible for an encoding format to be designed in a way where it's not clear where the message should end, and having a length prefix is an effective way to deal with that as well.
I also fully agree with the "streaming can be done as multiple messages" approach; from the discussion here, it sounds like there may be some nice use cases where having a length prefix would be prohibitive (e.g. compression being generated on-the-fly), but these don't sound like typical use cases for encoding formats intended to be used generally; if anything, I'd expect something like a gzip response to be sent back as the entirety of a response (e.g. an HTTP get request for a specific file) rather than a part of a message in some custom protocol using protobuf or something similar.
"Streaming" in gRPC (or Cap'n Proto) involves sending multiple messages. I think this is probably the right approach. Otherwise it seems very hard to use a message as its streaming in as you can never know which fields are not present vs. just haven't arrived yet. But representing a stream as multiple messages leaves it up to the application to decide what it wants in each chunk, which makes sense.