Scuttlebutt is a neat concept, burdened by a bad protocol. Signing a message inv...

SyrupThinker · on April 18, 2020

After reading about the protocol I came to a similar conclusion as you. Although it needs to be noted that the JSON serialization is defined as JSON.stringify as defined in ECMA-262 6th Ed. plus some more.

To me it's worse that key order must be preserved, which this standard does not specify the way I understand it.

Source: https://ssbc.github.io/scuttlebutt-protocol-guide/#message-f...

kbumsik · on April 19, 2020

> In brief, the rules are:

- Two spaces for indentation.

- Dictionary entries and list elements each on their own line.

- etc...

This is so weird to me. Why a protocol needs a strict, opinionated format of JSON? If they really need a very specific format of JSON, why they even bother JSON? There are better options like protobuf.

This seems the worst example of "Use JSON for everything".

jitschlit · on April 20, 2020

Just a guess, but maybe so they can reliable hash it?

viraptor · on April 18, 2020

This is terrible. It's basically not JSON anymore. It's some custom text protocol with similar value escaping...

If they wanted both the descriptions to be visible and the order to be preserved, they could use:

    [ ["previous", "..."], ["author", "..."], ...

tptacek · on April 19, 2020

lvh at Latacora wrote a blog post about this problem.

https://latacora.micro.blog/2019/07/24/how-not-to.html

Skinney · on April 18, 2020

This might actually improve in the near future: https://github.com/ssbc/ssb-spec-drafts/blob/master/drafts/d...

black_puppydog · on April 19, 2020

As @cel pointed out further down the tree [1] that's not entirely correct any longer. Especially the "100% tied to node.js" part has not been for a while. There are now alternative implementations for Rust and Go.

Now, the feed format is indeed a practical problem if you try to make a client from scratch and it is annoying. Luckily, using one of the already existing libraries will handle this for you.

Changing that feed format for everyone is not possible, simply because there's already an existing social network built on the old ones that we very much want to preserve since we actually... well... hang out there. Changing feed format thus involves adding a new feed format and making sure other clients can handle and link together both. At the benefit of abstracting away the feed format, and being able to iterate on them.

[1]: https://news.ycombinator.com/item?id=22912075

rakoo · on April 19, 2020

Indeed, json maps are not supposed to be ordered so doing anything that depends on the order is bound to fail.

This is the exact reason bencode (https://en.wikipedia.org/wiki/Bencode) was invented, and I still believe we can replace all uses of json by bencode and be better off it, because it solves all too common issues:

- bencoding maps are in lexicographical order of the keys, so no confusion possible for hashing/signing (a torrent id is the hash of a bencoding map)

- bencoding is binary friendly, in fact it must be because it stores the pieces hashes of the torrent

Why don't we use bencoding everywhere ?

muldvarp · on April 19, 2020

Interesting that your experience with Bencode was this positive. I've implemented a Bencode serializer/deserializer in Rust and here are a few things I've noticed:

* It is very easy to parse/produce Bencode

* It's probably fast

* The specification is really bad

* No float type

* No string type, just byte sequences. This is especially bad because most/all dictionary keys will be utf-8 strings in practice but you can't rely on it

* Integers are arbitrary length, most implementations just ignore this

I think Bencode is an ok format for its use case, but I don't think it should be used instead of json.

sneak · on April 18, 2020

When I was faced with this (signing a structure), I serialized the json into base64, then put that base64 string as a value (along with the MAC) into a new json document. It of course increases deserialization overhead (json, verify, unbase64, inner json) but sidesteps this issue.

I thought about sorting keys and other things like that, and the dozen edge cases and potential malleability issues dissuaded me for the compatibility issues mentioned above.

How have others solved it?

rakoo · on April 19, 2020

Use Bencoding, like bittorrent does: https://en.wikipedia.org/wiki/Bencode

As I put in another comment, a torrent id is a hash of a map, where one of the keys contains binary data. bencoding solved that decades ago already.

aabbcc1241 · on April 19, 2020

It looks similar to stackish and BON (binary object notation) https://github.com/bon-org/bon-doc/blob/master/README.asciid...

BON is compatible with json+ and erlang data type, in specific, it allows any data type for the map key. Json only allows string as map key.

sneak · on April 19, 2020

Why bencoding and not BSON or CBOR or any of the other serialization options?

flir · on April 18, 2020

Binary protocols. Or out-of-band signing.

elviejo · on April 18, 2020

Do you mind going deeper here?

Which protocols? Can you point to examples?

Because I was about to design a signing Json solution but based on the comments here it is a bad idea.

zenhack · on April 19, 2020

The signing schemes I've seen used in binary protocols fall into two categories:

1. Canonicalize and sign: the format has a defined canonical form. Convert to that before doing cryptographic operations. If the format is well designed around it, this is doable, whereas JSON doesn't really have this and with many libraries it's hard to control the output to the degree that you'd need. 2. Serialize and sign: serialize the data as flat bytes, and then your signed message just has a "bytes" field that is the signed object. This is conceptually not far off from the base64 solution above, except that there's not extra overhead, since with a binary protocol you'll have a length prefix instead of having to escape stuff.

bmm6o · on April 19, 2020

Being able to separate the object and signature saves tons of trouble https://latacora.micro.blog/2019/07/24/how-not-to.html

moojd · on April 18, 2020

Protobuf, cap n proto, and messagepack are a few I've seen before

boomlinde · on April 20, 2020

It sounds like base64 was unnecessary there since a JSON string can contain serialized JSON.

Personally I'll just concatenate the values in a defined order and sign/hash that.

sneak · on April 20, 2020

jsonrpc “args” can vary in name, number, and type.

boomlinde · on April 20, 2020

Sure, but where you're doing

    {"msg": "eyJhIjogMTAsICJjIjogInRlc3RpbmciLCAiYiI6ICJoZWxsbyJ9", "h": "..."}

you can as well be doing

    {"msg": "{\"a\": 10, \"c\": \"testing\", \"b\": \"hello\"}", "h": "..."}

and skip base64 altogether.

If you mean this as a point against the second part of my post, it's of course only in some limited circumstances you can simply dump the values in a defined order and be done with it. To make it general, you have to have delimiters for at least arrays, objects and numbers, canonicalize number representations, and probably output the keys as well, at which point you've invented your own complete serialization protocol.

VWWHFSfQ · on April 18, 2020

I just use JWTs whenever I have to pass signed messages.

alexh1 · on April 18, 2020

This is how Cosmos (https://cosmos.network/) deals with signing transactions, caused _major_ headaches for me trying to figure out why my signatures were invalid when sending them from Node.js

goliatone · on April 18, 2020

I did something similar for a project that had clients in go and node. I solved this by flattening the object key paths to strings and sorting the keys, basically. You need to build that into the client serialization/deserialization, it feels clunky but I had 0 issues and has been working smoothly for a long while now.

dboreham · on April 18, 2020

Any signing done over structured data has this problem. You always need a canonical representation.

ben509 · on April 19, 2020

It would be simpler to work out a canonical reduction for JSON. This should be reasonably easy since there are so few elements to it.

A simple proposal for actual security guys to rip to shreds:

Strings are represented as utf-8 blobs, and hashed.

Numbers should probably be represented using a proper decimal format and then hashed. If you're reading the JSON in and converting it to floats, you could get slight disagreement in some cases.

Arrays are a list of hashes, which itself is hashed.

For objects, convert the keys to utf-8 and append the value (which will always be the hashed representation) and then sort these entries bitwise. And then hash all that.

Or, better, it'd be great to have an order-independent hash function that's also secure. I doubt xoring all the pairs would be good enough. Update: a possible technique[1]

[1]: https://crypto.stackexchange.com/questions/51258/is-there-su...

mouldysammich · on April 18, 2020

could it use a HTTPHeader style system instead to avoid the json serializing back and forth?

Vendan · on April 18, 2020

Just serialize, sign, pack the signature and the raw bytes of the serialization. Doesn't matter how, just gotta pack the raw bytes, not futz with it

monocasa · on April 18, 2020

You can even have a JSON object containing base64 fields for message and signature if you feel it really needs to be JSON all the way down for whatever reason.

antpls · on April 19, 2020

This is typically the case that would benefit from a function compiled to webassembly. "Write once, run everywhere."

The protocol helper functions would be compiled to a webassembly library, and you would reuse them in Go, Python, the browser, etc

Of course, it's not justification for using their protocol (rewriting a protocol in another language is a good test for the protocol specification), but that would be a usecase for webassembly.

TimJRobinson · on April 18, 2020

Once someone has written a library to do this correctly in Go/Rust/Whatever isn't this problem solved? Everyone building scuttlebutt apps with that language can use that library. It didn't seem like this protocol is changing.

Vendan · on April 18, 2020

I tried, and tried, and tried to make that library for Go. I failed, cause of serialization issues too involved for it to be worth it to me (got to the point where I'd have to write my own json implementation)

Just giving people the heads-up that JS seemingly is the only blessed language for scuttlebutt

cel · on April 18, 2020

The work was picked up and completed.

Rust:

https://github.com/sunrise-choir/ssb-legacy-msg/blob/master/...

https://github.com/sunrise-choir/ssb-publish/blob/master/src...

https://sunrise.social/

Go:

https://github.com/cryptoscope/ssb/blob/master/message/legac...

https://planetary.social

Spec:

https://spec.scuttlebutt.nz/feed/messages.html#json-encoding

https://spec.scuttlebutt.nz/feed/datamodel.html#signing-enco...

masukomi · on April 18, 2020

there is work on C, Go, and Rust versions. It runs in iOS and Android. So, yeah, not so much "tied into node.js". Yes, that's where the majority of the work is, but...

ptman · on April 20, 2020

https://pypi.org/project/canonicaljson/

hinkley · on April 18, 2020

I spent quite a bit of time with XML digital signatures, which is a similar situation, but with even more surface area to get wrong. Months to implement, over a year to harden.

crispyporkbites · on April 18, 2020

I’ve had similar, albeit easy to solve, problems with key formats when signing emails.

But that isn’t a reason why scuttlebutt isn’t more popular. It only takes one Go implementation and the problem is solved permanently.