JSON is underspecified, leading to various incompatibilities between implementat...

Diggsey · on Jan 20, 2022

JSON isn't under-specified, you can tell if something's valid JSON just based on the rules here: http://www.json.org/json-en.html. The mapping of the JSON data model to the data models found in various languages is what's ambiguous. Which is not impacted by curl supporting JSON in the slightest:

- The `--json` option only adds a content-type header, it doesn't alter the transmitted data at all.

- The `--jp` option has a bespoke format that's not part of the JSON spec, and which doesn't actually depend on a specific data model, it's just string manipulation.

gumby · on Jan 21, 2022

See a detailed reply to a parallel comment.

Also, —jp is actually generating a JSON.

Diggsey · on Jan 21, 2022

Numbers are the main thing I assumed you were talking about: the JSON spec is clear though, numbers can be arbitrarily long.

The problem is not with the JSON spec, the problem is when you are converting from one data model to another. Any program which claims to perfectly round-trip the JSON data-model should support arbitrarily long numbers, there's no ambiguity in the spec about that.

If you are only parsing JSON as a means to encode your own data-model, then there's no obligation to support arbitrary precision, but users should not expect to be able to round-trip arbitrary JSON data.

AFAICT, `--jp` doesn't do anything which would affect the length of supported numbers, even though it's generating JSON.

ch4s3 · on Jan 20, 2022

I think that's the most interesting implication here. I really hope he lands on something suitable.

doliveira · on Jan 20, 2022

How is JSON underspecified? What kinds of incompatibilities?

latk · on Jan 20, 2022

JSON lets you write numbers. They can have a sign, decimal part, and an exponent. The standard euphemistically describes this as:

> JSON is agnostic about the semantics of numbers. […] JSON instead offers only the representation of numbers that humans use: a sequence of digits. […] That is enough to allow interchange.

But can you encode/decode an arbitrary integer or a float? Probably not!

* Float values like Infinity or NaN cannot be represented.

* JSON doesn't have separate representation for ints and floats. If an implementation decodes an integer value as a float, this might lose precision.

* JSON doesn't impose any size limits. A JSON number could validly describe a 1000-bit integer, but no reasonable implementation would be able to decode this.

The result is that sane programs – that don't want to be at the mercy of whatever JSON implementation processes the document – might encode large integers as strings. In particular, integers beyond JavaScript's Number.MAX_SAFE_INTEGER (2^53 - 1) should be considered unsafe in a JSON document.

Another result is that no real-world JSON representation can round-trip “correctly”: instead of treating numbers as “a sequence of digits” they might convert them to a float64, in which case a JSON → data model → JSON roundtrip might result in a different document. I would consider that to be a problem due to underspecification.

gumby · on Jan 21, 2022

The numbers was what I was mainly thinking of, so thanks for your exhausting enumeration of those problems.

Jason.org requires white space for empty arrays and objects while RFC 8259 does not (and I often see [] and {} in the wild).

A lot of packages de fact break the spec in other ways, such as ppl blatting python maps out rather than converting them to JSON so that the keys are quoted as ‘foo’ rather than “foo”. I’ve complained about this when trying to parse the stuff only to receive the response “it works for me so you must have a bug” from the pythonistas. This has happened in multiple projects.