Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's worth pointing out that any valid JSON value is a valid JSON document. There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

"I am a valid JSON document. So is the Number below, and in fact every line below this line."

4

null



Actually actually... the JSON spec doesn't define the concept of a JSON document. Neither http://www.json.org/ nor http://www.ecma-international.org/publications/files/ECMA-ST... actually specifies that a JSON 'document' is synonymous with a JSON 'value'.

Now it's also true that JSON doesn't specify an entity that can be either an object or an array but not be a string or a bool or a number or null. So it's kind of true that JSON doesn't say that an object or array are valid root elements.

But JSON also says "JSON is built on two structures" - arrays and objects. It defines those two structures in terms of 'JSON values'. But it's a reasonable way to read the JSON spec to say that it defines a concept of a 'JSON structure' as an array or object - but not a plain value. And then to assume that a .json file contains a JSON 'structure'.

Basically... JSON's just not as well defined a standard as you might hope.

edit: And now I'm going to well actually myself: Turns out https://tools.ietf.org/html/rfc4627 defines a thing called a 'JSON text' which is an array or an object, and says that a 'JSON text' is what a JSON parser should be expected to parse.

So - pick a standard.


JSON is in fact defined in (at least) six different places, as described in the piece 'Parsing JSON is a Minefield' [1] (HN: [2]).

The problem is perhaps not as egregious as with "CSV" -- which is more of a "technique" rather than a format, despite after 30 years of customary usage, someone retroactively having written a spec; but it does manifest in various edge cases like we're debating.

[1] http://seriot.ch/parsing_json.php [2] https://news.ycombinator.com/item?id=12796556


Why are you referencing the obsolete rfc? There is no restriction to object/array for the JSON text in the current rfc https://tools.ietf.org/html/rfc7159


The current RFC recommends the use of an object or array for interoperability with the previous specification. JSON being a bit of a clusterf* of variants, they tried to make the RFC broad then place interoperability limitations on it. (lenient in what you accept, etc etc)


Because I just discovered that there was, at least once, a specification that actually defined JSON that way, where previously I had thought it had only been ambiguously described, and I thought that was interesting.


> There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

Alas, if only that were true.

RFC 4627:

> A JSON text is a serialized object or array. The MIME media type for JSON text is application/json.

RFC 7159:

> A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array. Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.

IIRC, Ruby's JSON parser was written to be strictly RFC 4627 compliant, and yields a parser error for non-array non-object texts.

Since JSON isn't versioned so no one has any idea what "JSON" really means, or what "standard" is being followed.


You're right, thanks for the correction! Also kind of reinforces my point I feel. That any JSON document is just that, a JSON document; it doesn't carry more semantics just because you say so. My JSON parser will still just see simple JSON values, no matter how much I tell it that a certain key should really be a URL, not just a string.


True, but that's also true of any XML, RSS, Atom, HTML, etc. Websites abuse HTML all the time, and there's nothing saying that just because something is transferred with application/atom+xml that it will be valid or follow the spec.

It's more of a social agreement. If you get a JSON object from a place you expect a JSON Feed and it has a title and items, then it'll probably work, even if it omits other things.


So we can ditch media types altogether then? What's the point of having actual contracts if all we need is a hand shake and a wink? We're not talking about malformed data here, that's something different entirely and yes – it happens all the time. We're talking about calling a spade a spade.

If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work. Granted, it probably won't be a very interesting program precisely because JSON is just generic data without any meaningful semantics.

This spec is entirely about attaching semantics to JSON documents, but all that gets lost when you forget to let people know the document carries semantics and just call it generic JSON. Maybe that doesn't matter to a JSON-feed specific app that thinks any JSON is JSON-feed (an equally egregious error) but if there's an expectation that I should be able to point my catch-all program (i.e. web browser) at a URL and it should magically (more like heuristically I guess, potato/tomato) determine that the document retrieved isn't in fact just any JSON then things are about to get real muddy. Web browsers aren't particularly social, so I suspect a social agreement probably won't work that well.

Media types aren't just something that someone thought was a nifty idea back in the dizzy, they are pretty important to how the web functions.


If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work.

That's not a valid argument, because JSON is just a serialization format for an arbitrary data structure. You can't throw any arbitrary data structure at any program that accepts data and expect it to be able to accept it. Every program that accepts input requires that input to be in a specific format, which is nearly always more specific than the general syntax of the format. And aside from programs that make strict use of XML schemas, they pretty much all use the handshake-and-wink method for enforcing the contract. (Or to put it another way: documentation and input validation.)

My take on the author's approach is that the content-type is specifying the syntax of the expected input, and the documentation specifies the semantics and details of the data structure. In that respect, the program works like most other programs out there.


Aww that's not fair – if you're going to quote then don't cherry pick and remove the relevant bits.

> If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work. Granted, it probably won't be a very interesting program precisely because JSON is just generic data without any meaningful semantics.

(Emphasis mine.)

By doing this you're just reinforcing my argument that just parsing any ol' plain JSON won't make for very interesting programs. JSON is just plain dumb data, it doesn't tell you anything interesting. There may be additional semantics you can glean from a document than just its format (HTML is pretty good for this, but oddly enough not a very popular API format) if there are mechanisms to describe data in richer terms – but JSON has none of these. Yet this spec says you should serve this as just plain ol' boring JSON.

> And aside from programs that make strict use of XML schemas, they pretty much all use the handshake-and-wink method for enforcing the contract.

This is just not true. Case in point: web browsers – arguably one of the most successful kind of program there ever was, with daily active users measuring in the billions – make heavy use of meta data including media types to determine how to interpret the input. Not just by way of format (i.e. media type) but also by way of supplemental semantics (e.g. markup, micro formats, links.)

> My take on the author's approach is that the content-type is specifying the syntax of the expected input, and the documentation specifies the semantics and details of the data structure.

Which could and should be described in a spec, with a corresponding IANA consideration to include a new and specific media type in the appropriate registry – not by overloading an existing one.


I'm not sure what you're arguing. JSONFeed is JSON, unless I'm missing something, just JSON that matches a specific schema.

If I'm pulling JSON from any API, I expect it to match a certain schema. If I expected { "time": 10121} from a web API they send me "4", then sure, that's valid JSON, but it doesn't match the schema they promised me in the API.

Something that's JSON should be marked JSON, even if we're expecting it to follow a schema.


> JSONFeed is JSON, unless I'm missing something, just JSON that matches a specific schema.

Yes, and everything is application/octet-stream, so why have mime types? Because it helps with tooling, discovery, and content negotiation. It is a hint for the poor soul who inherits some undocumented ruby soup calling your endpoint.

Being as specific as possible with mine types is a convention for a reason. Please don't break it unless you have an explicit reason to.


This is exactly one of the things that media-types solve. Simply using application/json doesn't tell me (consumer) anything about the semantics of what I'm reading. It only tells me what "parser" to use. If we have a proper media-type, like application/hal+json, I know exactly how to create a client for that type: I need to use a JSON parser _and_ use the vocabulary defined by HAL…


> Something that's JSON should be marked JSON, even if we're expecting it to follow a schema.

That's what the +json type suffix is for. I wonder how many people in this thread actually have read the mediatype RFCs, because they definitely don't encourage using mediatypes in the way you're describing.

The whole point of mediatypes is to make it possible to distinguish schemas while also potentially describing the format that the schema is in.


this tool may help to validate and format JSON data, https://jsonformatter.org


Beware that many JSON parsers don't agree with this, although your interpretation is the correct interpretation of the spec. Some parsers will only accept either an array or object. If you're building a JSON endpoint you'll be safest returning either an array or object.


true


false




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: