I'd argue that for data storage purposes, you'd like to have a low metadata to i...

dfox · on Oct 30, 2019

For storage of structured data (and probably even for loosely coupled RPC) you want format that is efficient and schema-oblivious. The bad choice 15 years ago was XML, bad choice today is JSON (the parsing overhead is not negligible) or ProtoBufs (not schema-oblivious). Various binary formats with JSON-like object model seems like the way to go (my choice is CBOR).

And then there is the EU-wide absurdity of WhateverAdES, which invariably leads to onion-like layers of XML in ASN.1 encoded as base64 in XML wrapped in CMS DER encoded message...

RantyDave · on Oct 30, 2019

I beg to differ. For a start, XML compresses well and besides, storage is monster cheap these days. XML is a better storage format because it documents what the data is (a title, a reference etc) as well as the data itself.

There are many better reasons to hate XML.

hnick · on Oct 30, 2019

XML does compress well as text or over the wire but the parsing trees can be quite large in memory and processing consumption. At least in Perl I've had enough scripts crash out due to this overhead when implementing the common/naive solution using off the shelf modules. You can get around this by choosing between DOM or SAX but I consider that a symptom of the problem, you choose XML to solve a problem and now you have another problem to solve.

Mikhail_Edoshin · on Oct 30, 2019

I had the same problem with npm, I think, and JSON, because npm could not simply load the huge JSON file into memory. A huge anything can crash a naively written tool used to handle smaller instances.

hnick · on Oct 30, 2019

That's true but I think XML has the edge there. It has so many features like defining new types which you wouldn't normally see in JSON. One parser we used had a ten to one ratio - 50MB of XML meant 500MB of RAM usage when using a DOM parser. And that's taking into account the textual representation of XML is already >50% bloat with the closing tags etc.