I program in C and I like many of the reasons they mention here, are things I like about C programming. They use C89 (and sometimes C99), although I do use some of the GNU extensions (which, as far as I know, both GCC and Clang support them).
I would want to use separate programs for displaying videos, whether or not the operating system includes them. Being able to play, pause, seek, set caption styles (including size, colour, and opacity), record it on a DVD, etc, would be helpful, but that can be whatever program you decide to use that has the features you wanted; whoever made the video or wants to send it to you should not need to care which of these features you are using (although they will need to include captions if you want them).
I wrote a program to record videos from HLS, including the option to avoid downloading commercial breaks. However, some things are still missing and/or might not work properly. (Some things, such as converting it to the DVD video format and then recording it onto a DVD, are done with separate programs and is out of scope of this one.)
I don't like AI images or stock images. If a picture is not needed (or helpful; sometimes it is not needed but can be helpful) to describe it which is specific to that article, add it, but stock images and AI images are not helpful, and can sometimes be deceptive in some contexts (although so can pictures specific for the article, if the pictures are badly made).
Often, the text will be good enough, or better (since then you do not need to download the picture, it does not take up space on the screen (or on a paper if printed out),e tc.)
There are also some things in C that do not work or work differently in C++, such as (void*), empty structures (which in C++ are not really empty), etc; and there is also such C++ stuff such as name mangling, the C++ standard library, etc, even if those things are not a part of your program, which is another reason why you might prefer C.
I do not agree that restricting it to UTF-8 (or to Unicode in general) is a fair and reasonable design decision (although UTF-8 may be reasonable if Unicode is somehow required anyways (you should avoid requiring Unicode if you can though), especially the program is also expected to deal with ASCII in addition to requiring Unicode), but regardless of that, the number of code points is not usually relevant (and substring operations indexed by code points is not usually necessary either), and the number of bytes will be more important, and some programs should not need to know about the character encoding at all (or only have a limited consideration of what they do with them).
(One reason you might care about the number of code points is because you are converting UTF-8 to UTF-32 (or Shift-JIS to TRON-32 or whatever else) and you want to allocate the memory ahead of time. The number of characters (which is not the same as the number of code points in the case of Unicode, although for other character sets it might be) is probably not important; if you want to display it, you will care about the display width according to the font, and if you are doing editing then where one character starts and ends is going to be more significant than how many characters they are. If you are using and indexing by the number of code points a lot (even though as I say that should not usually be necessary), then you might use UTF-32 instead of UTF-8.)
(It is also my opinion that Unicode is not a good character set.)
> data types won't matter (hence XML doesn't have them, but after that JSON got them back)
JSON does not have very much or very good data types either, but (unlike XML) at least JSON has data types. ASN.1 has more data types (although standard ASN.1 lacks one data type that JSON has (key/value list), ASN.1X includes it), and if DER or another BER-related format is used then all types use the same framing, unlike JSON. One thing JSON lacks is octet string type, so instead you must use hex or base64, and must be converted after it has been read rather than during reading because it is not a proper binary data type.
> The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL!
This is true of any format that you can import with your own names though, and since the names might otherwise conflict, it can also be necessary. This issue is not only XML (and JSON does not have namespaces at all, although some application formats that use it try to add them in some ways).
The <article> command in HTML can be useful, even if most implementations do not do much with it. For example, a browser could offer the possibility to print or display only the contents of a single <article> block, or to display marks in the scrollbar for which positions in the scrollbar correspond to the contents of the <article> block. It would also be true of <time>; although many implementations do not do much with it, they could do stuff with it. And, also of <h1>, <h2>, etc; although browsers have built-in styles for them, allowing the end user to customize them is helpful, and so is the possibility of using them to automatically display the table of contents in a separate menu. None of these behaviours should need to be standardized; they can be by the implementation and by the end user configuration etc; only the meaning of the commands will be standardized, not their behaviour.
"Meaning" has a rather vague meaning, but behavior is exact. If I know the behavior, it becomes a tool I can employ. If I only know supposed behavior, I cannot really use that. E.g. why we have so much SEO slop and so little "semantic" HTML? Because the behavior of search engines is real and thus usable, even when it is not documented much.
Different formats are good for different purposes. XML does have some benefits (like described in there), as well as some problems; the same is true of JSON. They do not mention ASN.1, although it also has many benefits. Also, the different formats have different data types, different kind of structures, etc, as well.
XML only has text data (although other kinds can be represented, it isn't very good at doing so), and the structure is named blocks which can have named attributes and plain text inside; and is limited to a single character set (and many uses require this character set to be Unicode).
XML does not require a schema, although it can use one, which is a benefit, and like they say does work better than JSON schema. Some ASN.1 formats (such as DER) can also be used without a schema, although it can also use a schema.
My own nonstandard TER format (for ASN.1 data) does have comments, although the comments are discarded when being converted to DER.
Namespaces are another benefit in XML, that JSON does not have. ASN.1 has OIDs, which have some of this capability, although not as much as XML (although some of my enhancements to ASN.1 improve this a bit). However, there is a problem with using URIs as namespaces which is that the domain name might later be assigned to someone else (ASN.1 uses OIDs which avoids this problem).
My nonstandard ASN1_IDENTIFIED_DATA type allows a ASN.1X data file to declare its own schema, and also has other benefits in some circumstances. (Unlike XML and unlike standard ASN.1, you can declare that it conforms with multiple formats at once, you can declare conformance with something that requires parameters for this declaration, and you can add key/value pairs (identified by OIDs) which are independent of the data according to the format it is declared as.)
(I have other nonstandard types as well, such as a key/value list type (called ASN1_KEY_VALUE_LIST in my implmentation in C).)
XSLT is a benefit with XML as well, although it would also be possible to make a similar thing with other formats (for databases, there is SQL (and Tutorial D); there is not one for ASN.1 as far as I know but I had wanted such a thing, and I have some ideas about it).
The format XML is also messy and complicated (and so is YAML), compared with JSON or DER (although there are many types in DER (and I added several more), the framing is consistent for all of them, and you do not have to use all of the types, and DER is a canonical form which avoids much of the messiness of BER; these things make it simpler than what it might seem to some people).
Any text format (XML, JSON, TER, YAML, etc) will need escaping to properly represent text; binary formats don't, although they have their own advantages and disadvantages as well. As mentioned in the article, there are some binary XML formats as well; it seems to say that EXI requires a schema (which is helpful if you have a schema, although there are sometimes reasons to use the format without a schema; this is also possible with ASN.1, e.g. PER requires a schema but DER does not).
Data of any format is not necessarily fully self-descriptive, because although some parts may be self-described, it cannot describe everything without the documentation. The schema also cannot describe everything (although different schema formats might have different capabilities, they never describe everything).
> When we discarded XML, we lost: ...
As I had mentioned, other formats are capable of this too
> What we gained: Native parsing in JavaScript
If they mean JSON, then, JSON was made from the syntax of JavaScript, although before JSON.parse was added into standard JavaScript they might have used eval and caused many kind of problems with that. Also, if you are using JavaScript then the data model is what JavaScript does, although that is a bit messy. Although JavaScript now has a integer type, it did not have at the time that JSON was made up, so JSON cannot use the integer type.
> I am tired of lobotomized formats like JSON being treated as the default, as the modern choice, as the obviously correct solution. They are none of these things.
I agree and I do not like JSON either, but usually XML is not good either. I would use ASN.1 (although some things do not need structured data at all, in which case ASN.1 is not necessary either).
(Also, XML, JSON, and ASN.1 are all often badly used; even if a format is better does not mean that the schema for the specific application will be good; it can also be badly designed, and in my experience it often is.)
reply