upvote
I've said it before, but I maintain that XML has only two real problems:

1. Attributes should not exist. They make the document suddenly have two dimensions instead of one, which significantly increases complexity. Anything that could be an attribute should actually be a child element.

2. There should be one close tag: `</>` which closes the last element, which burns a significant amount of space with useless syntax. Other than that and the self-closing `<tag />` (which itself is less useful without attributes) there isn't much that you need. Maybe a document close tag like `<///>`

You'll notice that, yes, JSON solves both of those things. That's a part of why it's so popular. The other is just that a lot more effort was put into maximizing the performance of JavaScript than shredding XML, and XSLT, the intended solution to this problem, is infamous at this point.

The problem of comments is kind of a non-issue in practice, IMO. You can just add a `"_COMMENT"` element or similar. Sure, yes, it will get parsed. But you shouldn't have that many comments that it will cause a genuine performance issue.

However, JSON still has two problems:

1. Schema support. You can't validate that a file before de-serializing it in your application. JSON Schema does exist, but it's support is still thin, IMX.

2. Many serializers are pretty bad with tabular data, and nearly all of them are bad with tabular data by default. So sometimes it's a data serialization format that's bad at serializing bulk data. Yeah, XML is worse at this. Yeah, you can use the `"colNames": ["id", ...], "rows": [ [1,...],[2,...] ]` method or go columnar with `"id": [1,2,...], "name": [...], "createDate": [...]`, but you had better be sure both ends can support that format.

In both cases, it seems like there is an attempt to resolve both of those issues. OpenAPI 3.1 has JSON schema included in it. The most popular JSON parsers seem to be adding tabular data support. I guess we'll see.

reply
A lot of people dislike that decision not to include comments in JSON, but I think while shocking it was and is totally correct.

In a programming language it's usually free to have comments because the comment is erased before the program runs; we usually render comments in grey text because they can't change the meaning of the program.

In a data language you have no such luxury. In a data language there's no comment erasure happening between the producer and the consumer, so comments are just dangerous as they would without doubt evolve into a system of annotations -- an additional layer of communication which would then not be standardized at all and which then would grow into a wild west of nonstandard features and compatibility workarounds.

reply
I don't dislike the decision at all, FWIW! For data interchange it's totally reasonable. But it does make JSON ill-suited for a bunch of applications that JSON has been forcefully and unfortunately applied to.
reply
> In a programming language it's usually free to have comments because the comment is erased before the program runs

That's inherent to the language specification, but it isn't inherent to the document. You have to have a system with rules that require that erasure.

Nothing prevents one from mandating a system that strips those comments out of JSON. You could even "compile" JSON to, I don't know, BSON or msgpack or something.

Just as nothing prevents one from creating tooling to, say, extract type annotations from comments in a dynamically typed language.

reply
> while shocking it was and is totally correct

Agreed —— consider how comments have been abused in HTML, XML, and RSS.

Any solution or technology that can be abused will be abused if there are no constraints.

reply
Could you imagine hitting a rest api and like 25% of the bytes are comments? lol
reply
Worse than that - people will start tagging "this value is a Date" via comments, and you'll need to parse ad-hoc tags in the comments to decode the data. People already do tagging in-band, but at least it's in-band and you don't have to write a custom parser.
reply
HTML and JS both have comments, I don't see the problem
reply
And both are poor interchange formats. When things stay in their lane, there is no "problem." When you try to make an interchange format using a language with too many features, or comments that people abuse to add parsable information (e.g. "type information") then there is a BIG problem.
reply
> so comments are just dangerous as they would without doubt evolve into a system of annotations -- an additional layer of communication which would then not be standardized at all and which then would grow into a wild west of nonstandard features and compatibility workarounds

IIRC Douglas Crockford explicitly stated that he saw people initially using comments for a purpose like ad hoc preprocessor directives.

reply
No, it was obviously and flagrantly incorrect, as evidenced by the success of interchange formats that do allow for comments, including many real world systems that pragmatically allow comments even when JSON says they shouldn't. This is Stockholm Syndrome.

But what can we expect from a spec that somehow deems comments bad but can't define what a number is?

reply