I have huge respect for Doug Crockford, and I never imagined I would disagree with him.

However I think by now we've seen that a lot of that "unnecessary" XML complexity was not, in fact, entirely unnecessary. These days we use JSON for everything, but now we've got JSON Schema, Swagger/OpenAPI, Zod, etc etc. It's not really simpler and there's a lot of manual work - we might as well be using XML, XSD & SOAP/WSDL.

With XML, the complexity is the baseline, and it only goes up from there. With JSON, the complexity is just an option, the baseline is pretty simple. Also, good XML-tools are rare or expensive.

Baseline for XML would be a document that doesn't use schemas, namespaces, attributes, or any of the SGML legacy stuff like DTDs and PCDATA.

Such a document is essentially as simple as the equivalent JSON.

Even that is more complicated than JSON.

Care to elaborate?

Every key is written twice, for opening and closing. Keys can be duplicated, and in fact that's what you have to do if you want a simple list. There aren't numeric types, so you have to parse strings. It also looks horrible.

  
    Led Zeppelin IILed Zeppelin999
    La BriseArax999
  
or

  
    
      Led Zeppelin II
      Led Zeppelin
      999
    
    
      La Brise
      Arax
      999
    
  
vs something like

  [
    {"title": "Led Zeppelin II", "artist": "Led Zeppelin", "price": 999},
    {"title": "La Brise", "artist": "Arax", "price": 999},
  ]
You can probably do better using XML attributes. But then you're using more features.

If we are complaining about the closing tags, might as well add that embedding newlines or quotes into JSON is less than pleasant.

Which is to say, this feels a touch of a non-issue. Yes, writing it by hand can get tedious, but that is true of any and every format. Is why you will almost certainly reach for other formats if doing a long list of data. And each and every one of them will fail for some form of input in ways that is frustrating.

you can't ignore ux stuff like this in a protocol that's meant for general use

something like duplicating info in closing tags in XML (which applies to every element) isn't really comparable to stuff like having to escape certain characters in JSON strings (which applies only to the values use those things)

perfect is the enemy of the good, and the good is the metric

Don't you also have to escape stuff in XML? Like &gt, which is even worse.

Yes, though many languages have lenient parsers. Most browser parsers, for example, will probably only be lenient if parsing "HTML."

    new XMLSerializer().serializeToString(new DOMParser().parseFromString("hello < ", "text/html")) 
The above in my console does as expected there. And again, entities are a very dangerous part of XML and friends.

You are correct that if you tell it that that is xml, the browser will throw it back at you. Just as the JSON parser will barf on JSON.parse("{'test':'value'}").

per specifications, json parsing is not lenient, html parsing is lenient

Right, and amusingly, more than a few json parsers are very lenient in this. That or folks abandon ship fairly quickly and go for another spec that is far more friendly.

well json definitely does not accept `{'test':'value'}` as valid input

any parser that behaves otherwise is pretty clearly buggy

json has many problems but parsing ambiguity is not really one of them

Me thinks you have never looked at the field. I'd as soon declare csv is an error free format. Only true if you ignore the proliferation of applications that get it wrong. In subtle ways, often. Still wrong.

csv is wildly ambiguous, to the frustration of ~every data science engineer in industry

json is not

show me an application that parses `{'a':'b'}` as valid JSON, i'm actually interested, probably there are some which exist, but there is no ambiguity about those applications being wrong

fun doc! it lists many of the undefined behaviors of the spec, and many of the problems in common parsers

afaict none of them permit keys or value strings to be expressed with single quotes

Apologies for the, in retrospect, somewhat lazy posting of an article with no comment. I thought that article had a section about how many of them allow single quotes if you don't "enable strict." I am not seeing it on review, though; so either I made that up in my mind, or I'm remembering another article. Either way, apologies.

I did find https://github.com/json5/json5 no a quick search that basically says what I asserted about people just jumping to another standard for things that you hand write. I was probably also thinking heavily about python's dict syntax. (And I confess, I still don't know when to use single versus double quotes in python...)