While this is true of JSON, it's also true of any other non-trivial serialization and/or encoding format. The main lessons to learn here are that:
1) implementation matters
2) "simple" specs never really are
It's definitely important to have documents like this one that explore the edge cases and the differences between implementations, but you can replace "JSON" in the introductory paragraph with any other serialization format, encoding standard, or IPC protocol and it would remain true:
" is not the easy, idealised format as many do believe. [There are not] two libraries that exhibit the very same behaviour. Moreover, [...] edge cases and maliciously crafted payloads can cause bugs, crashes and denial of services, mainly because libraries rely on specifications that have evolved over time and that left many details loosely specified or not specified at all."
No, this is not true of many reasonable formats. You don't have to make an obtusely nontrivial format to encode the data JSON does.
JSON is fairly trivial. The post is a nonsensical rant about parsers accepting non-JSON compliant documents (as the JSON spec specifically states that parsers may), such as trailing commas.
In the large colored matrix, the following colors mean everything is fine: Green, yellow, light blue and deep blue.
Red are crashes (things like 10000 nested arrays causing a stack overflow—this is a non-JSON-specific parser bug), and dark brown are constructs that should have been supported but weren't (things like UTF-8 handling, which is again non-JSON specific parser bugs).
Writing parsers can be tricky, but JSON is certainly not a hard format to parse.
Actually unless one is doing JavaScript, JSON is extremely difficult to parse correctly. I challenge you to write a simple, understandable JSON parser in Bourne shell or in AWK.
> JSON is extremely difficult to parse correctly ... in Bourne shell or in AWK.
Sorry for the misquote, but does it get to the heart of your objection?
I'm torn here. On the one hand I want to say "Those are not languages one typically writes parsers in," but that's a really muddled argument:
1. People "parse" things often in bash/awk because they have to -- because bash etc deal in unstructured data.
2. Maybe "reasonable" languages should be trivially parseable so we can do it in Bash (etc).
I'm kinda torn. On the one hand bash is unreasonably effective, on the other I want data types to be opaque so people don't even try to parse them... would love to hear arguments though.
Shell and AWK programs have fewest dependencies, are extremely light on resources’ consumption and are extremely fast. When someone’s program emits JSON because the author assumed that programs which ingest that data will also be JavaScript programs, that’s a really bad assumption. It would force the consumer of that data to replicate the environment. This goes against core UNIX principles, as discussed at length in “The art of UNIX programming” book. It’s a rookie mistake to make.
It won't, because JSON is a standard. Imperfect like all standards but practically good enough. And "plain text" just means "an undefined syntax that I have to mostly guess". And nobody "programs" in bash or awk anymore. The "standard scripting languages" for all sane devs are Python or Ruby (and some Perl legacy) and parsing JSON in them is trivial.
The "UNIX philosophy" was a cancerous bad idea anyway and now it's thankfully being eaten alive by its own children, so time to... rejoice?!
EDIT+: Also, if I feel truly lazy/evil (like in "something I'll only in this JS project"), I would use something much much less standard than JSON, like JSON-5 (https://github.com/json5/json5), which will practically truly force all consumers to use JS :P