Lots of comments here about XML vs. JSON... but there are areas where these two don't collide. I'm thinking about text/document encoding (real annotated text, things like books, etc).

Even though XML is still king here (see TEI and other norms), some of its limitations are a problem. Consider the following text:

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Now say you want to qualify a part of it:

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Now say you want to qualify another part, but it's overlapping with previous part:

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Of course, this is illegal XML... so we have to do dirty hacks like this:

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Which means rather inefficient queries afterwards :-/

You are absolutely right that XML is better for document structures.

My current theory is that Yjs [0] is the new JSON+XML. It gives you both JSON and XML types in one nested structure, all with conflict free merging via incremental updates.

Also, you note the issue with XML and overlapping inline markup. Yjs has an answer for that with its text type, you can apply attributes (for styling or anything else) via arbatary ranges. They can overlap.

Obviously I'm being a little hypabolic suggesting it will replace JSON, the beauty of JSON is is simplicity, but for many systems building on Yjs or similar CRDT based serialisation systems is the future.

Maybe what we need is a YjsSchema...

https://github.com/yjs/yjs/