I don't have much experience in this area, but I'd be interested in an overview of how different pieces of sofware handle the concurrent / multiplayer editing problem, like:

- Etherpad

- Google docs

- Apache / Google Wave (open sourced: http://incubator.apache.org/projects/wave.html)

- repl.it https://repl.it/site/blog/multi

- figma https://www.figma.com/blog/multiplayer-editing-in-figma/ (image editing rather than text editing)

So is the gist of it that OT relies on central servers and they all use OT rather than CRDT? That was not entirely clear to me.

Looking at the xi analysis, it sounds like "IME" is specific to a desktop application using X (?), so it doesn't apply to any of this web software.

The rest of them seem like they do apply?

Is the problem that you have to pay a "CRDT tax" for every piece of state in the application? I thought the same was true of OT. Don't you have to express every piece of state within those constraints too?

repl.it doesn't seem to have a problem with syntax highlighting or brace matching (try it, it's pretty slick). So is it just that they paid that tax with a lot of code or is there something else fundamentally different about xi vs. repl.it ? Or maybe xi is going for a lot lower latency than web-based editors?

recent thread about OT vs. CRDT that might be interesting: https://news.ycombinator.com/item?id=18191867

TL;DR CRDT is completely irrelevant to any of the highlighting/etc stuff

Most highlighters are lexers. Advanced highlighters/folders are parsers.

The lexing/parsing that is required for highlighting is easy to make incremental for all sane programming languages.

for LL(star) grammars, adding incrementality is completely trivial (i sent patches to ANTLR4 to do this)

for LR(k) grammars, it's more annoying but possible (tree-sitter does this)

For lexing, doing it optimally is annoying, doing it near optimally is very easy.

optimal incremental lexing requires tracking, on a per token basis, how far ahead in the character stream the recognizer looked (easy), and computing the affected sets (annoying)

Most real programming languages have a lookahead of 1 or 2. Near-optimally requires tracking only the max lookahead used, and assuming all tokens need that max-lookahead. In a world where min token length is 1, that means you only need to re-lex an additional (max lookahead) tokens before the changed range. In a world where min token length is 0, it's all zero length tokens + (max lookahead) tokens before the changed range. This does not require any explicit per-token computation.

Again for basically all programming languages, this ends up re-lexing 1 or 2 more tokens total than strictly necessary.

Tree-sitter does context-aware on-demand lexing. i have patches on the way for ANTLR to do the same.

The only thing CRDT helps with in this equation at all is knowing what changed and producing the sequence of tree-edits for the lexer/parser.

The lexer only cares about knowing what character ranges changed, which does not require CRDT. The typical model for this kind of edit is what vscode's document text changes provide (IE For a given text edit, old start , old end, new start , new end)

The parser only cares about what token ranges changed, which does not require CRDT.

Very interesting and informative. But I suppose what the user really wants is the next step: not just incremental parsing, but also incremental compilation (or at least the error-checking part of the compilation process).

The Skip language (which I mentioned in another comment) is a language based entirely on incremental computation. Every piece of code you write in it becomes incremental, as far as I know.

AND the compiler is self-hosted!!! So that means they have a fully incremental compiler.

Although there is an LLVM back end, so maybe only the front end of the compiler is incremental? That's still important. It comes from the same people as the Hack language, which I believe has a fast "online" typechecker -- which I think is incremental (?)

I believe you could write a C compiler in Skip the same way?

If you peek at the source code, it has a 6000 line recursive descent parser, and 6000 line type checker, much in the style of OCaml (but in Skip).

I think this deserves a blog post because it's not clear to me that the message of Skip got through (7 months ago on HN). Their website has some examples, but not too many.

Of course, I haven't tried it yet, and it's a super ambitious project, so there could be some big problem with it.

But if the goal is to make an entire suite of compiler and devtools that are incremental, then Skip seems like it's ahead of the state of the art and something to borrow from.

Salsa was also mentioned, and I believe it's based on some of the same academic work like Adapton. But I don't know if they have an entire self-hosted compiler as a demo like Skip does. It seems to be "query-oriented" rather than supporting arbitrary code, which is what I would expect for most incremental computing frameworks. Skip seems to be more ambitious.

https://github.com/salsa-rs/salsa

I believe that the Skip language does this, because it's compiler is self-hosted. It's http://skiplang.com/blog/2017/01/04/how-memoization-works.ht...

https://github.com/skiplang/skip/tree/master/src/frontend

Skip – A programming language to skip the things you have already computed https://news.ycombinator.com/item?id=18077612

https://github.com/skiplang/skip/blob/master/src/frontend/sk...