Cross-language interactions suck. We need WebAssembly components and good code generators for a critical mass of languages before people actually start to use Wasm across different languages.

Unpopular opinion: Users will eventually realize that the lowest common denominator between languages is ... a BYTE STREAM (e.g. JSON/CSV/HTML), or what I think of as shell / Unix / Web -style composition.

IDLs and code generators are useful in many limited domains (e.g. when you control both sides of the wire), but they bake in a lot of assumptions that people don't realize are language-specific.

e.g. Protobufs are very good for C++ <-> C++ communication, but Java and Python users seem to dislike them equally, and even Go users do too.

COM is probably better than what most people are proposing now -- it recognizes the problem is dynamic, rather than trying to create the leaky abstraction of "fake" static system. It's true that IDL is reinvented every 5 / 10 / 20 years. (Related recent story: https://news.ycombinator.com/item?id=30128048)

I expect WASM will get some kind of component system (if it doesn't already exist), but many apps will still need to fall back to something more general.

-----

I'm writing about byte streams as a narrow waist of interoperability now, and this is the "lead in" review post: http://www.oilshell.org/blog/2021/12/review-arch.html

Even though disparate WebAssembly components can run in the same process (with memory potentially shared by the host), for something as wide as the "Web", the lowest common denominator is still the common text-based interchange formats we already have.

Related comment on WebAssembly: https://news.ycombinator.com/item?id=28581634

Programmers underestimate the degree to which languages and VMs are coupled i.e. I question whether WebAssembly is truly polyglot, i.e. GC requires a richer type system in the VM, and types create languages that are winners or losers. Losers are the language implementations that experience 2x-10x slowdowns.

> e.g. Protobufs are very good for C++ <-> C++ communication, but Java and Python users seem to dislike them equally, and even Go users do too.

Well, hilariously, when I looked at using it on a project, JSON outperformed protobuf in Python. JSON is implemented in C, Protobuf in Python, and the C decoder for a less efficient format won out.

(Now, technically, Protobuf has a C implementation, but at the time I was testing, it segfaulted reliably. Which is the problem with C…)

I also recall there were some severe problems with protobuf's ability to represent some types, but I don't remember what they are at this point. I thought it was something to do with sum types, but I'm looking at it now and it does have oneof, so IDK.

Yup, it is crazy how much optimization these "narrow waist" formats get, and it goes even further with simdjson and so forth [1].

The Python protobuf implementation has a somewhat checkered history (I used protobuf v1 and v2 for a long time, and reviewed v3 a tiny bit).

The type system issue is that protobufs to a large extent "replace" your language's types. It's essentially language-independent type. So that means you are limited to a lowest common denominator, and you have the issues of "winners" and "losers"... I would call Python somewhat of a "loser" in the protobuf world, i.e. it feels more second class and is more of a compromise.

This doesn't mean that anybody did a bad job; it's just a fundamental issue with such IDLs. In contrast, JSON/XML/CSV are "data-first" and there are multiple ways of using them and parsing them. You can lazily parse all of them, DOM and SAX, for example, and you have push and pull parsers, etc. Protobufs have grown some of that but it wasn't the primary usage, and many people don't know about it.

[1] https://github.com/simdjson/simdjson