> we are standing on the shoulders of giants, those who have built and battle-tested it, and brought it to its current mature state
I would rewrite this maybe to:
> we are making Google's internal problems into everyone's problems
There are benefits to an IDL in the abstract, but an IDL for everyone should be built with the benefit of hindsight looking at the lessons of protobuf, ion, thrift, etc. Not just baking Google's internal backwards compatibility obligations into a formal spec everyone should follow.
I think any time google takes an internal tool and flips the "open source" bit on it, it turns out to be a bad match for the rest of the world. When they instead take the time to build a new system that learns from the internal tool, like Kubernetes learned from Borg, I think the end result is significantly more valuable.
>> we are making Google's internal problems into everyone's problems
I think you're right. One example that I thought about the last time Buf was on the front page, but didn't post at the time, was the lesson that Google (and @kentonv with Cap'n Proto) took from Google's specific challenges with required fields in proto2. The Cap'n Proto FAQ [1] includes this:
> Imagine a production environment in which two servers, Alice and Bob, exchange messages through a message bus infrastructure running on a big corporate network. The message bus parses each message just to examine the envelope and decide how to route it, without paying attention to any other content. Often, messages from various applications are batched together and then split up again downstream.
> Now, at some point, Alice’s developers decide that one of the fields in a deeply-nested message commonly sent to Bob has become obsolete. To clean things up, they decide to remove it, so they change the field from “required” to “optional”. The developers aren’t idiots, so they realize that Bob needs to be updated as well. They make the changes to Bob, and just to be thorough they run an integration test with Alice and Bob running in a test environment. The test environment is always running the latest build of the message bus, but that’s irrelevant anyway because the message bus doesn’t actually care about message contents; it only does routing. Protocols are modified all the time without updating the message bus.
> Satisfied with their testing, the devs push a new version of Alice to prod. Immediately, everything breaks. And by “everything” I don’t just mean Alice and Bob. Completely unrelated servers are getting strange errors or failing to receive messages. The whole data center has ground to a halt and the sysadmins are running around with their hair on fire.
> What happened? Well, the message bus running in prod was still an older build from before the protocol change. And even though the message bus doesn’t care about message content, it does need to parse every message just to read the envelope. And the protobuf parser checks the entire message for missing required fields. So when Alice stopped sending that newly-optional field, the whole message failed to parse, envelope and all. And to make matters worse, any other messages that happened to be in the same batch also failed to parse, causing errors in seemingly-unrelated systems that share the bus.
How common is this type of message bus outside of a very big, very custom infrastructure like Google's? Also, I think the right lesson to take from this would be that the message bodies passing through the bus should be opaque; they should probably be defined as byte arrays in the schema for the message envelope. The message bus shouldn't parse or validate those bodies any more than an IP router is supposed to parse or validate the bodies of IP packets (leaving aside, for the moment, stateful packet inspection).
And so the power of Protobuf and Cap'n Proto schemas, particularly the power to validate messages which is supposed to be a primary advantage of strong static typing, is limited because of a specific issue with Google's infrastructure, from which I think they took the wrong lesson.
The latest version of protobuf does support passing opaque messages [1] and disallows required fields [2]. So I guess lessons were learned.
1: https://developers.google.com/protocol-buffers/docs/referenc...
2: https://stackoverflow.com/questions/31801257/why-required-an...
Yes, and I'm disagreeing with the second one.