> but it’s the first language I’ve experimented with that has made them a first-class feature

It is a feature of ALGOL68, Pascal, Ada and quite some newer languages:

https://en.wikipedia.org/wiki/Tagged_union

They look a lot nicer in ML and Haskell, IMO.

Of course I implemented them in Virgil, too. (shameless plug: https://github.com/titzer/virgil)