This is very useful.

I wish it had some information about supported languages. Most of the processing systems are JVM-based and require that you write your program in a JVM language. Some have Python support. But I have yet to encounter one that allows you write your pipelines in Go, Rust or JavaScript, for example. One notable exception is Storm, which supports pluggable runners, including one that talks to an external program over standard I/O. My impression that aside from Python, today's pipelines require a large amount of JVM buy-in, something I'm personally not interested in.

I'd also love some kind of metric for "aliveness". For example, my impression is that Storm was hot for about a week, and then Spark and Flink happened, and now nobody is talking about it, and Twitter itself has apparently replaced it with Heron.

If you're looking for something that doesn't constrain you to a particular language take a look at Pachyderm. It's built around containers so you can run any code you want. I designed it with JVM-phobes like you (and me) in mind.

https://github.com/pachyderm/pachyderm