For those who know, what is that interrogator thing?

It's image -> words, the inverse of stable diffusion.

see: https://github.com/pharmapsychotic/clip-interrogator

P.S. It's strange to me that this is a big deal, there are plenty of libraries for this stuff afaik, but I don't pay attention to licenses personally.