For those who know, what is that interrogator thing?
It's image -> words, the inverse of stable diffusion.
see: https://github.com/pharmapsychotic/clip-interrogator
P.S. It's strange to me that this is a big deal, there are plenty of libraries for this stuff afaik, but I don't pay attention to licenses personally.