mrfusion

Offtopic sort of, but does anyone know if folks are working on combining vision and natural language in one model? I think that could wield some interesting results.

sillysaurusx

https://github.com/openai/CLIP

The results are quite interesting:

https://www.reddit.com/r/Art/comments/p866wv/deep_dive_meai_...