Is there a Stable-Diffusion-esque open-source GPT yet? Given the incredible pace of advances in the image space this year, and my (perhaps naive) assumption that text generation is less complex and less resource-intensive than image generation, I'm hoping we'll get something similar and surprised that we haven't yet.
There is, called "GPT-NeoX", or its TPU-based predecessor GPT-Neo. However, even running inference on these models is much, much harder than Stable Diffusion -- the GPT-NeoX-20B weights for GPT-NeoX requires a minimum of two GPUs with 24 GB of VRAM each to simply run inference, never-mind training or fine-tuning.
I believe there are some tricks for cutting down the VRAM requirements a bit by dropping precision at different points, but the gist is that these big text models are actually quite a bit more resource intensive than the image models.
Where can i find more information about running GPT-NeoX? Is it covered in the paper? Or is there a forum, HOWTO or Wiki somewhere?