What does HackerNews think of imagen-pytorch?
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Language:
Python
#23
in
Deep learning
That's already been done: https://github.com/lucidrains/imagen-pytorch
Thanks to the amazing @lucidrains there's already an open-source implementation of DALL-E 2: https://github.com/lucidrains/DALLE2-pytorch and a pretrained model for it should be released within this year.
The same person is also at work on an open-source implementation of Google's Imagen which should be even better (and faster) than DALLE-2: https://github.com/lucidrains/imagen-pytorch.
This is possible because the original research papers behind DALLE-2 and Imagen were both publicly released.
I'm honestly surprised that they trained a StyleGAN. Recently, the Imagen architecture has been show to be both easier in structure, easier to train, and even faster to produce good results. Combined with the "Elucidating" paper by NVIDIA's Tero Karras you can train a 256px Imagen* to tolerable quality within an hour on a RTX 3090.
Here's a PyTorch implementation by the LAION people:
https://github.com/lucidrains/imagen-pytorch
And here's 2 images I sampled after training it for some hours, like 2 hours base model + 4 hours upscaler:
* = Only the unconditional Imagen variant, meaning what they show off here. The variant with a T5 text embedding takes longer to train.
This implementation popped up on hacker news not too long ago. I got it working on Colab first, and then my own GPU at home. But just barely. Need more memory :)