What does HackerNews think of latent-diffusion?

High-Resolution Image Synthesis with Latent Diffusion Models

Language: Jupyter Notebook

There are a lot but the one implemented as LDSR in most stable guis is this one. https://github.com/CompVis/latent-diffusion

upscale wiki is really the place to explore everything image scaling:

https://upscale.wiki/

The latent-diffusion[1] I've got running at home frequently generates stock image watermarks (e.g. "The London Skyline at night in the style of Carboni"[2], images 1, 2, and 6)

[1] https://github.com/CompVis/latent-diffusion [2] https://imgur.com/a/8tOI9QU

FYI if you only have a limited amount of compute, latent diffusion models will converge much faster.

https://github.com/CompVis/latent-diffusion

I don't have any of the DALL-Es but I do have a couple from github [1], [2] which gave these outputs[3]

[1] https://github.com/nerdyrodent/VQGAN-CLIP [2] https://github.com/CompVis/latent-diffusion [3] https://imgur.com/a/DjQYLUz

Don't have access to Dall-E 2 or Imagen but I do have [1] and [2] locally and they produced [3] with that prompt.

[1] https://github.com/nerdyrodent/VQGAN-CLIP.git [2] https://github.com/CompVis/latent-diffusion.git [3] https://imgur.com/a/dCPt35K

The latent-diffusion[1] one I've been playing with is not terrible at drawing legible text but generally awful at actually drawing the text you want (cf. [2]) (or drawing text when you don't want any.)

[1] https://github.com/CompVis/latent-diffusion.git [2] https://imgur.com/a/Sl8YVD5

An older, but similar and still impressive alternative is available here: https://github.com/CompVis/latent-diffusion

If you have a decent amount of VRAM, you can use it to start generating images with their pre-trained models. They're nowhere near as impressive as DALL-E 2, but they're still pretty damn cool. I don't know what the exact memory requirements are, but I've gotten it to run on a 1080 TI with 11gb.

EDIT: I also tried a 980 with 4GB of RAM a while back, but that failed...so you probably need more than that.

Hi, I'm the author of this post. I hope you all enjoy it! I researched and wrote this back in January, and although the main ideas are still relevant, the landscape of AI art generation has changed quite a bit in just three months. Here are some important new developments:

- DALL-E 2: https://openai.com/dall-e-2/

- Midjourney: https://twitter.com/midjourney

- Laion 5B dataset: https://laion.ai/laion-5b-a-new-era-of-open-large-scale-mult...

- Compvis latent diffusion: https://github.com/CompVis/latent-diffusion

Since the field is moving so quickly, this newsletter is a good way to try to stay on top of things: https://multimodal.art/news

Also I went on Yannic Kilcher's podcast to talk about this! https://www.youtube.com/watch?v=DdkenV-ZdJU&ab_channel=Yanni...

This isn’t true, the quality of images generated by DALL-E are really good, but they are an incremental improvement and based on a long chain of prior work. See e.g. https://github.com/CompVis/latent-diffusion