What does HackerNews think of InvokeAI?

This version of CompVis/stable-diffusion features an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, a WebGUI, and multiple features and other enhancements.

Language: Jupyter Notebook

> Whenever I ask for something like ‘seamless tiling xxxxxx’ it kinda sorta gets the idea, but the resulting texture doesn’t quite tile right.

Getting seamless tiling requires more than just have "seamless tiling" in the prompt. It also depends on if the fork you're using has that feature at all.

https://github.com/lstein/stable-diffusion has the feature, but you need to pass it outside the prompt. So if you use the `dream.py` prompt cli, you can pass it `"Hats on the ground" --seamless` and it should be perfectly tilable.

Stable Diffusion is wild - the space has been quickly developing and watching the pace of development makes me reconsider what I consider "staggering". I've been blown away. The accessibility of this technology is even more incredible - there's even a fork that is working on M1 Macs (https://github.com/lstein/stable-diffusion)

We are in for some interesting times. Whatever the next iteration of Textual Inversion is will be extremely disruptive, especially if the concepts continue to be developed collectively.

SD on an Intel mac with Vega graphics runs pretty well though — I think it ran at something like ~3-5 iterations/s for me, which is decent. I ran either https://github.com/magnusviri/stable-diffusion or https://github.com/lstein/stable-diffusion which have MPS support
Thanks :)

I generate one image in about ~3 seconds with the DDIM sampler, 20 steps, on a RTX 2080Ti (~8it/s). The video on the Patreon page is sped up as it's not very interesting to sit and watch renders haha.

Although, some of the users who started using my UI weren't using the fork my app connects to, and were surprised it was a bit faster than what they were using before, so maybe you can give it a try. The repository is https://github.com/lstein/stable-diffusion

> 55 seconds on a M1 Pro MacBook with 16GB RAM to generate a picture

I've been running webui [1] on M1 MacBook Air 16GB RAM: 512x512, 50 steps takes almost 300 seconds. I'm suspecting that it is running on CPU, because the script says "Max VRAM used for this generation: 0.00G" and Activity Monitor says that it's using lots of CPU % and no GPU % at all. When M1 users are running stable diffusion, does the Activity Monitor show the GPU usage correctly?

[1] https://github.com/lstein/stable-diffusion

I recently switched from a CPU-only version to this repo release 1.13: https://github.com/lstein/stable-diffusion

The original txt2img and img2img scripts are a bit wonky and not all of the samplers work, but as long as you stick to dream.py and use a working sampler, I have had good luck with k_lms, then it works great and runs way faster than the cpu version.

Works great on 32gb ram but I'm honestly tempted to sell this one and get a 64gb model once the m2 pros come around. This is capable of eating up all the ram you can throw at it to do multiple pictures simultaneously.

The lstein fork [1] of the CompVis main repo is working on "Apple Silicon" based machines (and may work on Intel based too). It's not very fast though, ~3.5 minutes for 50 steps on my 16GB M1 Mini, whereas I understand that a 3080 can spit them out in the 30 second range. M machines with higher GPU core count I would suppose are faster.

[1] https://github.com/lstein/stable-diffusion

Magnusviri[0], the original author of the SD M1 repo credited in this article, has merged his fork into the Lstein Stable Diffusion fork.

You can now run the Lstein fork[1] with M1 as of a few hours ago.

This adds a ton of functionality - GUI, Upscaling & Facial improvements, weighted subprompts etc.

This has been a big undertaking over the last few days, and I highly recommend checking it out. See the mac m1 readme [3]

[0] https://github.com/magnusviri/stable-diffusion

[1] https://github.com/lstein/stable-diffusion

[2] https://github.com/lstein/stable-diffusion/blob/main/README-...

Depends what fork you're running... Some seem to be using CPU-based generation, others use the MPS device backend correctly which is MUCH faster. I have another comment floating around about lstein's fork, but it takes some massaging to get it to run happily. https://github.com/lstein/stable-diffusion/
I've used [0], [1] and [2] so far. I only use open-source ones and quickly skim the source code for anything suspicious. I also only use ones with some degree of popularity, meaning that others have probably taken a look at the code as well.

[0]: https://github.com/lstein/stable-diffusion

[1]: https://github.com/hlky/stable-diffusion

[2]: https://github.com/basujindal/stable-diffusion

RTX 3080 (10GB) here

Keep in mind to have the batch-size low (equal to 1, probably), that was my main issue when I first installed this.

Then, there's lot's of great forks already which add an interactive repl or web ui [0][1]. They also run with half-precision which saves a few bytes. Additionally, they optionally integrate with upscaling neural networks, which means you can generate 512x512 images with stable diffusion and then scale them up to 1024x1024 easily. Moreover, they optionally integrate with face-fixing neural networks, which can also drastically improve the quality of images.

There's also this ultra-optimized repo, but it's a fair bit slower [2].

[0]: https://github.com/lstein/stable-diffusion

[1]: https://github.com/hlky/stable-diffusion

[2]: https://github.com/basujindal/stable-diffusion