If you have even just 4gb stable diffusion will run fine if u go for 448x448 instead (basically the same quality).

I feel like I'm going insane. Everyone says 512x512 should work with 8gb but when I do it I get:

    CUDA out of memory. Tried to allocate 3.00 GiB (GPU 0; 8.00 GiB total capacity; 5.62 GiB already allocated; 0 bytes free; 5.74 GiB reserved in total by PyTorch)
any ideas? I have a 3060ti with 8gb vram...

with 448x448 I get:

    CUDA out of memory. Tried to allocate 902.00 MiB (GPU 0; 8.00 GiB total capacity; 6.73 GiB already allocated; 0 bytes free; 6.86 GiB reserved in total by PyTorch)

Others may have reduced the batch size (n_samples) to reduce the memory load. A lower batch size will significantly help with the memory consumption.

This comment: https://news.ycombinator.com/item?id=32710550 talks about running SD with 8GiB of VRAM and mentions needing to reduce this parameter to 1 to get it to output right.

This helped and I finally generated something larger than 256x256 :D thanks

If you're okay waiting a while linger and have plenty of RAM, https://github.com/bes-dev/stable_diffusion.openvino has a somewhat CPU-optimized version as well that relies on system memory rather than VRAM.

My laptop takes about 6 seconds per iteration so it's significantly slower, but if you're willing to wait I bet you'll have a much easier time plugging more RAM into your system than adding VRAM.