On a 16GB 8c8g Macbook Air M1, the PyTorch implementation takes about 3.6s/step which is about 3 minutes per image with the default parameters. I wonder how faster this would be. If there's anyone out there with a similar system and wants to compare, could you please write your findings?

Not M1 comparible but I'm working on testing various GPU vs M1 comparisons, with a few accessible cloud providers. My impression is times should be the same, but it's nice to hear other real-world stats for M1 with SD. Makes me really want to rent the Hetzner M1 now.

Which repo or build are you using BTW, is it the one related to this readme?

https://github.com/magnusviri/stable-diffusion/blob/main/REA...

I would love to see it, but this file is not accessible.

Sorry about that, web link rot sure is real eh.

This is an example of the original file: https://github.com/magnusviri/stable-diffusion/blob/79ac0f34...

Which seems to have been renamed, and cleaned up a bit here: https://github.com/magnusviri/stable-diffusion/blob/main/doc...

However, per the note on the magnusviri repo, the following repo should be used for a stable set of this SD Toolkit: https://github.com/invoke-ai/InvokeAI

with instructions here https://github.com/invoke-ai/InvokeAI/blob/main/docs/install...