What does HackerNews think of web-stable-diffusion?

Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.

Language: Jupyter Notebook

#23 in Deep learning
Llama.cpp/ggml is uniquely suited to llms. The memory requirements are huge, quantization is effective, and token generation is surprisingly serial and bandwidth bound, making it good for CPUs, and an even better fit for ggml's unique pipelined CPU/GPU inference.

...But Stable Diffusion is not the same. It doesn't quantize as well, the unet is very compute intense, and batched image generation is effective and useful to single users. Its a better fit for GPUs/IGPs. Additionally, it massively benefits from the hackability of the Python implementations.

I think ML compilation to executables is the way for SD. AITemplate is already blazing fast [1], and TVM Vulkan is very promising if anyone will actually flesh out the demo implementation [2]. And they preserve most of the hackability of the pure PyTorch implementations.

1: https://github.com/VoltaML/voltaML-fast-stable-diffusion

2: https://github.com/mlc-ai/web-stable-diffusion

Yup, here's their web stable diffusion repo: https://github.com/mlc-ai/web-stable-diffusion

The input is a model (weights + runtime lib) compiled via the mlc-llm project: https://mlc.ai/mlc-llm/docs/compilation/compile_models.html

The MLC team got that working back in March: https://github.com/mlc-ai/web-stable-diffusion

Even more impressively, they followed up with support for several Large Language Models: https://webllm.mlc.ai/

The Apache TVM machine learning compiler has a WASM and WebGPU backend, and can import from most DNN frameworks. Here's a project running Stable Diffusion with webgpu and TVM [1].

Questions exist around post-and-pre-processing code in folks' Python stacks, with e.g. NumPy and opencv. There's some NumPy to JS transpilers out there, but those aren't feature complete or fully integrated.

[1] https://github.com/mlc-ai/web-stable-diffusion