This is a huge deal, congrats! We've had a ton of users asking how to run their own LLMs on Linux and the unfortunate answer was always that the existing options were slightly complicated. Having a single-click to download option is going to open this up for so many more people! If anyone is looking for a way to use Ollama inside VS Code, one option (what I've been working on) is https://continue.dev

Also curious, do you plan to support speculative sampling if/when the feature is merged into llama.cpp? Excited about the possibility of running a 34b at high speeds on a standard laptop