Same boat.

I’d love to host something local but have been so overwhelmed by the rapid progress and every time I start looking I find a guide that inevitably has a “then plug in your OpenAI api key…” step which is a hard NOPE for me.

I have a few decent gpus but I’ve got no idea where to start…

Path of least resistance:

- Download koboldcpp: https://github.com/LostRuins/koboldcpp

- Download your 70B ggml model of choice, for instance airoboros 70B Q3_K_L: https://huggingface.co/models?sort=modified&search=70b+ggml

- Run Koboldcpp with opencl (or rocm) with as many layers as you can manage on the GPU. If you use rocm, you need to install the rocm package from your linux distro (or direct from AMD on Windows).

- Access the UI over http. Switch to instruct mode and copy in the correct prompt formatting from the model download page.

- If you are feeling extra nice, get an AI Horde API key and contribute your idle time to the network, and try out other models on from other hosts: https://lite.koboldai.net/#