What does HackerNews think of alpaca.cpp?

Locally run an Instruction-Tuned Chat-Style LLM

Language: C

Also check out Alpaca; you can self-host this one, the 7B and 13B variants produce surprisingly good results and are fast enough just running on CPU: https://github.com/antimatter15/alpaca.cpp
Be aware this file is a single ~8GB 4-bit model (ggml-alpaca-13b-q4.bin) instead of the 2x ~4GB models (ggml-model-q4_0.bin, ggml-model-q4_0.bin.1) that most llama.cpp style inference running programs expect. You'll probably have to edit the line,

    n_parts = LLAMA_N_PARTS.at(hparams.n_embd);
in chat.cpp (or main.cpp) to hard code it to treat this 1 file model properly like,

    n_parts = 1;
Or re-write the parameter config subroutine to recognize and handle non-standard weights file.

magnet: magnet:?xt=urn:btih:053b3d54d2e77ff020ebddf51dad681f2a651071&dn=ggml-alpaca-13b-q4.bin&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Fopentracker.i2p.rocks%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A6969%2Fannounce&tr=udp%3A%2F%2F9.rarbg.com%3A2810%2Fannounce

torrent: https://btcache.me/torrent/053B3D54D2E77FF020EBDDF51DAD681F2...

torrent: https://torrage.info/torrent.php?h=053b3d54d2e77ff020ebddf51...

via: https://github.com/antimatter15/alpaca.cpp

Alpaca [1], perhaps. It's based on facebook's model (LLaMA) and its been trained on a conversational style, same as chat gpt. I don't know if it can produce code, though.

[1] https://github.com/antimatter15/alpaca.cpp