What does HackerNews think of dalai?
The simplest way to run LLaMA on your local machine
I guess it comes down to the requirement of a very high end (or multiple) GPU that makes it impractical for most vs just running it in Colab or something.
Tho there are some efforts:
It works for 7B/13B/30B/65B LLaMA and Alpaca (fine-tuned LLaMA which definitely works better). The smaller models at least should run on pretty much any computer.
My plan was to use https://github.com/cocktailpeanut/dalai with the alpaca model then somehow use llamaindex to input my dataset - a slack export. But it's not too clear how to train the alpaca model.
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction: {instruction}
### Response:
or
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction: {instruction}
### Input: {input}
### Response:
I ran Alpaca 7B Q4 almost instantly because they provided Curl's to download it. Super simple. But it seems most aren't doing that because it's prone to getting Facebook's gaze. So.. what's recommended?
I happened to find this[2], but i think that's the non-quantized raw models? Not sure yet.
[1]: Won't bother with 65B, can't fit in memory i believe? [2]: https://github.com/shawwn/llama-dl/blob/main/llama.sh
edit: I forgot about https://github.com/cocktailpeanut/dalai - i suspect this is best in breed atm? Though a Docker container would be nice to wrangle all the dependencies