What does HackerNews think of qlora?
QLoRA: Efficient Finetuning of Quantized LLMs
Language:
Jupyter Notebook
Take a look at the QLoRA repo https://github.com/artidoro/qlora/ which has an example finetuning Llama. Made by the authors of the QLoRA paper.
I used: https://github.com/artidoro/qlora but there are quite a few others that likely work better. It was literally my first attempt at doing anything like this, and took the better part of an evening to work through CUDA/Python issues to get it training, and ~20 hours of training.
How much GPU memory do you have access to? If you can run it, Guanaco-65B is probably as close as you can get in terms of something publicly available. https://github.com/artidoro/qlora. But as other comments mention, it's still noticeably worse in my experience.
Currently SOTA for specialization of LLMs is QLoRA: https://github.com/artidoro/qlora