What does HackerNews think of LoRA?
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
W = W0 + B A
Where W0 is the trained model’s weights, which are kept fixed, and A and B are matrices but with a much much lower rank than the originals (say r = 4).
It has been shown (as mentioned in the lora paper that training for specific tasks results in low rank corrections, so this is what it is all about. I think that doing LoRa can be done locally.
That's not what's happening in the parent comment. They're talking about projects like
https://github.com/ZrrSkywalker/LLaMA-Adapter
https://github.com/microsoft/LoRA
https://github.com/tloen/alpaca-lora
and specifically the paper: https://arxiv.org/pdf/2106.09685.pdf
Lora is just a way to re-train a network for less effort. Before we had to fiddle with all the weights, but with Lora we're only touching 1 in every 10,000 weights.
The parent comment says GPT4all doesn't give us a way to train the full size Llama model using the new lora technique. We'll have to build that ourselves. But it does give us a very huge and very clean dataset to work with, which will aid us in the quest to create an open source chatGPT killer.
With the model's weights open to people, people can do interesting generative stuff. However, it's still hard to train the model to do new things: training large language models is famously expensive because of both their raw size and their structure. Enter...
LoRA is a "low rank adaptation" technique for training large language models, fairly recently published by Microsoft (https://github.com/microsoft/LoRA). In brief, the technique assumes that fine-tuning a model really just involves tweaks to the model parameters that are "small" in some sense, and through math this algorithm confines the fine-tuning to just the small adjustment weights. Rather than asking an ordinary person to re-train 7 billion or 11 billion or 65 billion parameters, LoRA lets users fine-tune a model with about three orders of magnitude fewer adjustment parameters.
Combine these two – publicly-available language model weights and a way to fine tune it – and you get work like the story here, where the language model is turned into something a lot like ChatGPT that can run on a consumer-grade laptop.