His estimate is that you could train a LLaMA-7B scale model for around $82,432 and then fine-tune it for a total of less than $85K. But when I saw the fine tuned LLaMA-like models they were worse in my opinion even than GPT-3. They were like GPT-2.5 or like that. Not nearly as good as ChatGPT 3.5 and certainly not ChatGPT-beating. Of course, far enough in the future you could certainly run one in the browser for $85K or much less, like even $1 if you go far enough into the future.
Yeah, the constant barrage of "THIS IS AS GOOD AS CHATGPT AND IS PRIVATE" screeds from LLaMA-based marketing projects are getting ridiculous. They're not even remotely close to the same quality. And why would they be?
I want the best LLMs to be open source too, but I'm not delusional enough to make insane claims like the hundreds of GitHub forks out there.
> I want the best LLMs to be open source too
How do you do this without being incredibly wealthy?
Pooling resources a la SETI@home would be an interesting option I would love to see.
https://github.com/bigscience-workshop/petals is a project that does this kind of thing for running inference - I tried it out in Google Collab and it seemed to work pretty well.
Model training is much harder though, because it requires a HUGE amount of high bandwidth data exchange between the machines doing the training - way more than is feasible to send over anything other than a local network connection.