His estimate is that you could train a LLaMA-7B scale model for around $82,432 and then fine-tune it for a total of less than $85K. But when I saw the fine tuned LLaMA-like models they were worse in my opinion even than GPT-3. They were like GPT-2.5 or like that. Not nearly as good as ChatGPT 3.5 and certainly not ChatGPT-beating. Of course, far enough in the future you could certainly run one in the browser for $85K or much less, like even $1 if you go far enough into the future.

Yeah, the constant barrage of "THIS IS AS GOOD AS CHATGPT AND IS PRIVATE" screeds from LLaMA-based marketing projects are getting ridiculous. They're not even remotely close to the same quality. And why would they be?

I want the best LLMs to be open source too, but I'm not delusional enough to make insane claims like the hundreds of GitHub forks out there.

robertlagrant

> I want the best LLMs to be open source too

How do you do this without being incredibly wealthy?

mejutoco

Pooling resources a la SETI@home would be an interesting option I would love to see.

simonw

My understanding is that can work for model inference but not for model training.

https://github.com/bigscience-workshop/petals is a project that does this kind of thing for running inference - I tried it out in Google Collab and it seemed to work pretty well.

Model training is much harder though, because it requires a HUGE amount of high bandwidth data exchange between the machines doing the training - way more than is feasible to send over anything other than a local network connection.