This class of startup, "build domain specific LLMS using your own data", is extremely crowded right now but I am not optimistic about their future. For large companies, the actual modeling work for this is already easy for any ML team, thanks to existing FOSS work on stuff like PEFT and LoRA. The hard part is figuring out what data goes into the fine tuning process and how to get this data in a usable form, but this is very business specific and can't be automated in a SaaS process.

For SMBs, the value would be in using the LLM to generate responses to customer Q&A/search queries. But these companies aren't going to integrate some external third party service, they'll only use it if it's already baked into their CMS - Wordpress/Shopify/Wix/etc. I just don't see who the final consumer for this product would be.

> thanks to existing FOSS work on stuff like PEFT and LoRA

YMMV. Sometimes a LORA is fine, but sometimes a full finetune is necessary for higher quality output.

That being said, backwards pass free training keeps making more and more progress. Seems like a short matter of time before it becomes practical.

Look at QLoRA. The QLoRA can be attached to all layers, allowing you to alter behavior with much less data than the original LoRA implementation. It seems to "stick" better.

I just fine tuned a ~30b parameter model on my 2x 3090s to check it out. It worked fantastically. I should be able to fine tune up-to 65b parameter models locally but wanted to get my dataset right on a smaller model before trying.

Are there any repos and steps you can point to to do this? I'd love to try to do exactly what you describe. I have been trying to do the same and have run into a lot of repos with broken dependencies.

I used: https://github.com/artidoro/qlora but there are quite a few others that likely work better. It was literally my first attempt at doing anything like this, and took the better part of an evening to work through CUDA/Python issues to get it training, and ~20 hours of training.