In terms of building something that's usable (considering cost, speed, scale, etc) if comparing an OpenAI API call to these, it's difficult for me to see a current path where these have any viable application outside some niche scenario.
From what I understand, even to run locally you/your team needs to be able to afford a machine with a 4090. These are super expensive in some countries.
I played around with the smaller Llama/Alpaca models and it wasn't really viable to build anything with.
Not really seeing a use-case for fine-tuning either compared to just few-shot prompting.
Can someone fill me in on what I'm missing? It feels like I'm out of the loop
So... not exactly a serious use-case. But it's what I'm using, and now I'm saving 10s of dollars on inferencing costs per month!
[0] https://github.com/go-skynet/LocalAI
I'm also using this to improve acceleration - https://cloudmarketplace.oracle.com/marketplace/en_US/adf.ta...