Is this an indication that the biggest impact from LLMs will be on the edge?

It's almost a certainty that a model as good (or better) than Alpaca's fine-tuned LLaMA 7B will be made public within the next or two.

And it's been shown that a model of that size can run on a Raspberry Pi with decent performance and accuracy.

With all that being the case, you could either use a service (with restrictions, censorship, etc) or you could use your own model locally (which may have a license that is essentially "pretty please be good, we're not liable if you're bad").

For most use cases the service may provide better results. But if self-hosting is only ~8months behind on average (guesstimate), then why not just always self-host?

You could say "most users are not evil, and will be happy with a service." Makes sense. But what about users who are privacy-conscious, and don't want every query sent to a service?

I just saw a project that lets you input an entire repo into GPT. Coincidentally, my place of employment just told us not to input any proprietary code into any generator with a retention policy.

Even then, I feel like the play will be an enterprise service instead of licensing.

If it's the product I think it is (I don't recall the exact name), it's not putting the repo into GPT. It's calculating embeddings on the code in the repo, storing those in a vector db and providing context from the store when processing questions about the repo. Effectively when you ask "how does foo work" becomes 1. lookup code items related to foo getting 1-N copies of code. 2. ask GPT "here is code related to foo . Now answer the following question: how does foo work"

I think we’re talking about different projects. This one just gives you a text output of an entire repo.

https://github.com/mpoon/gpt-repository-loader