What does HackerNews think of LocalAI?

:robot: Self-hosted, community-driven, local OpenAI-compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU required. LocalAI is an API to run ggml compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many other

Language: Go

#4 in API
#7 in Kubernetes
At the moment only those that support the OpenAI Chat API, with function calling for the structured outputs. For example you can use LocalAI[0][1] to run models locally.

[0] https://github.com/go-skynet/LocalAI

[1] https://localai.io/features/openai-functions/

There's also LocalAI[0] which allows the use of local LLMs with an OpenAI compatible API.

[0] https://github.com/go-skynet/LocalAI

ChainForge has similar functionality for comparing : https://github.com/ianarawjo/ChainForge

LocalAI creates a GPT-compatible HTTP API for local LLMs: https://github.com/go-skynet/LocalAI

Is it necessary to have an HTTP API for each model in a comparative study?

Enterprises can bring up https://github.com/go-skynet/LocalAI, run Llama or others and connect to them from their Promptly LLM apps - So spin up GPU instances and host whatever model in their VPC and it connects to your SaaS stack? What are they paying you for in this scenario?
LocalAI seems like a interesting project that allows self-hosted can be easily called over the network. Open AI compatible interface https://github.com/go-skynet/LocalAI
Did you mean user interface or API?

Anyway for UI you could look at chainlit, for API some of the models are already getting wrapped up in an open ai compatible rest interface.

See https://github.com/go-skynet/LocalAI

You can't trust any API to not close down. Your only safe bet is relying on local models or relying on nothing at all.

My bot calls LocalAI, which uses the same API as OpenAI does: https://github.com/go-skynet/LocalAI

Porting an app over to it's API only requires you to change the endpoint and the name of the model you're calling. It uses the same API schema as OpenAI and works like a charm. No need to mope about lack of alternatives, vote with your feet and move.

> It's a sad reality because I so desperately want there to be competition

AI doesn't move fast. The hype moves fast - training models and raising capital doesn't.

I'm running Vicuna on a free 4core Oracle VPS, and it's perfectly usable for a Discord bot. Responses rarely take more than 15 seconds with <256 max token limit, and the responses are much more entertaining than GPT 3.5. I'm not using the streaming API my server software[0] offers, but if I did it would probably load somewhere between the speeds of GPT-3.5 and GPT-4. It's more or less the same time a human would take to compose the same message.

So... not exactly a serious use-case. But it's what I'm using, and now I'm saving 10s of dollars on inferencing costs per month!

[0] https://github.com/go-skynet/LocalAI

I'm also using this to improve acceleration - https://cloudmarketplace.oracle.com/marketplace/en_US/adf.ta...

In our experimentation, we've found that it really depends what you're looking for. That is you really need to break down down evaluation by task. Local models don't have the power yet to just "do it all well" like GPT4.

There are open source models that are fine tuned for different tasks, and if you're able to pick a specific model for a specific use case you'll get better results.

---

For example, for chat there are models like `mpt-7b-chat` or `GPT4All-13B-snoozy` or `vicuna` that do okay for chat, but are not great at reasoning or code.

Other models are designed for just direct instruction following, but are worse at chat `mpt-7b-instruct`

Meanwhile, there are models designed for code completion like from replit and HuggingFace (`starcoder`) that do decently for programming but not other tasks.

---

For UI the easiest way to get a feel for quality of each of the models (or, chat models at least) is probably https://gpt4all.io/.

And as others have mentioned, for providing an API that's compatible with OpenAI, https://github.com/go-skynet/LocalAI seems to be the frontrunner at the moment.

---

For the project I'm working on (in bio) we're currently struggling with this problem too since we want a nice UI, good performance, and the ability for people to keep their data local.

So at least for the moment, there's no single drop-in replacement for all tasks. But things are changing every week and every day, and I believe that open-source and local can be competitive in the end.

The answer to this question changes every week.

For compatibility with the OpenAI API one project to consider is https://github.com/go-skynet/LocalAI

None of the open models are close to GPT-4 yet, but some of the LLaMA derivatives feel similar to GPT3.5.

Licenses are a big question though: if you want something you can use for commercial purposes your options are much more limited.

I'm grateful that computer interfaces are not copyrightable[0], so I can use projects like LocalAI[1] to drop-in replace ChatGPT API calls wherever I want. OpenAI and their current business model can rot, for all I care.

[0] https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America%2....

[1] https://github.com/go-skynet/LocalAI

LocalAI is the OpenAI compatible API that lets you run AI models locally on your own CPU! Data never leaves your machine! No need for expensive cloud services or GPUs, LocalAI uses llama.cpp and ggml to power your AI projects!

LocalAI supports multiple models backends (such as Alpaca, Cerebras, GPT4ALL-J and StableLM) and works seamlessly with OpenAI API. Join the LocalAI community today and unleash your creativity!

GitHub: https://github.com/go-skynet/LocalAI

We are also on discord! Feel free to join our growing community!

https://discord.gg/uJAeKSAGDy