I added a Python library API to my LLM CLI tool recently which offers a very lightweight way to call models: https://llm.datasette.io/en/stable/python-api.html
import llm
model = llm.get_model("gpt-3.5-turbo")
model.key = 'YOUR_API_KEY_HERE'
response = model.prompt(
"Five surprising names for a pet pelican"
)
print(response.text())
Or you can stream the responses like this: response = model.prompt(
"Five diabolical names for a pet goat"
)
for chunk in response:
print(chunk, end="")
It works with other models too, installed via plugins - including models that can run directly on your machine: pip install llm-gpt4all
Then: model = llm.get_model("ggml-vicuna-7b-1")
print(model.prompt(
"What is the capital of France?"
).text())
It also handles conversations, where each prompt needs to include the previous context of the conversation: c = model.conversation()
print(c.prompt("Capital of France?").text())
print(c.prompt("what language do they speak?").text())
I wrote more about the new plugin system for adding extra models here: https://simonwillison.net/2023/Jul/12/llm/Nice. Now all we need is a vector database atop SQLite.
I had a go at one of those a few months ago: https://datasette.io/plugins/datasette-faiss
Alex Garcia built a better one here as a SQLite Rust extension: https://github.com/asg017/sqlite-vss