What does HackerNews think of postgresml?
PostgresML is an AI application database. Download open source models from Huggingface, or train your own, to create and index LLM embeddings, generate text, or make online predictions using only SQL.
I use it a lot for long running queries when doing data science and machine learning work, and a lot of times when executing queries from a jupyter notebook or CLI. That way if my jupyter kernel dies, my query execution continues even if the network or my environment has an issue. I've started using it a bit more with https://github.com/postgresml/postgresml for model training tasks too, since those can be quite long running depending on the situation.
We've extended Postgres w/ open source models from Huggingface, as well as vector search, and classical ML algos, so that everything can happen in the same process. It's significantly faster and cheaper, which leaves a large latency budget available to expand model and algorithm complexity. In addition open source models have already surpassed OpenAI's text-embedding-ada-002 in quality, not just speed. [1]
Here is a series of posts explaining how to accomplish the complexity involved in a typical ML powered application, as a single SQL query, that runs in a single process with memory shared between models and feature indexes, including learned embeddings and reranking models.
- Generating LLM embeddings with open source models in the database[2]
- Tuning vector recall [3]
- Personalize embedding results with application data [4]
This allows a single SQL query to accomplish what would normally be an entire application w/ several model services and databases
e.g. for a modern chatbot built across various services and databases
-> application sends user input data to embedding service
<- embedding model generates a vector to send back to application
-> application sends vector to vector database
<- vector database returns associated metadata found via ANN
-> application sends metadata for reranking
<- reranking model prunes less helpful context
-> application sends finished prompt w/ context to generative model
<- model produces final output
-> application streams response to user
[1]: https://huggingface.co/spaces/mteb/leaderboard[2]: https://postgresml.org/blog/generating-llm-embeddings-with-o...
[3]: https://postgresml.org/blog/tuning-vector-recall-while-gener...
[4]: https://postgresml.org/blog/personalize-embedding-vector-sea...
There is deeper explanation in the README: https://github.com/postgresml/postgresml