Which open source libraries is pinecone wrapping?

I’m not sure where the other commenter gets their confidence, but Pinecone is not wrapping any open source vector-search library. We offer three index types (in-memory, in-memory graph-based, hybrid memory + disk), and all are proprietary.

We do have articles about Faiss and HNSW and all sorts of other vector-search and NLP topics, so it’s possible that’s where the confusion comes from.

so, how does your proprietary solution compare against FAISS, eg with 10M dense vectors of 1024 dimensions?

They used to publish some benchmarks on their site, but seem to have removed them. You can find them on archive.org[1]. I guess it is understandable, since vector search performance is pretty unpredictable, and depends on a lot of factors. If their target market is people who want vector search without needing to read a bunch of papers first, benchmarks might be more confusing than they are helpful.

edit: While I do think it's understandable, it's not great for transparency. Even if they don't want to open-source their index, I would admire it if they were willing to give ann-benchmarks[2] an API key to publish some independent results.

Disclaimer: I work on vector search at a different company

[1] https://web.archive.org/web/20210227105542/https://www.pinec... [2] https://github.com/erikbern/ann-benchmarks