What does HackerNews think of hnswlib?

Faiss: A library for efficient similarity search | Mar 2023

hnswlib (https://github.com/nmslib/hnswlib) is a strong alternative to faiss that I have enjoyed using for multiple projects. It is simple and has great performance on CPU.

After working through several projects that utilized local hnswlib and different databases for text and vector persistence, I integrated hnswlib with sqlite to create an embedded vector search engine that can easily scale up to millions of embeddings. For self-hosted situations of under 10M embeddings and less than insane throughput I think this combo is hard to beat.

https://github.com/jiggy-ai/hnsqlite

Storing OpenAI embeddings in Postgres with pgvector | Feb 2023

Expand Context ↕

https://github.com/nmslib/hnswlib

Used it to index 40M text snippets in the legal domain. Allows incremental adding.

I love how it just works. You know, doesn’t ANNOY me or makes a FAISS. ;-)

Building a semantic search engine in Rust | Nov 2022

Expand Context ↕

hnswlib is in cpp and has python bindings (you should be able to make your own for other languages). Faiss, Annoy (by Spotify) should also provide similar functionality.

https://github.com/nmslib/hnswlib

Find anything fast with Google's vector search technology | Dec 2021

Expand Context ↕

hnswlib[1] allows for incremental updates. And I believe in terms of accuracy it stacks up fairly well against alternatives like FAISS or ScaNN.

[1]: https://github.com/nmslib/hnswlib/

Find anything fast with Google's vector search technology | Dec 2021

Expand Context ↕

There's also hnswlib[1], which has supposedly lower memory requirements and allows for adding new vectors to an existing index.

[1]: https://github.com/nmslib/hnswlib/

Introduction to Locality-Sensitive Hashing | Jun 2021

great post and my favorite CS topic, LSH is particularly relevant to machine learning because of its use in indexing of embeddings, which are now omnipresent in ML (from word2vec to transformers to image and graph embeddings etc, etc.)

it is now supported in ElasticSearch KNN index (they use HNSWLIB but you can call it a descendant of original LSH in a way)

check out ANN benchmarks [0] for comparison of LSH performance to other state of the art methods like proximity graphs/HNSWLIB [1] and quantization/SCANN [2]

As an introduction LSH (with MinHash) is also described in detail in the book "Mining Of Massive Datasets", ch.3, "Finding Similar items", highly recommended [3]

if you want to play with LSH, python "annoy" library is the best place to start [4]

[0] https://github.com/erikbern/ann-benchmarks

[1] https://github.com/google-research/google-research/tree/mast...

[2] https://github.com/nmslib/hnswlib

[3] http://infolab.stanford.edu/~ullman/mmds

[4] https://github.com/spotify/annoy

EuclidesDB: a multi-model machine learning feature database | Nov 2018

Expand Context ↕

https://github.com/nmslib/hnswlib for anybody else googling this library