Interesting DB for feature storage and LSH is good choice I believe. I'm wondering why the tight link to pytorch C++ tensors (under refactoring actually), bit I haven't looked at the euclidendb code yet. Thanks for sharing !

Those interested can also find an open source integration of lmdb + annoy here: https://github.com/jolibrain/deepdetect/blob/master/src/sims...

This the underlying support for similarity search based on embeddings, including images and object similarity search, see https://github.com/jolibrain/deepdetect/tree/master/demo/obj...

This is running for apps such as a Shazam for art, faster annotation tooling and text similarity search.

Annoy only supports indexing once, while hnwlib supports incremental indexing, something I'm looking at.

https://github.com/nmslib/hnswlib for anybody else googling this library