Interesting DB for feature storage and LSH is good choice I believe. I'm wondering why the tight link to pytorch C++ tensors (under refactoring actually), bit I haven't looked at the euclidendb code yet. Thanks for sharing !
Those interested can also find an open source integration of lmdb + annoy here: https://github.com/jolibrain/deepdetect/blob/master/src/sims...
This the underlying support for similarity search based on embeddings, including images and object similarity search, see https://github.com/jolibrain/deepdetect/tree/master/demo/obj...
This is running for apps such as a Shazam for art, faster annotation tooling and text similarity search.
Annoy only supports indexing once, while hnwlib supports incremental indexing, something I'm looking at.