Curious to see if it can take my entire site content: https://taoofmac.com/static/graph

Might be a fun weekend experiment.

Woah, that's a huge site!

Should be fine, though, as it iterates over it, it creates embeddings and then stores them in the FAISS store (https://github.com/facebookresearch/faiss) which was created to handle a large amount of embeddings.

For the actual queries, it filters it down by the most relevant documents which are closest in the embedding space, so this should work.

Let me know how it goes!