(disclaimer: i cofounded Chroma)
if you are building locally and dont want to send your data anywhere - try the open-source alternative Chroma https://github.com/chroma-core/chroma
chroma can help here https://github.com/chroma-core/chroma
Yeah its completely seperate. The LLM just gets some extra text in the prompt, that is all. The text you want to insert is "encoded" into the database which is not particularly compute expensive. You can read about one such implementation here: https://github.com/chroma-core/chroma
Chroma runs on Windows since I believe it's just a python package: https://github.com/chroma-core/chroma