Ive been playing with embeddings lately for document search based off queries/questions. this makes it seem like they work great, but it hasn't been very smooth for me.

I kept running into people recommending against using them for long documents? Is openai better then the models sentenance-transformers uses? I found some recommendations to average together the embeddings of parts. I guess its cutting edge-ish still, a lot still feels like you can have a cool demo quickly, but something reliable and accurate is a lot of work?

Try this https://github.com/marqo-ai/marqo which handles all the chunking for you (and is configurable). Also handles chunking of images in an analogous way. This enables highlighting in longer docs and also for images in a single retrieval step.