I'm not sure what the advantage the use of a somewhat comprehensive framework like Langchain gives you for this use case?
It starts to feel as AI tech is slowly turning into web tech with a million tools and frameworks, so I'm just wondering whether all of these are needed and if it isn't easier to code your own than learning a foreign framework...
Not off-topic at all. After struggling with LangChain's hyper-opinionated implementation of classes I agree.
In fact, this is better off leveraging Llamaindex. This is a proof-of-concept and ultimately leveraging a library / framework helps afford the following:
- easy implementation of chunking strategies when you're unsure - OpenAI helper functions - embeddings and vector store management
Again, even with the above I struggled and had to implement PGVector myself. Going into production once I have my document retrieval strategy and prompt-tuning optimized, I would never use Langchain in production simply bc of the bloat and inflexible implementation of things like the PGVector class. Also the footprint is massive and the LLM part can be done in 5% of the footprint in Golang and 5% of the cloud costs.
So I actually agree with you :)
Someone needs to create a “Langchain, but less complicated” framework
On the main langchain post (In January) that got the traction on hackernews, i left this comment: https://news.ycombinator.com/item?id=34422917 . It still remains true, a "simpler langchain"
> To offer this code-style interface on top of LLMs, I made something similar to LangChain, but scoped what i made to only focus on the bare functional interface and the concept of a "prompt function", and leave the power of the "execution flow" up to the language interpreter itself (in this case python) so the user can make anything with it.
I made a really lightweight wrapper over requests and call it lambdaprompt https://github.com/approximatelabs/lambdaprompt It has served all of my personal use-cases since making it, including powering `sketch` (copilot for pandas) https://github.com/approximatelabs/sketch
Core things it does: Uses jinja templates, does sync and async, and most importantly treats LLM completion endpoints as "function calls", which you can compose and build structures around just with simple python. I also combined it with fastapi so you can just serve up any templates you want directly as rest endpoints. It also offers callback hooks so you can log & trace execution graphs.
All together its only ~600 lines of python.
I haven't had a chance to really push all the different examples out there, so I think it hasn't seen much adoption outside of those that give it a try.
I hope to get back to it sometime in the next week to introduce local-mode (eg. all the open source smaller models are now available, I want to make those first-class)