LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are very powerful but they're very general. As a common example for this limitation, imagine you want your LLM to answer questions over a large corpus.

You can't pass the entire corpus into the prompt. So you might: - preprocess the corpus by iterating over documents, splitting them into chunks, and summarizing them - embed those chunks/summaries in some vector space - when you get a question, search your vector space for similar chunks - pass those chunks to the LLM in the prompt, along with your question

This ends up being a very common pattern, where you need to do some preprocessing of some information, some real-time collecting of pieces, and then an interaction with the LLM (in some cases, you might go back and forth with the LLM). For instance, code and semantic search follows a similar pattern (preprocess -> embed -> nearest-neighbors at query time -> LLM).

Langchain provides a great abstraction for composing these pieces. IMO, this sort of "prompt plumbing" is far more important than all the slick (but somewhat gimicky) "prompt engineering" examples we see.

I suspect this will get more important as the LLMs become more powerful and more integrated, requiring more data to be provided at prompt time.

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index) which takes a more "data structure" approach (and actually uses langchain for some stuff under the hood).