What does HackerNews think of gpt_index?
GPT Index is a project consisting of a set of data structures designed to make it easier to use large external knowledge bases with LLMs.
A set of data structures to augment LLM's with your data: https://github.com/jerryjliu/gpt_index
Funny that we had just rebranded our tool from GPT Index to LlamaIndex about a week ago to avoid potential trademark issues with OpenAI, and turns out Meta has similar ideas around LLM+llama puns :). Must mean the name is good though!
Also very excited to try plugging in the LLaMa model into LlamaIndex, will report the results.
A Local alternative to GPT -> https://github.com/debanjum/khoj
I did exactly that with Asimov's Let's Get Together using https://github.com/jerryjliu/gpt_index. It's a short story that's only 8,846 words, so it's not quite a novel, much less the whole of the Harry Potter series, but it was able to answer questions that required information from different parts of the text all at the same time.
It requires multiple passes of incremental summarization so it is of course much slower than making a single call to the model, but I stand by my assertion that these things just aren't much problem in practice. They are only a problem if you're trying to paste them into ChatGPT or the GPT-3 playground window or something like that.
People are solving the problems with building these systems in the real world almost as fast as the problems arise in the first place.
AFAIK there's no GPT-3-like LLM that's easy to run at home, because the number of parameters is so so large. Your gaming PC's GPU won't have enough RAM to hold the model. For example, gpt-neox-20b needs about 40GB of RAM: https://huggingface.co/EleutherAI/gpt-neox-20b/discussions/1...
Here's yesterday's thread on this prompt context pattern: https://news.ycombinator.com/item?id=34477543
I've been experimenting with the 'gpt index' project <https://github.com/jerryjliu/gpt_index> and it doesn't seem like "oh just put summaries of stuff in the prompt" works for everything -- like I added all the Seinfeld scripts and was asking questions like "list every event related to a coat or jacket" and the insights were not great -- so you have to find the situations in which this makes sense. I found one example output that was pretty good, by asking it to list inflation related news by date given a couple thousand snippets: https://twitter.com/firasd/status/1617405987710988288
https://github.com/jerryjliu/gpt_index is a particularly interesting implementation under very active development at the moment.
GPT Index is a project consisting of a set of data structures that are created using LLMs and can be traversed using LLMs in order to answer queries.
https://github.com/jerryjliu/gpt_index https://github.com/jerryjliu/gpt_index/blob/main/examples/pa...
GPT isn't multi-modal yet (so no images), but that's coming.
Could you share anything (e.g. how many rows of data and tokens in each row) around how much it cost you to use GPT Index? It looks interesting, but it seems it'd be expensive.