I wonder why the books are all non-fiction. I could imagine it would be entertaining to chat with large works of fiction.

Books that operate in a narrative/temporal style are hard to manage, because statements of fact are mutative.

Consider the story:

"Justin is hungry. Justin eats dinner. Justin is not hungry."

You ask the chatbot "Is Justin hungry?". There is a temporal aspect to this question that is hard for simple systems that are just embedding facts into a vector DB (or similar techniques) to reconcile.

lgas

I asked ChatGPT:

  Me: 
      Consider the story: "Justin is hungry. Justin eats dinner. Justin is not hungry."

      Is Justin hungry?

  ChatGPT:
      No, Justin is not hungry after eating dinner.

I'm not sure that it's that big of a problem.

gamegoblin

The example was to just illustrate the general problem. Think of ingesting a whole novel that takes place over a few years. The whole novel doesn't fit into GPT's context window (which is only a page or two of text). So you have to extract individual statements of fact and index over them (e.g. with semantic indexing, or many other techniques).

It's tricky to deal with cases where the state of something changes many times over the course of the years in the novel.

Imagine you ingest the whole Harry Potter series. You ask the chatbot "How old is Harry Potter?". The answer to the question depends on which part of the story you are talking about. "Does Harry know the foobaricus spell?" The answer depends on which part of the story you are talking about.

Whereas for a non-fiction book typically does not contain these temporally changing aspects. In a book about astronomy, Mars is the 4th planet from the sun in chapter 1, and in chapter 10.

lgas

> Think of ingesting a whole novel that takes place over a few years.

I did exactly that with Asimov's Let's Get Together using https://github.com/jerryjliu/gpt_index. It's a short story that's only 8,846 words, so it's not quite a novel, much less the whole of the Harry Potter series, but it was able to answer questions that required information from different parts of the text all at the same time.

It requires multiple passes of incremental summarization so it is of course much slower than making a single call to the model, but I stand by my assertion that these things just aren't much problem in practice. They are only a problem if you're trying to paste them into ChatGPT or the GPT-3 playground window or something like that.

People are solving the problems with building these systems in the real world almost as fast as the problems arise in the first place.