This is a personal correspondence typeset via LaTeX — it is not an academic paper, and it was not peer-reviewed. (The document does not claim otherwise, but I think it's common for people to assume that documents that have been typeset in such a format are more rigorous than this is.)

Leaving that aside, I really take issue with the style used by the author. For example, section 3 begins:

> There is increasingly substantial evidence that LLMs develop internal representations of the world to some extent, and that these representations allow them to reason at a level of abstraction that is not sensitive to the precise linguistic form of the text that they are reasoning about.

LLMs do not "reason"; they do not "learn" or "develop" anything of their own volition. They are (advanced) statistical models. Anthropomorphizing them is not just technically incorrect, but morally disingenuous. Writing about LLMs in this way causes people with less domain-specific knowledge to trust the models the way they might trust people, and this has the potential for serious harm.

Because of this choice of phrasing, I wanted to look into the author's background. Among their recent activities, they list:

> I'm organizing a new AI safety research group at NYU, and I wrote up a blog post explaining what we're up to and why.

"AI Safety" is a distinct (and actually opposing) area from "AI Ethics". The people who prefer the word "safety" tend also to engage in discussions that touch on aspects of longtermism. Longtermism is not scientifically well-grounded; it seeks to divert attention from present and real issues to fanciful projections of far-future concerns. I do not know for certain that the author is in fact a longtermist, but their consistent anthropomorphization of a pile of statistical formulae certainly suggests they wouldn't feel out of place among a crowd of such people.

In contrast, the people who prefer the term "ethics" in their work are grounded in real and present issues. They concern themselves with reasonable regulation. They worry about the current-day environmental impacts of training large models. In short, they are concerned with actual issues, rather than the alleged potential for a statistical model to "develop" sentience or exhibit properties of "emergent" intelligence (subjects from the annals of the science-fiction writing of last century).

I hope the author can clarify their choice of phrasing in their work, though I worry they have chosen their words carefully already. Readers should exercise caution in taking the claims of a soothsayer without a sufficient quantity of salt.

What does “reason” mean? It seems like it does everything I expect from something that reasons.

People routinely make up their own vague and ill defined meanings of understanding and reasoning to disqualify LLMs. This is necessary because LLMs obviously reason and understand by any evaluation that can be carried out.

Seriously just watch. He's not actually going to be able to coherently define his "reasoning" in a way that can be tested.

> Seriously just watch. He's not actually going to be able to coherently define his "reasoning" in a way that can be tested.

Google gives the following definition of the verb "reason":

> think, understand, and form judgments by a process of logic.

LLMs do not think, they do not understand, and they do not form judgments. They do not come to their own conclusions. They do not have the physical capability. They are statistical models, nothing more.

> LLMs obviously reason and understand by any evaluation that can be carried out.

Uh-huh. Sure, Jan.

This is the problem with non-operational definitions, because now we need to know how you define "think" and "understand" and "form judgments", to move on.

Instead, could you operationally define "reason" in a way that a human is, say, 90 % likely to pass the test and GPT is 10 % likely to do?

Yes, François Chollet released ARC(Abstraction and Reasoning Corpus) benchmark for this in 2019, and the benchmark can be scored automatically. Humans solve 100% of tests and GPTs solve 0% of tests and GPTs made exactly zero progress from 2019 to 2022.

https://twitter.com/fchollet/status/1631699463524986880

https://github.com/fchollet/ARC