I tend to disregard articles that default to the "Stochastic Parrot" argument. These tools are useful now, I don't personally care about achieving actual intelligence. I want additional utility for myself and other humans, which these provide now, at scale.
By a lot of measures many humans perform at just about the same level, including confidently making up bullshit.
This post reads like one of the "Goodbye X online video game" posts. I'll cut them some slack because this is their blog they're venting on and was likely posted here by someone else and not themselves doing some attention seeking, but meh.
It’s pretty ironic that the argument that LLMs are stochastic parrots actually sounds like the actual parrot.
I think we’re now way past that now with LLMs now quickly taking on the role of a general reasoning engine.
> now quickly taking on the role of a general reasoning engine
And this right here is why it's important to emphasize the "stochastic parrot" fact. Because people think this is true and are making decisions based on this misunderstanding.
Or maybe they just disagree with you?
Who? See: https://dl.acm.org/doi/pdf/10.1145/3442188.3445922 A 2021 research paper warning (among other things) precisly about this confusion.
> The ersatz fluency and coherence of LMs raises several risks, precisely because humans are prepared to interpret strings belonging to languages they speak as meaningful and corresponding to the communicative intent of some individual or group of individuals who have accountability for what is said
Is there any Researcher who maintains that LLM models contain Reasoning and intent?
Those who are working on this models are not confused, they know what they are, the Public is confused.
How is that different from "This evidence for X raises the risk that people falsely believe that X"? That's an argument for X, not against. And nothing in that paper, even if I discard the dross (ie. everything except one section on page 7), seems to actually make an argument against X of any strength beyond "it is wrong because it is wrong".
My point is this: I disagree with you. This is not because I have "misunderstood" something; it is because I understand the stochastic-parrot argument and think it is erroneous. And the more you talk about "the risk that people will come to falsely believe" rather than actual arguments, the less convincing you sound. This paternalistic tendency is a curse on science and debate in general.
> it is because I understand the stochastic-parrot argument and think it is erroneous.
Okay then, what exactly about it is erroneous? Because stochastically sorting the set M of known tokens by likelyhood of being the next, is literally what LLMs do.
There's a class of statements that can be either interpreted precisely, at which point the claim they make is clearly true but trivial, or interpreted expansively, at which point the claim is significant but no longer clearly true.
This is one of those: yes, technically LLMs are token predictors, but technically any nondeterministic Turing machine is a token predictor. The human brain could be viewed as a token predictor [1]. The interesting question is how it comes up with its predictions, and on this the phrase offers no insight at all.
> The human brain could be viewed as a token predictor
No it really couldn't, because "generating and updating a 'mental model' of the environment." is as different from predicting the next token in a sequence, as a bees dance is from a structured human language.
The mental model we build and update is not just based on a linear stream, but many parallel and even contradictory sensory inputs that we make sense of not as abstract data points, but as experiences in a world of which we are part of. We also have a pre-existing model summarizing our experience in the world, including their degradation, our agency in that world, and our intentionality in that world.
The simple fact that we don't just complete streams, but do so with goals, both immediate and long term, and fit our actions into these goals, in itself already shows how far a humans mental modeling is from the linear action of a language model.
> The mental model we build and update is not just based on a linear stream, but many parallel and even contradictory sensory inputs
So just like multimodal language models, for instance GPT-4?
> as experiences in a world of which we are part of.
> The simple fact that we don't just complete streams, but do so with goals, both immediate and long term, and fit our actions into these goals
Unfalsifiable! GPT-4 can talk about its experiences all day long. What's more, GPT-4 can act agentic if prompted correctly. [2] How do you qualify a "real goal"?
[1] https://www.neelnanda.io/mechanistic-interpretability/othell...