This logic only applies to generative pre-training, behavior cloning, and other training methods which rely on learning to mimic well-structured content from the real world.

It does not apply to intelligence gathered through methods like RL.

How does the author think about the intelligence of AlphaGo, for instance, which was trained entirely by self-play?

Good point. This calls to mind LeCun's recent argument about missing models that can learn from raw experience or "self-play". When we have a ChatGPT that understands language strictly from audio / video inputs, then we can start to talk about human-like intelligence.

As for AlphaGo, I would put it the same category of intelligence as a calculator. It does one thing well -- approximate a Monte Carlo Tree Search.

isn't HuggingGPT and Jarvis able todo some of that.

https://github.com/microsoft/JARVIS