I think this is a classic case of us overestimating the immediate impact and underestimating the long term impact.

Right now, they are definitely useful time savers, but they need a lot of handholding. Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.

You could spin up a giant staff the way we do servers now. There has to be a world changing application of that.

>Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.

This is an intuitive direction. In fact, it’s so intuitive that it’s a little bit odd that nobody seems to have made proper progress with LLM swarm computation.

In particular, it's odd that the greatest software developer in the world (ChatGPT) hasn't made progress with LLM swarm computation.

How is "LLM swarm computation" different that single bigger LLM?

It's easy to distribute across many computers which communicate with high latency

LLMs are already running distributed on swarms of computers. A swarm of swarms is just a bigger swarm.

So again, what is the actual difference you are imagining?

Or is it just that distributed X is fashionable?

Significantly higher latency than you have within a single datacenter. Think "my GPU working with your GPU".

There are already LLMs hosted across the internet (Folding@Home style) instead of in a single data center.

Just because the swarm infrastructure hosting an LLM has higher latency across certain paths does not make it a swarm of LLMs.

> There are already LLMs hosted across the internet (Folding@Home style)

Interesting, I haven't heard of that. Can you name examples?

I read about Petals (1) some time ago here on HN. There are surely others too, but I don't remember the names.

1. https://github.com/bigscience-workshop/petals