What does HackerNews think of hivemind?

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Language: Python

I'm not entirely how the approach they're using works [0], but I study federated learning and one of the highly-cited survey papers has several chapters (5 and 6 in particular) addressing potential attacks, failure modes, and bias [1].

0: https://github.com/learning-at-home/hivemind

1: https://arxiv.org/abs/1912.04977

The library they're using is literally called Hivemind [0]. I'm interested to see how the approach they're using differs from what we use in federated learning or gossip learning.

> Hivemind is a PyTorch library for decentralized deep learning across the Internet.

0: https://github.com/learning-at-home/hivemind

You can finetune with it. If you want a more generic framework you can use hivemind[1] which is what petals uses, but you'll have to create your own community for whatever model you're trying to train.

https://github.com/learning-at-home/hivemind

Yes, hivemind trained a gpt 6B model like this.

General model training https://github.com/learning-at-home/hivemind

Stable diffusion specific https://github.com/chavinlo/distributed-diffusion

Inference only stable diffusion https://stablehorde.net/

Probably the biggest recent result: https://arxiv.org/abs/2209.04836 (author thread: https://twitter.com/SamuelAinsworth/status/15697194946455265...)

See also: https://github.com/learning-at-home/hivemind

and more to OP's incentive structure: https://docs.bittensor.com/

Latter two intend to beat latency with Mixture-of-Expert models (MoEs). If the results of the former hold, it shows that with a simple algorithmic transformation you can merge two independently trained models in weight-space and have performance functionally equivalent to a model trained monolithically.

There absolutely are! Check out hivemind (https://github.com/learning-at-home/hivemind), a general library for deep learning over the Internet, or Petals (https://petals.ml/), a system that leverages Hivemind and allows you to run BLOOM-176B (or other large language models) that is distributed over many volunteer PCs. You can join it and host some layers of the model by running literally one command on a Linux machine with Docker and a recent enough GPU.

Disclaimer: I work on these projects, both are based on our research over the past three years

The problem is that, currently, large ML models need to be trained on clusters of tightly-connected GPUs/accelerators. So it's kinda useless having a bunch of GPUs spread all over the world with huge latency and low bandwidth between them. That may change though - there are people working on it: https://github.com/learning-at-home/hivemind
Here is a recent paper (disclaimer: I am the first author) named "Learning@home" which proposes something along these lines. Basically, we develop a system that allows you to train a network with thousands of "experts" distributed across hundreds or more of consumer-grade PCs. You don't have to fit 700GB of parameters on a single machine and there is significantly less network delay as for synchronous model parallel training. The only thing you sacrifice is the guarantee that all the batches will be processed by all required experts.

You can read it on ArXiv https://arxiv.org/abs/2002.04013v1 or browse the code here: https://github.com/learning-at-home/hivemind. It's not ready for widespread use yet, but the core functionality is stable and you can see what features we are working on now.