What does HackerNews think of kraken?

P2P Docker registry capable of distributing TBs of data in seconds

Language: Go

#11 in Docker
#5 in P2P
Uber did it too. Kraken is docker registry that gossips containers via bittorrent.

https://github.com/uber/kraken

Uber Engineering open-sourced Kraken [1], their peer-to-peer docker registry. I remember it originally using the BitTorrent protocol but in their readme they now say it is "based on BitTorrent" due to different tradeoffs they needed to make.

As far as I know there aren't any projects doing peer-to-peer distribution of container images to servers, probably because it's useful to be able to use a stock docker daemon on your server. The Kraken page references Dragonfly [2] but I haven't grokked it yet, it might be that.

It has seemed strange to me that the docker daemon I run on a host is not also a registry, if I want to enable that feature.

It's also possible that in practice you'd want your CI nodes optimized for compute because they're doing a lot of work, your registry hosts for bandwidth, and your servers again for compute, and having one daemon to rule them all seems elegant but is actually overgeneralized, and specialization is better.

1 https://github.com/uber/kraken

2 https://d7y.io/

If you're pulling big images you could try kube-fledged (it's the simplest option, a CRD that works like a pre-puller for your images), or if you have a big cluster you can try a p2p distributor, like kraken or dragonfly2.

Also there's that project called Nydus that allows starting up big containers way faster. IIRC, starts the container before pulling the whole image, and begins to pull data as needed from the registry.

https://github.com/senthilrch/kube-fledged

https://github.com/dragonflyoss/Dragonfly2

https://github.com/uber/kraken

https://nydus.dev/

Docker images already need special handling since you download the layers separately and reassemble them. Going from that to full BitTorrent should be transparent to the users.

In fact, there already exist several implementations of it for Docker![0, 1, 2]

[0]: https://coreos.com/blog/torrent-pulls

[1]: https://d7y.io/en-us/

[2]: https://github.com/uber/kraken

Apologies for the hand-waving, but is there a well-known community sponsored public peer-to-peer registry service, based on https://github.com/uber/kraken perhaps?
+1 for this -- requiring the docker image to be built/managed on the machine it's being deployed on is the simpler architectural choice (easier to debug, etc), but it doesn't necessarily make sense for production.

I wonder if there's a ticket about this on dokku already

[EDIT] - Couldn't find anything... Some tickets about how the containers are built and changing the base image but not much about.

I wonder if you could jury rig something like kraken[0] and make sure wherever your building images is a peer or something... Of course the simpler solution might be to add a CI step that just pushes the image (via the working `docker save` method) to the deployment machine(s)? Maybe if you have a staging environment, let CI push there, then if that machine is peered (via something like kraken) with production, production will get the image (though it may never run the image).

[0]: https://github.com/uber/kraken

"the container runtime can be modified to retrieve layers identified by their CIDs"

How do you do this? Exercise for the reader? :)

For the case of distributing containers in a datacenter with P2P, theres also this work:

https://github.com/uber/kraken

1) You can run your own pull-through cache[0]

2) You can use a different registry

3) Run something like kraken[1] so machines can share already-downloaded images with eachother

4) If you need an emergency response, you can docker save[2] an image on a box that has it cached and manually distribute it/load it into other boxes

0: https://docs.docker.com/registry/recipes/mirror/

1: https://github.com/uber/kraken

2: https://docs.docker.com/engine/reference/commandline/save/

> Crosby explained that a registry would still be needed to handle the naming of images, but the content address blobs could be transferred from one machine to another without the need to directly interact with the registry. In the P2P model for image delivery, a registry could send a container image to one node, and then users could share and distribute images using something like BitTorrent sync.

I believe this is the basic concept of Uber's Kraken project: https://github.com/uber/kraken

I think it's a really clever idea, but I can also predict that a lot of enterprise companies will hear "bittorrent" and nope their way out of it.