What does HackerNews think of go-containerregistry?

Go library and CLIs for working with container registries

Language: Go

#48 in Docker
This is one of my absolute favorite topics. Pardon me while I rant and self-promote :D

Dockerfiles are great for flexibility, and have been a critical contributor to the adoption of Docker containers. It's very easy to take a base image, add a thing to it, and publish your version.

Unfortunately Dockerfiles are also full of gotchas and opaque cargo-culted best practices to avoid them. Being an open-ended execution environment, it's basically impossible to tell even during the build what's being added to the image, which has downstream implications for anybody trying to get an SBOM from the image for example.

Instead, I contribute to a number of tools to build and manage images without Dockerfiles. Each of them are less featureful than Dockerfiles, but being more constrained in what they can do, you can get a lot more visibility into what they're doing, since they're not able to do "whatever the user wants".

1. https://github.com/google/go-containerregistry is a Go module to interact with images in the registry and in tarballs and layouts, in the local docker daemon. You can append layers, squash layers, modify metadata, etc. This library is used by all kinds of stuff, including buildpacks, Bazel's rules_docker, and all of the below, to build images without Docker.

2. crane is a CLI that uses the above (in the same repo) to make many of the same modifications from the commandline. `crane append` for instance adds a layer containing some contents to an image, entirely in the registry, without even pulling the base image.

3. ko (https://ko.build) is a tool to build Go applications into images without Dockerfiles or Docker at all. It runs `go build`, appends that binary on top of a base image, and pushes it directly to the registry. It generates an SBOM declaring what Go modules went into the app it put into the image, since that's all it can do.

4. apko (https://apko.dev) is a tool to assemble an image from pre-built apks, without Docker. It's capable of producing "distroless" images easily with config in YAML. It generates an SBOM declaring exactly what apks it put in the image, since that's all it can do.

Bazel's rules_docker is another contender in the space, and GCP's distroless images use it to place Debian .debs into an image. Apko is its spiritual successor, and uses YAML instead of Bazel's own config language, which makes it a lot easier to adopt and use (IMO), with all of the same benefits.

I'm excited to see more folks realizing that Dockerfiles aren't always necessary, and can sometimes make your life harder. I'm extra excited to see more tools and tutorials digging into the details of how container images work, and preaching the gospel that they can be built and modified using existing tooling and relatively simple libraries. Excellent article!

> Comparing image digests won’t work; they will never match.

This is a strong assertion with no further explanation. It reads like a generic truth about container images, but it's certainly possible to achieve this, as referenced later:

> Sidebar: Various folks in the container ecosystem are looking at enabling deterministic images. We welcome that. See Building deterministic Docker images with Bazel and DETERMINISTIC DOCKER IMAGES WITH GO MICROSERVICES.

I'll agree that docker makes it _really_ difficult to build and consume reproducible images (for a variety of reasons, see https://github.com/google/go-containerregistry/issues/895#is... and https://twitter.com/lorenc_dan/status/1343921451792003073 for a sampling of interesting ones), but there is more to the container ecosystem than docker or Dockerfiles.

Shameless plug: I help maintain ko (https://github.com/google/ko), which can achieve reproducible builds for go projects without much fuss. It also leans heavily on go's excellent support for cross-compilation to produce multi-platform images, trivially.

> There are two cases where the container-diff tool will report that the registry and local images that you are comparing are the same (in terms of Docker history lines), but will be misleading because the images are actually different.

While container-diff is great, it can obscure what's really going on a bit. If you're interested in uncovering exactly why the digest of the image you built is different from what was published, please forgive another shameless plug for crane (https://github.com/google/go-containerregistry/blob/main/cmd...), a tool I wrote to expose most of the functionality of go-containerregistry (https://github.com/google/go-containerregistry), which is the library both container-diff and ko use under the hood.

Forgive the sparse documentation, but it should be relatively straightforward for anyone familiar with the registry API and data structures, as the commands map pretty directly to registry functionality. Using crane, you can easily inspect the image in the registry directly to compare the manifests and blobs that make up an image.

For example, one reason that the digests might never match is that these images are somewhat strangely wrapped as singleton manifest lists: https://gist.github.com/jonjohnsonjr/ffba104ca504b5bb4a1f227...

It makes some sense to me that they might want to do this to prevent folks from pulling this on windows, but usually you would only encounter manifest lists for multi-platform images. Even if these builds were reproducible, you would have to compare the digest of what you built with sha256:9a210bb9cbbdba5ae2199b659551959cd01e0299419f4118d111f8443971491a -- not the sha256:fb1a43b50c7047e5f28e309268a8f5425abc9cb852124f6828dcb0e4f859a4a1 that docker outputs, as shown in the article.

The tag used for this example (mcr.microsoft.com/dotnet/sdk:5.0-alpine) has since been updated. Comparing this with the original using container-diff just tells us that the size changed: https://gist.github.com/jonjohnsonjr/90c2def551833c8cacf3264...

But looking at the actual manifests, config blobs, and layers using crane is often faster and more interesting: https://gist.github.com/jonjohnsonjr/283eab27d996b2f4cc04553...

My intention with crane is to be easily composable so that you can use familiar tools like tar, sort, diff, jq, etc.

(To be fair to container-diff, you can use the -t flag to show similar things.)

I realize this is not really the point of the article, but it's a huge pet peeve of mine that everyone has just given up on understanding what's going on with their images because the tooling UX makes everything so opaque. If the digest of something doesn't match, you should know why! It's as if `git push --force` was on by default and everyone has just accepted that reality.

Now to read the rest of the article :)

Most CI systems used GET requests to fetch image manifests, in order to see what the registry's most recent image is. These requests are counted towards the limits in Docker's new rules.

Systems which built on top of the GGCR library[0] are switching to using HEAD requests instead[1]. These don't fetch the entire manifest, instead relying on just headers to detect that a change has occurred.

[0] https://github.com/google/go-containerregistry

[1] https://github.com/concourse/concourse/releases/tag/v6.7.0

For folks coming to this later, a correction. The buildpacks team are using the "crane" tool in go-containerregistry[0], rather than Kaniko, as the basis for their daemonless container-building containers.

There is so much container-related work coming out of Google right now that I am struggling to keep up.

[0] https://github.com/google/go-containerregistry

- We've published libraries to interact with the registry without docker or the docker CLI, which we use in these projects

https://github.com/google/go-containerregistry

https://github.com/google/containerregistry

- Our team has built something exactly like you're describing https://github.com/GoogleCloudPlatform/distroless

Dockerfiles without RUN commands are technically more correct: reproducible, much easier to inspect. However, its quite limiting for the existing corpus of Dockerfiles.

I like to think of kaniko as the (pull) + build + push decoupling of the docker monolith. Other tools, like cri-o, have implemented the complement (pull + run).

Disclaimer: I work on kaniko and some of these other tools at Google