I do a lot of Python and a lot of Docker. Mostly python in Docker. I've used both Alpine and Ubuntu. There is a fair amount right about this article, and a lot wrong.
First "the Dockerfiles in this article are not examples of best practices"
Well, that's a big mistake. Of course if you don't follow best practices you won't get the best results. In these examples the author doesn't even follow the basic recommendations from the Docker Alpine image page. Ex, use "apk add --no-cache PACKAGE". When you're caching apt & apk, of course the image is going to be a ton larger. On the flip side he does basically exactly that to clean up ubuntus apt cache.
The real article should have been "should you use alpine for every python/docker project?" and the answer is "No". If you're doing something complicated that requires a lot of system libs, like say machine learning or imagine manipulation - don't use Alpine. It's a pain. On the flip side if all you need is a small flask app, Alpine is a great solution.
Also, build times and sizes don't matter too much in the grand scheme of things. Unless you're changing the Dockerfile regularly, it won't matter. Why? Because Docker caches each layer of the build. So if all you do is add your app code (which changes, and is added at the end of the Dockerfile) - sure the initial build might be 10 mn, but after that it'll be a few seconds. Docker pull caches just the same, so the initial pull might be large, but then after that it's just the new layers.
> Also, build times and sizes don't matter too much in the grand scheme of things. Unless you're changing the Dockerfile regularly, it won't matter. Why? Because Docker caches each layer of the build.
It does if you practice continuous deployment, or even if you use Docker in your local dev setup and you want to use a sane workflow (like `docker-compose build && docker-compose up` or something). Unfortunately, the standard docker tools are really poorly thought out, beginning with the Dockerfile build system (assumes a linear dependency tree, no abstraction whatsoever, standard tools have no idea how to build the base images they depend on, etc). It's absolute madness. Never mind that Docker for Mac or whatever it's called these days will grind your $1500 MacBook Pro to a halt if you have a container idling in the background (with a volume mount?). Hopefully you don't also need to run Slack or any other Electron app at the same time.
As for the build cache, it often fails in surprising ways. This is probably something on our end (and for our CI issues, on CircleCI's end [as far as anyone can tell, their build cache is completely broken for us and their support engineers couldn't figure it out and eventually gave up]), but when this happens it's a big effort to figure out what the specific problem is.
This stuff is hard, but a few obvious things could be improved--Dockerfiles need to be able to express the full dependency graph (like Bazel or similar) and not assume linearity. Dockerfiles should also allow you to depend on or include another Dockerfile (note the differences between including another Dockerfile and including a base image). Coupled with build args, this would probably allow enough abstraction to be useful in the general case (albeit a real, expression-based configuration language is almost certainly the ideal state). Beyond that the standard tooling should understand how to build base images (maybe this is a byproduct of the include-other-Dockerfiles work above) so you can use a sane development workflow. And lastly, local dev performance issues should be addressed or at least allow for better debugging.
See https://github.com/moby/buildkit. You can enable it today with `DOCKER_BUILDKIT=1 docker build ...`
There is also buildx which is an experimental tool to replace `docker build` with a new CLI: https://github.com/docker/buildx