As someone dabbling in ML for the first time, setting up the correct Python env has been a nightmare just to run some sample code. I am familiar with Docker and would love if this got more mainstream adoption.

Fighting with dependencies sucks when you aren’t intimately familiar with the ecosystem: python, node and other js, and even go.

What you need is the conda package manager (from anaconda python), in my opinion. Docker is very powerful, but is useful for other purposes (e.g deploying code) than simply grabbing a bunch of python modules and running analyses in a notebook.

I'm asking as someone who is mid level Python & DS experienced and out of real curiosity (not a "mine is better than yours" thought): Could you elaborate what in your opinion makes anaconda superior? I am frequently doing data munging & low level ML things and - so far - I am more happy with the pip side of things as a combination of virtualenv, autoenv [1], pip tools [2], pyup [3] and the rest of the eco system.

Standard procedure, takes two minutes and I have a Jupyter notebook up, smartly automated base processes for dependency management, full control of environment variables and can deploy/ integrate this in any other Python setup (if I were to port IPython code to pure Python).

Is this more a thing of each his own or am I missing a crucial advantage of Anaconda?

[1] https://github.com/kennethreitz/autoenv [2] https://github.com/jazzband/pip-tools [3] https://github.com/pyupio/pyup