What does HackerNews think of pip-tools?
A set of tools to keep your pinned Python dependencies fresh.
For deno, as the article points out, it's easy, deno has this feature baked in and we just have to deno run the scripts, similar for go where the imports are pretty much a fully qualified pointer to their source and version.
For python, we have to do a bit of machinery: we parse the AST, look at the imports, use the heuristics that most import names correspond to their pypi package (we maintain also a list of exception-mappings), pip-compile [2] all the imports, and then get a requirement file that we attach to the script, pip-install it before running the script (then we do a LOT of magic to cache the dependencies so one doesn't actually have to install them 99.9% of the time).
Note: if you're interested in a collection of single-file scripts to many APIs, that you can just copy-paste and call the main function of, that's what windmill's hub is for [3]
[1]: https://github.com/windmill-labs/windmill [2]: https://github.com/jazzband/pip-tools [3]: https://hub.windmill.dev/
This way I can install the same bill of materials every time
Instead, install pip-tools[0] then use the pip-compile command.
Why?
pip freeze will also pin dependencies of your dependencies, which makes your requirements.txt hard to read and extend.
Never manually create requirements.txt either because a programmer's job is to automate boring tasks like dependency pinning.
Simple and effective in my experience.
How is this falling short? Why such a push for different tooling? Why the continued complaints about Python packaging being substandard? Honest questions because I've been using this for web and cli applications for years without issue.
I also use pipx[2] to install Python based tools onto my system so they have their own virtualenv.
https://github.com/jazzband/pip-tools
And you can still use standard setup.cuff and pip install -e unlike Poetry. Also, much faster.
1. You have to manually freeze packages (instead of automatic package-lock.json)
2. Each time your install/remove package, dependant packages are not removed from freeze. You have to do it manually. (interesting link: https://github.com/jazzband/pip-tools)
3. Freeze is flat list (npm can restore tree structure)
* Basically I would have a single bash script that every `.py` entrypoint links to.
* Beside that symlink is a `requirements.in` file that just lists the top-level dependencies I know about.
* There's a `requirements.txt` file generated via pip-tools that lists all the dependencies with explicit version numbers.
* The bash script then makes sure there's a virtual environment in that folder & the installed package list matches exactly the `requirements.txt` file (i.e. any extra packages are uninstalled, any missing/mismatched version packages are installed correctly).
This was great because during development if you want to add a new dependency or change the installed version (i.e. pip-compile -U to update the dependency set), it didn't matter what the build server had & could test any diff independently & inexpensively. When developers pulled a new revision, they didn't have to muck about with the virtualenv - they could just launch the script without thinking about python dependencies. Finally, unrelated pieces of code would have their own dependency chains so there wasn't even a global project-wide set of dependencies (e.g. if 1 tool depends on component A, the other tools don't need to).
I viewed the lack of `setup.py` as a good thing - deploying new versions of tools was a git push away rather than relying on chef or having users install new versions manually.
This was the smoothest setup I've ever used for running python from source without adopting something like Bazel/BUCK (which add a lot of complexity for ingesting new dependencies as you can't leverage pip & they don't support running the python scripts in-place).
For me, most of the pain with the Python packaging went away after I started using Pip-tools[0]. It's just a simple utility to add lockfile capabilities to Pip. Nothing new to learn, no new filosophies or paradigm's. No PEP waiting to be adopted by everyone. Just good old requirements.txt + Pip.
This didn't solve the multiple versions of python on the host. That was managed by having a bootstrap script written in python2 that would set up the development environment to a consistent state (i.e. install homebrew, install required packages) that anyone wanting to run the tools would run (no "getting started guides") which also versioned itself & was idempotent (generally robust against running multiple times). We also shipped this to our external partners in the factory. Generally worked well as once you ran the necessary scripts once no further internet access was required.
It wasn't easy but eventually it worked super reliably.
Having lived more in the JS ecosystem for the last several years, my ideal workflow would be a copy of how the Yarn package manager works:
- Top-level dependencies defined in an editable metadata file
- Transitive dependencies with hashes generated based on the calculated dependency tree
- All dependencies installed locally to the project, in the equivalent of a `node_modules` folder
- All package tarballs / wheels / etc cached locally and committed in an "offline mirror" folder for easy and consistent installation
- Attempting to reinstall packages when they already are installed in that folder should be an almost instantaneous no-op
PEP-582 (adding standard handling for a "__pypackages__" folder) appears to be the equivalent of `node_modules` that I've wanted, but tools don't seem to support it yet. I'd looked through several Python packaging tools over the last year, but none of them seemed to support it yet (including Poetry [0]).
The only tool that I can find that really supports PEP-582 atm is `pythonloc` [1], which is really just a wrapper around `python` and pip` that adds that folder to the path. Using that and `pip-tools`, I was able to mostly cobble together a workflow that mimics the one I want. I wrote a `requirements.in` file with my main deps, generated a `requirements.txt` with the pinned versions and hashes with `pip-compile`, was able to download and cache them using `pip`, and installed them locally with `piploc`.
Admittedly, I've only tried this out once a few weeks ago on an experimental task, but it seemed to work out sufficiently, and I intend to implement that workflow on several of our Python services in the near future.
If anyone's got suggestions on better / alternate approaches, I'd be interested.
[0] https://github.com/python-poetry/poetry/issues/872
[1] https://github.com/jazzband/pip-tools [2] https://docs.pipenv.org/en/latest/ [3] https://tox.readthedocs.io/en/latest/
https://github.com/jazzband/pip-tools
https://github.com/jazzband/pip-tools https://gist.github.com/hynek/5e85706cee589a204251b333595853...
https://github.com/jazzband/pip-tools would be what I used before pipenv came to be.
Standard procedure, takes two minutes and I have a Jupyter notebook up, smartly automated base processes for dependency management, full control of environment variables and can deploy/ integrate this in any other Python setup (if I were to port IPython code to pure Python).
Is this more a thing of each his own or am I missing a crucial advantage of Anaconda?
[1] https://github.com/kennethreitz/autoenv [2] https://github.com/jazzband/pip-tools [3] https://github.com/pyupio/pyup
We are using `pip-tools` to manage that: https://github.com/jazzband/pip-tools