I’m a backend python software engineer in my day job, working on production code in a standard git, text editor, .py files workflow. I occasionally use Jupyter notebooks (e.g. yesterday for the AdventOfCode challenges). I am ashamed to say that despite the beautiful front-end that works so well, I absolutely hate using Jupyter notebooks.
- The mutable state with global variables from all cells in scope drives me crazy. I just want to run an ephemeral script repeatedly so I can be sure what the state is during execution.
- The process of starting the server and moving code from version-controlled .py files into notebook cells to become part of Untitled(25).ipynb, which can’t be sanely version controlled, drives me crazy.
- Not being able to use my normal text editor drives me crazy.
Instead of building up lines of well-tested python functions in a disciplined line-by-line fashion, periodically committed to git with readable diffs, I end up with a chaotic mess of code fragments in various cells with no confidence regarding what will happen when some subset of the cells are executed, and none of it in git let alone with a sane commit history.
I’ve tried so many times over the last 10 years, and I feel bad because it’s such an amazing project, but I really dislike the experience of using Jupyter instead of standard python development tools.
The reason I do it is for graphics. This isn’t a Jupyter gripe other than the diverted attention, but why the fuck can’t I just import matplotlib in a normal python file? (under macOS it throws something about “frameworks” which no-one cares to understand. I think there’s some incantation that makes it work but seriously, this is ridiculous). And maybe draw graphics to a GUI graphics widget from the (i)python shell like R does.
(No need to reply "Because you haven't written it"! This is deliberately a rant; I contribute to open source projects.)
Never tried macOS, but matplotlib works fine under Windows and Linux. Maybe you could save plots to images on disks and prevent them to show? I once ran code that used PLT on a server and I needed to use something like `matplotlib.use('Agg')` to prevent the code from crashing because of lacking graphical output.
See this SO answer: https://stackoverflow.com/a/34583288/2476920
Personally, I love to have notebook cells to be able to code without re-running everything. Especially in the case of deep learning, training a model is long. Jupyter is very good for creating and debugging code that: A) needs a trained model loaded for it to work but you want to skip the part of saving/loading the model, or B) code that saves-then-load a model.
If the "mutable state with global variables" drives you crazy, you may want to avoid reusing the same variable names from one cell to another, and reset the notebook more often. Also, avoid side effects (such as writing/loading from disks) and try to have what's called pure functions (e.g.: avoid singletons or service locators, pass references instead). If your code is clean and does not do too much side effects, you should be able to work fine in notebooks without having headaches.
EDIT: typo.
Also, you should be able to use your favorite editor for the code outside notebooks (over time, more and more of the code will be outside of your notebook). You might often work in the editor, and at other times in the notebook depending on the nature of the work. As the project advances, notebooks will become less and less important, they only kickstart projects.
But in general I wonder whether this is what I'm looking for: https://github.com/daleroberts/itermplot