What does HackerNews think of nbstripout?
strip output from Jupyter and IPython notebooks
Once the hook was in place git diff worked well enough to not need any other diffing tool.
You can also set it up as a 'filter', so it automatically runs before any git operations, whether it's add, commit, diff or an interactive rebase.
It eases most of the pain regarding version control. You can use it as a 'git filter', so only inputs would be shown in diffs and committed (and also works with interactive adding!), while keeping outputs in your working tree.
There's also,
- nbstripout[2] for stripping outputs automatically before every commit
- nbdime[3] for diff'ing notebooks locally
- jupytext[4] for converting notebooks to markdown and vice-a-versa
[2] https://github.com/kynan/nbstripout
$ pip install --upgrade nbstripout # install nbstripout bin
$ nbstripout --install # install Git hook in current repo
Then, any .ipynb files that you check in will have their output stripped in the index (without affecting your working copy).
(Surprised it's not mentioned in the article.)
With that approach, though notebooks are clean they're still fairly poor for easily evaluating diffs between versions. If code review / diffs are more important than preserving the notebook, then you could use a post save hook to convert notebook input to a .py file and output to .html:
https://towardsdatascience.com/version-control-for-jupyter-n...
pip install --upgrade nbstripout
nbstripout --install