There are two competing needs to be considered when figuring out what your workflow should be in regard to history.

Both come from the fundamental question: “When (if) we look back in history, what are we looking for?” Keeping everything as it was reduces the risk of deleting something that will later be important; consolidating is supposed to reduce the risk of missing the needle in the haystack or discouraging looking back at all.

Curating the past is 99% wasted effort since looking back is rare. I think the best compromise is to add some automation if you really care, as Ted suggested.

woolion

>Curating the past is 99% wasted effort since looking back is rare.

This is the worst kind of self-fulfilling prophecy. It is exactly the same as 'tidying your home is not worth it because you will need to search for things anyway'. For it to be useful, you need to have proper atomic commits and useful messages, same as a good organization, otherwise looking back would have little added value. And it's also something with little overhead if the discipline is integrated into your workflow.

The only reason you wouldn't need a good history is if your git repository is 100% bug-free. Then you don't need to understand why or how the bug was introduced, if some weird piece of code is handling a very specific edge-case or was just poorly written. Is the bug generalizable? It's also something you'd probably see fast by knowing if it was introduced in a local commit or a refactoring one.

Code-wise 'history-obliviousness' (or proper Git hygiene) is among the worst banes of programmers, I believe.

fulafel

I guess "curate" could be interpreted either way. It would seem possible you are both arguing for preserving history but interpreting the word differently...

woolion

I agree the parent comment is making a great point, but I disagree with its conclusion that "curating the past is wasted effort", because it is tremendously useful if done right, whereas thinking that way creates the problem --a bad history is useless. You can argue the same about documentation, bad documentation is not read, so it is useless to write it (well, in fact you have to write it properly).

The definition of curate is 'to apply selectivity and taste to' a collection, so I'd say it does mean both. I have not built a theory of what is the most useful, but let's take the two extremes. On the one end you have "push only development", that commit bad program states and all their fixes. It's bad because it adds way to much noise to the history. On the other end, you have 'squash only' development, where one polished feature is pushed in one commit. It's a huge diff that carry little more information than the code itself, and loses all subfeatures milestones and discussions, therefore it is mostly useless.

In a way, imagine you have to teach something by demonstration. You don't want the student to get lost by you screwing up the details. You also need to chunk that information into a set of simpler and well-articulated parts. If done well, your git history carries the information of your process in a very similar way.

You have to be somewhere in the middle, so I'd say to do a semantic rebase at last step before merge. A fantastic tool that is not so well-known is git-absorb, which helps a lot doing that cleanly and automatically.

https://github.com/tummychow/git-absorb