I wrote some simple bash scripts around git which allow me to very quickly identify the most frequently-edited files, the most recently-edited files, the largest files, etc.

https://github.com/gilesbowkett/rewind

it's for assessing a project on day one, when you join, especially for "rescue mission" consulting. it's most useful for large projects.

the idea is, you need to know as much as possible right away. so you run these scripts and you get a map which immediately identifies which files are most significant. if it's edited frequently, it was edited yesterday, it was edited on the day the project began, and it's a much bigger file than any other, that's obviously the file to look at first.

we tend to view files in a list, but in reality, some files are very central, some files are out on the periphery and only interact with a few other files. you could actually draw that map, by analyzing "require" and "import" statements, but I didn't go that far with this. those vary tremendously on a language-by-language basis and would require much cleverer code. this is just a good way to hit the ground running with a basic understanding which you will very probably revise, re-evaluate, or throw away completely once you have more context.

but to answer your actual question, you do some analysis like this every time you go into an unfamiliar code base. you also need to get an idea of the basic paradigms involved, the coding style, etc. -- stuff which would be much harder to capture in a format as simple as bash scripts.

one of the best places to start is of course writing tests. Michael Feather wrote a great book about this called "Working Effectively with Legacy Code." brudgers's comment on this is good too but I have some small disagreements with it.

Anybody knows a similar analysis tool for an SVN project? One option would be to convert the SVN to git for analysis purposes, but I'd be interested in a better solution.

never tried it myself, but you can take a look at the accompanying code "Your Code as a Crime Scene"

https://github.com/adamtornhill/code-maat