I've been playing with some ideas for creating a SQLite database of classes, functions and suchlike found in Python code, so I can analyze my codebases with SQL queries.
I've had some good initial results with https://github.com/davidhalter/jedi - which is the Python introspection library that powers various editor autocomplete implementations. I have a prototype which uses that to create a SQL database of functions, classes and places that they are used.
I've also been playing with https://github.com/github/semantic - it can parse Python, JavaScript and other languages and offers a --json-symbols option which dumps out a JSON object showing the symbols (functions, variables etc) found in the code.
Oh this sounds cool! When I was writing this post I was thinking about ASTs and transforming them into domain-specific semantic graphs.
E.g. that `run_config` example would be a generic "method call" node in an AST, but a domain-specific AST crawler could recognize that a "method call node whose name is `run_config`" should be replaced with a semantically-meaningful `run_config` node.
And maybe that would be an interesting way to build up a conceptual graph of a codebase?