I've been playing with some ideas for creating a SQLite database of classes, functions and suchlike found in Python code, so I can analyze my codebases with SQL queries.

I've had some good initial results with https://github.com/davidhalter/jedi - which is the Python introspection library that powers various editor autocomplete implementations. I have a prototype which uses that to create a SQL database of functions, classes and places that they are used.

I've also been playing with https://github.com/github/semantic - it can parse Python, JavaScript and other languages and offers a --json-symbols option which dumps out a JSON object showing the symbols (functions, variables etc) found in the code.

Oh this sounds cool! When I was writing this post I was thinking about ASTs and transforming them into domain-specific semantic graphs.

E.g. that `run_config` example would be a generic "method call" node in an AST, but a domain-specific AST crawler could recognize that a "method call node whose name is `run_config`" should be replaced with a semantically-meaningful `run_config` node.

And maybe that would be an interesting way to build up a conceptual graph of a codebase?

I'm with you on this one. I'd actually suggest https://github.com/CoatiSoftware/Sourcetrail could be extended to do this, though I haven't found the time yet for my own codebases. For example https://github.com/CoatiSoftware/SourcetrailPythonIndexer and under the hood the file format is SQLite: https://github.com/CoatiSoftware/SourcetrailDB