And there's yet more speed-ups coming in Python 3.12! It'll support Perf profiler to allow looking into Python call stacks and see python functions in perf output and have even more improved error messages.

https://docs.python.org/dev/whatsnew/3.12.html

If you're interested in profiling, you should check out scalene. It's leaps ahead of every other profiling tool I've used in python and honestly might be the best profiler I've used in any language. It will give you per line results for mem, cpu, and gpu, tell you c time vs python time (numpy calls, etc), and it's faster than most other python profilers.

https://github.com/plasma-umass/scalene