What does HackerNews think of py-spy?

Sampling profiler for Python programs

Language: Rust

#27 in Python
For CPU cycles, py-spy[0] is getting more and more used. For RAM, I would like to known too...

[0] -- https://github.com/benfred/py-spy

Theres also Py Spy, a profiling tool that can generate flame charts containing a mix of python and C (or C++) calls.

https://github.com/benfred/py-spy

It's worked really well for my needs

Similarly, py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way. py-spy is extremely low overhead: it is written in Rust for speed and doesn't run in the same process as the profiled Python program. This means py-spy is safe to use against production Python code.

I'm not sure if it exports results in a format Chrome can render but it does produce great interactive SVGs and is compatible with speedscope.app

https://github.com/benfred/py-spy

https://github.com/jlfwong/speedscope

It is pretty cool. py-spy has also been doing this for a few years

https://github.com/benfred/py-spy

Two other profilers that also let you spy on a running Python program:

- py-spy: https://github.com/benfred/py-spy (written in Rust)

- pyflame: https://github.com/uber-archive/pyflame (C++, seems to be not maintained anymore)

The "no performance cost" thing is interesting: my experience writing a similar profiler is that there are a couple of things that can affect performance a little bit:

1. You have to make a lot of system calls to read the memory of the target process, and if you want to sample at a high rate then that does use some CPU. This can be an issue if you only have 1 CPU.

2. you have two choices when reading memory from a process: you can either race with the program and hope that you read its memory to get the function stack before it changes what function it's running (and you're likely to win the race, because C is faster than Python), or you can pause the program briefly while taking a sample. py-spy has an option to choose which one you want to do: https://github.com/benfred/py-spy#how-can-i-avoid-pausing-th...

Definitely this method is a lot lower overhead than a tracing profiler that instruments every single function call, and in practice it works well.

One thing I think is nice about this kind of profiler is that reading memory from the target process sounds like a complicated thing, but it's not: you can see austin's code for reading memory here, and it's implemented for 3 platforms in just 130 lines of C: https://github.com/P403n1x87/austin/blob/877e2ff946ea5313e47...

Very cool, py-spy[1] has been an invaluable tool in my development process since jvns blogged[2] about it. The power of being able to visualize where your code is spending its time is so obvious and I'm glad people are building tools to make that easier.

As a quick compare and contrast between py-spy and pyinstrument it looks like py-spy has the advantage of being able to attach to an already running process which is super useful when your program is stuck and you don't know why. I haven't used pyinstrument yet but I do like the fact that it can do its flame graph in the console, sometimes I find saving down an svg file and opening up the browser a bit arduous. Excited to give it a try.

[1] https://github.com/benfred/py-spy

[2] https://jvns.ca/blog/2018/09/08/an-awesome-new-python-profil...

To extend on inspecting stack traces: I found that I often wanted to know the the content of (relevant) (local) variables of a stack trace. Whenever I got a stacktrace, I would often restart it in a debugger, or add print statements on variables. Until I extended the stacktrace to just include this information. The tricky part is that printing all local variables is too much. And also global variables might be involved. Or attributes (e.g. self.x or so). So now I'm parsing all variables and attributes from the source code line from the traceback (in a very simple way) and print only those. This covers about 95% of all cases - i.e. in 95% of the case, the stack trace contains all information I need to understand and fix the problem.

I published that here:

https://pypi.org/project/better_exchook/ https://github.com/albertz/py_better_exchook

I also often have a SIGUSR1 handler which will print the stacktrace of all threads. This is useful on long running processes, involving multi threading, where you might run into some strange hangs or deadlocks.

In addition to that, if you might get crashes (segfault or so), something like faulthandler is useful.

If you also want to see the C stack trace in addition in such cases, I load libSegFault.so, like here: https://github.com/rwth-i6/returnn/blob/5b8e34ec1fd725d0e20b...

Profiling is another topic. Something like Py-Spy (https://github.com/benfred/py-spy) can be very helpful.

I also found a remote background ZMQ IPython/Jupyter kernel to be useful sometimes. I published that here: https://github.com/albertz/background-zmq-ipython

Interesting article. While I definitely think you should be profiling your code to figure out the hot spots, cProfile has some limitations for profiling: cProfile doesn't give you line numbers, doesn’t work with threads, and significantly slows your program down.

I wrote a tool py-spy (https://github.com/benfred/py-spy) that is worth checking out if you’re interesting in profiling python programs. Not only does it solve those problems with cProfile - py-spy also lets you generate a flamegraph, profile running programs in production, works with multiprocess python applications, can profile native python extensions etc.

Similar to PyFlame is also py-spy: https://github.com/benfred/py-spy

> While pyflame is a great project, it doesn't support Python 3.7 yet and doesn't work on OSX, Windows or FreeBSD.

I wonder how the CPU profiling in Scalene is different. It does not mention PyFlame or py-spy at all in the Readme. Of course, the memory profiler is some nice extra.

Looks like `pyflame` was recently deprecated & archived.

I've had success using `py-spy` for debugging perf issues. Flamegraphs are much nicer to work with than cProfile's output.

https://github.com/benfred/py-spy

woah I hadn't seen that.. sounds like PySpy but implemented in BPF. That's crazy and cool: https://github.com/benfred/py-spy
I wrote something that will get you the python interpreter stack from any running cpython process : https://github.com/benfred/py-spy/ , and rbspy can do the same for ruby https://github.com/rbspy/rbspy
This gives instagram ideas for how to improve the python interpreter for their workload. If instead you're interested in profiling to look for ways to improve your python code, check out https://github.com/uber/pyflame or https://github.com/benfred/py-spy