What does HackerNews think of async-profiler?
Sampling CPU and HEAP profiler for Java featuring AsyncGetCallTrace + perf_events
These days, async profiler (https://github.com/jvm-profiling-tools/async-profiler) is much better than the Go tooling for performance. It is a joy to use and features a top-like view for the hottest methods. It works for locks, allocations and CPU time. It also integrates with JMH.
1. An inotify library for detecting file changes. In this case, IntelliJ detected that it didn't have a binary for my architecture and just let me know it would be slower. Not a big deal.
2. The async profiler[0]. While available for other architectures, only x86_64 binaries are bundled with IntelliJ. And unfortunately, it didn't detect this either; it just quietly failed to work.
Hopefully with this patch they've also fixed these issues on linux-aarch64.
My main project so far this year has been creating actual performance metrics, and not just guesstimates (This is especially important with the JVM which will optimize code at runtime to your actual payloads). And the best tool so far has been FlameGraphs [1]: I urge everyone to try and find am implementation for them and their specific language, as these things are actually interactive. It's not just a nice graphic, but can tell you very directly where you're spending time. We've found countless minor bugs and wasted cycles.
The best java integration I could find is Async-Profiler [2] which can - as the name implies - be attached to any running jvm. The config is pretty powerful and intuitive. It's one of those magical things that just work.
[1]: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
In particular, you should always aim to profile your app when running under a production load so that you do not have to make assumptions about its behaviour. Something like async-profiler[0] is good, since it avoids the safepoint bias issue and can also track heap allocations.
[1] https://en.wikipedia.org/wiki/Performance_Analyzer [2] https://github.com/jvm-profiling-tools/async-profiler
But there's also a newer way that can do Java stack sampling from perf without the frame pointer: https://github.com/jvm-profiling-tools/async-profiler
We're not running it yet. I want to try it out. Note that the stacks with async-profiler are a bit broken -- Java methods become detached from the JVM -- but I'm hoping that's fixable (it should be with a JVM change, at least).