I think it worth noting briefly that almost everything discussed in TFA concerns bandwidth (network, disk, other I/O, CPU) and not latency. That's understandable, because performance for a lot of people is about bandwidth. But there are some of us for whom bandwidth definitely takes a back seat, and you need a different set of tools for tuning latency in Linux.
What are the tools and resources you would recommend with respect to latency tuning?
Not the person you asked, but generally you might want to look at "frame-based" profilers. These are typically used in video games, but the concept is general, and can apply to other applications. The "frame" could also be something like a request or transaction being processed. I like Tracy[1], myself.

Another latency metric that you'll see, often w/respect to web apps and microservices is "P99" and similar. This is the amount of time in which 99% of requests get their response. For a higher percentile, you get a better idea of worst-case performance.

[1] https://github.com/wolfpld/tracy