Perf counters are super useful. On linux the perf tool (and perf event api) make these usable: https://perf.wiki.kernel.org/index.php/Main_Page

The counters vary per Intel CPU, though the most useful ones are universal (e.g. cycle counts). AMD has similar counters.

ocperf is wrapper around perf provided by someone at intel. At the first run, it downloads a list of counter specific the CPU detected, pretty cool; https://github.com/andikleen/pmu-tools