it turns out x86/amd chips many of the perf counter events are offset by the (unpredictable) interrupt count because the interrupt return instruction uop gets counted as both a user and kernel instruction. On many processors the retired store instruction avoids this issue.
especially as that builds off of the extensive work done on x86 counter determinism here: https://web.eece.maine.edu/~vweaver/projects/deterministic/
it turns out x86/amd chips many of the perf counter events are offset by the (unpredictable) interrupt count because the interrupt return instruction uop gets counted as both a user and kernel instruction. On many processors the retired store instruction avoids this issue.