Ok, that sounds like a good approach to find if it's done by some
kind of emulation or not. Of course, any machine with SMM (even if it
doesn't emulate the PIT per se - maybe it just gets some event related to
overheating or other 'maintenance' stuff) can have occasional hickups, but
the '120msec' thing is, I think, the real clincher.
Why? Because we only try to wait for 50ms in the first place! Even if
emulation is 100% exact (or even none at all, and the PIT accesses are in
hardware), if we have a 120ms hickup while waiting for 50ms, then the end
result will obviously be total crap, and yes, that sure explains how you
can get >100% wrong values.
I think the most trivial approach would be to
- just keep track of the max TSC difference for each loop iteration.
- if the max TSC is bigger than 1% of the total TSC, then something is
already seriously wrong (either we had very few loops indeed, or some
of them were very expensive)
- perhaps loop over the calibration, and make the TSC calibration loop
increase the delay. Because even if there is a 120ms hickup, if we had
used a longer calibration delay, we'd probably not have noticed (well,
ok, 120ms is pretty damning and is probably just unfixable, but smaller
hickups are probably harmless)
Additionally doing a min/max comparison to see that the loop is very
_stable_ is of course also a way to validate things, but expecting _too_
much stability may be wrong too. As mentioned, SMM events can happen for
other reasons than emulation.
Linus
--