OK. I had used "monotonic" in its more general sense earlier in the
thread, and I wanted to be sure.
Sure. But on a given machine, the CPUs are likely to be closely enough
matched that a cycle on one CPU is more or less equivalent to a cycle on
another CPU. The fact that a cycle represents a different amount of
work on an i486 compared to Core2 doesn't matter much. The important
part is that when the scheduler is doling out CPU time is comparing
everyone's usages with a common unit.
Yes, but the question is whether it matters all that much? Does it
matter enough to make them two separate concepts, when one seems to
cover all the important points?
Not at all. You might have an unimportant but cpu-bound process which
doesn't merit increasing the cpu speed, but should also be scheduled
properly compared to other processes. I often nice my kernel builds
(which cpufreq takes as a hint to not ramp up the cpu speed) on my
laptop so to save power.
It doesn't matter. The scheduler is only important when there's
contention for the cpu, and if there is, that it compare process CPU
usage with the same unit. What that unit isn't inherently very
important, so long as its consistent.
That's true. But this is a case of the left brain not talking to the
right brain: cpufreq might decide to slow a cpu down, but the scheduler
doesn't take that into account. Making the timebase of sched_clock
reflect the current cpu speed (or more specifically, the integral of the
cpu speed over a time interval) is a good way of communicating between
the two subsystems.
As things stand now, there's not much difference from the scheduler's
perspective, since the scheduler takes no action in either case.
So, this is the target process timeslice, in units of sched_clock's
timebase?
I added stolen time accounting to xen-pv_ops last night. For Xen, at
least, it wasn't hard to fit into the clockevent infrastructure. I just
update the stolen time accounting for each cpu when it gets a timer
tick; they seem to get a tick every couple of seconds even when idle.
Similarly, implementing sched_clock as "number of ns the vcpu spent in
running state" is simple and direct (though this makes it an explicitly
per-cpu clock; comparing raw sched_clock values between cpus will be
meaningless; but that's likely true when using the tsc as a timebase too).
It doesn't matter why you didn't get the time; the important part is
that you know that time went missing. Its true that you may end up with
some spurious rescheds, but that seems like the kind of thing you'd want
to measure as being a problem before getting clever in fixing.
I'd call that an oversight. Xen has everything needed to implement
sched_clock in terms of non-stolen time.
Not necessarily. The cpu might drop into thermal protection clock
modulation.
J
-