logo
Published on KernelTrap (http://kerneltrap.org)

Linux: Debating Hertz

By Jeremy
Created Jul 14 2005 - 09:19

The decision to change the default hertz in the 2.6 Linux kernel from 1000 to 250 [story [1]] continued to be discussed, until Linux creator Linus Torvalds finally had enough, "get on with your lives. Realize that there is no 'perfect' value for HZ. 250 right now is somewhere reasonable, and for the extreme ends you can always choose your own." He also dismissed concern over picking an ideal number that minimizes long-term time drift, "long-term time drift is something that we inevitably have to use things like NTP to handle, if you want an exact clock." Instead, he highlighted tasks that affect the short term, such as converting timevals into jiffies, "in short-term things, the timeval/jiffie conversion is likely to be _bigger_ issue than the crystal frequency conversion. So we should aim for a HZ value that makes it easy to convert to and from the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all good values for that reason. 864 is not."

Looking forward, he proposed a better solution, "btw, if somebody really gets excited about all this, let me say (once again) what I think might be an acceptable situation." He noted, "I don't think we want a _highfrequency_ timer, I want a _lower_ frequency mode." He described raising the default hertz all the way up to 2000, "or something that we decide is the upper limit of sanity. We then have some timer logic entity that notices that nothing is going to care for the next 100 ticks, so we go into 'slow mode', and reprogram the timer to tick at a frequency of 100Hz, but when it does tick, we just count it as 20." He went on to explain that this hides the 'variable frequency' from everything else, "the timer tick is 2kHz as far as everybody is concerned. It's just that the ticks sometimes come in 'bunches of 20'."



From: David Lang [email blocked]
To: Bill Davidsen [email blocked]
Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Date:	Wed, 13 Jul 2005 10:24:10 -0700 (PDT)

On Wed, 13 Jul 2005, Bill Davidsen wrote:

> How serious is the 1/HZ = sane problem, and more to the point how many 
> programs get the HZ value with a system call as opposed to including a header 
> or building it in? I know some of my older programs use header files, that 
> was part of the planning for the future even before 2.5 started. At the time 
> I didn't expect to have to use the system call.

in binary 1/100 or 1/1000 are not sane values to start with so I don't 
think that that this is likly to be that critical (remembering that the 
kernel doesn't do floating point math)

David Lang

-- 
There are two ways of constructing a software design. One way is to make it
 so simple that there are obviously no deficiencies. And the other way is to
 make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare


From: Vojtech Pavlik [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 20:42:27 +0200 On Wed, Jul 13, 2005 at 10:24:10AM -0700, David Lang wrote: > >How serious is the 1/HZ = sane problem, and more to the point how many > >programs get the HZ value with a system call as opposed to including a > >header or building it in? I know some of my older programs use header > >files, that was part of the planning for the future even before 2.5 > >started. At the time I didn't expect to have to use the system call. > > in binary 1/100 or 1/1000 are not sane values to start with so I don't > think that that this is likly to be that critical (remembering that the > kernel doesn't do floating point math) No, but 1/1000Hz = 1000000ns, while 1/864Hz = 1157407.407ns. If you have a counter that counts the ticks in nanoseconds (xtime ...), the first will be exact, the second will be accumulating an error. It's a tradeoff. -- Vojtech Pavlik SuSE Labs, SuSE CR
From: Linus Torvalds [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 12:10:48 -0700 (PDT) On Wed, 13 Jul 2005, Vojtech Pavlik wrote: > > No, but 1/1000Hz = 1000000ns, while 1/864Hz = 1157407.407ns. If you have > a counter that counts the ticks in nanoseconds (xtime ...), the first > will be exact, the second will be accumulating an error. It's not even that we have a counter like that, it's the simple fact that we have a standard interface to user space that is based on milli-, micro- and nanoseconds. (For "poll()", "struct timeval" and "struct timespec" respectively). It's totally pointless saying that we can do 864 Hz "exactly", when the fact is that all the timeouts we ever get from user space aren't in that format. So the only thing that matters is how close to a millisecond we can get, not how close to some random number. So we do a lot of conversions from "struct timeval" to "jiffies", and if you don't take the error in that conversion into account, then you're ignoring what is likely a _bigger_ error. Long-term time drift is a known issue, and is unavoidable since you don't even know the exact frequency of the crystal, since that is not only not that exact in the first place, it depends on temperature etc. So long-term time drift is something that we inevitably have to use things like NTP to handle, if you want an exact clock. And in short-term things, the timeval/jiffie conversion is likely to be a _bigger_ issue than the crystal frequency conversion. So we should aim for a HZ value that makes it easy to convert to and from the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all good values for that reason. 864 is not. Linus
From: Lee Revell [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 15:13:44 -0400 On Wed, 2005-07-13 at 12:10 -0700, Linus Torvalds wrote: > So we should aim for a HZ value that makes it easy to convert to and from > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > good values for that reason. 864 is not. How about 500? This might be good enough to solve the MIDI problem. Lee
From: Dmitry Torokhov [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 14:32:02 -0500 Hi, On 7/13/05, Lee Revell wrote: > On Wed, 2005-07-13 at 12:10 -0700, Linus Torvalds wrote: > > So we should aim for a HZ value that makes it easy to convert to and from > > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > > good values for that reason. 864 is not. > > How about 500? This might be good enough to solve the MIDI problem. > I would expect number of laptop users significatly outnumber ones driving MIDI so as a default entry 250 makes more sense IMHO. -- Dmitry
From: Lee Revell [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 13:31:33 -0400 On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > I think the shortlog speaks for itself. HZ still defaults to 250. As was explained in another thread, this will break apps like MIDI sequencers and won't really save much battery power. The default should remain 1000 until these issues are resolved. Or am I wasting my time with this? Lee
From: Linus Torvalds [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 10:51:53 -0700 (PDT) On Wed, 13 Jul 2005, Lee Revell wrote: > On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > > I think the shortlog speaks for itself. > > HZ still defaults to 250. As was explained in another thread, this will > break apps like MIDI sequencers and won't really save much battery > power. Stop bothering with this, I've seen the thread, and no, I disagree totally with "as explained in another thread". That's simply not true. The only thing that is true is that 100Hz is too low for some use, and 1000Hz is too high for some uses. NOBODY has shown that 250Hz isn't good enough, there's only been people whining and complaining and saying it might not be. The fact is, engineering is about finding something that works "well enough". If _you_ think that 1000Hz is the right answer, then _you_ select that. But if you cannot accept the fact that other people are of a different opinion, then why would anybody want to discuss the issue with you? This is a fundamental fact of engineering (and, in fact, pretty much any other area in life): If you cannot accept that other people have other aims and needs than than you, then why are you talking to other people in the first place? So get on with your lives. Realize that there is no "perfect" value for HZ. 250 right now is somewhere reasonable, and for the extreme ends you can always chose your own. Don't try to force your ideas on others. And btw, the next time somebody complains about HZ, I want HARD DATA. I don't want whining. Stop cc'ing me in you don't have a real datapoint, and if you cannot accept that other people have _other_ real datapoints. Linus
From: Lee Revell [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 14:05:48 -0400 On Wed, 2005-07-13 at 10:51 -0700, Linus Torvalds wrote: > > On Wed, 13 Jul 2005, Lee Revell wrote: > > > On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > > > I think the shortlog speaks for itself. > > > > HZ still defaults to 250. As was explained in another thread, this will > > break apps like MIDI sequencers and won't really save much battery > > power. > > Stop bothering with this, I've seen the thread, and no, I disagree totally > with "as explained in another thread". That's simply not true. > > The only thing that is true is that 100Hz is too low for some use, and > 1000Hz is too high for some uses. NOBODY has shown that 250Hz isn't good > enough, there's only been people whining and complaining and saying it > might not be. OK, point taken, I'm done with this issue as far as LKML is concerned. Anyone who wants to discuss this further can come over to the linux-audio-dev list. Lee
From: Linus Torvalds [email blocked] To: Lee Revell [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 18:54:11 -0700 (PDT) On Wed, 13 Jul 2005, Lee Revell wrote: > > Interesting. First they say it's impractical to reprogram the PIT, then > they later imply that's exactly what Windows does, though for some > reason they don't come out and say it. I suspect that it is impractical to reprogram the PIT on a very fine granularity. Btw, if somebody really gets excited about all this, let me say (once again) what I think might be an acceptable situation. First off, I'm _not_ a believer in "sub-HZ ticks". Quite the reverse. I think we should have HZ be some high value, but we would _slow_down_ the tick when not needed, and count by 2's, 3's or even 10's when there's not a lot going on. In other words, I don't think we want a _highfrequency_ timer, I want a _lower_ frequency mode. So let's say that we raise HZ to 2000, or somethign that we decide is the upper limit of sanity. We then have some timer logic entity that notices that nothing is going to care for the next 100 ticks, so we go into "slow mode", and reprogram the timer to tick at a frequency of 100Hz, but when it does tick, we just count it as 20. IOW, nothing ever sees any "variable frequency", and there's never any question about what the timer tick is: the timer tick is 2kHz as far as everybody is concerned. It's just that the ticks sometimes come in "bunches of 20". This also means that there is never any issue of the timer running wild. The _most_ it will ever run at is limited quite naturally, and some crazy user asking for a 1ns itimer won't make any difference at all to the system. Linus
From: Ingo Molnar [2] [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Thu, 14 Jul 2005 10:38:43 +0200 * Linus Torvalds [email blocked] wrote: > I suspect that it is impractical to reprogram the PIT on a very fine > granularity. yes - reprogramming the PIT can take up to 10 usecs even on recent PCs. (in fact the cost is pretty much system-independent due to PIO.) On modern, PIT-less timesources (e.g. HPET) it can be faster. > Btw, if somebody really gets excited about all this, let me say (once > again) what I think might be an acceptable situation. > > First off, I'm _not_ a believer in "sub-HZ ticks". Quite the reverse. > I think we should have HZ be some high value, but we would _slow_down_ > the tick when not needed, and count by 2's, 3's or even 10's when > there's not a lot going on. i think that would be an acceptable solution for high-precision timers, as long as two other problems are solved too: - there are real-time applications (robotic environments: fast rotating tools, media and mobile/phone applications, etc.) that want 10 usecs precision. If such users increased HZ to 100,000 or even 1000,000, the current timer implementation would start to creek: e.g. jiffies on 32-bit systems would wrap around in 11 hours or 1.1 hours. (To solve this cleanly, pretty much the only solution seems to be to increase the timeout to a 64 bit value. A non-issue for 64-bit systems, that's why i think we could eventually look at this possibility, once all the other problems are hashed out.) - at very high HZ values the clustering of e.g. network timers is lost, creating an artificially high number of timer interrupts. So likely we'd still need some way to 'blur' timeouts and to round e.g. network timers to the next 1 msec or 500 usecs boundary, to cluster up timers for bulk processing. But in any case, such a solution does not sound nearly as messy as the sub-jiffies method. if the 'high precision' uses are not addressed [*] i fear the whole HRT game starts again: embedded folks trying to standardize on Linux for everything [**] will want HRT timers and will do addons and sub-jiffy approaches [***], and will push for inclusion. I think we could as well solve this whole problem area by making ridiculously high HZ values practical too! Ingo [*] there's also a third problem: timer prioritization. It's not necessarily a problem the upstream kernel should care about, but it's a problem for things that try to offer hard-real-time, like PREEMPT_RT: HRT timers need to be prioritizable. If e.g. the system is soaked handling network timers, it should still be possible for that single mega-important HRT timer to run and wake up the mega-high-priority RT task that will preempt all network activity within 10 usecs worst-case. The sub-jiffies approach does this prioritization in a natural way, because there HRT timers are separate, so the prioritization of them is easy. With the grand unified 'big HZ' scheme the HRT folks would have to implement a mechanism to split off highprio timers from the stream of normal timers. ] [**] having high precision is also a perception and uniformity of platform issue: most embedded developers will find 500 usecs precision good enough for most uses, but it does not 'sound' good enough, and there's no easy way out either. So _if_ there's the occasional need for higher precision they'll have no easy solution for Linux, and this prevents them from standardizing on Linux for _everything_. [***] a side-thought about sub-jiffies: the biggest conceptual problem the sub-jiffy method has is the sorting needed when timers move from the jiffy bucket into the HRT list which doesnt scale - but this is really a HRT-timers-internal problem. It could be further improved by e.g. dividing the last jiffy up into say 10 usec buckets - with a 1msec jiffy clock that's 100 buckets, and having a bitmap to see which bucket is active. Then the 'get the next timer' act becomes a matter of searching the bitmap for the next bit set - at pretty much constant overhead. Even a 1 usec precision would mean only 1000 buckets for the last jiffy, with 1000 bits (128 bytes) to search, still quite ok. But, in any case, it would be nice to avoid the "conceptual dualness" of the HRT patch.



Related Links:


Source URL:
http://kerneltrap.org/node/5430