The decision to change the default hertz in the 2.6 Linux kernel from 1000 to 250 [story] continued to be discussed, until Linux creator Linus Torvalds finally had enough, "get on with your lives. Realize that there is no 'perfect' value for HZ. 250 right now is somewhere reasonable, and for the extreme ends you can always choose your own." He also dismissed concern over picking an ideal number that minimizes long-term time drift, "long-term time drift is something that we inevitably have to use things like NTP to handle, if you want an exact clock." Instead, he highlighted tasks that affect the short term, such as converting timevals into jiffies, "in short-term things, the timeval/jiffie conversion is likely to be _bigger_ issue than the crystal frequency conversion. So we should aim for a HZ value that makes it easy to convert to and from the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all good values for that reason. 864 is not."
Looking forward, he proposed a better solution, "btw, if somebody really gets excited about all this, let me say (once again) what I think might be an acceptable situation." He noted, "I don't think we want a _highfrequency_ timer, I want a _lower_ frequency mode." He described raising the default hertz all the way up to 2000, "or something that we decide is the upper limit of sanity. We then have some timer logic entity that notices that nothing is going to care for the next 100 ticks, so we go into 'slow mode', and reprogram the timer to tick at a frequency of 100Hz, but when it does tick, we just count it as 20." He went on to explain that this hides the 'variable frequency' from everything else, "the timer tick is 2kHz as far as everybody is concerned. It's just that the ticks sometimes come in 'bunches of 20'."
From: David Lang [email blocked] To: Bill Davidsen [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 10:24:10 -0700 (PDT) On Wed, 13 Jul 2005, Bill Davidsen wrote: > How serious is the 1/HZ = sane problem, and more to the point how many > programs get the HZ value with a system call as opposed to including a header > or building it in? I know some of my older programs use header files, that > was part of the planning for the future even before 2.5 started. At the time > I didn't expect to have to use the system call. in binary 1/100 or 1/1000 are not sane values to start with so I don't think that that this is likly to be that critical (remembering that the kernel doesn't do floating point math) David Lang -- There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare
From: Vojtech Pavlik [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 20:42:27 +0200 On Wed, Jul 13, 2005 at 10:24:10AM -0700, David Lang wrote: > >How serious is the 1/HZ = sane problem, and more to the point how many > >programs get the HZ value with a system call as opposed to including a > >header or building it in? I know some of my older programs use header > >files, that was part of the planning for the future even before 2.5 > >started. At the time I didn't expect to have to use the system call. > > in binary 1/100 or 1/1000 are not sane values to start with so I don't > think that that this is likly to be that critical (remembering that the > kernel doesn't do floating point math) No, but 1/1000Hz = 1000000ns, while 1/864Hz = 1157407.407ns. If you have a counter that counts the ticks in nanoseconds (xtime ...), the first will be exact, the second will be accumulating an error. It's a tradeoff. -- Vojtech Pavlik SuSE Labs, SuSE CR
From: Linus Torvalds [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 12:10:48 -0700 (PDT) On Wed, 13 Jul 2005, Vojtech Pavlik wrote: > > No, but 1/1000Hz = 1000000ns, while 1/864Hz = 1157407.407ns. If you have > a counter that counts the ticks in nanoseconds (xtime ...), the first > will be exact, the second will be accumulating an error. It's not even that we have a counter like that, it's the simple fact that we have a standard interface to user space that is based on milli-, micro- and nanoseconds. (For "poll()", "struct timeval" and "struct timespec" respectively). It's totally pointless saying that we can do 864 Hz "exactly", when the fact is that all the timeouts we ever get from user space aren't in that format. So the only thing that matters is how close to a millisecond we can get, not how close to some random number. So we do a lot of conversions from "struct timeval" to "jiffies", and if you don't take the error in that conversion into account, then you're ignoring what is likely a _bigger_ error. Long-term time drift is a known issue, and is unavoidable since you don't even know the exact frequency of the crystal, since that is not only not that exact in the first place, it depends on temperature etc. So long-term time drift is something that we inevitably have to use things like NTP to handle, if you want an exact clock. And in short-term things, the timeval/jiffie conversion is likely to be a _bigger_ issue than the crystal frequency conversion. So we should aim for a HZ value that makes it easy to convert to and from the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all good values for that reason. 864 is not. Linus
From: Lee Revell [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 15:13:44 -0400 On Wed, 2005-07-13 at 12:10 -0700, Linus Torvalds wrote: > So we should aim for a HZ value that makes it easy to convert to and from > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > good values for that reason. 864 is not. How about 500? This might be good enough to solve the MIDI problem. Lee
From: Dmitry Torokhov [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 14:32:02 -0500 Hi, On 7/13/05, Lee Revell wrote: > On Wed, 2005-07-13 at 12:10 -0700, Linus Torvalds wrote: > > So we should aim for a HZ value that makes it easy to convert to and from > > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > > good values for that reason. 864 is not. > > How about 500? This might be good enough to solve the MIDI problem. > I would expect number of laptop users significatly outnumber ones driving MIDI so as a default entry 250 makes more sense IMHO. -- Dmitry
From: Lee Revell [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 13:31:33 -0400 On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > I think the shortlog speaks for itself. HZ still defaults to 250. As was explained in another thread, this will break apps like MIDI sequencers and won't really save much battery power. The default should remain 1000 until these issues are resolved. Or am I wasting my time with this? Lee
From: Linus Torvalds [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 10:51:53 -0700 (PDT) On Wed, 13 Jul 2005, Lee Revell wrote: > On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > > I think the shortlog speaks for itself. > > HZ still defaults to 250. As was explained in another thread, this will > break apps like MIDI sequencers and won't really save much battery > power. Stop bothering with this, I've seen the thread, and no, I disagree totally with "as explained in another thread". That's simply not true. The only thing that is true is that 100Hz is too low for some use, and 1000Hz is too high for some uses. NOBODY has shown that 250Hz isn't good enough, there's only been people whining and complaining and saying it might not be. The fact is, engineering is about finding something that works "well enough". If _you_ think that 1000Hz is the right answer, then _you_ select that. But if you cannot accept the fact that other people are of a different opinion, then why would anybody want to discuss the issue with you? This is a fundamental fact of engineering (and, in fact, pretty much any other area in life): If you cannot accept that other people have other aims and needs than than you, then why are you talking to other people in the first place? So get on with your lives. Realize that there is no "perfect" value for HZ. 250 right now is somewhere reasonable, and for the extreme ends you can always chose your own. Don't try to force your ideas on others. And btw, the next time somebody complains about HZ, I want HARD DATA. I don't want whining. Stop cc'ing me in you don't have a real datapoint, and if you cannot accept that other people have _other_ real datapoints. Linus
From: Lee Revell [email blocked] Subject: Re: Linux v2.6.13-rc3 Date: Wed, 13 Jul 2005 14:05:48 -0400 On Wed, 2005-07-13 at 10:51 -0700, Linus Torvalds wrote: > > On Wed, 13 Jul 2005, Lee Revell wrote: > > > On Tue, 2005-07-12 at 22:05 -0700, Linus Torvalds wrote: > > > I think the shortlog speaks for itself. > > > > HZ still defaults to 250. As was explained in another thread, this will > > break apps like MIDI sequencers and won't really save much battery > > power. > > Stop bothering with this, I've seen the thread, and no, I disagree totally > with "as explained in another thread". That's simply not true. > > The only thing that is true is that 100Hz is too low for some use, and > 1000Hz is too high for some uses. NOBODY has shown that 250Hz isn't good > enough, there's only been people whining and complaining and saying it > might not be. OK, point taken, I'm done with this issue as far as LKML is concerned. Anyone who wants to discuss this further can come over to the linux-audio-dev list. Lee
From: Linus Torvalds [email blocked] To: Lee Revell [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Wed, 13 Jul 2005 18:54:11 -0700 (PDT) On Wed, 13 Jul 2005, Lee Revell wrote: > > Interesting. First they say it's impractical to reprogram the PIT, then > they later imply that's exactly what Windows does, though for some > reason they don't come out and say it. I suspect that it is impractical to reprogram the PIT on a very fine granularity. Btw, if somebody really gets excited about all this, let me say (once again) what I think might be an acceptable situation. First off, I'm _not_ a believer in "sub-HZ ticks". Quite the reverse. I think we should have HZ be some high value, but we would _slow_down_ the tick when not needed, and count by 2's, 3's or even 10's when there's not a lot going on. In other words, I don't think we want a _highfrequency_ timer, I want a _lower_ frequency mode. So let's say that we raise HZ to 2000, or somethign that we decide is the upper limit of sanity. We then have some timer logic entity that notices that nothing is going to care for the next 100 ticks, so we go into "slow mode", and reprogram the timer to tick at a frequency of 100Hz, but when it does tick, we just count it as 20. IOW, nothing ever sees any "variable frequency", and there's never any question about what the timer tick is: the timer tick is 2kHz as far as everybody is concerned. It's just that the ticks sometimes come in "bunches of 20". This also means that there is never any issue of the timer running wild. The _most_ it will ever run at is limited quite naturally, and some crazy user asking for a 1ns itimer won't make any difference at all to the system. Linus
From: Ingo Molnar [email blocked] Subject: Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt Date: Thu, 14 Jul 2005 10:38:43 +0200 * Linus Torvalds [email blocked] wrote: > I suspect that it is impractical to reprogram the PIT on a very fine > granularity. yes - reprogramming the PIT can take up to 10 usecs even on recent PCs. (in fact the cost is pretty much system-independent due to PIO.) On modern, PIT-less timesources (e.g. HPET) it can be faster. > Btw, if somebody really gets excited about all this, let me say (once > again) what I think might be an acceptable situation. > > First off, I'm _not_ a believer in "sub-HZ ticks". Quite the reverse. > I think we should have HZ be some high value, but we would _slow_down_ > the tick when not needed, and count by 2's, 3's or even 10's when > there's not a lot going on. i think that would be an acceptable solution for high-precision timers, as long as two other problems are solved too: - there are real-time applications (robotic environments: fast rotating tools, media and mobile/phone applications, etc.) that want 10 usecs precision. If such users increased HZ to 100,000 or even 1000,000, the current timer implementation would start to creek: e.g. jiffies on 32-bit systems would wrap around in 11 hours or 1.1 hours. (To solve this cleanly, pretty much the only solution seems to be to increase the timeout to a 64 bit value. A non-issue for 64-bit systems, that's why i think we could eventually look at this possibility, once all the other problems are hashed out.) - at very high HZ values the clustering of e.g. network timers is lost, creating an artificially high number of timer interrupts. So likely we'd still need some way to 'blur' timeouts and to round e.g. network timers to the next 1 msec or 500 usecs boundary, to cluster up timers for bulk processing. But in any case, such a solution does not sound nearly as messy as the sub-jiffies method. if the 'high precision' uses are not addressed [*] i fear the whole HRT game starts again: embedded folks trying to standardize on Linux for everything [**] will want HRT timers and will do addons and sub-jiffy approaches [***], and will push for inclusion. I think we could as well solve this whole problem area by making ridiculously high HZ values practical too! Ingo [*] there's also a third problem: timer prioritization. It's not necessarily a problem the upstream kernel should care about, but it's a problem for things that try to offer hard-real-time, like PREEMPT_RT: HRT timers need to be prioritizable. If e.g. the system is soaked handling network timers, it should still be possible for that single mega-important HRT timer to run and wake up the mega-high-priority RT task that will preempt all network activity within 10 usecs worst-case. The sub-jiffies approach does this prioritization in a natural way, because there HRT timers are separate, so the prioritization of them is easy. With the grand unified 'big HZ' scheme the HRT folks would have to implement a mechanism to split off highprio timers from the stream of normal timers. ] [**] having high precision is also a perception and uniformity of platform issue: most embedded developers will find 500 usecs precision good enough for most uses, but it does not 'sound' good enough, and there's no easy way out either. So _if_ there's the occasional need for higher precision they'll have no easy solution for Linux, and this prevents them from standardizing on Linux for _everything_. [***] a side-thought about sub-jiffies: the biggest conceptual problem the sub-jiffy method has is the sorting needed when timers move from the jiffy bucket into the HRT list which doesnt scale - but this is really a HRT-timers-internal problem. It could be further improved by e.g. dividing the last jiffy up into say 10 usec buckets - with a 1msec jiffy clock that's 100 buckets, and having a bitmap to see which bucket is active. Then the 'get the next timer' act becomes a matter of searching the bitmap for the next bit set - at pretty much constant overhead. Even a 1 usec precision would mean only 1000 buckets for the last jiffy, with 1000 bits (128 bytes) to search, still quite ok. But, in any case, it would be nice to avoid the "conceptual dualness" of the HRT patch.
really excited
while i'm not a kernel hacker and therefore have no room to speak, i do hope some "gets really excited" and implements the variable frequency timer Linus refers too.
could someone please explain
could someone please explain the performance benefits/throughput of using a higher/lower hertz?
higher hertz in the timmer (i
higher hertz in the timmer (i think) mean more accurate short term timing, allowing a proces to sleep for a short, but accurate time, but heres the catch, it can only be waken when the timer interupts (goes bing), so the higher the frequency, the more accurate a short term sleep becomes, this is why audio developers want high frequencies, so they can work accuratly in there sound codes. But every time the timmer interupts, the interupt handler in the kernel is run, true it runs for a short time, but it can degrade perfromance, and most apps can deal with a inaccurate timmer (audio and video are really the only ones that need a decent timmer, other common apps arent sensative to it)
it also happens that running the timer interupt handler could also wake up some CPU's, and going to sleep (power save mode) them coming back so many times a second might waste even more power, there is a patch around that turns the timmer down while the CPU is idle, but i dont remeber where its at ....
summery: fast hertz = accurate sleep timing for time sensative applications, slow hertz = more throughtput on the CPU and possibly more power savings (tho im not sure how much extra perfromance or power savings you get ...)
older hardware (say a pentium pro or less?) also likes slower timings, like the old 100 hertz works good for them
Re: could someone please explain
basically;
low HZ == improved throughput (good) but higher latency (bad).
high HZ == lower throughput (bad) but lower latency (good).
It's a tradeof. You can't have low latency and high throughput at the same time since low latency will always add extra overhead. The timer interrupt going off more often is great for interactive apps and audio/video apps since they can time stuff more accurately and respond faster to interactive events, but all those timer interrupts take time (and a little time adds up when you have 1000 interrupts each second) - just handling the interrupt takes time, and then you also have to check all timers for expiration and whatnot - interactivity may be nice with a high HZ, but overall system throughput will suffer. There's no perfect HZ value for all tasks... Long term we want either a dynamic HZ that the kernel runtime tunes according to load, or we want sub-HZ timers or a tickless kernel - or we just go on with the current system and let users build their own kernels and pick their prefered fixed HZ at build time.
variable Hz?
In one of my small realtime kernels, i've implemented an 'inverted timelist' implementation of the kernel timer service: I make the timer generate an interrupt at the (future..) moment that one or more 'applications' requests timing services. So: instead of 'generate a timer interrupt and then see if anyone wants it' (fixed Hz, which is polling) i look at the time-list, and then set the timer interrupt generating subsystem so that it generates an interrupt when the next one is needed. This does not really suffer from the tradeoff that was mentioned: if -for instance- no timer services are required at some moment, then the entire timer subsystem becomes ~100% 'quiet'; if something wants timer interrupts like mad, the timer interrupt subsystem becomes busy, but that is what the 'user' requests at that moment. I would love to see something similar in Linux. The implementation is a little more difficult than a 'standard' timer service, but not even so much. Perhaps that is what the previous poster meant with 'tickless kernel'?
You're not the first one with
You're not the first one with that idea, but apparently this doesn't work because of hardware issues: setting the timer interrupt is slow.
I believe someone said that i
I believe someone said that it can take up to 18ms to program a slow PIC, that kind of latency isn't acceptable.
I believe that's microseconds
I believe that's microseconds, not milliseconds. Still, it's a sizeable number of cycles such that you probably only want to do that a couple times a second.
How do you handle other interrupts?
Maybe this isn't a factor in your application, but if you program the timer in such a way, how do you handle other hardware interrupts, like a keypress or network packet?
For example, say you look at your wait list and you see that you have nothing to do for 4 seconds, but 150ms after you reprogram the PIC an interrupt arrives. You haven't been counting these 150ms, so how do you know how much time to tell the application waiting on select() remains?
If the CPU has one, read the
If the CPU has one, read the TSC. Alterately, many timers allow reading the current countdown, so you can interpolate.
If this were really a problem, gettimeofday() wouldn't give you better than HZ precision, when actually it does on most hosts.
Interbench results
With somewhat fortuitous timing coinciding with releasing Interbench, I have results from interbench which show differences in audio/video performance with HZ=1000 vs 250:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112140080312273&w=2
I know it was discussed to de
I know it was discussed to death on LKML, but still I'm strongly favor "right solution for the task" approach.
Audio and video, as multimedia in general, are exceptional that you have to have some sort of uber-scheduler for it.
Real-time extenstions doesn't solve problem.
High HZ works problems around. Not more.
Implementation of threads with specialization. Say one thread reads data from disk and does demux, second thread decodes into raw streams, another N (e.g. 2) threads send decoded data to respective hardware (e.g. video and audio). The problem is that video and audio processing are so different from each other.
The task is standard. Very very very standard. All multimedia comes to that: consume from one source, decode & process, send to multiple destinations. Userspace cannot solve the problem on its own - this is specialized scheduling and application defined prioritization. As long as kernel will not implement something special to optimize such workloads - Linux will suck on multimedia. Just as it sucks right now. It is so funny to find that Mac OS X with all its bloat does video playback with post processing much much smoother. Linux has twice lower CPU load on same task - mplayer & vlc are blazingly fast - but even with high HZ it is still not as smooth as Mac OS X' QuickTime. My perception (on my home hi-fi/cinema) still catches sound jitter and framerate fluctation from both vlc and mplayer.
P.S. Jitter from alsa is especially annoying: on cheap speakers it is Okay, on normal hi-fi it sucks. Test: cdparanoia CD & play raw wavs on PC; play CD on hi-fi; listen to both. Proper cabling (middle range IXOS), proper sound card (Turtle Beach Santa Cruz) are in place.
> P.S. Jitter from alsa is es
> P.S. Jitter from alsa is especially annoying: on cheap speakers it is > Okay, on normal hi-fi it sucks. Test: cdparanoia CD & play raw wavs on > PC; play CD on hi-fi; listen to both. Proper cabling (middle range
> IXOS), proper sound card (Turtle Beach Santa Cruz) are in place.
puff puff pass man... cause you're obvously bogarting the crack pipe on this one.
Every single soundcard available is clocked from a local osc, there is no way Alsa can affect the jitter of the playback.... Either ALSA gets the next block of audio there or it doesn't and you hear a click.
Now if your soundcard is junk, ... Well thats not Linux's fault.
The DAC and ADC on my RME multiface here would be sutiable for scientific measurment. There is nothing wrong with the linux audio subsystem.
CK benchmark
Impressive!
I didn't realize how much important HZ value was.
The 250Hz Video latency is an order of magnitude worse than 1000Hz.
VMware
What I would find very useful is a way to set the HZ value in runtime.
Whoever out there is running 2.6 over VMware knows what I'm talking about. With 1000Hz, the clock drifts so much behind that nor NTP/vmware-tools can compensate. The damn thing loses 15 minutes each hour...
The only solution is to recompile the kernel, changing it to 100Hz.
250HZ may be enough to avoid wasting time recompiling, but it may not.
Is that why my clock keeps dr
Is that why my clock keeps drifting? It was driving me nuts, and I couldn't find any reference to it.
Switch to XEN: http://www.cl.
Switch to XEN: http://www.cl.cam.ac.uk/Research/SRG/netos/xen/
When Xen can run Windows gues
When Xen can run Windows guests unmodified, along with Linux guests, I'll think about it. Until then, it isn't an option.
Clock drift
Probably not. The way to fix it is to run ntpd. Point it to some NTP peers and let it run for a while, it will take some time to find out the proper drift and compensate in real time.
As I said, it can't. The very
As I said, it can't. The very nature of virtualization means the drift is not fixed, it is random (depends on the amount of time the VM gets to run, which depends on other processes on the host). NTPd can't correct the drift because it is too big and can't be predicted.
laptops
another point against using a high default HZ value is laptops whose power electronics make noise when the CPU transitions from halt to running. Power management has to be completely disabled on these machines in order to run with a high HZ value and no noise.
Dynamic HZ value
There's a patch "Dynamic Tick Timer for Linux (dyn-tick)" from Tony Lindgen and Tuukka Tikkanen for 2.6.12-rc6 kernels that makes HZ value depend on the current load. So the system runs at full HZ during load, and skips ticks when possible while idle. More info can be found here: www.muru.com. I've prepared some patch for 2.6.12 kernel that uses dyn-tick and other stuff like: plugsched 5.2 from P.Williams + genetic stuff from Jake Moilanen (genetic-as IO scheduler), etc.
Here you can find link to the patch and an ebuild for Gentoo Linux users: http://forums.gentoo.org/viewtopic-p-2577927.html#2577927
Or way higher
I've got a laptop like this. When I first had the problem, I tried setting HZ to 10kHz instead. After a couple changes in the kernel (preventing overflows), I got it working and couldn't hear the noise anymore. Unfortunately, the overhead is way too high (10% on my Pentium-M 1.6 GHz).
This applies not only to lapt
This applies not only to laptops. I got an old 400MHz AMD K6II (DFI Motherboard) at home, which I use as diskless/fanless (downclocked to 350MHz, replaced CPU heatsink with oversized unit) X-terminal. Since I use 2.6 it emits some mildly annoying sound (presumeably from the power converter on the MB) when it is idle. I haven't tried the most recent kernels yet - perhaps I like the sound of 250HZ better ;-}