Disabling NO_HZ has a serious negative effect on performance -- an extra 70us per I/O. Prevent users from deselecting it. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index f06a8a3..55b9a04 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -5,13 +5,9 @@ config TICK_ONESHOT bool config NO_HZ - bool "Tickless System (Dynamic Ticks)" + def_bool y depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS select TICK_ONESHOT - help - This option enables a tickless system: timer interrupts will - only trigger on an as-needed basis both when the system is - busy and when the system is idle. config HIGH_RES_TIMERS bool "High Resolution Timer Support" -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
OTOH, it causes some systems to not work at all still. Sadly. Thanks, --
It doesn't stop people from specifying nohz=off on the kernel command line. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
Which still doesn't work for some people AFAICT. Thanks, Rafael --
Why remove the ability to make the configuration choice? Why not just add the info about performance impact to the help text and let me shoot myself in the foot (that is the unix way (tm)) if I desire to? (Yes I say your later reply about the kernel command line method of shooting myself in the foot.) -Frank --
$ wc -l .config 2601 .config It's too hard to get every single config option right ... unless it's a works / doesn't work choice, having a "make my performance suck" config option is a bad idea. I didn't even know I had this config option set the wrong way until I ran powertop on my desktop, implemented its recommendations for things I'd screwed up in my .config, then reran the tests I was doing. Someone with a big server might not run powertop, so wouldn't be told they'd got it run. Sure, we could write a special tool to check peoples config options to see if they've got some common config options set wrongly ... but I'm rather reminded of Brazil. http://farm1.static.flickr.com/50/109556460_5362c5f2b5_o.jpg -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
(And yet one more duplicate, because lkml rejected my previous reply. Sorry!!) (sorry for the duplicate, I dropped the cc's on my previous reply) Agreed that setting config options can be painful! Various options that improve performance for your system can negatively impact the performance of systems and applications that are different than your system. On some of my real time systems, CONFIG_NO_HZ=y results in larger maximum interrupts disabled values, meaning larger real time latency. For the option in question, would it be sufficient to just add "default y", so that the common case gets the correct value, but the value remains configurable? -Frank --
One can argue that configurability is one of the greatest strengths of Linux. OTOH, one can also argue that users tend to get lost and hang themselves when given too much rope; and that the burden of support and maintenance of unnecessary config options squanders valuable resources. Personally, I have two bugs filed against my code that can be reproduced only in tickfull mode that almost nobody uses. Is it a good use of my time to be distracted by by configurations that 0.01% use, or focus on issues seen by the other 99.99%? I'm in favor of deleting the config option, and the cmdline option with it, and I applaud Matthew for proposing such. Len Brown, Intel Open Source Technology Center --
I always disable tickless since early on it crashed. I guess I haven't bothered to risk that again, and, updating the kernel via 'make oldconfig' means I'm not often presented with the option, apart from first custom kernel after a new install. There are many items in .cong need better help info, to inform on the consequences, how we (the users) supposed to know the 'new' way is now better than the old? I'll try tickless, if only to gain back some unexpected performance loss on a RAID6 system I built recently, I've not done RAID benchmarking for five years ;) I expect twice the throughput I'm getting based on a Linux NAS device I looked at recently. I ask for the better help text, unless you can show tickful operation is no longer required anywhere? A better help explanation plus scheduled removal? (I didn't view the patch to know if that's what it does). --
Hi, I have developed a device driver which is using default kernel work queue thread for deferring tasks. This work queue thread while executing exits sometimes in function calls. It exits sometimes at one point and sometimes at another point without any panic. so the next time when the work is scheduled, it works normally. can this problem be due to stack limit? Regards Raj --
Even at a kconfig variable without prompt, a help section may be useful. In case of this one, you should rather add to the text than remove it. -- Stefan Richter -=====-==-=- ==-- ==--= http://arcgraph.de/sr/ --
System with fixed HZ is really simpler etc... and 70us per i/o does not sound that bad. Anyway, should not i/o overhead just be fixed? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
Also, where are those 70us spent exactly ? Once per clock tick ? For me, I/O can be an ethernet frame sent or received, and I never noticed such a delay. At least, commit message sounds very evasive too me. --
This is rather non-trivial change. The common wisdom was that NO_HZ=y *adds* overhead, not removes it. Thus, I propose: (1) Retaining and elaborating the help text. (2) Allow user to change the default. --
Hi, Enabling NO_HZ has a serious negative effect on realtime due to longer interrupt handling and less predictability of the system. We prefer to align the timer interrupt handling to the interrupt handling of our realtime control loops so they do not bother each other. A varying moment of the timer interrupt handling would destroy much realtime responsiveness here. So, what is good for throughput is usually bad for responsiveness... This is an example that the many different applications of the Linux kernel demand a different configuration of the kernel. I prefer to keep it configurable. Kind regards, Remy --
