Linux: Debating The Merits Of Kernel Preemption

Submitted by Jeremy
on October 6, 2004 - 6:36pm

In response to a recent bug report on the lkml, Jeff Garzik, the maintainer of the Serial ATA subsystem [story] and network device drivers, had some harsh words for kernel preemption. In the bug report it was mentioned that kernel preemption was enabled to which Jeff replied, "I strongly recommend disabling kernel preemption. It is a hack that hides bugs". The comment lead into an interesting discussion.

Jeff's two main points were that kernel preemption potentially hides long latency code paths that should instead be fixed, and that it potentially introduces bugs. Andrea Arcangeli [interview], who back in March also had some sharp criticisms on the usefulness of kernel preemption [story] replied to the first point, "with any decent profiler, places where we're wasting cpus will show-up immediately. [...] So I disagree with your claim that preempt risks to hide inefficient code, there are many other (probably easier) ways to detect inefficient code than to check the latencies." 2.6 maintainer Andrew Morton [interview] replied to the second point, "what driver bugs are apparent with preemption which are not already SMP bugs? [The] only thing I can think of is unguarded use of per-cpu data, and we have runtime debug checks for that now."


From: Jeff Garzik [email blocked]
Subject: Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)
Date: 	Tue, 05 Oct 2004 20:54:14 -0400

Gianluca Cecchi wrote:
> Eventually I can provide config. Btw, I'm using udev and kernel preemption.

I strongly recommend disabling kernel preemption.  It is a hack that 
hides bugs.



> ata1: dev 0 ATA, max UDMA/133, 240121728 sectors:
> ata1: dev 0 configured for UDMA/100
> and the same for ata2?

The controller maximum is set to UDMA/100.



> But taking in parallel top with refresh 1s in another window, during the
> 80
> seconds elapsed above, I had two freezes in top session: one of 20sec and
> another
> of 15 sec: about the half of the time of the total I/O operation!
> During this time there was one cpu used at about 30%, no other cpu constraints.
> I repeated the operation with similar behaviour. Sometimes also 50 contiguos
> secs
> of freeze, until I/O operation finished.
> The freeze is not about all system itself, but only certian things.
> For example I tried also a vmstat session in another window with 3secs of
> delay
> and it had no problems while top was blocked:

Make sure you have the latest BIOS update for your system, and latest 
BIOS update for your card.

	Jeff


From: Roland Dreier [email blocked] Subject: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 18:00:16 -0700 Obscurity, linux) Jeff> I strongly recommend disabling kernel preemption. It is a Jeff> hack that hides bugs. Why do you say that? Preempt seems to be the cleanest way to low latency, and if anything it exposes locking bugs and races rather than hiding anything. Thanks, Roland
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 21:11:59 -0400 Roland Dreier wrote: > Jeff> I strongly recommend disabling kernel preemption. It is a > Jeff> hack that hides bugs. > > Why do you say that? Preempt seems to be the cleanest way to low > latency, and if anything it exposes locking bugs and races rather than > hiding anything. Clean? Hardly. It breaks up code paths that were never written to be broken up. The proper fix is to locate and fix high latency code paths. Preempt does nothing but hide those high latency code paths, and discourage people from seeking a better solution. Fix the drivers, rather than bandaid over them with preempt. If all code paths in the kernel were low latency, then you would not need preempt at all. Jeff
From: Nick Piggin [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 11:28:20 +1000 Jeff Garzik wrote: > Roland Dreier wrote: > > Clean? Hardly. It breaks up code paths that were never written to be > broken up. But lots of things change. Unserialising the kernel broke code paths. > The proper fix is to locate and fix high latency code paths. > Preempt does nothing but hide those high latency code paths, and > discourage people from seeking a better solution. > > Fix the drivers, rather than bandaid over them with preempt. > > If all code paths in the kernel were low latency, then you would not > need preempt at all. > When you say low latency, you mean small lock hold times, *and* cond_rescheds placed everywhere - it is this second requirement that isn't the cleanest way of doign things. With preempt, sure you still need small lock hold times. No big deal.
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 21:40:03 -0400 Nick Piggin wrote: > When you say low latency, you mean small lock hold times, *and* > cond_rescheds placed everywhere - it is this second requirement > that isn't the cleanest way of doign things. > > With preempt, sure you still need small lock hold times. No big > deal. And with preempt you're still hiding stuff that needs fixing. And when it gets fixed, you don't need preempt. Therefore, preempt is just a hack that hides stuff that wants fixing anyway. Jeff
From: Robert Love [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 21:52:55 -0400 On Tue, 2004-10-05 at 21:40 -0400, Jeff Garzik wrote: > And with preempt you're still hiding stuff that needs fixing. And when > it gets fixed, you don't need preempt. > > Therefore, preempt is just a hack that hides stuff that wants fixing anyway. This actually sounds like the argument for preempt, and against sprinkling cond_resched() hacks all over the kernel. Robert Love
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 5 Oct 2004 21:55:15 -0400 On Tue, Oct 05, 2004 at 09:52:55PM -0400, Robert Love wrote: > On Tue, 2004-10-05 at 21:40 -0400, Jeff Garzik wrote: > > This actually sounds like the argument for preempt, and against As opposed to fixing drivers??? Please fix the drivers and code first. > sprinkling cond_resched() hacks all over the kernel. cond_resched() is not the only solution. Jeff
From: Robert Love [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 22:07:39 -0400 On Tue, 2004-10-05 at 21:55 -0400, Jeff Garzik wrote: > As opposed to fixing drivers??? Please fix the drivers and code first. No, definitely not, dude. Fixes for anything--drivers include--is never superseded by anything else, even the eternal quest for "low latency." > > sprinkling cond_resched() hacks all over the kernel. > > cond_resched() is not the only solution. Indeed. Most other solutions (fixing algorithms, lowering lock hold time) have "automatic" benefits with kernel preemption, though, and that has been what I have always advocated. Robert Love
From: Nick Piggin [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 12:02:48 +1000 Jeff Garzik wrote: > On Tue, Oct 05, 2004 at 09:52:55PM -0400, Robert Love wrote: > >>On Tue, 2004-10-05 at 21:40 -0400, Jeff Garzik wrote: >> >> >>>And with preempt you're still hiding stuff that needs fixing. And when >>>it gets fixed, you don't need preempt. >>> >>>Therefore, preempt is just a hack that hides stuff that wants fixing anyway. What is it hiding exactly? >> >>This actually sounds like the argument for preempt, and against > > > As opposed to fixing drivers??? Please fix the drivers and code first. > I thought you just said preempt should be turned off because it breaks things (ie. as opposed to fixing the things that it breaks). But anyway, yeah obviously fixing drivers always == good. I don't think anybody advocated otherwise.
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 5 Oct 2004 22:07:34 -0400 On Wed, Oct 06, 2004 at 12:02:48PM +1000, Nick Piggin wrote: > > What is it hiding exactly? Bugs and high latency code paths that should instead be fixed. > > But anyway, yeah obviously fixing drivers always == good. I don't > think anybody advocated otherwise. By _definition_, when you turn on preempt, you hide the stuff I just described above. Hiding that stuff means that users and developers won't see code paths that need fixing. If users and developers aren't aware of code paths that need fixing, they don't get fixed. Therefore, by advocating preempt, you are advocating a solution _other than_ actually making the necessary fixes. Jeff
From: Andrea Arcangeli [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 6 Oct 2004 05:17:26 +0200 Hello, On Tue, Oct 05, 2004 at 10:07:34PM -0400, Jeff Garzik wrote: > Hiding that stuff means that users and developers won't see code paths > that need fixing. If users and developers aren't aware of code paths with any decent profiler places where we're wasting cpus will showup immediatatly, one obvious example that I can recall is the pte_chains rmap code, that wasn't necessairly high latency but still it was very wasteful in terms of cpu, was always at the top of the profiling. So I disagree with your claim that preempt risks to hide inefficient code, there are many other (probably easier) ways to detect inefficient code than to check the latencies. The one argument I've against preempt is that the claim that preempt doesn't spread cond_resched all over the place is false. It can spread even more of them as implicit ones. They're not visible to the developer but they're visible to the cpu. So disabling preempt and putting finegriend cond_resched should allow us to optimize the code better, and actually _reduce_ the number of cond_resched (cond_resched as the ones visible to the cpu, not the ones visible to the kernel developer). I wonder if anybody ever counted the number of implicit cond_resched placed by preempt and compared them to the number of explicit cond_resched needed without preempt.
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 23:27:06 -0400 Andrea Arcangeli wrote: > So I disagree with your claim that preempt risks to hide inefficient > code, there are many other (probably easier) ways to detect inefficient > code than to check the latencies. You're ignoring the argument :) If users and developers are presented with the _impression_ that long latency code paths don't exist, then nobody is motivated to profile them (with any tool), much less fix them. Jeff
From: Andrea Arcangeli [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 6 Oct 2004 06:03:23 +0200 On Tue, Oct 05, 2004 at 11:27:06PM -0400, Jeff Garzik wrote: > You're ignoring the argument :) > > If users and developers are presented with the _impression_ that long > latency code paths don't exist, then nobody is motivated to profile them > (with any tool), much less fix them. well, you are assuming those latencies are visible with eyes. they might be in extreme cases, but normally they're not (what people notices normally are disk latencies, and few people uses an RT userspace anyways which means they cannot claim the problem to be a lack of cond_resched, but more likely they want shorter timeslices in the scheduler etc..). So my point is that you need a measurement tool anyways...
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 00:08:47 -0400 Andrea Arcangeli wrote: > > well, you are assuming those latencies are visible with eyes. they might > be in extreme cases, but normally they're not (what people notices > normally are disk latencies, and few people uses an RT userspace > anyways which means they cannot claim the problem to be a lack of > cond_resched, but more likely they want shorter timeslices in the > scheduler etc..). So my point is that you need a measurement tool anyways... I do agree with that. I don't think that implies preempt is useful for anything except hiding stuff that should be fixed anyway, though... Preempt will always be something I ask people to turn off when reporting driver bugs; it just adds too much complicated mess for zero gain. Jeff
From: Andrew Morton [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 5 Oct 2004 21:46:05 -0700 Jeff Garzik [email blocked] wrote: > > Preempt will always be something I ask people to turn off when reporting > driver bugs; it just adds too much complicated mess for zero gain. What driver bugs are apparent with preemption which are not already SMP bugs? Only thing I can think of is unguarded use of per-cpu data, and we have runtime debug checks for that now.
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 02:04:27 -0400 Andrew Morton wrote: > Jeff Garzik [email blocked] wrote: > >>Preempt will always be something I ask people to turn off when reporting >> driver bugs; it just adds too much complicated mess for zero gain. > > > What driver bugs are apparent with preemption which are not already SMP bugs? If your implied answer is true, then we wouldn't need preempt_{en,dis}able() sprinkled throughout the code so much. Jeff
From: Andrew Morton [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 5 Oct 2004 23:16:42 -0700 Jeff Garzik [email blocked] wrote: > > Andrew Morton wrote: > > Jeff Garzik [email blocked] wrote: > > > >>Preempt will always be something I ask people to turn off when reporting > >> driver bugs; it just adds too much complicated mess for zero gain. > > > > > > What driver bugs are apparent with preemption which are not already SMP bugs? > > If your implied answer is true, then we wouldn't need > preempt_{en,dis}able() sprinkled throughout the code so much. > Where? `grep -r preempt_disable drivers' points at one bodgy scsi driver. `grep -r preempt_disable fs' finds two instances of per-cpu data. `grep -r preempt_disable mm' finds three instances (wtf is vmalloc_to_page trying to do? Looks redundant) `grep -r preempt_disable ipc' is empty `grep -r preempt_disable net' is empty `grep -r preempt_disable include' gets a few. It's less than I expected, actually.

From: Nick Piggin [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 13:43:53 +1000 Jeff Garzik wrote: > > You're ignoring the argument :) > > If users and developers are presented with the _impression_ that long > latency code paths don't exist, then nobody is motivated to profile them > (with any tool), much less fix them. > But even without preempt you'd still have to profile the latency. If anyone with !preempt notices unacceptable latency, then they can report and/or profile and fix it. If not, then !preempt latency must be acceptable.
From: Jeff Garzik [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Tue, 05 Oct 2004 23:59:07 -0400 Nick Piggin wrote: > > But even without preempt you'd still have to profile the latency. You're making my point for me. If the bandaid (preempt) is not active, then the system can be accurated profiled. If preempt is active, then it is potentially hiding trouble spots. The moral of the story is not to use preempt, as it * potentially hides long latency code paths * potentially introduces bugs, as we've seen with net stack and many other pieces of code * is simply not needed, if all code paths are fixed Jeff
From: Nick Piggin [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 14:22:57 +1000 Jeff Garzik wrote: > Nick Piggin wrote: > >> Jeff Garzik wrote: >> >> >> But even without preempt you'd still have to profile the latency. > > > You're making my point for me. If the bandaid (preempt) is not active, > then the system can be accurated profiled. If preempt is active, then > it is potentially hiding trouble spots. No. It can still be accurately profiled. You could profile theoretical !preempt latency on a running preempt kernel. *You* are ignoring my point :) *Nothing* has to be fixed if !preempt users aren't seeing unacceptable latency. By definition, right? > > The moral of the story is not to use preempt, as it > * potentially hides long latency code paths > * potentially introduces bugs, as we've seen with net stack and many > other pieces of code So does CONFIG_SMP. For better or worse, it is in the kernel and therefore a preempt bug is a bug and not a reason to turn preempt off. > * is simply not needed, if all code paths are fixed > Jeff, the entire point of preempt is to not have to put cond_resched everywhere. So yes, you can fix it by putting in lots of cond_rescheds, or turning on preempt. What's more, with preempt, those that don't care about 100us latency don't have to be executing cond_resched 10000 times per second. I'd actually say that the code needs fixing if the !PREEMPT case is doing that much work to try to achieve insanely low latencies.
From: Aleksandar Milivojevic [email blocked] Subject: Re: Preempt? (was Re: Cannot enable DMA on SATA drive (SCSI-libsata, VIA SATA)) Date: Wed, 06 Oct 2004 10:16:08 -0500 Jeff Garzik wrote: > You're making my point for me. If the bandaid (preempt) is not active, > then the system can be accurated profiled. If preempt is active, then > it is potentially hiding trouble spots. > > The moral of the story is not to use preempt, as it > * potentially hides long latency code paths > * potentially introduces bugs, as we've seen with net stack and many > other pieces of code > * is simply not needed, if all code paths are fixed One can also look onto it from another angle: * conviniently resolves long latency code paths that can't be avoided * uncovers bugs that need to be fixed * implicitly fixes code paths It seems to me that you are mixing latency of the system, efficiency of the driver functions, and performance of the system in the way it suits your arguments. Those three influent each other, but should be looked at and solved separately. Preempt is a fix for latency. It doesn't (and can't) fix efficiency and performace. If you are using latency as a measure for efficiency and performance, you are mixing apples and oranges with bananas. Unefficient driver function (or code path) will not become efficient if you sprinkle it with cond_resched (it will only reduce the latency of the system). As you conviniently said, you are just putting band aid on the problem, instead of fixing it. Not different than using preept kernel, really. Only more time spent on it by developer, that might be used better somewhere else (like making code path more efficient). Performance of the system might be a bit lower with preempt kernel. But most of those that would notice or care (0.1% of users? probably less) would probably be happier without cond_resched executed thousands a time per second too, and would happily sacrifice latency to high performance. Finally, the bugs. Bugs need to be fixed. Period. If bug goes away when somebody turns off preempt on uniprocessor system, it may as well hit back when you move to non-preempt SMP system in even more obscure ways (because than you really have code paths executed in parallel). Telling somebody to try with non-preempt kernel should be debugging step, not the solution. -- Aleksandar Milivojevic [email blocked] Pollard Banknote Limited Systems Administrator 1499 Buffalo Place Tel: (204) 474-2323 ext 276 Winnipeg, MB R3T 1L7

Related Links:

Developer vs User

Anonymous
on
October 6, 2004 - 8:31pm

I think it's important to point out that Jeff Garzik is recommending that developers/testers disable preemption, not users, necessarily.

When you're a kernel developer you want areas of high latency to jump out at you so you can fix them, but if you're a normal user you want precisely the opposite. Preemption results in lower worst-case latencies which is exactly what you want as a user so your MP3s don't skip or your digital recordings aren't filled with gaps and jitter.

agree

Anonymous
on
October 7, 2004 - 6:30am

I agree.

Developer's needs are different than end users. I will not use a kernel unless it has low latency enabled on my workstation or my laptop.

heath

uh..

Anonymous
on
October 7, 2004 - 9:41am

low latency and preemptible are not the same thing.

However, they must also run w

Anonymous
on
October 7, 2004 - 12:17pm

However, they must also run with preemption and/or SMP otherwise they won't catch those kinds of bugs.

Preemption does not improve worst case latency!

Korpo
on
October 7, 2004 - 11:34pm

Preemption does not improve worst case latency, since it cannot preempt code paths under SMP-safe locks (where preemption is disabled, too, to protect kernel data structures). Bad luck: Some of the longest code paths in system calls are taking part under those locks. So average latency improves (which gives better responsiveness allover), but the realtime-relevant worst case latency does not improve.

This only changes with the low-latency patches or the TimeSys preemption extensions (which bind locks to the resources they protect, so a single resource lock cannot block all preemption).

BTW, this means preemption does hide only one "bug": long system call code paths not under lock. Long system call paths under lock (that hurt a lot in SMP as well) still stand out, and performance loss for inefficient code paths is even worse: Longer runtime, because the inefficient code path will have the scheduler run more often, but it's own runtime statistics stay the same - there are no hidden performance problems.

So - in the case of long code paths without locks - we have single error scenario that can be hidden by preemption, but not really "totally". Profiling will turn those problems out still.

Without preemption low latency patches would not be quite as useful, and vice versa. Both together are a very good way towards low worst case latency.

As seen on the Tree Channel

I just want a kernel that works

Anonymous
on
October 7, 2004 - 12:11pm

Why does windows XP feel so much more responsive and solid on the exact same hardware? Whatever it is, why can't I have that? :)

In Linux it's the developers

Anonymous
on
October 7, 2004 - 12:37pm

In Linux it's the developers that decide what users need, not the other way around. :)

its not so much the kernel

Anonymous
on
October 7, 2004 - 1:25pm

Its not the kernel thats making your linux desktop slow, Its gnome, kde, pango, and all that gui shit. If you run linux as a server, its just as fast as win2k3, if not faster at most things. I mean look at MAC OS X, that runs on top of a modified freebsd kernel, I mean look how fast that is, and it keeps getting faster and more responsive every release. Pango is the biggest problem of them all, it uses unicode which uses lots of bytes, so it takes the CPU longer to calculate all those bytes just to put a fricken character on the screen...

-----Johnny Q =)

Do not confuse CPU overload d

florin
on
October 7, 2004 - 2:12pm

Do not confuse CPU overload due to too many apps running, and latency problems. These are very different matters.
There are some patches made by Ingo Molnar which reduce the latency in a spectacular way, even when running all those apps you mention. They're just not in mainstream yet.

MacOSX fast???

Anonymous
on
October 7, 2004 - 3:08pm

What are you smoking? It must be good stuff. Or you have an extremely fast G5...

MacOS X

Anonymous
on
October 7, 2004 - 4:29pm

MacOS X doesn't use a FreeBSD-derived kernel. Mach 3.x does most of the core duties, and services like networking and file I/O are handled by a 4.4BSD layer with lot's of imported FreeBSD and NetBSD code. OS X's low latency is a result of a good audio layer (CoreAudio), and the Mach kernel.

Come on, Mac OS X _is_ BSD

Korpo
on
October 7, 2004 - 11:52pm

Mac OS X is, IIRC, a BSD/Mach.

1) It is a microkernel, but only one monolithic server does implement all (or nearly all, I don't bother) services: a modified BSD kernel.
2) Micro kernel and servers run in the same context to avoid additional context switches - there's not that much of distinction.

Now let's look at the evolution of FreeBSD: FreeBSD has incorporated, especially in its virtual memory layer, Mach kernel features. It reincorporated the external calls of the Mach as internal calls, therefore merging a Mach (or part of it), back in the BSD server.

So, if server(s) and kernel run in the same context with MacOS X, it's nothing more than a little bit cleaner interface - it's not something that helps stability, because if the BSD server goes down, it takes all with it: It simply carries too much system state in one place. Even if Apple separates out some servers for things it implements itself (like sound), still every of these components can corrupt kernel data structures (same context). And improved performance? Since in BSD there are already kernel threads, where is the improvement by different schedulable entities in the Mach kernel over BSD?

Mach gives little improvement, while introducing little overhead. It's actually a futile exercise, since it does not bring in the real benefits of the microkernel approach, like for example QNX does: Different contexts, and servers are easily restartable (each lives in its own context like a process does, and needs a system call to enter the kernel, too).

Performance and stability do not improve by this approach, but latency might: Servers are preemptible except when altering microkernel state. This is nearly the same as preemption in Linux: System calls can be preempted, but not if changing kernel state (under SMP/preemption lock).

Where is now the _real_ advantage of MacOS X? In its nice audio layer? That's ok with me, but does not bother me at all. Different users, different needs.

As seen on the Tree Channel

FreeBSD versus Mach

Anonymous
on
October 8, 2004 - 12:18pm

FreeBSD's kernel is still much different than Mach. The reason OSX doesn't use it (basically only uses the network stack), is because Cocoa uses features of Mach that would be difficult/inefficient to port to FreeBSD. And, well, FreeBSD doesn't really run reliably on PowerPC yet.

Is Darwin/XNU a microkernel? No, not by a long shot. But it does have features that the free Unix kernels lack, and its performance isn't abysmal. Why not keep it?

OSX and Linux

Anonymous
on
October 8, 2004 - 4:25am

I've worked fairly extensively on Mac hardware and I must say I'm rather impressed how fast Linux runs on them. It is much faster than OSX. On a desktop, that won't matter much since most machines are beefy enough nowadays, but on the older Mac laptops running OSX is not very fun.

I call BS.

Mr_Z
on
October 8, 2004 - 1:36pm

I call BS on Unicode. Windows has pretty deep Unicode integration, as I understand it. I'd say that the additional cost moving a few text strings around is NOT the source of any interactiveness issues.

All the round-trips through the X server to your desktop manager and other such things are a big source of slowdown. I'm no expert in that department, but it seems the basic architecture of X + window manager + environment manager + apps all going over a network interface is flexible but potentially slow. Any little bit of latency in one-way communication translates to huge latencies as you have several round-trips between all the pieces, just to draw a character on the screen.

The problem isn't network transparency

Anonymous
on
October 8, 2004 - 3:49pm

I'm not a kernel expert but from what I understand Unix domain sockets--the mechanism X clients communicate to the server through--are extremely low-overhead. The performance problem with X11 lies elsewhere, probably due to the obsolescent XFAA driver acceleration API among other things.

XFAA?

Anonymous
on
October 11, 2004 - 7:22am

What is XFAA? I've never heard of it. I counldn't even find any info via Google.

XFAA (actually it's XAA)

Anonymous
on
October 11, 2004 - 7:30am

XFree86 Acceleration Architecture

It's the driver acceleration API that was developed for XFree86 that the XOrg X11 distribution shares. It's widely considered to be deficient in a number of ways and inadequate for utilizing modern GPUs effectively.

From what I've read Keith Packard and the Xorg folks are working on a replacement, but it means that all the chipset drivers would have to be ported.

If you work extensively with

Anonymous
on
October 7, 2004 - 1:24pm

If you work extensively with network, you'll don't feel windows XP responsive anymore!
Whatever net problem you have, each applications just freeze and you have to wait until the network come back again.
So don't compare real systems like any unix flavor with some crappy code from redmond.

Damn Straight

Anonymous
on
October 7, 2004 - 1:27pm

Damn straight, u proved my above point

----Johnny Q =)

re: i just want a kernel that works

Anonymous
on
October 7, 2004 - 1:55pm

The reason your Windows XP feels so responsive is the MS has built all the graphics functions directly into the kernel! This hack does dramatically improve the feeling of quick response, at a horrendous cost of ugly design, instability, and security weaknesses. As another poster has mentioned, it can be achieved without resorting to this hack, as Mac OS-X demonstrates. FOSS desktops will get better with time. There's no corporate daddy demanding we fix this right now, so he can sell more units. So, it's in the kind hands of volunteer developers to do it. They will do it and will do it right! Maybe you'd like to volunteer to help GNOME or others get there.

XP is More Responsive?!

Anonymous
on
October 7, 2004 - 8:17pm

A recent version of Windows XP is the only version of Windows I own right now and, while I bought it for development of Windows software, I otherwise hardly touch it. I was shocked at how unresponsive it is. I also work a lot with MacOS X on Dual G5s and believe me, it's about as responsive as a 500 pound women in a 5 gallon bucket (heavily re-inforced bucket). Gentoo GNU/Linux on the same hardware responds like lightning--I really cannot exxagerate here..

The real issue here is that you cannot generalize GNU/Linux desktops, because each is put together differently. Consider SuSE's KDE, for example, is reasonable on most hardware. Mandrake's is noticeably slower but has more in it. Knoppix runs off a CD and, once loaded, runs surprisingly well considering. Lycoris Desktop/LX responds like a rabbit with it's ass on fire--even on pretty low-end hardware. Yet it's not unreasonably lacking in KDE features.

Having compiled numerous KDE desktop systems on x86-32/64 and ppc-32/64 systems, I can tell you outright: responsiveness (and overall performance) varies *greatly* on (1) how you build KDE (and Debian does it very well while RH does it noticeably less well), (2) how your disk drivers/partitioning are setup, and (3) how you build your kernel--pre-emption does help some.

Sadly, Gentoo's use flags and ebuilds don't facilitate well enough to match Debian per se.. But since you do have the source, it's reasonably easy to dig in deeper than what Gentoo more formally facilitates.

Windows XP is more responsive?? The answer can be yes, but is generally no. Remember, GNU/Linux is a highly versatile set of componants to build a working operating system--virtually all comparisons therefor depend greatly on how its built and put together.

Matthew

I have to agree

Hiryu
on
October 7, 2004 - 11:04pm

KDE3 (on Debian and SuSE) was a lot more responsive on my old computer than XP was on my girlfriend's faster laptop. As was the windows 2000 professional I was dual booting with on my older/slower machine (and Debian/SuSE with KDE3 was also more responsive than windows 2000 on the same hardware and still is on my several times faster new machine). She dual boots with SuSE and XP, and SuSE is mostly a lot faster/more-responsive on the same hardware for her.

There's a few areas where KDE (and gnome?) can appear slower. Specifically redraw is the area I notice most. I can't really see it so much on my current system since it's so much faster, but text redraw was noticable when dragging a window all over my desktop under Linux/FreeBSD w/KDE&Gnome where it wasn't noticable with Windows 2000.
Issues like these are being specifically addressed by the X.org devs right now.

Someone mentioned work that Ingo Molnar was doing on the kernel side of things to improve latency. He's probably working on several things but one of his projects is voluntary preemption: http://people.redhat.com/~mingo/voluntary-preempt/

You can some more of his stuff here: http://people.redhat.com/~mingo/

move != redraw

Ano Nymous
on
October 8, 2004 - 4:30pm

The thing seems to be that MSWindows doesn't actually redraw the windows when moving them, the program doesn't get any events anymore when it's moved, so it's more like taking a snapshot and moving that around, and then continuing the program. In X the program keeps running and redrawing all the time when being moved. So it should be no wonder that moving windows in MSWindows feels "snappier"...

Another reason why MSWindows may seem snappier is because all it's graphic programs, mainly (Internet) Explorer, are preloaded at startup (that's why the harddisk seems to be killed at each boot..). That stops working if you don't have plenty of ram though (256 was minimum or not enough for XP?).

I think the preloading of com

Anonymous
on
October 9, 2004 - 2:53pm

I think the preloading of components like IE is the only reason why XP seems more responsive sometimes (if you have a very fast machine with lots of RAM). Plus Konquerer takes a long time to load into memory - it's one thing that KDE should preload.

Slow Redraws With X11 Desktops

Anonymous
on
October 9, 2004 - 7:34pm


There's a few areas where KDE (and gnome?) can appear slower. Specifically redraw is the area I notice most. I can't really see it so much on my current system since it's so much faster, but text redraw was noticable when dragging a window all over my desktop under Linux/FreeBSD w/KDE&Gnome where it wasn't noticable with Windows 2000.
Issues like these are being specifically addressed by the X.org devs right now.


Yeah, that's a historical X11 problem dating back to the days when machines had less than a few megabytes total RAM and dozens of users. The problem is that the X11 server doesn't bother "remembering" the pixels that compose the window. Remembering all those pixels consumes too much memory. Instead when X11 notices a region of a window has been dirtied it sends the application an Expose event. The application then has to redraw all the dirtied pixels. That means lots of wasted CPU cycles a and noticeable redraw latency.


There are a couple of solutions. The most recent is X.org's Composite extension. Install the X.org server and you'll notice that dragging windows around is quick, just like MacOS X and Windows XP. The X.org server is storing the pixels that compose the window in offscreen memory. The X.org server redraws windows from the offscreen memory rather than sending Expose events. Feels very fast.


The last trick to get lightning speed is actually a perception hack. Keith and Jim are working on it for the next release of X.org. The perception is that redrawing seems to be slower if you notice flicker. They are implementing a double buffer mode into Composite so you'll never even see the pixels redraw.


I get the impression you already knew this but I'm filling in the details for other people who might not have known ;-)


-- nathanh

GUI stuff

Anonymous
on
October 8, 2004 - 4:27am

It is because the GUI runs in kernel space and has all sorts of advantages. (Compare with old style NT, which used to do like Linux because it is more secure, it is also sluggish under load.) Linux can not use that model right away, since it's not very unix-ish and has all sorts of security problems, but there might be other ways for the kernel to give special attention to X. I hope some kernel hackers look into that.

Window XP Snappiness vs Linux

Anonymous
on
October 8, 2004 - 5:54am

Well, you need to qualify the Linux that you are using. At home, I'm still running RedHat 9 with Ximian Desktop 2. I use the binary NVidia drivers for my 3D acceleration and that's fast. However, my normal desktop setup is sluggish in comparison to what I expect/would like. However, it's definetly acceptable. Nevertheless, when I move Windows around the screen, the repaints/updates to the background produce a shadow effect. The primary reason for this is that the GNOME 2.2 desktop doesn't employ faster double-buffering. By the way, my box is a Dual 1 Ghz Pentium III with 1 GB of RAM (PCI 133).

With that said, I popped Gnoppix 0.8 into my Dell 2.26 Ghz Pentium 4 w/256 MB RAM system at work and the video was was very smooth. Not a single window that I moved around the screen produced a shadow effect. Menu items popped up quickly. Everything was very snappy. Gnoppix uses GNOME 2.6 and, I believe, the X.org server (Note: GNOME 2.8 is available now). X.org may have some enhancements that XFree86 does not possess and GNOME 2.6, and greater, may exploit these X.org enhancements.

Note: My Window XP on the same DELL hardware is NOT as snappy. XP suffers from the same behavior as my home machine. When I try to move Windows around very quickly, either the movement is jerky or I see a window shadow! So, different experiences for different folks.

Anyway, the point is that a number of factors could affect your experience. However, take heart. Great Linux desktop performance is here in the form of the latest GNOME, or KDE, with the new X.org server and the efforts of freedesktop.org is overseeing all of these desktop improvements.

RedHat

Mr_Z
on
October 8, 2004 - 1:47pm

I've noticed that every RedHat release seems WAY slower than the previous, at least in terms of user responsiveness. It appears others in this thread have noticed the same thing.

I'm going to give VectorLinux a try on my aging 450MHz/128MB laptop, and on my 300MHz desktop box. RH9 is unusable on both--even just using a handful of gnome-terminals--and yet Win98 is just fine. Even RH6 is fine. Considering I started with Linux on a 8MB 486DX33, and had no problem running X with a dozen xterms then, I consider RH9 a huge step backwards in many ways.

I just need a more modern disto so that I have a chance of using a modern web browser.

Devs vs Users

Anonymous
on
October 8, 2004 - 1:11am

I agree it would be easier for devs to check their bugs when kernel premption is disabled. Isn't that already the case ?

But avoiding these bugs or "hiding" them for the end user is probably best.

Kernel recompile makes the most difference

Anonymous
on
October 15, 2004 - 2:04am

I have played around with all the CPU kernel options, turning them on and off and what I have noticed is premption w/ deadline scheduler and optimize for size makes the largest difference. I disable all the other schedulers and programs execute faster and I can shrink buffer sizes in most audio applications. I also get rid of all the logging that the system does for the most part because it seems to cause a lot of slowdown in linux multimedia applications. I know the security buffs out there are against this, but I have not had much in the way of crashes so I don't need logging too much anyhow. I also don't use modules, disabling as much as I can. SUSE and Mandrake has like 20 daemons running at startup and that will make your system slower on old hardware. Currently, my computer runs circles around Windows XP on the same hardware. I also compile alsa into my kernel ;)

here are the daemons I run at runlevel 5 in SUSE:
K09cups K09xprint K10sshd K17network

Linux is becoming bloatware ... trying to support everyone and everything ;-)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.