Linux: CFS and 3D Gaming

Submitted by Jeremy
on July 31, 2007 - 5:55am

Some of the concerns expressed about the Completely Fair Scheduler were reports that it might not handle 3D games as well as the SD scheduler. In a recent thread, Ingo Molnar noted, "people are regularly testing 3D smoothness, and they find CFS good enough and that matches my experience as well (as limited as it may be). In general my impression is that CFS and SD are roughly on par when it comes to 3D smoothness." He noted that all known regressions were reported against earlier versions of CFS that had long since been fixed, and that he was very interested in any new reports of regressions against the current version of the code, "what is more interesting (to me) is not the positive CFS feedback but negative CFS feedback (although positive feedback certain _feels_ good so don't hold it back intentionally ;-)," adding, "there are no open 3D related regressions for CFS at the moment." Ingo then offered benchmarks illustrating the improved 3D performance of CFS, with numbers showing it to perform as well and in some cases considerably better than the SD scheduler.

Linus Torvalds noted, "I don't think _any_ scheduler is perfect, and almost all of the time, the RightAnswer(tm) ends up being not 'one or the other', but 'somewhere in between'." He noted that he was confident that he'd made the right decision in merging CFS, then added, "but at the same time, no technical decision is ever written in stone. It's all a balancing act. I've replaced the scheduler before, I'm 100% sure we'll replace it again. Schedulers are actually not at all that important in the end: they are a very very small detail in the kernel."


From:	Kasper Sandberg [email blocked]
Subject: SD still better than CFS for 3d   (was Re: 2.6.23-rc1)
Date:	Fri, 27 Jul 2007 13:43:30 +0200

On Sun, 2007-07-22 at 14:04 -0700, Linus Torvalds wrote: 
> Ok, right on time, two weeks afetr 2.6.22, there's a 2.6.23-rc1 out there.
> 
> And it has a *ton* of changes as usual for the merge window, way too much 
> for me to be able to post even just the shortlog or diffstat on the 
> mailing list (but I had many people who wanted to full logs to stay 
> around, so you'll continue to see those being uploaded to kernel.org).
> 
> Lots of architecture updates (for just about all of them - x86[-64], arm, 
> alpha, mips, ia64, powerpc, s390, sh, sparc, um..), lots of driver updates 
> (again, all over - usb, net, dvb, ide, sata, scsi, isdn, infiniband, 
> firewire, i2c, you name it).
> 
> Filesystems, VM, networking, ACPI, it's all there. And virtualization all 
> over the place (kvm, lguest, Xen).
> 
> Notable new things might be the merge of the cfs scheduler, and the UIO 
> driver infrastructure might interest some people.

Im still not so keen about this, Ingo never did get CFS to match SD in
smoothness for 3d applications, where my test subjects are quake(s),
world of warcraft via wine, unreal tournament 2004. And this is despite
many patches he sent me to try and tweak it. As far as im concerned, i
may be forced to unofficially maintain SD for my own systems(allthough
lots in the gaming community is bound to be interrested, as it does make
games lots better)


<snip>


From: Ingo Molnar [email blocked] Subject: Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Sun, 29 Jul 2007 19:06:41 +0200 * Kasper Sandberg [email blocked] wrote: > Im still not so keen about this, Ingo never did get CFS to match SD in > smoothness for 3d applications, where my test subjects are quake(s), > world of warcraft via wine, unreal tournament 2004. [...] here's an update: checking whether Wine could be a factor in your problem i just tested latest CFS against latest SD with a 3D game running under Wine: v2.6.22-ck1 versus v2.6.22-cfsv19 (to get the most comparable kernel), using Quake 3 Arena Demo under Wine (0.9.41). Here are the results in a pretty graph: http://people.redhat.com/mingo/misc/cfs-vs-sd-wine-quake.jpg or, in text: 2.6.22-ck1 2.6.22-cfs-v19 ------------------------ ------------------------ quake + 0 loops | 41 fps quake + 0 loops | 41 fps quake + 1 loop | 3 fps quake + 1 loop | 41 fps quake + 2 loops | 2 fps quake + 2 loops | 32 fps quake + 3 loops | 1 fps quake + 3 loops | 24 fps quake + 4 loops | 0 fps quake + 4 loops | 20 fps quake + 5 loops | 0 fps quake + 5 loops | 16 fps Quake3-under-Wine behavior under SD/-ck: framerate breaks down massively during any kind of load. The game is completely unusable with 1 CPU loop running already! Quake3-under-Wine behavior under CFS: framerate goes down gently with load, gameplay remains smooth. Framerate is still pretty acceptable and the game is playable even with a 500% CPU overload. The graph looks good and the framerate reduction goes roughly along the expected 1/n 'fairness curve' - so it all looks pretty healthy. [Note: quake3 keeps its fully 41 fps even with 1 competing loop running on the CPU due to "sleeper fairness".] [ i've re-tested this using other SD and ck versions and other CFS versions such as v2.6.23-rc1 and the results are the same. To get the fps result i started a simple game scene: Single Player / Q3DM1 / I Can Win, turned on the fps display of Quake3, and did not move the player at all, just looked at the framerate that is displayed. (i also tried other scenes and other gameplay sections and they all behave consistently with the above results.) The system was otherwise completely idle. While i trust these numbers take them with a grain of salt, i'm obviously not neutral in this thing :-) ] so Kasper, i'll definitely need your help in tracking down your 3D smoothness problem under CFS. I have the feeling that it could be some odd factor that only hits your system, and once we've tracked that down there will be a simple solution that does not affect the totality of the scheduler. So far only you have reported any 3D game smoothness problem against recent CFS versions. (all 3D feedback has been positive, and that includes a number of gamers as well. Most of the 3D smoothness problems were fixed in CFS v13..v15 and it has not been reported to have regressed since then.) Ingo
From: Ingo Molnar [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Sun, 29 Jul 2007 22:47:16 +0200 * John [email blocked] wrote: > Ingo- > > Why not perform the same test using the native linux Q3 client to > compare numbers to wine? [...] I regularly test native Linux games on CFS, and they all behave well. While waiting for more detailed data from Kasper i was looking for atypical stuff in Kasper's description about what his workload involves, and what looked a bit atypical was that Kasper's workload also involved gaming under Wine: > > > my test subjects are quake(s), world of warcraft via wine, unreal > > > tournament 2004. [...] and Wine is a more complex server/client scenario instead of a single (and simple) standalone Quake3 binary that the Linux binary does. So it looked more interesting from a scheduler workload (and scheduler regression) POV. In any case i'll need more info from Kasper. Ingo
From: Ingo Molnar [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 13:46:49 +0200 * John [email blocked] wrote: > On 7/29/07, Ingo Molnar [email blocked] wrote: > > > > > > * John [email blocked] wrote: > > > > > Ingo- > > > > > > Why not perform the same test using the native linux Q3 client to > > > compare numbers to wine? [...] > > > > I regularly test native Linux games on CFS, and they all behave well. > > While waiting for more detailed data from Kasper i was looking for > > atypical stuff in Kasper's description about what his workload involves, > > and what looked a bit atypical was that Kasper's workload also involved > > gaming under Wine: > > I understand that, I was just wondering if the FPS scales the same > natively vs. Wine as I typically only run native games. [...] people are regularly testing 3D smoothness, and they find CFS good enough: http://bhhdoa.org.au/pipermail/ck/2007-June/007816.html and that matches my experience as well (as limited as it may be). In general my impression is that CFS and SD are roughly on par when it comes to 3D smoothness. The Wine+Quake3 numbers i posted yesterday are so bad under SD that they must be some artifact in SD (possibly related to yield - i've strace-ed the tasks under SD today and they are blocking in yield), so they are not really representative of the general quality of SD (unless you are being hit by that particular regression). Still it is kind of ironic that when i tried to find a 3D regression in CFS i found a 3D regression in SD. What is more interesting (to me) is not the positive CFS feedback but negative CFS feedback (although positive feedback certain _feels_ good so dont hold it back intentionally ;-), and i cannot possibly give you any definitive answer: at this point CFS could still have artifacts and bugs, so "check and see yourself" is the best answer. All i can tell you is that there are no open 3D related regressions for CFS at the moment. > [...] I have been hesitant to move over to CFS due to reports of 3D > issues and wanted to see if you had numbers in regards to CFS vs. SD. i have no numbers now, other than the trivial native 'ppracer' game where SD and CFS have roughly the same framerate under load: SD CFS 0: 38.1 0: 38.1 1: 24.0 1: 24.2 2: 16.6 2: 16.1 3: 11.9 3: 12.3 4: 9.9 4: 9.7 5: 8.2 5: 8.1 which i'd have expected, ppracer is quite CPU-intense on my test-system, and the fairness model of SD and CFS is similar for CPU-bound tasks. But ... numbers from _me_ are suspect by definition, i wrote a good chunk of the CFS code :-) So it would be much more interesting if others provided more numbers. Would you be interested in trying CFS and doing some numers perhaps? It requires some work: you have to start up your favorite game in a way that gives a reliable framerate number. (many games allow the display of FPS in-game) In Quake3 i simply started the game and did not move the player - that is something easy to reproduce. then create load the following way, by entering this into a shell: while :; do :; done & that will cause a shell to just loop infinitely, hogging the CPU. This is the "1 loop" case in the numbers i posted. Start several of them to get more. (Type 'killall bash' in the same terminal to get rid of them.) Monitor how the FPS of your game changes when you start more and more CPU hogs, and note the numbers. Repeat it under SD and CFS as well, and please post the results into this thread. and note that CPU hogs are just one type of 'load' that a system can experience - IO load or networking load could impact your in-game experience just as much. If you see any artifact or FPS reduction under CFS i'll give you further info about how to debug it (were you interested in debugging it). Ingo
From: Kenneth Prugh [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 13:54:59 -0400 Ingo Molnar wrote: > <large snip> Hello, I have a gaming rig and would love to help benchmark with my copy of UT2004(E6600 Core2 and a 7950GTO card). Or if you have anything else that would better serve as a benchmark I could grab it and try. The only problem is I don't know what 2 kernels I should be using to test the schedulers. I assume 2.6.23-rc1 for CFS, but what about SD? -- Kenneth Prugh
From: Ingo Molnar [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 21:10:29 +0200 * Kenneth Prugh [email blocked] wrote: > Ingo Molnar wrote: > > <large snip> > > Hello, I have a gaming rig and would love to help benchmark with my > copy of UT2004(E6600 Core2 and a 7950GTO card). Or if you have > anything else that would better serve as a benchmark I could grab it > and try. > > The only problem is I don't know what 2 kernels I should be using to > test the schedulers. I assume 2.6.23-rc1 for CFS, but what about SD? .22-ck1 includes it, so that should be fine: http://ussg.iu.edu/hypermail/linux/kernel/0707.1/0318.html Ingo
From: Kenneth Prugh [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 17:24:27 -0400 Ingo Molnar wrote: > * Kenneth Prugh [email blocked] wrote: > >> Ingo Molnar wrote: >>> <large snip> >> Hello, I have a gaming rig and would love to help benchmark with my >> copy of UT2004(E6600 Core2 and a 7950GTO card). Or if you have >> anything else that would better serve as a benchmark I could grab it >> and try. >> >> The only problem is I don't know what 2 kernels I should be using to >> test the schedulers. I assume 2.6.23-rc1 for CFS, but what about SD? > > .22-ck1 includes it, so that should be fine: > > http://ussg.iu.edu/hypermail/linux/kernel/0707.1/0318.html > > Ingo > Alright, Just got done with some testing of UT2004 between 2.6.23-rc1 CFS and 2.6.22-ck1 SD. This series of tests was run by spawning in a map while not moving at all and always facing the same direction, while slowing increasing the number of loops. CFS generally seemed a lot smoother as the load increased, while SD broke down to a highly unstable fps count that fluctuated massively around the third loop. Seems like I will stick to CFS for gaming now. Below you will find the results of my test with the average number of FPS. CFS | SD UT2004 + 0 loops | 200 FPS UT2004 + 0 loops | 190 FPS UT2004 + 1 loops | 195 FPS UT2004 + 1 loops | 190 FPS UT2004 + 2 loops | 190 FPS UT2004 + 2 loops | 190 FPS UT2004 + 3 loops | 189 FPS UT2004 + 3 loops | 136 FPS UT2004 + 4 loops | 150 FPS UT2004 + 4 loops | 137 FPS UT2004 + 5 loops | 145 FPS UT2004 + 5 loops | 136 FPS UT2004 + 6 loops | 145 FPS UT2004 + 6 loops | 105 FPS UT2004 + 7 loops | 118 FPS UT2004 + 7 loops | 104 FPS UT2004 + 8 loops | 97 FPS UT2004 + 8 loops | 104 FPS UT2004 + 9 loops | 94 FPS UT2004 + 9 loops | 89 FPS UT2004 + 10 loops | 92 FPS UT2004 + 10 loops | 91 FPS -- Kenneth Prugh
From: Miguel Figueiredo [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 22:34:21 +0100 can you apply the patch [1] that changes the behaviour of sched_yield on SD and report the results? SD should scale a lot better after the patch. 1 - http://bhhdoa.org.au/pipermail/ck/2007-July/008297.html -- Com os melhores cumprimentos/Best regards, Miguel Figueiredo http://www.DebianPT.org
From: Kenneth Prugh [email blocked] To: Miguel Figueiredo [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Mon, 30 Jul 2007 18:45:02 -0400 Miguel Figueiredo wrote: > > can you apply the patch [1] that changes the behaviour of sched_yield on SD > and report the results? > > SD should scale a lot better after the patch. > > 1 - http://bhhdoa.org.au/pipermail/ck/2007-July/008297.html > I Applied the patch. SD Seemed a bit smoother over the loads, although that could be a placebo effect. It wasn't until the 8 or 9th loop running that I could really notice that the fps were fluctuating in the map without looking at the fps counter. SD-Patched UT2004 + 0 loops | 202 FPS UT2004 + 1 loops | 201 FPS UT2004 + 2 loops | 199 FPS UT2004 + 3 loops | 143 FPS UT2004 + 4 loops | 145 FPS UT2004 + 5 loops | 145 FPS UT2004 + 6 loops | 112 FPS UT2004 + 7 loops | 110 FPS UT2004 + 8 loops | 108 FPS UT2004 + 9 loops | 90 FPS UT2004 + 10 loops | 89 FPS -- Kenneth Prugh - Ken69267 Gentoo AMD64 Arch Tester
From: Ingo Molnar [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Tue, 31 Jul 2007 11:45:26 +0200 * Kenneth Prugh wrote: > Alright, Just got done with some testing of UT2004 between 2.6.23-rc1 > CFS and 2.6.22-ck1 SD. This series of tests was run by spawning in a > map while not moving at all and always facing the same direction, > while slowing increasing the number of loops. > > CFS generally seemed a lot smoother as the load increased, while SD > broke down to a highly unstable fps count that fluctuated massively > around the third loop. Seems like I will stick to CFS for gaming now. > > Below you will find the results of my test with the average number of > FPS. Thanks Kenneth for the testing! I've created a graph out of your numbers: http://people.redhat.com/mingo/misc/cfs-sd-ut2004-perf.jpg (it also includes the SD numbers you got with the turn-yield-into-NOP hack applied.) Ingo
From: Matthew Hawkins [email blocked] Subject: Re: [ck] Re: SD still better than CFS for 3d ?(was Re: 2.6.23-rc1) Date: Tue, 31 Jul 2007 23:16:09 +1000 On 7/31/07, Ingo Molnar wrote: > * Kenneth Prugh wrote: > > CFS generally seemed a lot smoother as the load increased, while SD > > broke down to a highly unstable fps count that fluctuated massively > > around the third loop. Seems like I will stick to CFS for gaming now. My experience was quite similar. I noticed after launching the second loop that the FPS stuck down to 15 for about 20 seconds, then climbed back up to 48. After that it went rapidly downhill. This is similar to other benchmarks I've done of SD versus CFS in the past. At a "normal" load they're fairly similar but SD breaks down under pressure. The only other thing of interest is that the -ck kernel had the WM menus appear in about 3 seconds rather than 5-8 under the other two. Game: Nexuiz 2.3 OpenGL 2.0 shaders on Vertex Buffer Objects on Show FPS on ultimate quality 1024x768 2.6.23-git 0 48 1 48 2 48 3 48 4 40 5 38 6 33 7 28 8 22 9 22 10 18 2.6.22.1-ck 0 48 1 48 2 48 3 12 4 6 5 6 6 5 7 4 8 3 9 3 10 2 2.6.22.1-cfs-v19.1+ckbits [*] 0 48 1 48 2 48 3 46 4 45 5 43 6 36 7 32 8 25 9 24 10 24 [*] This kernel has the cfq-* and mm-* patches from -ck applied, and the above-background-load function from pre-SD ck patchsets (or 2.6.23-git) -- Matt
From: Kasper Sandberg [email blocked] Subject: Re: Linus 2.6.23-rc1 Date: Sat, 28 Jul 2007 11:44:08 +0200 On Fri, 2007-07-27 at 19:35 -0700, Linus Torvalds wrote: > > On Sat, 28 Jul 2007, Kasper Sandberg wrote: > > > > Im still not so keen about this, Ingo never did get CFS to match SD in > > smoothness for 3d applications, where my test subjects are quake(s), > > world of warcraft via wine, unreal tournament 2004. And this is despite > > many patches he sent me to try and tweak it. > > You realize that different people get different behaviour, don't you? > Maybe not. Sure. > > People who think SD was "perfect" were simply ignoring reality. Sadly, > that seemed to include Con too, which was one of the main reasons that I > never ended entertaining the notion of merging SD for very long at all: > Con ended up arguing against people who reported problems, rather than > trying to work with them. Im not saying its perfect, not at all, neither am i saying CFS is bad, surely CFS is much better than the old one, and i agree with what that university test you mentioned on kerneltrap says, that CFS and SD is basically impossible to feel difference in, EXCEPT for 3d under load, where CFS simply can not compete with SD, theres no but, this is how it has acted on every system ive tested, and YES, others reported it too, whether you choose to see it or not. and others people who run games on linux tells me the exact same thing, and i have had quite a few people try this. > > Andrew also reported an oops in the scheduler when SD was merged into -mm, > so there were other issues. And whats the point here? If you are trying to pull the old "Con just runs away", forget it, its a certainty that he would have put the required time into fixing whatever issues arise. > > > As far as im concerned, i may be forced to unofficially maintain SD for > > my own systems(allthough lots in the gaming community is bound to be > > interrested, as it does make games lots better) > > You know what? You can do whatever you want to. That's kind of the point > of open source. Keep people honest by having alternatives. True that > > But the the thing is, if you want to do a good job of doing that, here's a > big hint: instead of keeping to your isolated world, instead of just > talking about your own machine and ignoring other peoples machines and First off, i've personally run tests on many more machines than my own, i've had lots of people try on their machines, and i've seen totally unrelated posts to lkml, plus i've seen the experiences people are writing about on IRC. Frankly, im not just thinking of myself. > issues and instead of just denying that problems may exist, and instead of > attacking people who report problems, how about working with them? As i recall, there was only 1 persons reports that were attacked, and that was because the person repeatedly reported the EXPECTED behavior as broken, simply because it was FAIRLY allocating the cpu time, and this did not meet with the dudes expectations. And it was after multiple mails he was "attacked" > > That was where the SD patches fell down. They didn't have a maintainer > that I could trust to actually care about any other issues than his own. You may not have been able to trust Con, but thats because you havent taken the time to actually really see whats been going on, if you just read the threads for SD you'd realize that he was more than willing to maintain it, after all, why do you think he wrote and submitted it? you think he just wrote it to piss you off by having it merged and leave? > > So here's a hint: if you think that your particular graphics card setup is > the only one that matters, it's not going to be very interesting for > anybody else. as explained earlier, its not just my particular setup, but actually that of alot of people, with lots of different hardware. > > > [ I realize that this comes as a shock to some of the SD people, but I'm > told that there was a university group that did some double-blind > testing of the different schedulers - old, SD and CFS - and that > everybody agreed that both SD and CFS were better than the old, but that > there was no significant difference between SD and CFS. You can try > asking Thomas Gleixner for more details. ] > > I'm happy that SD was perfect for you. It wasn't for others, and it had > nobody who was even interested in trying to solve those issues. > > As a long-term maintainer, trust me, I know what matters. And a person who > can actually be bothered to follow up on problem reports is a *hell* of a > lot more important than one who just argues with reporters. Okay, i wasnt going to ask, but ill do it anyway, did you even read the threads about SD? Con was extremely polite to everyone, and he did work with a multitude of people, you seem to be totally deadlocked into the ONE incident with a person that was unhappy with SD, simply for being a fair scheduler.
From: Linus Torvalds [email blocked] Subject: Re: Linus 2.6.23-rc1 Date: Sat, 28 Jul 2007 10:50:48 -0700 (PDT) On Sat, 28 Jul 2007, Kasper Sandberg wrote: > > First off, i've personally run tests on many more machines than my own, > i've had lots of people try on their machines, and i've seen totally > unrelated posts to lkml, plus i've seen the experiences people are > writing about on IRC. Frankly, im not just thinking of myself. Ok, good. Has anybody tried to figure out why 3D games seem to be such a special case? I know Ingo looked at it, and seemed to think that he found and fixed something. But it sounds like it's worth a lot more discussion. > Okay, i wasnt going to ask, but ill do it anyway, did you even read the > threads about SD? I don't _ever_ go on specialty mailing lists. I don't read -mm, and I don't read the -fs mailing lists. I don't think they are interesting. And I tried to explain why: people who concentrate on one thing tend to become this self-selecting group that never looks at anything else, and then rejects outside input from people who hadn't become part of the "mind meld". That's what I think I saw - I saw the reactions from where external people were talking and cc'ing me. And yes, it's quite possible that I also got a very one-sided picture of it. I'm not disputing that. Con was also ill for a rather critical period, which was certainly not helping it all. > Con was extremely polite to everyone, and he did work > with a multitude of people, you seem to be totally deadlocked into the > ONE incident with a person that was unhappy with SD, simply for being a > fair scheduler. Hey, maybe that one incident just ended up being a rather big portion of what I saw. Too bad. That said, the end result (Con's public gripes about other kernel developers) mostly reinforced my opinion that I did the right choice. But maybe you can show a better side of it all. I don't think _any_ scheduler is perfect, and almost all of the time, the RightAnswer(tm) ends up being not "one or the other", but "somewhere in between". It's not like we've come to the end of the road: the baseline has just improved. If you guys can show that SD actually is better at some loads, without penalizing others, we can (and will) revisit this issue. So what you should take away from this is that: from what I saw over the last couple of months, it really wasn't much of a decision. The difference in how Ingo and Con reacted to peoples reports was pretty stark. And no, I haven't followed the ck mailing list, and so yes, I obviously did get just a part of the picture, but the part I got was pretty damn unambiguous. But at the same time, no technical decision is ever written in stone. It's all a balancing act. I've replaced the scheduler before, I'm 100% sure we'll replace it again. Schedulers are actually not at all that important in the end: they are a very very small detail in the kernel. Linus

Related Links:

3D games and kernel scheduling

Dodger73 (not verified)
on
July 31, 2007 - 10:13am

Hi all,

This is a bit of a shot in the dark (IANAKP, I Am Not A Kernel Programmer), but one of the reasons for the 'scheduler invariance' seen here might be that the games listed here won't put just a whole lot of load on the scheduler - they're mostly single threaded (very few games released before 2006 even actually run sound processing in a second thread) and have a fairly straightforward loop only requiring few switches between processes per frame. The typical PC game loop looks kind of like this:

1. get current input device state
2. run physics, animation, AI
3. run gameplay code (dieing, spawning, scoring, etc.)
4. run visibility detection
5. send off everything found in 4. to the 3D API to render
6. kick off new sounds
7. swap buffers and repeat

only step 1., 5. and 6. make calls to drivers or system routines that may run in a separate thread (such as the OpenGL driver), 5. should by definition try to do as little work as possible before handing the data to the GPU, 1 should be absolutely minimal, and I can't imagine 6 running for more than maybe a ms before the driver is done and the hardware is working on its own (mileage may vary, as this depends somewhat on the hardware configuration and the game engine, of course).

So, assuming there's not a whole lot running on the machine in addition to the game, Could this point to the scheduler not having just a whole lot of scheduling to do? A game like Supreme Commander (if it can be run in a meaningful way with Wine or Cedega) would probably be a better test, as it is heavily multithreaded in order to take advantage of multi-core and multi-processor systems. Or maybe running UT2007 in windowed mode twice, with two instances running their own processing and making calls to drivers, in order to increase the load on the scheduler.

Again, I might be completely off target, so bear with me ;)

The scheduler matters

Anonymous (not verified)
on
July 31, 2007 - 10:49am

The scheduler matters because it interrupts the game's process(es) to run other processes. A hard core gamer would probably run on Linux with as few other processes as possible, but many people might want to leave other stuff running (if only because they don't want to configure their system to not run some stuff). Also newer linux kernels have parts of their device drivers running as user processes which also much be scheduled.
If you run a game along with something like a web browser with 20 tabs open (like me) and some of those tabs having things like myspace (full of annoying animated graphics/flash) and gmail (fortified with javascript looping), then you've got lots of places for bad scheduling to go on. And don't forget that X windows is another user level process that's also running, so there are 2 programs whose scheduling matters. If the game's processes or X are scheduled the wrong way there may be times when the game can't update the screen or process input for long enough to be noticed and/or affect game play, all because someone's raining stars effect on myspace is using the processor more than you would desire or maybe folding@home is still doing its calculations and not niced really low.

The scheduler is supposed to

e (not verified)
on
July 31, 2007 - 11:13am

The scheduler is supposed to help performance when _other_ tasks are running

3D gaming and scheduler

Carole Freeman (not verified)
on
December 3, 2007 - 12:23pm

Hopefully, even if they are really useful and helps everything run smoothly, we might soon be able to work without relying on a scheduler for 3D games. They will always be there, but as computer gets stronger and we get many cores (and maybe many cpu) in the future, independent 3D cards (in addition to the graphic card), we might be able to run the games as well on linux.

CFS was smoother

Anonymous (not verified)
on
July 31, 2007 - 1:17pm


one of the reasons for the 'scheduler invariance' seen here

You seem to have missed that part of the emails where tester after tester said that CFS offered a much smoother gameplay than SD. That was true even after some SD "yield patch" was posted and applied to the SD scheduler and all 3 "yield modes" were tested through. Each tester found CFS better for 3D gaming.

scheduler tests

Anonymous (not verified)
on
August 3, 2007 - 1:27am

Ingo's test didn't make much sense to me; I get the impression that the scene is static so that's almost like measuring FPS with the GIMP. Someone actually play the characters and have someone peek at the FPS etc as you run around blasting things. This will cause variation in the FPS, but if the differences between CK and CFS are not that much, why complain about "CFS is not as good". The test needs to be with a dynamic 3D setting and people can say how many loops they run before they get stuttering on the audio, tearing/skipping on video, etc.

Perhaps another option is to run a 3D demo with a 'fly-through' and watch the FPS. If I remember correctly, the CrystalSpace3D project had such a fly-through a year or two ago.

nobody wants dead patches,

Anonymous (not verified)
on
July 31, 2007 - 10:17am

nobody wants dead patches, so if you are not going to get further in this SD CFS issue, you might want to go further in the preemtive tasking, which is another 'huge' issue for the desktop applications

Do you mean the rt kernel

Anonymous (not verified)
on
July 31, 2007 - 1:18pm

Do you mean the rt kernel from Ingo Molnar?

Linus asked the question.

Anonymous (not verified)
on
July 31, 2007 - 3:37pm

Its simple really. Microsoft cheats on 3d for games as well.

Windows uses a lot of biases on single user desktop clients.

There is a bias to active windows on client machine.

Direct X also has a bias.

Gamers don't expect fair. They expect bias so the game they are playing runs the best at the cost of everything else. And don't care of it slows down when they click to another application because they cannot see it.

This is a question that has not been answered for the Linux kernel. Most likely the true solution to this is somewhere between X11, Direct fb and the kernel. Its the one thing that makes Desktop special.

It has been answered

Anonymous (not verified)
on
July 31, 2007 - 5:17pm

This is a question that has not been answered for the Linux kernel.

Oh yes it has. The answer is called "nice levels".

I agree, as a gamer I DO

Anonymous (not verified)
on
August 1, 2007 - 6:37am

I agree, as a gamer I DO want bias to my games.

As a Linux user I'm quite capable of biasing games by myself. No bias in the kernel please :)

Not so capable

ccurtis
on
August 4, 2007 - 8:39am

One problem is that you actually can't, as a user, give your game a higher priority - you can only give it a lower priority.

It seems like there should be a third nice 'zone', say -5 to +5 (or even +15), where users can freely adjust priority higher or lower. -6 to -19 retain superuser-only usage and +6/+16 to +20 retain superuser intervention to increase the priority (decrease niceness).

Frankly, this seems like a deficiency to me but it is inherent in the kernel. It is probably specified POSIX behavior, but this makes single-user unix systems 'unfriendly', especially with the default nice level set to the highest user priority. Quake (UT, ...) should be able to say "When I'm running, I'm more important than apache" without requiring root privileges.

Sample code:

#include <stdio.h>
#include <sys/resource.h>

#define END 999

int main()
{
        int i, priority, level[] = { -5, 0, 9, 4, END };
        for( i = 0; level[i] != END; i++ )
        {
                priority = getpriority( PRIO_PROCESS, 0 );
                printf( "Current nice level: %d\n", priority );
                printf( "Setting level to %d\n", level[i] );
                setpriority( PRIO_PROCESS, 0, level[i] );
        }
        printf( "Final nice level: %d\n", priority );
        return 0;
}

The solution is there

Anonymous (not verified)
on
August 5, 2007 - 2:55am

If you _really_ think you need to give your games higher priority for them to run as you expect, you could create a SUID program that increases the priority up to a given level, drops privileges and executes the game. This program could be part of a whatever distribution decides to include it. For example, it could ship with permissions 4710 and owner root:nice, supposing a "nice" group was created for this purpose. This is an example code:

#include <sys/time.h>
#include <sys/resource.h>
#include <sys/types.h>
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

#define MIN_NUM_ARGS	(3)
#define ARG_PRIO	(1)
#define ARG_PROG	(2)

#define PROG_NAME	"niceify"

/* print usage instructions */
void print_usage(void)
{
	fprintf(stderr, "Usage: %s PRIORITY PROGNAME [ARGS]\n", PROG_NAME);
}

int main(int argc, char *argv[])
{
	long prio_l;
	int prio;
	char *endp;

	/* check number of arguments */
	if (argc < MIN_NUM_ARGS) {
		print_usage();
		exit(EXIT_FAILURE);
	}

	/* get command line indicated priority */
	prio_l = strtol(argv[ARG_PRIO], &endp, 0);
	if (endp != argv[ARG_PRIO] + strlen(argv[ARG_PRIO])) {
		print_usage();
		exit(EXIT_FAILURE);
	}
	if (prio_l < PRIO_MIN || prio_l > PRIO_MAX) {
		fprintf(stderr, "ERROR: priority out of range [%d, %d]\n",
			PRIO_MIN, PRIO_MAX);
		exit(EXIT_FAILURE);
	}
	prio = (int) prio_l;

	/* set that priority */
	errno = 0;
	setpriority(PRIO_PROCESS, (int) getpid(), prio);
	if (errno != 0) {
		perror(PROG_NAME);
		exit(EXIT_FAILURE);
	}

	/* drop privileges */
	if (setuid(getuid()) != 0) {
		perror(PROG_NAME);
		exit(EXIT_FAILURE);
	}

	/* exec indicated program */
	execvp(argv[ARG_PROG], argv + ARG_PROG);

	/* error executing program */
	perror(PROG_NAME);
	return EXIT_FAILURE;
}

Hardly

ccurtis
on
August 5, 2007 - 8:32am

I wouldn't call that a solution. Any Windows user can, from the task manager, select one of their tasks and assign it one of about 5 priorities. This requires no root/suid privileges, nor should it. As long as the superuser can selectively (de)prioritize a task beyond a users' control there should be a zone within which a user can adjust their own computing 'niceness' and be able to do so freely (ie: bidirectionally).

Your code does not allow that and would let any ('nice' group) user run any task at -20 priority essentially disabling the system even from root. And of course you can change the code to restrict the nice level but we already have a program called 'renice' that would allow this if the kernel just natively supported the concept.

Re: Hardly

Anonymous (not verified)
on
August 5, 2007 - 3:07pm

I see where you're going. Let's forget about the above program for one second and analyze what unix in general and Linux in particular let you do nowadays. Today, you can also prioritize your programs from the GUI and select one of several available priorities. This does not require special privileges. Any task is started with the highest available "user priority", which is 0, and the range [0, 20] is available for you to decide your processes priority. So the situation is pretty similar.

The only thing you can't do is the bidirectional adjustment you mention, because currently you can only decrease the priority of your own processes, but you can't increase it beyond its level at a given moment. Being able to do so would be interesting, there's no denying that. However, after thinking a bit about the matter, I'm not sure it's really important. If you have a high quality scheduler, it will distribute the CPU among processes quite nicely. You may only want to play with nice levels if you have several tasks that require a lot of CPU and you are interested in giving priority to a group of them (usually only one, like a game while you compile or calculate something). In that case you can simply lower the priority of the other task(s). Sure, you can't revert your decision, but is it that critical? I don't think so.

Besides, if it really is critical in your case you can create a small tool like the one above that could work like a restricted "renice", only allowing you to renice your own processes and never above 0. Sure, it's a hack, but it's such an unusual task...

Not Unusual

ccurtis
on
August 5, 2007 - 6:25pm

I don't think it's so unusual. The most obvious case is:

-) Start several background tasks (eg, before you leave work).
-) Realize that you forgot and need to complete [foo] before you leave.
-) (Really, contrive any example where multiple things are running)
-) Make [foo] a higher priority.

You can't. Users, by default, run everything at the highest possible [user] priority. That's the same as running everything at -20 and expecting the scheduler to "do the right thing" when you have no way to tell it that you are physically waiting for [thing a] to complete.

Again, imagine widespread corporate adoption of Linux. Users are not given either root or sudo access. It's extremely restrictive to tell someone that they can make something more nice but never a higher priority. Instead, they can make everything except the high priority thing more nice. Then they're stuck with every new task they run being the highest priority because everything else is reniced. u-g-l-y.

Nothing here is specific to 3D games; it really is a system limitation that this simple adjustability is unavailable to standard users.

Hmmm, you can on Ubuntu

Anonymous (not verified)
on
August 23, 2007 - 10:04am

If you run Ubuntu and are in the admin group, then you can already renice your processes.

1. Open the Gnome System Monitor
2. Go to the processes tab and locate your task
3. right click and select priority.
5. Increase it's priority.

You will get the gksudo dialog box asking you to enter your password. Enter the password and your process will be reniced.

I don't think that is too complicated.

Actually there are other

Anonymous (not verified)
on
July 31, 2007 - 8:35pm

Actually there are other problems with gaming.
Basically the scheduller is important but more important is responsiveness and some more bias as someone mentioned it here. For ex. OSX and NT(and probably XP) have the same classic behavior known from Unix, which is 1. more responsiveness in the active window but few fps lower and 2. the scheduller lets the background tasks to behave more smoothly, slower but smooth. One thing that bothers me right now is the responsiveness than more fps in games.
Anyone heard this or something similar - "My computer is ultrasuper fast and this new OS lets me play my favourite game with max fps and in the same time in the background I can burn my favourite music on DVD".
Well, this bothers me most, because the active window gets more responsiveness, so the user thinks the game is actually faster and gets max fps.
The problem is with the background tasks.
Have you ever tried for ex. run K3B to burn full DVD and in the same time play UT2003 and get "max" (or near) fps?
One thing that i'm curious of is, if we can actually play a game, and burn DVD - which produces a lot of I/O (and dont forget about the active window and more responsiveness), then what impact can latency have on this? Because if we see the latency table it looks like the IDE controller ticks more faster (latency 0) than the Video Card (latency 248). So how we can get more responsiveness in active game window if all the IO is granted for the burning DVD. And which is more curious if all the IO is granted that way, how can someone actually play a game if CPU is 100% and worried more about the DVD than the active window :).

I hate to RTFM on you but...

Charles Goodwin (not verified)
on
August 1, 2007 - 9:08am

$ man nice

Been possible for a long, long time.

Yes I know I can use 'nice'

Anonymous (not verified)
on
August 1, 2007 - 10:31am

Yes I know I can use 'nice' to shuffle some privileges of runnig programs, yet i think it doesn't matter if the CPU will be still 100% doing copying, burning or whatever the dvd<->hdd is doing in the background. So let's say that i have right now vol. preempt in kernel, CFQ scheduler, and the default latencies (0 for IDE and 248 for VGA), do you think the 'nice' will change something?
Let's take the example. I run K3B. Next I use 'nice' to grant the game the highest possible level. Let's skip the rest of programs in the background and focus on those two. Game in the front and K3B in the back. What do you think it would happen?
Don't forget about preempt which forces everything to be more responsive no matter what the 'niced' level is.
Let me know if i'm wrong (RTFM on me again ;) but I think I will be worse than before granting higher level for game.

ionice

Anonymous (not verified)
on
August 1, 2007 - 12:54pm

K3B does not need a lot of CPU time, at least not while it's burning the disc per-se. You could use "ionice" to give K3B (or cdrecord or whatever) more priority for IO operations and make sure it doesn't reach a buffer underrun, while giving the game CPU priority with "nice".

This is more and more

Anonymous (not verified)
on
August 2, 2007 - 1:17am

This is more and more exciting :).
Because as you see:
1. I/O is eating my CPU
2. i can't grant more I/O for the burning app because it's already eating all my CPU
3. i can't use 'nice' to grant more privileges for the burning app, because the burning app itself doesn't need a lot of CPU, instead i need to give more for the game
4. i shouldn't be worried about the responsiveness because the vol. preempt is already providing this
5. i can't give more I/O using 'ionice' for the game because the game itself doesn't need all the I/O
6. i'm not worried about the buffer underrun, since the hardware and the software prevents that right now
So basically, it's a little more complicated.
If i turn off vol. preempt i get no responsiveness, and the 'ionice' is needed for both the game and the app.
Let me know if i'm wrong but i there's no easy solution without using 'nice', 'ionice', and fiddling with kernel. :)

I/O eating CPU?

Anonymous (not verified)
on
August 2, 2007 - 3:16pm

I/O is eating my CPU

I/O should not eat your CPU, unless DMA is not available in your system for some very weird reason. In any case, that would be a technical problem, not the way it's supposed to work normally.

Burning a CD by itself is a mere question of taking data from the hard drive and sending it to the CD recorder, and you don't need the CPU continuously during the data transfer. Only to program the devices. Burning a CD is not a CPU intensive task. I/O operations should not be eating your CPU.

Maybe, only maybe, creating the ISO image requires a bit more CPU and that's what's hurting you. However, K3B has an option to disable creating the image on the fly while sending it to the burner. Instead, it will create the image before, and then send it to the burner. By giving your 3D game more CPU priority, with "nice", the image creation process may take longer, but the burning process itself shouldn't be affected.

Well... Hdparm tells me the

Anonymous (not verified)
on
August 2, 2007 - 5:13pm

Well...
Hdparm tells me the DMA is on and the disk works as UDMA5 and the dvd works as UDMA4, yet all tasks dvd<->hdd are still eating the CPU on both, the old and the new ATA drivers.
No matter what the kernel is it has been like this all the time.
I don't know what's the reason because it is the Intel's ICH4 on the old Gigabyte motherboard.
If I turn DMA off using hdparm, it's worse, so I believe it's working.
Both using 80pin cables. Under XP there's also DMA turned on and not eating the CPU.
I know, it's weird ;).
And to make this more bizarre, I got another IDE controller with hdd and cdrw mounted in PCI slot. And basically every combination with disks and optical drives gives the same ;).
Both controllers are fully supported and the drivers are commented as "for production ready".
Latest possible kernel installed. (configured and compiled actually with libata which forces the DMA if it is available)
Any ideas? ;)

Oops

Anonymous (not verified)
on
August 2, 2007 - 5:39pm

I'm sorry, I should put the above as a forum post, because it's no more CFS vs SD problem ;).

A couple of questions/suggestions

Anonymous (not verified)
on
August 6, 2007 - 11:22am

Are you sure it's really CPU starvation, and doesn't just feel like it?
"Who" is top blaming for the CPU usage?
You wouldn't be using FUSE (Especially NTFS) for the image file, would you? I've seen a NTFS FUSE mount go crazy on large files before, allthough that WAS for writing, it could be related.

Come on, what is the urge to

Anonymous (not verified)
on
August 3, 2007 - 6:29am

Come on,
what is the urge to play AND at the same time do so many things in the meantime ? especially burning dvd ?
Can't you all spare 10 minutes to finish one thing before playing ?

Yeah, multitasking is meant to *allow* that, but do you so dramatically *need* that ?

With long lasting operations (say, several hours encoding videos ?) : who CARES if it lasts a little bit longer ? or even twice as long ?

I care - my computer should work without me noticing

Arne Babenhauserheide (not verified)
on
April 25, 2008 - 6:39pm

I care, because I want my computer to do work for me.

I don't like having to wait for my computer to finish something before being able to do something else.

OK, so I'm burning a DVD right now. All nice and well. But I don't want to wait till it's finished. That would be dead time.

I prefer working on something else, while my computer does the repetitive work.

And now look at Gentoo, where I often have big compiles running while I work. For this I'd really need a better RAM paging management, which keeps my user-programs in memory whenever possible, so my access times to my visible programs are as short as possible (that's a factor which is often overlooked, I think).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.