On 3/12/07, Con Kolivas <kernel@kolivas.org> wrote:I think the issue here is that the scheduler is doing what Con expects it to do, but not what Mike Galbraith here feels it should do. Maybe Con and Mike here are using different definitions, as such, for "interactivity", or at least have different ideas of how this is supposed to be accomplished. Does that sound right? I've begun using RSDL on my machines here, and so far there haven't been any issues with it, in my opinion. From a feel standpoint, it's not what I would call perfectly smooth, but it is better than the other schedulers I've seen (and the one case where there are still problems it is an issue of I/O contention, not CPU -- using RSDL has made a surprisingly large impact regardless). Perhaps, Mike Galbraith, do you feel that it should be possible to use the CPU at 100% for some task and still maintain excellent interactivity? (It has always seemed to me that if you wanted interactivity, you had to have the CPU idle at least a couple percent of the time. How much or how little that many percent had to be was usually affected by how much preempting you put in the kernel, and what CPU scheduler was in it at the time.) Considering the concepts put out by projects such as BOINC and SETI@Home, I wouldn't be thoroughly surprised by this ideology, although I do question the particular way this test case is being run. That said, I haven't run the test case in particular yet, although I will see if I can get the time to do so soon. In any case, I personally do have a few qualms about this test case being run on HT virtual cores: * I am curious about why splitting a task and running them on separate HT virtual cores improves interactivity any. (If it was Amarok on one virtual CPU and one lame on the other, I would get it. But I see two lame processes here -- wouldn't they just be allocated one to each virtual CPU, leaving Amarok out most of the time? How do you get interactivity with that?) Does using HT really fill up the CPU better than having the CPU announce itself as the single core it is? My understanding is that throughput goes down somewhat even just by using multiple threads with HT, compared to the single thread on the single core, and why would you use more than one lame thread unless you seek throughput? * Where are the lame processes encoding to/from? For example, are the results for both being sent to /dev/null? To a hard drive? etc. etc. In a real-world test case, I would imagine a user running TWO lame processes would be encoding from two sources to the same hard drive. (Or, they might even be both encoding FROM that same hard drive. Or both.) The need for the single HD to seek so much reduces throughput on most of these cases in HT, IIRC, which may be a factor that would probably defeat the point of this case for most users. Of course, my point is negated if they have multiple drives for their use of lame, and/or if they have sufficient memory and bandwidth to handle the issue, or if encoding throughput isn't their aim. The only reason I can think of that running two lame processes would improve "interactivity" would be so that if one particular portion gets stuck, then there's a chance the other thread will be working on an easier portion, making it appear like more is being done. This occurs, for example, with POV-Ray and Blender, where some parts of the image may require more time to render than others due to the variable complexity of various portions of a 3D image. In this case, using HT or multiple threads makes it more likely that at least one thread will be working on an "easy" spot, which would increase the number of pixels on screen in e.g. the middle of the render. However, the overall render usually wouldn't speed up on HT. In fact, the whole image may even take slightly longer due to the overhead of the threading, although that overhead is trivial for most users when they are using it. (And of course, the overhead would be negligible when you had two or more actual cores, because the increase would be more like 1.85x when you factored out the overhead. HT, by contrast, would give you, maybe 0.97x. Or whatever.) As for the realism of the scenario, maybe one might have a Linux server that is broadcasting/encoding the same source for multiple bit rates (e.g. Internet Radio) might run multiple lame instances... on a single core with HT. Not the most common thing in the world, but there are still quite a few of these guys out there. Enough to be of concern, IMO. Con has indicated somewhere recently he has an idea about improving negative nice; among other things -- maybe RSDL isn't capable of handling this test case yet, but it might very soon. Considering that any process not run at the top nice level ends up with pockets of smoothness followed by pauses of a determinable size, processes like lame that handle bits and pieces as fast at they come but need to do so at a steady rate may act peculiarly (not smoothly?) under RSDL. One last thing I'm not sure about: Mike, are you upset about lame interfering with Amarok, or lame in and of itself? Which process do you feel is getting too much or too little CPU? Why? I think Con is saying here that Mike, you are one of two or so people so far to have given a primarily negative feedback report on RSDL. (I think akpm hit a snag on some PPC box with -mm a while back, IIRC, and then there's you. I'm only on -ck, though, so there may be others I haven't heard about. But these two pale in comparison to the complements I've seen on-list.) Of course, it's still important that we see WHY it doesn't work well for you, while everyone else is having fewer (or no) issues. This seems to me like he's saying that there has to be a mechanism (outside of nice) that can be used to treat processes that "I" want to be interactive all special-like. It feels like something that would have been said in the design of what the scheduler was in -ck and is currently in vanilla. To me, that fundamentally clashes with the design behind RSDL. That said, I could be wrong -- Con appears to have something that could be very promising up his sleeve that could come out sooner or later. Once he's written it, of course. In any case, RSDL seems very promising, for the most part. -- Michael Chang -
| Andrew Morton | -mm merge plans for 2.6.23 |
| Rafael J. Wysocki | [Bug #11207] VolanoMark regression with 2.6.27-rc1 |
| Zhang, Yanmin | AIM7 40% regression with 2.6.26-rc1 |
| Con Kolivas | [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2 |
git: | |
| Gregory Haskins | [RFC PATCH 03/17] vbus: add connection-client helper infrastructure |
| David Woodhouse | [PATCH 03/30] solos: FPGA and firmware update support. |
| Natalie Protasevich | [BUG] New Kernel Bugs |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
