logo
Published on KernelTrap (http://kerneltrap.org)

Active Merge Windows

By Jeremy
Created May 2 2008 - 16:41

"This is starting to get beyond frustrating for me," complained David Miller of the latest merge window, launching what turned into a very lengthy and ongoing discussion [1] about the Linux kernel development process. The concept of a regular "merge window" was first discussed in July of 2005 with the release of the 2.6.14-rc4 kernel, following the 2005 Developers' Summit. From 2.6.14 on, the release of each official 2.6.y kernel has been followed by a two week period during which major changes are merged into the kernel, followed by a 2.6.y-rc1 release. David complained that this particular merge window has been more painful than others, "the tree breaks every day, and it's becoming an extremely non-fun environment to work in. We need to slow down the merging, we need to review things more, we need people to test their [...] changes!"

During the lengthy discussion, Linux creator Linus Torvalds explained:

"The notion that we should even _try_ to aim to slow things down, that one I find unlikely to be true, and I don't even understand why anybody would find it a logical goal? Of course, you will have fewer new bugs if you have fewer changes. But that's not a goal, that's a tautology and totally uninteresting. A small program is likely to have fewer bugs, but that doesn't make something small 'better' than something large that does more. Similarly, a stagnant development community will introduce new bugs more seldom. But does that make a stagnant one better than a vibrant one? Hell no. So what I'm arguing against here is not that we should aim for worse quality, but I'm arguing against the false dichotomy of believing that quality is incompatible with lots of change."


From: David Miller <davem@...>
Subject: Slow DOWN, please!!!
 [1]Date: Apr 29, 10:03 pm 2008

This is starting to get beyond frustrating for me.

Yesterday, I spent the whole day bisecting boot failures
on my system due to the totally untested linux/bitops.h
optimization, which I fully analyzed and debugged.

Today, I had hoped that I could get some work done of my
own, but that's not the case.

Yet another bootup regression got added within the last 24
hours.

I don't mind fixing the regression or two during the merge
window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS!

The tree breaks every day, and it's becomming an extremely
non-fun environment to work in.

We need to slow down the merging, we need to review things
more, we need people to test their fucking changes!
--

From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 3:36 pm 2008 On Wednesday, 30 of April 2008, David Miller wrote: > > This is starting to get beyond frustrating for me. > > Yesterday, I spent the whole day bisecting boot failures > on my system due to the totally untested linux/bitops.h > optimization, which I fully analyzed and debugged. > > Today, I had hoped that I could get some work done of my > own, but that's not the case. > > Yet another bootup regression got added within the last 24 > hours. > > I don't mind fixing the regression or two during the merge > window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! > > The tree breaks every day, and it's becomming an extremely > non-fun environment to work in. > > We need to slow down the merging, we need to review things > more, we need people to test their fucking changes! Well, I must say I second that. I'm not seeing regressions myself this time (well, except for the one that Jiri fixed), but I did find a few of them during the post-2.6.24 merge window and I wouldn't like to repeat that experience, so to speak. IMO, the merge window is way too short for actually testing anything. I rebuild the kernel once or even twice a day and there's no way I can really test it. I can only check if it breaks right away. And if it does, there's no time to find out what broke it before the next few hundreds of commits land on top of that. Thanks, Rafael --
From: Andrew Morton <akpm@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 4:15 pm 2008 On Wed, 30 Apr 2008 21:36:57 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > IMO, the merge window is way too short for actually testing anything. <jumps up and down> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! _anything_ which appears in 2.6.x-rc1 and which wasn't in 2.6.x-mm1 was snuck in too late (OK, apart from trivia and bugfixes). If we decide that we need to fix the oh-shit-lets-slam-this-in-and-hope problem then I expect we can do so, via fairly relible means. But the first attempt at solving it should be to ask people to not do that. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 4:31 pm 2008 On Wed, 30 Apr 2008, Andrew Morton wrote: > > <jumps up and down> > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! The problem I see with both -mm and linux-next is that they tend to be better at finding the "physical conflict" kind of issues (ie the merge itself fails) than the "code looks ok but doesn't actually work" kind of issue. Why? The tester base is simply too small. Now, if *that* could be improved, that would be wonderful, but I'm not seeing it as very likely. I think we have fairly good penetration these days with the regular -git tree, but I think that one is quite frankly a *lot* less scary than -mm or -next are, and there it has been an absolutely huge boon to get the kernel into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also started something like that). So I'm very pessimistic about getting a lot of test coverage before -rc1. Maybe too pessimistic, who knows? Linus --

From: Linus Torvalds <torvalds@...>
Subject: Re: Slow DOWN, please!!!
 [1]Date: Apr 30, 4:05 pm 2008

On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
> 
> IMO, the merge window is way too short for actually testing anything.

That is largely on purpose.

There's two choices:

 - have a longer and calmer merge window, spread out the joy, and have 
   people test and fix their things during the merge window too. In other 
   words, less black-and-white.

 - Really short merge window, and use the extra time *after* it to fix the 
   issues.

and I've obviously gone for the latter. In fact, I'd personally like to 
make it even shorter, because the problem with the long merge window can 
be summed up very simply:

   Long merge windows don't work - because rather than test more, it just 
   means that people will use them to make more changes!

So one of the major things about the short merge window is that it's 
hopefully encouraging people to have things ready by the time the merge 
window opens, because it's too late to do anything later.

And yes, we could have some other way of enforcing that - allow the merge 
window to be longer, but have some other mechanism to make sure that I 
only merge old code. 

In fact, I'd personally *love* to have a hard rule that says "I will only 
pull from trees that were already 'done' by the time the window opened", 
and we've been kind-of moving in that direction.

But that wish is counteracted by the fact that the merges themselves do 
need some development, so expecting everything to be ready before-hand is 
simply not realistic. 

Also, while I'd like trees to be ready when the window opens, at the same 
time I do think that it's good to spread out some of it, and get *some* 
basic testing - even if it's just a nightly build and a few tens of 
developers.

> I rebuild the kernel once or even twice a day and there's no way I can 
> really test it. I can only check if it breaks right away.

And really, that's all that we'd expect during the merge window. We want 
to find the *obvious* problems - build issues, and the things that hit 
everybody, but let's face it, the subtle ones will take time to find 
regardless.

Then, the short merge window means that we have more time when we really 
don't have big changes going in to find the subtle ones.

(And making the release cycle longer would *not* help - that would just 
make the next merge window more painful, so while it can, and does, work 
for some individual release with particular problems, it's not a solution 
in the long run).

			Linus
--

From: Paul Mackerras <paulus@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 7:29 pm 2008 Linus Torvalds writes: > So one of the major things about the short merge window is that it's > hopefully encouraging people to have things ready by the time the merge > window opens, because it's too late to do anything later. Having things ready by the time the merge window opens is difficult when you don't know when the merge window is going to open. OK, after you release a -rc6 or -rc7, we know it's close, but it could still be three weeks off at that point. Or it could be tomorrow. That's mitigated at the moment by having the merge window be two weeks long. So if you open the merge window at a point where I, or someone downstream of me, thought we still had two weeks to go, we can hurry up and try to get stuff finished within the first week and still get it merged. But if you made a really hard and fast rule that only stuff that is in linux-next at the point where the merge window opens can be merged, AND the point at which the merge window opens is unknown and unpredictable within a period of about 4 weeks, then that makes it really tough for those of us downstream of you to plan our work. By the way, if you do want to make that rule, then there's a really easy way to do it - just pull linux-next, and make that one pull be the entire merge window. :) But please give us at least a week's notice that you're going to do that. Paul. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 11:47 pm 2008 On Thu, 1 May 2008, Paul Mackerras wrote: > > Having things ready by the time the merge window opens is difficult > when you don't know when the merge window is going to open. OK, after > you release a -rc6 or -rc7, we know it's close, but it could still be > three weeks off at that point. Or it could be tomorrow. Well, if the tree is ready, you shouldn't need to care ;) That said: > By the way, if you do want to make that rule, then there's a really > easy way to do it - just pull linux-next, and make that one pull be > the entire merge window. :) But please give us at least a week's > notice that you're going to do that. I'm not going to pull linux-next, because I hate how it gets rebuilt every time it gets done, so I would basically have to pick one at random, and then that would be it. I also do actually try to spread the early pulls out a _bit_, so that if/when problems happen, there's some amount of information in the fact that something started showing up between -git2 and -git3. HOWEVER. One thing that was discussed when linux-next was starting up was whether I would maintain a next branch myself, that people could actually depend on (unlike linux-next, which gets rebuilt). And while I could do that for really core infrastructure changes, I really would hate to see something like that become part of the flow - because I'd hope things that really require it should be so rare that it's not worth it for me to maintain a separate branch for it. But there could be some kind of carrot here - maybe I could maintain a "next" branch myself, not for core infrastructure, but for stuff where the maintainer says "hey, I'm ready early, you can pull me into 'next' already". In other words, it wouldn't be "core infrastructure", it would simply be stuff that you already know you'd send to me on the first day of the merge window. And if by maintaining a "next" branch I could encourage people to go early, _and_ let others perhaps build on it and sort out merge conflicts (which you can't do well on linux-next, exactly because it's a bit of a quick-sand and you cannot depend on merging the same order or even the same base in the end), maybe me having a 'next' branch would be worth it. But it would have to be low-maintenance. Something I might open after -rc4, say, and something where I'd expect people to only ask me to pull _once_ (because they really are mostly ready, and can sort out the rest after the merge window), and if they have no open regressions (again, the "carrot" for good behaviour). I'm not saying it's a great idea, but if that kind of flow makes sense to people, maybe it should be on the table as an idea or at least see if it might work. But let's see how linux-next works out. Maybe all the subsystem maintainers can just get their tree in shape, see that it merges in linux-next, and not even need anything else. Then, when the merge window opens, if you're ready, just let me know. Linus --
From: Jeff Garzik <jeff@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 12:17 am 2008 Linus Torvalds wrote: > But there could be some kind of carrot here - maybe I could maintain a > "next" branch myself, not for core infrastructure, but for stuff where the > maintainer says "hey, I'm ready early, you can pull me into 'next' > already". > > In other words, it wouldn't be "core infrastructure", it would simply be > stuff that you already know you'd send to me on the first day of the merge > window. And if by maintaining a "next" branch I could encourage people to > go early, _and_ let others perhaps build on it and sort out merge > conflicts (which you can't do well on linux-next, exactly because it's a > bit of a quick-sand and you cannot depend on merging the same order or > even the same base in the end), maybe me having a 'next' branch would be > worth it. linux-next is _supposed_ to be solely the stuff that is ready to be sent to you upon window-open. The only thing that isn't reliable are the commit ids -- and that's at the request of a large majority of maintainers, who noted to Stephen R that the branch he was pulling from them might get rebased -- thus necessitating the daily tree regeneration. So, I think a 'next' branch from you would open cans o worms: - one more tree to test, and judging from linux-next and -mm it's tough to get developers to test more than just upstream - is the value of holy penguin pee great enough to overcome this another-tree-to-test obstacle? - opens all the debates about running parallel branches, such as, would it be better to /branch/ for 2.6.X-rc, and then keep going full steam on the trunk? After all, the primary logic behind 2.6.X-rc is to only take bug fixes, theoretically focusing developers more on that task. But now we are slowly undoing that logic, or at least openly admitting that has been the reality all along. Jeff --
From: Alan Cox <alan@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 5:17 am 2008 > - opens all the debates about running parallel branches, such as, would > it be better to /branch/ for 2.6.X-rc, and then keep going full steam on > the trunk? After all, the primary logic behind 2.6.X-rc is to only take That encourages developers to continue ignoring that stabilizing work. The stall does have a side effect of refocussing them. A branch for -rc and a monthly cycle would be interesting as it would mean that the pushback for not fixing stability problems would be not getting you work pulled for the main tree if you didn't fix the bugs first - and could be both sufficient an incentive and not too vicious as it would be with a 2 month cycle. Alan --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 12:46 am 2008 On Thu, 1 May 2008, Jeff Garzik wrote: > > linux-next is _supposed_ to be solely the stuff that is ready to be sent to > you upon window-open. Yes, the "stuff" may be supposed to be stable. But the trees feeding it certainly are not. People are rebasing them etc, and it doesn't matter because I think linux-next starts largely from scratch next time around. > So, I think a 'next' branch from you would open cans o worms: > > - one more tree to test, and judging from linux-next and -mm it's tough to get > developers to test more than just upstream > > - is the value of holy penguin pee great enough to overcome this > another-tree-to-test obstacle? > > - opens all the debates about running parallel branches, such as, would it be > better to /branch/ for 2.6.X-rc, and then keep going full steam on the trunk? I do agree. And maybe I should have made it clear that I think it's worth it to me only if it then means that the merge window can shrink. If I'd have both a 'next' branch _and_ a full 2-week merge window, there's no upside. Btw, it wouldn't be another tree to test, since it would presumaby be what 'linux-next' starts out from - so it would purely be something that doesn't have the constant re-merging of the more wild-and-crazy 'linux-next' tree. Linus --

From: Rafael J. Wysocki <rjw@...>
Subject: Re: Slow DOWN, please!!!
 [1]Date: Apr 30, 4:45 pm 2008

On Wednesday, 30 of April 2008, Linus Torvalds wrote:
> 
> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
> > 
> > IMO, the merge window is way too short for actually testing anything.
> 
> That is largely on purpose.
> 
> There's two choices:

Oh well, I don't think it's really that simple.

>  - have a longer and calmer merge window, spread out the joy, and have 
>    people test and fix their things during the merge window too. In other 
>    words, less black-and-white.
> 
>  - Really short merge window, and use the extra time *after* it to fix the 
>    issues.
> 
> and I've obviously gone for the latter. In fact, I'd personally like to 
> make it even shorter, because the problem with the long merge window can 
> be summed up very simply:
> 
>    Long merge windows don't work - because rather than test more, it just 
>    means that people will use them to make more changes!

And what do you think is happening _after_ the merge window closes, when
we're supposed to be fixing bugs?  People work on new code.  And, in fact, they
have to, if they want to be ready for the next merge window.

> So one of the major things about the short merge window is that it's 
> hopefully encouraging people to have things ready by the time the merge 
> window opens, because it's too late to do anything later.
> 
> And yes, we could have some other way of enforcing that - allow the merge 
> window to be longer, but have some other mechanism to make sure that I 
> only merge old code. 

How about, instead, putting limits on the amount of stuff that's going to be
merged during the next window?

> In fact, I'd personally *love* to have a hard rule that says "I will only 
> pull from trees that were already 'done' by the time the window opened", 
> and we've been kind-of moving in that direction.

Well, and when's the time for fixing bugs?  Surely not during the merge window
and also not after that, because otherwise people won't be ready for the next
merge window with the new code.

> But that wish is counteracted by the fact that the merges themselves do 
> need some development, so expecting everything to be ready before-hand is 
> simply not realistic. 
> 
> Also, while I'd like trees to be ready when the window opens, at the same 
> time I do think that it's good to spread out some of it, and get *some* 
> basic testing - even if it's just a nightly build and a few tens of 
> developers.
> 
> > I rebuild the kernel once or even twice a day and there's no way I can 
> > really test it. I can only check if it breaks right away.
> 
> And really, that's all that we'd expect during the merge window. We want 
> to find the *obvious* problems - build issues, and the things that hit 
> everybody, but let's face it, the subtle ones will take time to find 
> regardless.

Exactly.  Moreover, the code is now being merged at a pace that makes it
physically impossible to review it given the human resources we have.
 
> Then, the short merge window means that we have more time when we really 
> don't have big changes going in to find the subtle ones.

Sorry to say that, but I don't think this is realistic.  What happens after the merge
window is people go and develop new stuff.  They look at the already merged
code only if they have to.  Also, there are a _few_ people testing the kernel
carefully enough to see the more subtle problems, let alone debugging and
fixing them.

> (And making the release cycle longer would *not* help - that would just 
> make the next merge window more painful, so while it can, and does, work 
> for some individual release with particular problems, it's not a solution 
> in the long run).

My point is, given the width of the merge windown, there's too much stuff
going in during it.  As far as I'm concerned, the window can be a week long
or whatever, but let's make fewer commits over a unit of time.

Thanks,
Rafael
--

From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 5:37 pm 2008 On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > Long merge windows don't work - because rather than test more, it just > > means that people will use them to make more changes! > > And what do you think is happening _after_ the merge window closes, when > we're supposed to be fixing bugs? People work on new code. And, in fact, they > have to, if they want to be ready for the next merge window. Oh, I agree. But at that point, the issue you brought up - of testing and then having the code change under you wildly - has at least gone away. And I think you are missing a big issue: > Sorry to say that, but I don't think this is realistic. What happens after the merge > window is people go and develop new stuff. From a testing standpoint, the *developers* aren't ever even the main issue. Yes, we get test coverage that way too, but we should really aim for getting most of the non-obvious issues from the user community, and not primarily from developers. So the whole point of the merge window is *not* to have developers testing their code during the six subsequent weeks, but to have *users* able to use -rc1 and report issues! That's why the distro "testing" trees are so important. And that's why it's so important that -rc1 be timely. > My point is, given the width of the merge windown, there's too much stuff > going in during it. As far as I'm concerned, the window can be a week long > or whatever, but let's make fewer commits over a unit of time. I'm not following that logic. A single merge will bring in easily thousands of commits. It doesn't matter if the merge window is a day or a week or two weeks, the merge will be one event. And there's no way to avoid the fact that during the merge window, we will get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was 9629 commits). So your "fewer commits over a unit of time" doesn't make sense. We have those ten thousand commits. They need to go in. They cannot take forever. Ergo, you *will* have a thousand commits a day during the merge window. We can spread it out a bit (and I do to some degree), but in many ways that is just going to be more painful. So it's actually easier if we can get about half of the merges done early, so that people like Andrew then has at least most of the base set for him by the first few days of the merge window. So here's the math: 3,500 commits per month. That's just the *average* speed, it's sometimes more. And we *cannot* merge them continuously, because we need to have a stabler period for testing. And remember: those 3,500 commits don't stop happening just because they aren't merged. You should think of them as a constant pressure. So 3,500 commits per month, but with a stable period (that is *longer* than the merge window) that means that the merge window needs to merge that constant stream of commits *faster* than they happen, so that we can then have that breather when we try to get users to test it. Let's say that we have a 1:3 ratio (which is fairly close to what we have), and that means that we need to merge 3,500 commits in a week. That's just simple *math*. So when you say "let's make fewer commits over a unit of time" I can onyl shake my head and wonder what the hell you are talking about. The merge window _needs_ to do those 3,500 commits per week. Otherwise they don't get merged! Linus --
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 6:23 pm 2008 On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > > > Long merge windows don't work - because rather than test more, it just > > > means that people will use them to make more changes! > > > > And what do you think is happening _after_ the merge window closes, when > > we're supposed to be fixing bugs? People work on new code. And, in fact, they > > have to, if they want to be ready for the next merge window. > > Oh, I agree. But at that point, the issue you brought up - of testing and > then having the code change under you wildly - has at least gone away. > > And I think you are missing a big issue: > > > Sorry to say that, but I don't think this is realistic. What happens after the merge > > window is people go and develop new stuff. > > From a testing standpoint, the *developers* aren't ever even the main > issue. Yes, we get test coverage that way too, but we should really aim > for getting most of the non-obvious issues from the user community, and > not primarily from developers. > > So the whole point of the merge window is *not* to have developers testing > their code during the six subsequent weeks, but to have *users* able to > use -rc1 and report issues! > > That's why the distro "testing" trees are so important. And that's why > it's so important that -rc1 be timely. That's correct, but since developers are already working on new code at that point, the bug reports in fact distract them and make them go back to the "old" stuff, recall why they did that particular changes etc. As a result, the developers often do not take the bug reports seriously enough, especially if they do not finger the "guilty" change. That, in turn, makes the users believe there's no point in testing and reporting bugs. > > My point is, given the width of the merge windown, there's too much stuff > > going in during it. As far as I'm concerned, the window can be a week long > > or whatever, but let's make fewer commits over a unit of time. > > I'm not following that logic. > > A single merge will bring in easily thousands of commits. It doesn't > matter if the merge window is a day or a week or two weeks, the merge will > be one event. No, technically it doesn't. > And there's no way to avoid the fact that during the merge window, we will > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > 9629 commits). Well, do we _have_ _to_ take that much? I know we _can_, but is this really necessary? > So your "fewer commits over a unit of time" doesn't make sense. Oh, yes it does. Equally well you could say that having brakes in a car didn't make sense, even if you could drive it as fast as the engine allowed you to. ;-) > We have those ten thousand commits. They need to go in. They cannot take > forever. But perhaps some of them can wait a bit longer. > Ergo, you *will* have a thousand commits a day during the merge window. That's only if you insist on handling everything what people push to you. > We can spread it out a bit (and I do to some degree), but in many ways > that is just going to be more painful. So it's actually easier if we can > get about half of the merges done early, so that people like Andrew then > has at least most of the base set for him by the first few days of the > merge window. > > So here's the math: 3,500 commits per month. That's just the *average* > speed, it's sometimes more. And we *cannot* merge them continuously, > because we need to have a stabler period for testing. And remember: those > 3,500 commits don't stop happening just because they aren't merged. You > should think of them as a constant pressure. > > So 3,500 commits per month, but with a stable period (that is *longer* > than the merge window) that means that the merge window needs to merge > that constant stream of commits *faster* than they happen, so that we can > then have that breather when we try to get users to test it. Let's say > that we have a 1:3 ratio (which is fairly close to what we have), and that > means that we need to merge 3,500 commits in a week. > > That's just simple *math*. So when you say "let's make fewer commits over > a unit of time" I can onyl shake my head and wonder what the hell you are > talking about. The merge window _needs_ to do those 3,500 commits per > week. Otherwise they don't get merged! Surely, they don't, but maybe they don't have to. You can technically handle merging even more, but what about quality? Do we have a quality assurance process in place? If we do, what is it? How is it able to handle the 3500 commits a week? Assuming it is, will it be able to handle more and what's the limit? IMO, there has to be a limit somewhere, or we will end up in a spiral driving everybody mad. Thanks, Rafael --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 6:31 pm 2008 On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > And there's no way to avoid the fact that during the merge window, we will > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > 9629 commits). > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > necessary? Do you want me to stop merging your code? Do you think anybody else does? Any suggestions on how to convince people that their code is not worth merging? Linus --
From: Willy Tarreau <w@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 6:46 pm 2008 On Wed, Apr 30, 2008 at 03:31:22PM -0700, Linus Torvalds wrote: > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > And there's no way to avoid the fact that during the merge window, we will > > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > > 9629 commits). > > > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > > necessary? > > Do you want me to stop merging your code? > > Do you think anybody else does? > > Any suggestions on how to convince people that their code is not worth > merging? I think you're approaching a solution Linus. If developers take a refusal as a punishment, maybe you can use that for trees which have too many unresolved regressions. This would be really unfair to subsystem maintainers which themselves merge a lot of work, but recursively they may apply the same principle to their own developers, so that everybody knows that it's not worth working on next code past a point where too many regressions are reported. Willy --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 7:20 pm 2008 On Thu, 1 May 2008, Willy Tarreau wrote: > > > > Any suggestions on how to convince people that their code is not worth > > merging? > > I think you're approaching a solution Linus. If developers take a refusal > as a punishment, maybe you can use that for trees which have too many > unresolved regressions. Heh. It's been done. In fact, it's done all the time on a smaller scale. It's how I've enforced some cleanliness or process issues ("I won't pull that because it's too ugly"). I see similar messages floating around about individual patches. That said, I don't think it really works that well as "the solution": it works as a small part of the bigger picture, but no, we can't see punishment as the primary model for encouraging better bevaiour. First off, and maybe this is not true, but I don't think it is a very healthy way to handle issues in general. I may come off as an opinionated bastard in discussions like these, and I am, but when it actually comes to maintaining code, really prefer a much softer approach. I want to _trust_ people, and I really don't want to be a "you need to do 'xyz' or else" kind of guy. So I'll happily say "I can't merge this, because xyz", where 'xyz' is something that is related to the particular code that is actually merged. But quite frankly, holding up _unrelated_ fixes, because some other issue hasn't been resolved, I really try to not do that. So I'll say "I don't want to merge this, because quite frankly, we've had enough code for this merge window already, it can wait". That tends to happen at the end of the merge window, but it's not a threat, it's just me being tired of the worries of inevitable new issues at the end of the window. And I personally feel that this is important to keep people motivated. Being too stick-oriented isn't healthy. The other reason I don't believe in the "won't merge until you do 'xyz'" kind of thing as a main development model is that it traditionally hasn't worked. People simply disagree, the vendors will take the code that their customers need, the users will get the tree that works for them, and saying "I won't merge it" won't help anybody if it's actually useful. Finally, the people I work with may not be perfect, but most maintainers are pretty much experts within their own area. At some point you have to ask yourself: "Could I do better? Would I have the time? Could I find somebody else to do better?" and not just in a theoretical way. And if the answer is "no", then at that point, what else can you do? Yes, we have personalities that clash, and merge problems. And let's face it, as kernel developers, we aren't exactly a very "cuddly" group of people. People are opinionated and not afraid to speak their mind. But on the whole, I think the kernel development community is actually driven a lot more by _positive_ things than by the stick of "I won't get merged unless I shape up". So quite frankly, I'd personally much rather have a process that encourages people to have so much _pride_ in what they do that they want it to be seen as being really good (and hopefully then that pride means that they won't take crap!) than having a chain of fear that trickles down. So this is why, for example, I have so strongly encouraged git maintainers to think of their public trees as "releases". Because I think people act differently when they *think* of their code as a release than when they think of it as a random development tree. I do _not_ want to slow down development by setting some kind of "quality bar" - but I do believe that we should keep our quality high, not because of any hoops we need to jump through, but because we take pride in the thing we do. [ An example of this: I don't believe code review tends to much help in itself, but I *do* believe that the process of doing code review makes people more aware of the fact that others are looking at the code they produce, and that in turn makes the code often better to start with. And I think publicly announced git trees and -mm and linux-next are great partly because they end up doing that same thing. I heartily encourage submaintainers to always Cc: linux-kernel when they send me a "please pull" request - I don't know if anybody else ever really pulls that tree, but I do think that it's very healthy to write that message and think of it as a publication event. ] Linus --
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 8:42 pm 2008 On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Willy Tarreau wrote: [--snip--] > I do _not_ want to slow down development by setting some kind of "quality > bar" - but I do believe that we should keep our quality high, not because > of any hoops we need to jump through, but because we take pride in the > thing we do. Well, we certainly should, but do we always remeber about it? Honest, guv? > [ An example of this: I don't believe code review tends to much help in > itself, but I *do* believe that the process of doing code review makes > people more aware of the fact that others are looking at the code they > produce, and that in turn makes the code often better to start with. It may help directly, for example when people realize that they work on conflicting or just related changes. > And I think publicly announced git trees and -mm and linux-next are > great partly because they end up doing that same thing. I heartily > encourage submaintainers to always Cc: linux-kernel when they send me a > "please pull" request - I don't know if anybody else ever really pulls > that tree, but I do think that it's very healthy to write that message > and think of it as a publication event. ] I totally agree with that. Still, the issue at hand is that (1) The code merged during a merge window is somewhat opaque from the tester's point of view and if a regression is found, the only practical means to figure out what caused it is to carry out a bisection (which generally is unpleasant, to put it lightly). (2) Many regressions are introduced during merge windows (relative to the total amount of code merged they are a few, but the raw numbers are significant) and because of (1) the process of removing them is generally painful for the affected people. (3) The suspicion is that the number of regressions introduced during merge windows has something to do with the quality of code being below expectations, that in turn may be related to the fact that it's being developed very rapidly. My opinion is that we need to solve this issue sooner rather than later and so the question is how we are going to approach that. Thanks, Rafael --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 9:19 pm 2008 On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > I do _not_ want to slow down development by setting some kind of "quality > > bar" - but I do believe that we should keep our quality high, not because > > of any hoops we need to jump through, but because we take pride in the > > thing we do. > > Well, we certainly should, but do we always remeber about it? Honest, guv? Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like process generates quality? And I dislike how people try to conflate "quality" and "merging speed" as if there was any reason what-so-ever to believe that they are related. You (and Andrew) have tried to argue that slowing things down results in better quality, and I simply don't for a moment believe that. I believe the exact opposite. The way to get good quality is not to put barriers up in front of developers, but totally the reverse - by helping them. And yes, that help can quite possibly be in the form of "process" - by making things more streamlined, and by having people not have to waste time on wondering where they should send things etc. But the notion that we should even _try_ to aim to slow things down, that one I find unlikely to be true, and I don't even understand why anybody would find it a logical goal? Of course, you will have fewer new bugs if you have fewer changes. But that's not a goal, that's a tautology and totally uninteresting. A small program is likely to have fewer bugs, but that doesn't make something small "better" than something large that does more. Similarly, a stagnant development community will introduce new bugs more seldom. But does that make a stagnant one better than a virbrant one? Hell no. So what I'm arguing against here is not that we should aim for worse quality, but I'm arguing against the false dichotomy of believing that quality is incompatible with lots of change. So if we can get the discussion *away* from the "let's slow things down", then I'm interested. Because at that point we don't have to fight made-up arguments about something irrelevant. Linus --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 9:40 pm 2008 On Wed, 30 Apr 2008, Linus Torvalds wrote: > > You (and Andrew) have tried to argue that slowing things down results in > better quality, Sorry, not Andrew. DavidN. Andrew argued the other way (quality->slower), which I also happen to not necessarily believe in, but that's a separate argument. Nobody should ever argue against raising quality. The question could be about "at what cost"? (although I think that's not necessarily a good argument, since I personally suspect that good quality code comes from _lowering_ costs, not raising them). But what's really relevant is "how?" Now, we do know that open-source code tends to be higher quality (along a number of metrics) than closed source code, and my argument is that it's not because of bike-shedding (aka code review), but simply because the code is out there and available and visible. And as a result of that, my personal belief is that the best way to raise quality of code is to distribute it. Yes, as patches for discussion, but even more so as a part of a cohesive whole - as _merged_ patches! The thing is, the quality of individual patches isn't what matters! What matters is the quality of the end result. And people are going to be a lot more involved in looking at, testing, and working with code that is merged, rather than code that isn't. So _my_ answer to the "how do we raise quality" is actually the exact reverse of what you guys seem to be arguing. IOW, I argue that the high speed of merging very much is a big part of what gives us quality in the end. It may result in bugs along the way, but it also results in fixes, and lots of people looking at the result (and looking at it in *context*, not just as a patch flying around). And yes, maybe that sounds counter-intuitive. But hey, people thought open source was counter-intuitive. I spent years explaining why it should work at all! Linus --
From: David Miller <davem@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 9:51 pm 2008 From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT) > IOW, I argue that the high speed of merging very much is a big part of > what gives us quality in the end. It may result in bugs along the way, but > it also results in fixes, and lots of people looking at the result (and > looking at it in *context*, not just as a patch flying around). This is a huge burdon to put on people. The more broken stuff you merge, the more people are forced to track these problems down so that they can get their own work done. It punishes people who do put forth the effort to let new changes cook properly, before pushing, and thus avoid putting turds into the tree. You really have to think about the ramifications of this system. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 10:01 pm 2008 On Wed, 30 Apr 2008, David Miller wrote: > From: Linus Torvalds <torvalds@linux-foundation.org> > Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT) > > > IOW, I argue that the high speed of merging very much is a big part of > > what gives us quality in the end. It may result in bugs along the way, but > > it also results in fixes, and lots of people looking at the result (and > > looking at it in *context*, not just as a patch flying around). > > This is a huge burdon to put on people. > > The more broken stuff you merge, the more people are forced to track > these problems down so that they can get their own work done. I'm not saying we should merge crap. You can take any argument too far, and clearly it doesn't mean that we should just accept *anything*, because it will magically be gilded by its mere inclusion into the kernel. No, I'm not going to argue that. But I do want to argue against the notion that the only way to raise quality is to do it before it gets merged. It's often better to merge early, and fix the issues the merge brings up early too! Release early, release often. That was the watch-word early in Linux kernel development, and there was a reason for it. And it _worked_. Did it mean "release crap, release anything"? No. But it did mean that things got lots more exposure - even if those "things" were sometimes bugs. Linus --

From: David Newall <davidn@...>
Subject: Re: Slow DOWN, please!!!
 [1]Date: Apr 30, 12:03 am 2008

David Miller wrote:
> We need to slow down the merging, we need to review things
> more, we need people to test their fucking changes!

Yes.  The Linux process is becoming unreliable.  Newly "stable" versions
have stability problems.  The development process looks childish. 
Seasoned developers say not to worry, that the process works.  I do
worry.  BSD seems more attractive, and it may even be worth the
considerable effort to switch my entire client-base.  Linux was lucky to
gain the foothold that it did: traditionally, BSD had a better system
with a less restrictive licence, so it is surprising that manufacturers
chose to go with Linux.  BSD still has a less restrictive licence and
when mainstream press becomes interested in Linux's quality problems
it's adoption will fall.  BSD is still a good, maybe even better, option.

Linus, this is your baby and so it's your problem.  Only you have the
influence to change things.
--

From: David Miller <davem@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 12:18 am 2008 From: David Newall <davidn@davidnewall.com> Date: Wed, 30 Apr 2008 13:33:29 +0930 > Yes. Please don't use my posting as an opportunity to portray BSD as the best thing since sliced bread. We're having ONE bad merge window, we're facing the problem head on, RIGHT NOW, to prevent it in the future. It's not a severe ongoing issue as you portray it to be. --
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 9:04 am 2008 David Miller wrote: > We're having ONE bad merge window, we're facing the problem > head on, RIGHT NOW, to prevent it in the future. It's > not a severe ongoing issue as you portray it to be. No. The problem is more than just a bad merge window. There is poor or non-existent review; frequent "regressions"; release of kernels as stable when they are not. There is resentment and resistance to even acknowledging these problems. Take, as an example, the desire to NOT record who gives good code and who gives bugs: that one clearly hit a nerve, which it should not have except from people who feel guilty. I don't claim BSD to be perfect, but it appears to have a consistently good quality. Old Linux kernels also have that; new ones not so. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 10:51 am 2008 On Wed, 30 Apr 2008, David Newall wrote: > > I don't claim BSD to be perfect, but it appears to have a consistently > good quality. Lol. You should try VMS. Now *there* was a stable system. Oh, but it didn't actually make any progress, did it? The fact is, we're merging a lot. It comes from having a lot of development. If you don't want that, then you're a fool - because you aren't looking at the long term. > Old Linux kernels also have that; new ones not so. Can you point to any actual stability problem? The problem under discussion is the fact that some people are unhappy because we had some merge trouble. The fact is, the problems got fixed in a few days. And yes, we will probably will have to make Ingo follow the rules that pretty much everybody else also follows, and no, it's not going to solve all problems either - the fundamental issue is that we are just too damn good at development. And that's not a big problem in my view, as long as we are also also able to handle the _result_ of that flood of patches. Which, quite frankly, we are. DavidN, you just have an agenda, and you think that mentioning BSD as some kind of shining example of goodness is a good way to reach that agenda. It isn't. It just shows that you don't understand the issue, and that you think that "threatening" developers by saying you'll switch is a great way to make PR. But you know what? I really don't care one _whit_ what you do. You can switch to Vista for all I care, and I really don't mind. All I care about is doing a good job technically. And you just show that you don't have a clue what you are talking about. If you want stable kernel, don't follow the current -git tree. Don't mind the fact that in two weeks we merge 6672 files changed, 373817 insertions(+), 285901 deletions(-) and instead look at something like the enterprise kernels or other tree that lags the development tree by half a year or more exactly _because_ they care about stable, not development. In short: what do you think the git tree is? Is it something that should prioritize good developmnent, or is it something that should worry about you making inane arguments? Ask yourself that. Linus --
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 2:21 pm 2008 Linus Torvalds wrote: > Can you point to any actual stability problem? > Well of course. So could you because they are a matter of public record on the list. Don't pretend otherwise. Just to give you some recent, personal bugaboos, and not even drawing on the many hundreds of relevant messages on LKML each month: 1. Out of memory, caused by apparent leak somewhere, resulting in machine effectively hanging for a minute or two (massive disk i/o) culminating in termination of one or more processes. (For what it's worth: 512MB, no swap.) Problem takes a couple of days to develop (hence I suspect a leak.) This is running only Firefox, Thunderbird and Evince, plus whatever xubuntu wants. Restarting the killed application(s) causes the problem to recur. Restarting X doesn't help. Killing almost all processes also doesn't help. Reboot is required. This problem seems not to be in 2.6.17, but is in 2.6.22 (plus whatever patches xubuntu use) and 2.6.23. I'm still testing 2.6.25, but probably going to have to abandon it and go backwards, because... 2. Suspend to disk doesn't resume properly (two out of three times.) System comes back but X has severe wierdness. Draws frames and title bar, but not window contents. Text-mode is just as bad: Screen is blank (erased font table, perhaps?) Subsequent suspend to disk doesn't resume at all. Note the wide range of kernels exhibiting problem 1. I don't even want to think about problem 2 at this stage; I just want to stop having to reboot to reclaim memory, especially when a mate who does Windows training visits! > the fundamental issue is that we are just > too damn good at development. > Not so good. The process is flawed. Inadequate testing. Inadequate review. This has been mentioned by others, so you know I'm not making it up. The real fundamental issue is that people are too keen to release and don't appear to care enough about correctness. > you think that mentioning BSD as some > kind of shining example of goodness is a good way to reach that agenda. Yes, BSD does seem to be a shining example of goodness, but I didn't mention it because I think people should switch. I did so to warn of competition, to say that the world does not owe Linux a second chance and isn't going to give it one. It's pointless to debate the relative merits of the two systems because, aside from the kernel, they are identical; and there's little that matters between the kernels, other than one appears to have a careful, robust and professional development process. Make no mistake about this point: I'm not saying that BSD is better, rather that Linux cannot lose credibility and survive. > But you know what? I really don't care one _whit_ what you do. You can > switch to Vista for all I care, and I really don't mind. All I care about > is doing a good job technically. > Sadly, you're doing a bad technical job in certain, important areas. You're pushing out buggy kernels and claiming that they're stable. This can't continue. Attrition to BSD is the risk, not some threat that I'm making. > And you just show that you don't have a clue what you are talking about. > If you want stable kernel, don't follow the current -git tree. Why are you bringing up git trees (which I don't use)? I'm presently plagued with a problem that's 2.6.22 or older, extending to at least 2.6.23 and maybe still current. I've said quite clearly that I'm talking about "stable" kernels, yet you presume I mean the git tree. Yet it's not the specifics of the problem I'm having that matters, it's the systemic problems in Linux's development process. I don't think I've anything to add unless the topic evolves in a direction that asks what should be changed. I'm posting this only because I want on record the answer to the question about actual stability problems. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 2:27 pm 2008 On Thu, 1 May 2008, David Newall wrote: > > Why are you bringing up git trees (which I don't use)? I'm presently > plagued with a problem that's 2.6.22 or older, extending to at least > 2.6.23 and maybe still current. Ok, *PLONK*. You're on an old kernel, don't know if your problem is fixed, and ask us to slow down development. That makes sense. Go away. Linus --
From: Chris Friesen <cfriesen@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 3:06 pm 2008 Linus Torvalds wrote: > > On Thu, 1 May 2008, David Newall wrote: > >>Why are you bringing up git trees (which I don't use)? I'm presently >>plagued with a problem that's 2.6.22 or older, extending to at least >>2.6.23 and maybe still current. > > > Ok, *PLONK*. > > You're on an old kernel, don't know if your problem is fixed, and ask us > to slow down development. > > That makes sense. > > Go away. He did say that he was testing 2.6.25, and that suspend-to-disk was broken in 2.6.25. Chris --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 3:13 pm 2008 On Wed, 30 Apr 2008, Chris Friesen wrote: > > He did say that he was testing 2.6.25, and that suspend-to-disk was > broken in 2.6.25. Neither of which had anything to do with the whole "slow down" argument. If you have a bug, make a bug report, and push it, and make people aware of it. But don't make it an argument for development to slow down. Should we all stand around with our thumbs up our *ss because somebody has a bug? Should the other developers just stop, because suspend-to-disk is broken for somebody? Should everything come to a standstill because David Newall doesn't like how there are other things going on that are independent of _his_ problems? Do you really believe that? Linus --
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 3:22 pm 2008 Linus Torvalds wrote: > Should everything come to a standstill because David > Newall doesn't like how there are other things going on that are > independent of _his_ problems? > You're being a nasty piece of work this day, Linus, and you're fibbing by mischaracterising what I said which, by the way, included, "it's not the specifics of the problem I'm having that matters". You're taking this far too personally. Get a grip. --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 3:42 pm 2008 On Thu, 1 May 2008, David Newall wrote: > > You're taking this far too personally. Umm. If you didn't want a personal opinion, why did you Cc me in the first place then, and ask for my input? I gave my input to you. I think your arguments are ludicrous, to the point of being totally idiotic. You complain how I don't release kernels that are stable, but without any suggestions on what the issue might be, apart from apparently me merging too much and making too many releases. But do you really expect me to stop merging, or hold up releases that fix hundreds of issues, just because there are other issues pending? Do you really think development can be stopped? Trust me, we've tried. Every time, it just leads to worse problems when the floodgates are then opened. And yes, there is a solution: don't develop so much. Don't allow thousands of developers to be involved. Do a small core group, and make development so hard or inconvenient that you only have a few tens of people who write code, and vet them and force them to jump through hoops when adding new features (or fixing old ones, for that matter). And yes, that *does* result in a "stable" system. Never mind that it's stable for all the wrong reasons, and generally doesn't actually work well across a dynamic environment (whether the hardware base below or user space above). See? This is why I think your arguments are so silly and misguided. But if you actually have real constructive ideas on things to actually *do*, please do mention them. We've changed our models over time, several times, exactly because we've searched for better ways to do thigns. But do realize that (a) we can't just stop, or even really slow down. We can onyl try to regulate and to some degree direct the flood, not hold it up for any particular issue. (b) We do have process in place, and it may not be perfect, but I doubt anything is, and what we do have actually has evolved over the years. And that's not just my process (ie "two-week merge window, followed by about 6-8 weeks of fixups"), but the whole process both before and after it (Andrew and now linux-next in front of it, and stable kernel tree and the vendors after it). (c) the "big picture" discussion is separate from individual issues. If you want your suspend-to-disk issue resolved, or a memory leak solved, you don't solve those by trying to complain about other parts of the system, that are totally separate. The global flow of patches and releases is not something that we can hold up for _any_ of your individual problems. I do end up delaying releases for really core things, so individual problems do obviously affect (for example) the release timing. But the solution to them is not in complaining about slowing down development, it is about actually trying to engage the developers of *that* feature in *that* particular bug. And finally, trust me, if you want to have people care about your problems, the last thing you want to do is say "I might switch to BSD". Because quite frankly, I really don't care. People who think that threats like that work in any productive way can go screw themselves. I'll flame idiots like that, and my likelihood of helping people because they think they hold a gun to my head is almost zero. Linus --

From: Andrew Morton <akpm@...>
Subject: Re: Slow DOWN, please!!!
 [1]Date: Apr 30, 9:31 pm 2008

On Wed, 30 Apr 2008 18:19:56 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> You (and Andrew) have tried to argue that slowing things down results in 
> better quality,

eh?  I argued the opposite: that increasing quality will as a side-effect
slow things down.

If we simply throttled things, people would spend more time watching the
shopping channel while merging smaller amounts of the same old crap.

--

From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: Apr 30, 9:43 pm 2008 On Wed, 30 Apr 2008, Andrew Morton wrote: > > eh? I argued the opposite: that increasing quality will as a side-effect > slow things down. Yes, my bad, I realized that when I read through my message and already sent out a fix for my buggy email ;) > If we simply throttled things, people would spend more time watching the > shopping channel while merging smaller amounts of the same old crap. I agree totally. And although some of the time would probably _also_ be spent on the frustrating crap that was designed to do the throttling, that isn't much more productive than watching the shopping channel would be ... Linus --
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 6:59 am 2008 On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > eh? I argued the opposite: that increasing quality will as a side-effect > > slow things down. > > Yes, my bad, I realized that when I read through my message and already > sent out a fix for my buggy email ;) > > > If we simply throttled things, people would spend more time watching the > > shopping channel while merging smaller amounts of the same old crap. > > I agree totally. And although some of the time would probably _also_ be > spent on the frustrating crap that was designed to do the throttling, that > isn't much more productive than watching the shopping channel would be ... Okay, so what exactly are we going to do to address the issue that I described in the part of my last message that you skipped? Rafael --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 11:26 am 2008 On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > Okay, so what exactly are we going to do to address the issue that I described > in the part of my last message that you skipped? Umm. I don't really see anythign to say. You said: > Still, the issue at hand is that > (1) The code merged during a merge window is somewhat opaque from the tester's > point of view and if a regression is found, the only practical means to > figure out what caused it is to carry out a bisection (which generally is > unpleasant, to put it lightly). > (2) Many regressions are introduced during merge windows (relative to the > total amount of code merged they are a few, but the raw numbers are > significant) and because of (1) the process of removing them is generally > painful for the affected people. > (3) The suspicion is that the number of regressions introduced during merge > windows has something to do with the quality of code being below > expectations, that in turn may be related to the fact that it's being > developed very rapidly. And quite frankly, (2) and (3) are both: "merge windows introduce new bugs", and that's such an uninteresting tautology that I'm left wordless. And (1) is just a result of merrging lots of stuff. Of course the new bugs / regressions are introduced during the merge window. That's when we merge new code. New bugs don't generally happen when you don't get new code. And of course finding bugs is always painful to everybody involved. And of course the bugs indicate something about the quality of code being merged. Perfect code wouldn't have bugs. So what you are stating isn't interesting, and isn't even worthy of discussion. The way you state it, the only answer is: don't take new code, then. That's what your whole argument always seems to boild down to, and excuse me for (yet again) finding that argument totally pointless. So let me repeat: (1) we have new code. We always *will* have new code, hopefully. A few million lines pe year. If you don't accept this, I don't have anything to say. (2) we need a merge window. That is a direct result not of wanting to have lots of code at the same time, but of the _reverse_ issue: we want to have times of relative calm. And again, if you continue to see the merge window as the "problem", rather than as the INEVITABLE result of wanting to have a calm period, there's no point in talking to you. (3) Ergo, there's a very fundamental and basic and inescapable result: we absolutely _will_ have times when we get lots and lots of new code. So these are not "problems". They are *facts*. Stating them as problems is stupid and pointless. I'm not going to discuss this with you if you cannot get over this. So please accept the facts. Once you accept the facts, you can state the things you can change. But the things you cannot change is the merge window, and the fact that we get a lot of new code at a high rate (where the merge window will inevitably compress that rate, so that we have _another_ window where the rate is lower). So stop arguing against facts, and start arguing about other things that can be argued about. That's all I'm saying. Linus --
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 1:09 pm 2008 On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > Okay, so what exactly are we going to do to address the issue that I described > > in the part of my last message that you skipped? > > Umm. I don't really see anythign to say. You said: > > > Still, the issue at hand is that > > (1) The code merged during a merge window is somewhat opaque from the tester's > > point of view and if a regression is found, the only practical means to > > figure out what caused it is to carry out a bisection (which generally is > > unpleasant, to put it lightly). > > (2) Many regressions are introduced during merge windows (relative to the > > total amount of code merged they are a few, but the raw numbers are > > significant) and because of (1) the process of removing them is generally > > painful for the affected people. > > (3) The suspicion is that the number of regressions introduced during merge > > windows has something to do with the quality of code being below > > expectations, that in turn may be related to the fact that it's being > > developed very rapidly. > > And quite frankly, (2) and (3) are both: "merge windows introduce new > bugs", and that's such an uninteresting tautology that I'm left > wordless. Perhaps if they introduced fewer bugs, all of that would be less frustrating to people who get hit by them, especially by two or more at a time. Everyone seems to be fine with that until it happens to him personally (like it happened to David). > And (1) is just a result of merrging lots of stuff. > > Of course the new bugs / regressions are introduced during the merge > window. That's when we merge new code. New bugs don't generally happen > when you don't get new code. I obviously agree with that. The question is, however, if we can decrease the number of bugs introduced during merge windows and you seem to be saying that no, we can't. Which is disappointing. > And of course finding bugs is always painful to everybody involved. > > And of course the bugs indicate something about the quality of code > being merged. Perfect code wouldn't have bugs. > > So what you are stating isn't interesting, and isn't even worthy of > discussion. The way you state it, the only answer is: don't take new > code, then. That's what your whole argument always seems to boild down > to, and excuse me for (yet again) finding that argument totally > pointless. I have never said you shouldn't take new code at all. That's not what I'm saying and please don't paint me this way. I see a problem in that you get patches that you shouldn't have got because they are unfinished and not well thought through. They introduce regressions which are only possible to find using bisection because of the amount of code merged at a time and that's frustrating. You seem to be regarding this as a necessity, but I'm really not convinced that you're right in that. > So let me repeat: > > (1) we have new code. We always *will* have new code, hopefully. A few > million lines pe year. > > If you don't accept this, I don't have anything to say. > > (2) we need a merge window. That is a direct result not of wanting to > have lots of code at the same time, but of the _reverse_ issue: we > want to have times of relative calm. > > And again, if you continue to see the merge window as the > "problem", rather than as the INEVITABLE result of wanting to have > a calm period, there's no point in talking to you. However, the width of the merge window is not a predetermined thing and might be adjusted, for example. Other things might be changed too. > (3) Ergo, there's a very fundamental and basic and inescapable result: > we absolutely _will_ have times when we get lots and lots of new > code. But that need not include obviously broken patches. > So these are not "problems". They are *facts*. Stating them as > problems is stupid and pointless. I'm not going to discuss this with > you if you cannot get over this. > > So please accept the facts. > > Once you accept the facts, you can state the things you can change. But > the things you cannot change is the merge window, and the fact that we > get a lot of new code at a high rate (where the merge window will > inevitably compress that rate, so that we have _another_ window where > the rate is lower). The problem is the (relatively small) fraction of patches pushed to you that is broken. Some patches are obviously broken, some of them are just not tested well enough. The result is pretty much the same in either case. Now, the question is if we can get rid of that fraction by adjusting the process somehow. You're arguing that we can't and so be it. [This is your opinion and BTW there's nothing allowing me to call that unreasonable or saying that you use made up arguments or something like this.] My opinion is that we could at least try to do something about it. linux-next is probably a step in the right direction, though time will tell. I'm afraid, though, that I personally can't do much more than I've been doing already to improve things. > So stop arguing against facts, and start arguing about other things that > can be argued about. That's all I'm saying. The message that started this whole thread was not from me and I believe it was sent for a reason. So the fact is that at least some people lose their patience over the current handling of merge windows. And I'm not sure that's necessary. Thanks, Rafael --
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! [1]Date: May 1, 1:41 pm 2008 On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > I obviously agree with that. The question is, however, if we can decrease the > number of bugs introduced during merge windows and you seem to be saying > that no, we can't. Which is disappointing. No, that's not what I'm saying. What I *am* saying is that as long as you concentrate on "merge window" and "lots of code", you're concentrating not on the problems, but on the facts of life. You can't change facts, and even trying is pointless. What you should concentrate on is not how many patches there are during the merge window (because we can't do anything about that) or the fact that they all happen in a short timeframe, but about quality of patches _regardless_ of merge window. So if you can make an argument that does not even *try* to change the fact that - we have lots of patches and - we have a merge window and - merging patches causes bugs but argues about quality from some other standpoint, then I can start to believe that you have a point. But as long as you argue about the fact that we merge a lot of stuff, and that bugs come in during the merge window, I'm not interested. Arguing about facts is totally non-productive. And as long as people keep saying "let's not merge broken patches" or "we should never have bugs", I'll just ignore those kinds of idiotic statements. They aren't even arguments, they are wishes, and they are unrealistic. If we knew they were broken and had bugs, of course we wouldn't merge them. In short - I'm simply not interested in what you _wish_ reality was. People need to first acknowledge reality, and _then_ they may have solutions. So the reality is: - we do have tons of patches, and they need to be merged (and furiously) - there *will* be bugs. And the number of bugs will inevitably be relative to the number of patches. There is no "perfect", and anybody who argues for a lower number of bugs by lowering the number of patches is an idiot in my book. - there *will* be releases, even in the presense of bugs, because holding everything up is simply not an option. Those are the things that we have to accept. Anything else is just dreaming. Now, what part _can_ we improve and still be realistic? We can try to improve average quality - the number of bugs will *still* be relative to the size of the changes (no getting away from that), but we may be able to lower the absolute number of bugs. But not to zero! And that "not to zero" is IMPORTANT. If you think you can aim for zero bugs, I'm simply not interested in discussing it with you. You live in a different universe, and we're not talking about the same reality. And if you're not being realistic, then why the hell would I believe that your solutions are realistic? I'd rather take some pills and talk to the little purple man living under the deck in my back yard, because at least he's amusing, even if he doesn't make much sense either. And I'm also not in the *least* interested in arguments like "We should just improve our quality of patches". Of course everybody wishes for that. Again, it's not an argument, it's just a unrealistic wish, unless you can actually give a suggestion of a process or other thing that would actually seem to reach it (without assuming other impossible things like "we need more time" or "we need more people who just spend their day looking for bugs"). Same goes for "we should all just spend time looking at each others patches and trying to find bugs in them". That's not a solution, that's a drug-induced dream you're living in. And again, if I want to discuss dreams, I'd rather talk about my purple guy, and the bad things he does to the hedgehog that lives next door. So do you have any productive *suggestions*? Some that involve more than "let's write less code" or "let's just review each others patches more". Linus --

From: Linus Torvalds <torvalds@...>
Subject: Re: Slow DOWN, please!!!
 [1]Date: May 1, 2:30 pm 2008

On Thu, 1 May 2008, Linus Torvalds wrote:
> 
> In other words: do people have realistic ideas for how to make others 
> spend _more_ time looking at patches? And not just _wishing_ people did 
> that?

Just to throw out an example:

 - make a "Random pending patch of the day" google gadget.

I know that's abit out there, and I'm not sure the google gadget thing is 
realistic, but I bet I'm not the only one who ends up using the google 
homepage all the time. A button that says "this patch looks ok", "this 
patch looks crap", or "I dunno, give me another one to look at" might be a 
fun game that would encourage people to look at a couple of patches a day.

You get five thousand people doing that occasionally (not every day, but 
maybe when they are bored and look for something more rewarding than 
trying to find bad music videos on youtube), and maybe you'd actually get 
feedback on patches.

Make it pick a random commit that is in linux-next but hasn't been merged 
into main -git yet.

Crazy? Probably. But at least it fits my notion of "let's not just wish 
people did more patch commentary" thing.

IOW, if people are really serious about coming up with ways to improve 
code quality, I really think it needs to be about _practical_ things that 
can fit in our flow or can be extensions to it, not just wishing for 
better quality.

"If wishes were horses, beggars would ride"

			Linus
--


Related links:


Source URL:
http://kerneltrap.org/Linux/Active_Merge_Windows