"This is starting to get beyond frustrating for me," complained David Miller of the latest merge window, launching what turned into a very lengthy and ongoing discussion about the Linux kernel development process. The concept of a regular "merge window" was first discussed in July of 2005 with the release of the 2.6.14-rc4 kernel, following the 2005 Developers' Summit. From 2.6.14 on, the release of each official 2.6.y kernel has been followed by a two week period during which major changes are merged into the kernel, followed by a 2.6.y-rc1 release. David complained that this particular merge window has been more painful than others, "the tree breaks every day, and it's becoming an extremely non-fun environment to work in. We need to slow down the merging, we need to review things more, we need people to test their [...] changes!"
During the lengthy discussion, Linux creator Linus Torvalds explained:
"The notion that we should even _try_ to aim to slow things down, that one I find unlikely to be true, and I don't even understand why anybody would find it a logical goal? Of course, you will have fewer new bugs if you have fewer changes. But that's not a goal, that's a tautology and totally uninteresting. A small program is likely to have fewer bugs, but that doesn't make something small 'better' than something large that does more. Similarly, a stagnant development community will introduce new bugs more seldom. But does that make a stagnant one better than a vibrant one? Hell no. So what I'm arguing against here is not that we should aim for worse quality, but I'm arguing against the false dichotomy of believing that quality is incompatible with lots of change."
From: David Miller <davem@...> Subject: Slow DOWN, please!!! Date: Apr 29, 10:03 pm 2008This is starting to get beyond frustrating for me.
Yesterday, I spent the whole day bisecting boot failures
on my system due to the totally untested linux/bitops.h
optimization, which I fully analyzed and debugged.Today, I had hoped that I could get some work done of my
own, but that's not the case.Yet another bootup regression got added within the last 24
hours.I don't mind fixing the regression or two during the merge
window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS!The tree breaks every day, and it's becomming an extremely
non-fun environment to work in.We need to slow down the merging, we need to review things
more, we need people to test their fucking changes!
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 3:36 pm 2008On Wednesday, 30 of April 2008, David Miller wrote:
>
> This is starting to get beyond frustrating for me.
>
> Yesterday, I spent the whole day bisecting boot failures
> on my system due to the totally untested linux/bitops.h
> optimization, which I fully analyzed and debugged.
>
> Today, I had hoped that I could get some work done of my
> own, but that's not the case.
>
> Yet another bootup regression got added within the last 24
> hours.
>
> I don't mind fixing the regression or two during the merge
> window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS!
>
> The tree breaks every day, and it's becomming an extremely
> non-fun environment to work in.
>
> We need to slow down the merging, we need to review things
> more, we need people to test their fucking changes!Well, I must say I second that.
I'm not seeing regressions myself this time (well, except for the one that
Jiri fixed), but I did find a few of them during the post-2.6.24 merge window
and I wouldn't like to repeat that experience, so to speak.IMO, the merge window is way too short for actually testing anything. I rebuild
the kernel once or even twice a day and there's no way I can really test it.
I can only check if it breaks right away. And if it does, there's no time to
find out what broke it before the next few hundreds of commits land on top of
that.Thanks,
Rafael
--
From: Andrew Morton <akpm@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 4:15 pm 2008On Wed, 30 Apr 2008 21:36:57 +0200
"Rafael J. Wysocki" wrote:> IMO, the merge window is way too short for actually testing anything.
There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1!
_anything_ which appears in 2.6.x-rc1 and which wasn't in 2.6.x-mm1 was
snuck in too late (OK, apart from trivia and bugfixes).If we decide that we need to fix the oh-shit-lets-slam-this-in-and-hope
problem then I expect we can do so, via fairly relible means.But the first attempt at solving it should be to ask people to not do that.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 4:31 pm 2008On Wed, 30 Apr 2008, Andrew Morton wrote:
>
>
>
> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1!The problem I see with both -mm and linux-next is that they tend to be
better at finding the "physical conflict" kind of issues (ie the merge
itself fails) than the "code looks ok but doesn't actually work" kind of
issue.Why?
The tester base is simply too small.
Now, if *that* could be improved, that would be wonderful, but I'm not
seeing it as very likely.I think we have fairly good penetration these days with the regular -git
tree, but I think that one is quite frankly a *lot* less scary than -mm or
-next are, and there it has been an absolutely huge boon to get the kernel
into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also
started something like that).So I'm very pessimistic about getting a lot of test coverage before -rc1.
Maybe too pessimistic, who knows?
Linus
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 4:05 pm 2008On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
>
> IMO, the merge window is way too short for actually testing anything.That is largely on purpose.
There's two choices:
- have a longer and calmer merge window, spread out the joy, and have
people test and fix their things during the merge window too. In other
words, less black-and-white.- Really short merge window, and use the extra time *after* it to fix the
issues.and I've obviously gone for the latter. In fact, I'd personally like to
make it even shorter, because the problem with the long merge window can
be summed up very simply:Long merge windows don't work - because rather than test more, it just
means that people will use them to make more changes!So one of the major things about the short merge window is that it's
hopefully encouraging people to have things ready by the time the merge
window opens, because it's too late to do anything later.And yes, we could have some other way of enforcing that - allow the merge
window to be longer, but have some other mechanism to make sure that I
only merge old code.In fact, I'd personally *love* to have a hard rule that says "I will only
pull from trees that were already 'done' by the time the window opened",
and we've been kind-of moving in that direction.But that wish is counteracted by the fact that the merges themselves do
need some development, so expecting everything to be ready before-hand is
simply not realistic.Also, while I'd like trees to be ready when the window opens, at the same
time I do think that it's good to spread out some of it, and get *some*
basic testing - even if it's just a nightly build and a few tens of
developers.> I rebuild the kernel once or even twice a day and there's no way I can
> really test it. I can only check if it breaks right away.And really, that's all that we'd expect during the merge window. We want
to find the *obvious* problems - build issues, and the things that hit
everybody, but let's face it, the subtle ones will take time to find
regardless.Then, the short merge window means that we have more time when we really
don't have big changes going in to find the subtle ones.(And making the release cycle longer would *not* help - that would just
make the next merge window more painful, so while it can, and does, work
for some individual release with particular problems, it's not a solution
in the long run).Linus
--
From: Paul Mackerras <paulus@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 7:29 pm 2008Linus Torvalds writes:
> So one of the major things about the short merge window is that it's
> hopefully encouraging people to have things ready by the time the merge
> window opens, because it's too late to do anything later.Having things ready by the time the merge window opens is difficult
when you don't know when the merge window is going to open. OK, after
you release a -rc6 or -rc7, we know it's close, but it could still be
three weeks off at that point. Or it could be tomorrow.That's mitigated at the moment by having the merge window be two weeks
long. So if you open the merge window at a point where I, or someone
downstream of me, thought we still had two weeks to go, we can hurry
up and try to get stuff finished within the first week and still get
it merged.But if you made a really hard and fast rule that only stuff that is in
linux-next at the point where the merge window opens can be merged,
AND the point at which the merge window opens is unknown and
unpredictable within a period of about 4 weeks, then that makes it
really tough for those of us downstream of you to plan our work.By the way, if you do want to make that rule, then there's a really
easy way to do it - just pull linux-next, and make that one pull be
the entire merge window. :) But please give us at least a week's
notice that you're going to do that.Paul.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 11:47 pm 2008On Thu, 1 May 2008, Paul Mackerras wrote:
>
> Having things ready by the time the merge window opens is difficult
> when you don't know when the merge window is going to open. OK, after
> you release a -rc6 or -rc7, we know it's close, but it could still be
> three weeks off at that point. Or it could be tomorrow.Well, if the tree is ready, you shouldn't need to care ;)
That said:
> By the way, if you do want to make that rule, then there's a really
> easy way to do it - just pull linux-next, and make that one pull be
> the entire merge window. :) But please give us at least a week's
> notice that you're going to do that.I'm not going to pull linux-next, because I hate how it gets rebuilt every
time it gets done, so I would basically have to pick one at random, and
then that would be it.I also do actually try to spread the early pulls out a _bit_, so that
if/when problems happen, there's some amount of information in the fact
that something started showing up between -git2 and -git3.HOWEVER.
One thing that was discussed when linux-next was starting up was whether I
would maintain a next branch myself, that people could actually depend on
(unlike linux-next, which gets rebuilt).And while I could do that for really core infrastructure changes, I really
would hate to see something like that become part of the flow - because
I'd hope things that really require it should be so rare that it's not
worth it for me to maintain a separate branch for it.But there could be some kind of carrot here - maybe I could maintain a
"next" branch myself, not for core infrastructure, but for stuff where the
maintainer says "hey, I'm ready early, you can pull me into 'next'
already".In other words, it wouldn't be "core infrastructure", it would simply be
stuff that you already know you'd send to me on the first day of the merge
window. And if by maintaining a "next" branch I could encourage people to
go early, _and_ let others perhaps build on it and sort out merge
conflicts (which you can't do well on linux-next, exactly because it's a
bit of a quick-sand and you cannot depend on merging the same order or
even the same base in the end), maybe me having a 'next' branch would be
worth it.But it would have to be low-maintenance. Something I might open after
-rc4, say, and something where I'd expect people to only ask me to pull
_once_ (because they really are mostly ready, and can sort out the rest
after the merge window), and if they have no open regressions (again, the
"carrot" for good behaviour).I'm not saying it's a great idea, but if that kind of flow makes sense to
people, maybe it should be on the table as an idea or at least see if it
might work.But let's see how linux-next works out. Maybe all the subsystem
maintainers can just get their tree in shape, see that it merges in
linux-next, and not even need anything else. Then, when the merge window
opens, if you're ready, just let me know.Linus
--
From: Jeff Garzik <jeff@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 12:17 am 2008Linus Torvalds wrote:
> But there could be some kind of carrot here - maybe I could maintain a
> "next" branch myself, not for core infrastructure, but for stuff where the
> maintainer says "hey, I'm ready early, you can pull me into 'next'
> already".
>
> In other words, it wouldn't be "core infrastructure", it would simply be
> stuff that you already know you'd send to me on the first day of the merge
> window. And if by maintaining a "next" branch I could encourage people to
> go early, _and_ let others perhaps build on it and sort out merge
> conflicts (which you can't do well on linux-next, exactly because it's a
> bit of a quick-sand and you cannot depend on merging the same order or
> even the same base in the end), maybe me having a 'next' branch would be
> worth it.linux-next is _supposed_ to be solely the stuff that is ready to be sent
to you upon window-open.The only thing that isn't reliable are the commit ids -- and that's at
the request of a large majority of maintainers, who noted to Stephen R
that the branch he was pulling from them might get rebased -- thus
necessitating the daily tree regeneration.So, I think a 'next' branch from you would open cans o worms:
- one more tree to test, and judging from linux-next and -mm it's tough
to get developers to test more than just upstream- is the value of holy penguin pee great enough to overcome this
another-tree-to-test obstacle?- opens all the debates about running parallel branches, such as, would
it be better to /branch/ for 2.6.X-rc, and then keep going full steam on
the trunk? After all, the primary logic behind 2.6.X-rc is to only take
bug fixes, theoretically focusing developers more on that task. But now
we are slowly undoing that logic, or at least openly admitting that has
been the reality all along.Jeff
--
From: Alan Cox <alan@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 5:17 am 2008> - opens all the debates about running parallel branches, such as, would
> it be better to /branch/ for 2.6.X-rc, and then keep going full steam on
> the trunk? After all, the primary logic behind 2.6.X-rc is to only takeThat encourages developers to continue ignoring that stabilizing work.
The stall does have a side effect of refocussing them. A branch for -rc
and a monthly cycle would be interesting as it would mean that the
pushback for not fixing stability problems would be not getting you work
pulled for the main tree if you didn't fix the bugs first - and could be
both sufficient an incentive and not too vicious as it would be with a 2
month cycle.Alan
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 12:46 am 2008On Thu, 1 May 2008, Jeff Garzik wrote:
>
> linux-next is _supposed_ to be solely the stuff that is ready to be sent to
> you upon window-open.Yes, the "stuff" may be supposed to be stable. But the trees feeding it
certainly are not. People are rebasing them etc, and it doesn't matter
because I think linux-next starts largely from scratch next time around.> So, I think a 'next' branch from you would open cans o worms:
>
> - one more tree to test, and judging from linux-next and -mm it's tough to get
> developers to test more than just upstream
>
> - is the value of holy penguin pee great enough to overcome this
> another-tree-to-test obstacle?
>
> - opens all the debates about running parallel branches, such as, would it be
> better to /branch/ for 2.6.X-rc, and then keep going full steam on the trunk?I do agree. And maybe I should have made it clear that I think it's worth
it to me only if it then means that the merge window can shrink.If I'd have both a 'next' branch _and_ a full 2-week merge window, there's
no upside.Btw, it wouldn't be another tree to test, since it would presumaby be what
'linux-next' starts out from - so it would purely be something that
doesn't have the constant re-merging of the more wild-and-crazy
'linux-next' tree.Linus
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 4:45 pm 2008On Wednesday, 30 of April 2008, Linus Torvalds wrote:
>
> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
> >
> > IMO, the merge window is way too short for actually testing anything.
>
> That is largely on purpose.
>
> There's two choices:Oh well, I don't think it's really that simple.
> - have a longer and calmer merge window, spread out the joy, and have
> people test and fix their things during the merge window too. In other
> words, less black-and-white.
>
> - Really short merge window, and use the extra time *after* it to fix the
> issues.
>
> and I've obviously gone for the latter. In fact, I'd personally like to
> make it even shorter, because the problem with the long merge window can
> be summed up very simply:
>
> Long merge windows don't work - because rather than test more, it just
> means that people will use them to make more changes!And what do you think is happening _after_ the merge window closes, when
we're supposed to be fixing bugs? People work on new code. And, in fact, they
have to, if they want to be ready for the next merge window.> So one of the major things about the short merge window is that it's
> hopefully encouraging people to have things ready by the time the merge
> window opens, because it's too late to do anything later.
>
> And yes, we could have some other way of enforcing that - allow the merge
> window to be longer, but have some other mechanism to make sure that I
> only merge old code.How about, instead, putting limits on the amount of stuff that's going to be
merged during the next window?> In fact, I'd personally *love* to have a hard rule that says "I will only
> pull from trees that were already 'done' by the time the window opened",
> and we've been kind-of moving in that direction.Well, and when's the time for fixing bugs? Surely not during the merge window
and also not after that, because otherwise people won't be ready for the next
merge window with the new code.> But that wish is counteracted by the fact that the merges themselves do
> need some development, so expecting everything to be ready before-hand is
> simply not realistic.
>
> Also, while I'd like trees to be ready when the window opens, at the same
> time I do think that it's good to spread out some of it, and get *some*
> basic testing - even if it's just a nightly build and a few tens of
> developers.
>
> > I rebuild the kernel once or even twice a day and there's no way I can
> > really test it. I can only check if it breaks right away.
>
> And really, that's all that we'd expect during the merge window. We want
> to find the *obvious* problems - build issues, and the things that hit
> everybody, but let's face it, the subtle ones will take time to find
> regardless.Exactly. Moreover, the code is now being merged at a pace that makes it
physically impossible to review it given the human resources we have.> Then, the short merge window means that we have more time when we really
> don't have big changes going in to find the subtle ones.Sorry to say that, but I don't think this is realistic. What happens after the merge
window is people go and develop new stuff. They look at the already merged
code only if they have to. Also, there are a _few_ people testing the kernel
carefully enough to see the more subtle problems, let alone debugging and
fixing them.> (And making the release cycle longer would *not* help - that would just
> make the next merge window more painful, so while it can, and does, work
> for some individual release with particular problems, it's not a solution
> in the long run).My point is, given the width of the merge windown, there's too much stuff
going in during it. As far as I'm concerned, the window can be a week long
or whatever, but let's make fewer commits over a unit of time.Thanks,
Rafael
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 5:37 pm 2008On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
> >
> > Long merge windows don't work - because rather than test more, it just
> > means that people will use them to make more changes!
>
> And what do you think is happening _after_ the merge window closes, when
> we're supposed to be fixing bugs? People work on new code. And, in fact, they
> have to, if they want to be ready for the next merge window.Oh, I agree. But at that point, the issue you brought up - of testing and
then having the code change under you wildly - has at least gone away.And I think you are missing a big issue:
> Sorry to say that, but I don't think this is realistic. What happens after the merge
> window is people go and develop new stuff.From a testing standpoint, the *developers* aren't ever even the main
issue. Yes, we get test coverage that way too, but we should really aim
for getting most of the non-obvious issues from the user community, and
not primarily from developers.So the whole point of the merge window is *not* to have developers testing
their code during the six subsequent weeks, but to have *users* able to
use -rc1 and report issues!That's why the distro "testing" trees are so important. And that's why
it's so important that -rc1 be timely.> My point is, given the width of the merge windown, there's too much stuff
> going in during it. As far as I'm concerned, the window can be a week long
> or whatever, but let's make fewer commits over a unit of time.I'm not following that logic.
A single merge will bring in easily thousands of commits. It doesn't
matter if the merge window is a day or a week or two weeks, the merge will
be one event.And there's no way to avoid the fact that during the merge window, we will
get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was
9629 commits).So your "fewer commits over a unit of time" doesn't make sense. We have
those ten thousand commits. They need to go in. They cannot take forever.
Ergo, you *will* have a thousand commits a day during the merge window.We can spread it out a bit (and I do to some degree), but in many ways
that is just going to be more painful. So it's actually easier if we can
get about half of the merges done early, so that people like Andrew then
has at least most of the base set for him by the first few days of the
merge window.So here's the math: 3,500 commits per month. That's just the *average*
speed, it's sometimes more. And we *cannot* merge them continuously,
because we need to have a stabler period for testing. And remember: those
3,500 commits don't stop happening just because they aren't merged. You
should think of them as a constant pressure.So 3,500 commits per month, but with a stable period (that is *longer*
than the merge window) that means that the merge window needs to merge
that constant stream of commits *faster* than they happen, so that we can
then have that breather when we try to get users to test it. Let's say
that we have a 1:3 ratio (which is fairly close to what we have), and that
means that we need to merge 3,500 commits in a week.That's just simple *math*. So when you say "let's make fewer commits over
a unit of time" I can onyl shake my head and wonder what the hell you are
talking about. The merge window _needs_ to do those 3,500 commits per
week. Otherwise they don't get merged!Linus
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 6:23 pm 2008On Wednesday, 30 of April 2008, Linus Torvalds wrote:
>
> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote:
> > >
> > > Long merge windows don't work - because rather than test more, it just
> > > means that people will use them to make more changes!
> >
> > And what do you think is happening _after_ the merge window closes, when
> > we're supposed to be fixing bugs? People work on new code. And, in fact, they
> > have to, if they want to be ready for the next merge window.
>
> Oh, I agree. But at that point, the issue you brought up - of testing and
> then having the code change under you wildly - has at least gone away.
>
> And I think you are missing a big issue:
>
> > Sorry to say that, but I don't think this is realistic. What happens after the merge
> > window is people go and develop new stuff.
>
> From a testing standpoint, the *developers* aren't ever even the main
> issue. Yes, we get test coverage that way too, but we should really aim
> for getting most of the non-obvious issues from the user community, and
> not primarily from developers.
>
> So the whole point of the merge window is *not* to have developers testing
> their code during the six subsequent weeks, but to have *users* able to
> use -rc1 and report issues!
>
> That's why the distro "testing" trees are so important. And that's why
> it's so important that -rc1 be timely.That's correct, but since developers are already working on new code at that
point, the bug reports in fact distract them and make them go back to the "old"
stuff, recall why they did that particular changes etc. As a result, the
developers often do not take the bug reports seriously enough, especially if
they do not finger the "guilty" change. That, in turn, makes the users believe
there's no point in testing and reporting bugs.> > My point is, given the width of the merge windown, there's too much stuff
> > going in during it. As far as I'm concerned, the window can be a week long
> > or whatever, but let's make fewer commits over a unit of time.
>
> I'm not following that logic.
>
> A single merge will bring in easily thousands of commits. It doesn't
> matter if the merge window is a day or a week or two weeks, the merge will
> be one event.No, technically it doesn't.
> And there's no way to avoid the fact that during the merge window, we will
> get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was
> 9629 commits).Well, do we _have_ _to_ take that much? I know we _can_, but is this really
necessary?> So your "fewer commits over a unit of time" doesn't make sense.
Oh, yes it does. Equally well you could say that having brakes in a car
didn't make sense, even if you could drive it as fast as the engine allowed
you to. ;-)> We have those ten thousand commits. They need to go in. They cannot take
> forever.But perhaps some of them can wait a bit longer.
> Ergo, you *will* have a thousand commits a day during the merge window.
That's only if you insist on handling everything what people push to you.
> We can spread it out a bit (and I do to some degree), but in many ways
> that is just going to be more painful. So it's actually easier if we can
> get about half of the merges done early, so that people like Andrew then
> has at least most of the base set for him by the first few days of the
> merge window.
>
> So here's the math: 3,500 commits per month. That's just the *average*
> speed, it's sometimes more. And we *cannot* merge them continuously,
> because we need to have a stabler period for testing. And remember: those
> 3,500 commits don't stop happening just because they aren't merged. You
> should think of them as a constant pressure.
>
> So 3,500 commits per month, but with a stable period (that is *longer*
> than the merge window) that means that the merge window needs to merge
> that constant stream of commits *faster* than they happen, so that we can
> then have that breather when we try to get users to test it. Let's say
> that we have a 1:3 ratio (which is fairly close to what we have), and that
> means that we need to merge 3,500 commits in a week.
>
> That's just simple *math*. So when you say "let's make fewer commits over
> a unit of time" I can onyl shake my head and wonder what the hell you are
> talking about. The merge window _needs_ to do those 3,500 commits per
> week. Otherwise they don't get merged!Surely, they don't, but maybe they don't have to.
You can technically handle merging even more, but what about quality? Do we
have a quality assurance process in place? If we do, what is it? How is it
able to handle the 3500 commits a week? Assuming it is, will it be able to
handle more and what's the limit?IMO, there has to be a limit somewhere, or we will end up in a spiral driving
everybody mad.Thanks,
Rafael
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 6:31 pm 2008On Thu, 1 May 2008, Rafael J. Wysocki wrote:
>
> > And there's no way to avoid the fact that during the merge window, we will
> > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was
> > 9629 commits).
>
> Well, do we _have_ _to_ take that much? I know we _can_, but is this really
> necessary?Do you want me to stop merging your code?
Do you think anybody else does?
Any suggestions on how to convince people that their code is not worth
merging?Linus
--
From: Willy Tarreau <w@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 6:46 pm 2008On Wed, Apr 30, 2008 at 03:31:22PM -0700, Linus Torvalds wrote:
>
>
> On Thu, 1 May 2008, Rafael J. Wysocki wrote:
> >
> > > And there's no way to avoid the fact that during the merge window, we will
> > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was
> > > 9629 commits).
> >
> > Well, do we _have_ _to_ take that much? I know we _can_, but is this really
> > necessary?
>
> Do you want me to stop merging your code?
>
> Do you think anybody else does?
>
> Any suggestions on how to convince people that their code is not worth
> merging?I think you're approaching a solution Linus. If developers take a refusal
as a punishment, maybe you can use that for trees which have too many
unresolved regressions. This would be really unfair to subsystem maintainers
which themselves merge a lot of work, but recursively they may apply the
same principle to their own developers, so that everybody knows that it's
not worth working on next code past a point where too many regressions are
reported.Willy
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 7:20 pm 2008On Thu, 1 May 2008, Willy Tarreau wrote:
> >
> > Any suggestions on how to convince people that their code is not worth
> > merging?
>
> I think you're approaching a solution Linus. If developers take a refusal
> as a punishment, maybe you can use that for trees which have too many
> unresolved regressions.Heh. It's been done. In fact, it's done all the time on a smaller scale.
It's how I've enforced some cleanliness or process issues ("I won't pull
that because it's too ugly"). I see similar messages floating around about
individual patches.That said, I don't think it really works that well as "the solution": it
works as a small part of the bigger picture, but no, we can't see
punishment as the primary model for encouraging better bevaiour.First off, and maybe this is not true, but I don't think it is a very
healthy way to handle issues in general. I may come off as an opinionated
bastard in discussions like these, and I am, but when it actually comes to
maintaining code, really prefer a much softer approach.I want to _trust_ people, and I really don't want to be a "you need to do
'xyz' or else" kind of guy.So I'll happily say "I can't merge this, because xyz", where 'xyz' is
something that is related to the particular code that is actually merged.
But quite frankly, holding up _unrelated_ fixes, because some other issue
hasn't been resolved, I really try to not do that.So I'll say "I don't want to merge this, because quite frankly, we've had
enough code for this merge window already, it can wait". That tends to
happen at the end of the merge window, but it's not a threat, it's just me
being tired of the worries of inevitable new issues at the end of the
window.And I personally feel that this is important to keep people motivated.
Being too stick-oriented isn't healthy.The other reason I don't believe in the "won't merge until you do 'xyz'"
kind of thing as a main development model is that it traditionally hasn't
worked. People simply disagree, the vendors will take the code that their
customers need, the users will get the tree that works for them, and
saying "I won't merge it" won't help anybody if it's actually useful.Finally, the people I work with may not be perfect, but most maintainers
are pretty much experts within their own area. At some point you have to
ask yourself: "Could I do better? Would I have the time? Could I find
somebody else to do better?" and not just in a theoretical way. And if the
answer is "no", then at that point, what else can you do?Yes, we have personalities that clash, and merge problems. And let's face
it, as kernel developers, we aren't exactly a very "cuddly" group of
people. People are opinionated and not afraid to speak their mind. But on
the whole, I think the kernel development community is actually driven a
lot more by _positive_ things than by the stick of "I won't get merged
unless I shape up".So quite frankly, I'd personally much rather have a process that
encourages people to have so much _pride_ in what they do that they want
it to be seen as being really good (and hopefully then that pride means
that they won't take crap!) than having a chain of fear that trickles
down.So this is why, for example, I have so strongly encouraged git maintainers
to think of their public trees as "releases". Because I think people act
differently when they *think* of their code as a release than when they
think of it as a random development tree.I do _not_ want to slow down development by setting some kind of "quality
bar" - but I do believe that we should keep our quality high, not because
of any hoops we need to jump through, but because we take pride in the
thing we do.[ An example of this: I don't believe code review tends to much help in
itself, but I *do* believe that the process of doing code review makes
people more aware of the fact that others are looking at the code they
produce, and that in turn makes the code often better to start with.And I think publicly announced git trees and -mm and linux-next are
great partly because they end up doing that same thing. I heartily
encourage submaintainers to always Cc: linux-kernel when they send me a
"please pull" request - I don't know if anybody else ever really pulls
that tree, but I do think that it's very healthy to write that message
and think of it as a publication event. ]Linus
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 8:42 pm 2008On Thursday, 1 of May 2008, Linus Torvalds wrote:
>
> On Thu, 1 May 2008, Willy Tarreau wrote:
[--snip--]> I do _not_ want to slow down development by setting some kind of "quality
> bar" - but I do believe that we should keep our quality high, not because
> of any hoops we need to jump through, but because we take pride in the
> thing we do.Well, we certainly should, but do we always remeber about it? Honest, guv?
> [ An example of this: I don't believe code review tends to much help in
> itself, but I *do* believe that the process of doing code review makes
> people more aware of the fact that others are looking at the code they
> produce, and that in turn makes the code often better to start with.It may help directly, for example when people realize that they work on
conflicting or just related changes.> And I think publicly announced git trees and -mm and linux-next are
> great partly because they end up doing that same thing. I heartily
> encourage submaintainers to always Cc: linux-kernel when they send me a
> "please pull" request - I don't know if anybody else ever really pulls
> that tree, but I do think that it's very healthy to write that message
> and think of it as a publication event. ]I totally agree with that.
Still, the issue at hand is that
(1) The code merged during a merge window is somewhat opaque from the tester's
point of view and if a regression is found, the only practical means to
figure out what caused it is to carry out a bisection (which generally is
unpleasant, to put it lightly).
(2) Many regressions are introduced during merge windows (relative to the
total amount of code merged they are a few, but the raw numbers are
significant) and because of (1) the process of removing them is generally
painful for the affected people.
(3) The suspicion is that the number of regressions introduced during merge
windows has something to do with the quality of code being below
expectations, that in turn may be related to the fact that it's being
developed very rapidly.My opinion is that we need to solve this issue sooner rather than later and so
the question is how we are going to approach that.Thanks,
Rafael
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:19 pm 2008On Thu, 1 May 2008, Rafael J. Wysocki wrote:
>
> > I do _not_ want to slow down development by setting some kind of "quality
> > bar" - but I do believe that we should keep our quality high, not because
> > of any hoops we need to jump through, but because we take pride in the
> > thing we do.
>
> Well, we certainly should, but do we always remeber about it? Honest, guv?Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like
process generates quality?And I dislike how people try to conflate "quality" and "merging speed" as
if there was any reason what-so-ever to believe that they are related.You (and Andrew) have tried to argue that slowing things down results in
better quality, and I simply don't for a moment believe that. I believe
the exact opposite.The way to get good quality is not to put barriers up in front of
developers, but totally the reverse - by helping them. And yes, that help
can quite possibly be in the form of "process" - by making things more
streamlined, and by having people not have to waste time on wondering
where they should send things etc.But the notion that we should even _try_ to aim to slow things down, that
one I find unlikely to be true, and I don't even understand why anybody
would find it a logical goal?Of course, you will have fewer new bugs if you have fewer changes. But
that's not a goal, that's a tautology and totally uninteresting. A small
program is likely to have fewer bugs, but that doesn't make something
small "better" than something large that does more.Similarly, a stagnant development community will introduce new bugs more
seldom. But does that make a stagnant one better than a virbrant one? Hell
no.So what I'm arguing against here is not that we should aim for worse
quality, but I'm arguing against the false dichotomy of believing that
quality is incompatible with lots of change.So if we can get the discussion *away* from the "let's slow things down",
then I'm interested. Because at that point we don't have to fight made-up
arguments about something irrelevant.Linus
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:40 pm 2008On Wed, 30 Apr 2008, Linus Torvalds wrote:
>
> You (and Andrew) have tried to argue that slowing things down results in
> better quality,Sorry, not Andrew. DavidN.
Andrew argued the other way (quality->slower), which I also happen to not
necessarily believe in, but that's a separate argument.Nobody should ever argue against raising quality.
The question could be about "at what cost"? (although I think that's not
necessarily a good argument, since I personally suspect that good quality
code comes from _lowering_ costs, not raising them).But what's really relevant is "how?"
Now, we do know that open-source code tends to be higher quality (along a
number of metrics) than closed source code, and my argument is that it's
not because of bike-shedding (aka code review), but simply because the
code is out there and available and visible.And as a result of that, my personal belief is that the best way to raise
quality of code is to distribute it. Yes, as patches for discussion, but
even more so as a part of a cohesive whole - as _merged_ patches!The thing is, the quality of individual patches isn't what matters! What
matters is the quality of the end result. And people are going to be a lot
more involved in looking at, testing, and working with code that is
merged, rather than code that isn't.So _my_ answer to the "how do we raise quality" is actually the exact
reverse of what you guys seem to be arguing.IOW, I argue that the high speed of merging very much is a big part of
what gives us quality in the end. It may result in bugs along the way, but
it also results in fixes, and lots of people looking at the result (and
looking at it in *context*, not just as a patch flying around).And yes, maybe that sounds counter-intuitive. But hey, people thought open
source was counter-intuitive. I spent years explaining why it should work
at all!Linus
--
From: David Miller <davem@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:51 pm 2008From: Linus Torvalds
Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT)> IOW, I argue that the high speed of merging very much is a big part of
> what gives us quality in the end. It may result in bugs along the way, but
> it also results in fixes, and lots of people looking at the result (and
> looking at it in *context*, not just as a patch flying around).This is a huge burdon to put on people.
The more broken stuff you merge, the more people are forced to track
these problems down so that they can get their own work done.It punishes people who do put forth the effort to let new changes cook
properly, before pushing, and thus avoid putting turds into the tree.You really have to think about the ramifications of this system.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 10:01 pm 2008On Wed, 30 Apr 2008, David Miller wrote:
> From: Linus Torvalds
> Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT)
>
> > IOW, I argue that the high speed of merging very much is a big part of
> > what gives us quality in the end. It may result in bugs along the way, but
> > it also results in fixes, and lots of people looking at the result (and
> > looking at it in *context*, not just as a patch flying around).
>
> This is a huge burdon to put on people.
>
> The more broken stuff you merge, the more people are forced to track
> these problems down so that they can get their own work done.I'm not saying we should merge crap.
You can take any argument too far, and clearly it doesn't mean that we
should just accept *anything*, because it will magically be gilded by its
mere inclusion into the kernel. No, I'm not going to argue that.But I do want to argue against the notion that the only way to raise
quality is to do it before it gets merged. It's often better to merge
early, and fix the issues the merge brings up early too!Release early, release often. That was the watch-word early in Linux
kernel development, and there was a reason for it. And it _worked_. Did it
mean "release crap, release anything"? No. But it did mean that things got
lots more exposure - even if those "things" were sometimes bugs.Linus
--
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 12:03 am 2008David Miller wrote:
> We need to slow down the merging, we need to review things
> more, we need people to test their fucking changes!Yes. The Linux process is becoming unreliable. Newly "stable" versions
have stability problems. The development process looks childish.
Seasoned developers say not to worry, that the process works. I do
worry. BSD seems more attractive, and it may even be worth the
considerable effort to switch my entire client-base. Linux was lucky to
gain the foothold that it did: traditionally, BSD had a better system
with a less restrictive licence, so it is surprising that manufacturers
chose to go with Linux. BSD still has a less restrictive licence and
when mainstream press becomes interested in Linux's quality problems
it's adoption will fall. BSD is still a good, maybe even better, option.Linus, this is your baby and so it's your problem. Only you have the
influence to change things.
--
From: David Miller <davem@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 12:18 am 2008From: David Newall
Date: Wed, 30 Apr 2008 13:33:29 +0930> Yes.
Please don't use my posting as an opportunity to portray
BSD as the best thing since sliced bread.We're having ONE bad merge window, we're facing the problem
head on, RIGHT NOW, to prevent it in the future. It's
not a severe ongoing issue as you portray it to be.
--
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:04 am 2008David Miller wrote:
> We're having ONE bad merge window, we're facing the problem
> head on, RIGHT NOW, to prevent it in the future. It's
> not a severe ongoing issue as you portray it to be.No. The problem is more than just a bad merge window. There is poor or
non-existent review; frequent "regressions"; release of kernels as
stable when they are not. There is resentment and resistance to even
acknowledging these problems. Take, as an example, the desire to NOT
record who gives good code and who gives bugs: that one clearly hit a
nerve, which it should not have except from people who feel guilty.I don't claim BSD to be perfect, but it appears to have a consistently
good quality. Old Linux kernels also have that; new ones not so.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 10:51 am 2008On Wed, 30 Apr 2008, David Newall wrote:
>
> I don't claim BSD to be perfect, but it appears to have a consistently
> good quality.Lol. You should try VMS. Now *there* was a stable system.
Oh, but it didn't actually make any progress, did it?
The fact is, we're merging a lot. It comes from having a lot of
development. If you don't want that, then you're a fool - because you
aren't looking at the long term.> Old Linux kernels also have that; new ones not so.
Can you point to any actual stability problem?
The problem under discussion is the fact that some people are unhappy
because we had some merge trouble. The fact is, the problems got fixed in
a few days. And yes, we will probably will have to make Ingo follow the
rules that pretty much everybody else also follows, and no, it's not going
to solve all problems either - the fundamental issue is that we are just
too damn good at development.And that's not a big problem in my view, as long as we are also also able
to handle the _result_ of that flood of patches. Which, quite frankly, we
are.DavidN, you just have an agenda, and you think that mentioning BSD as some
kind of shining example of goodness is a good way to reach that agenda. It
isn't. It just shows that you don't understand the issue, and that you
think that "threatening" developers by saying you'll switch is a great way
to make PR.But you know what? I really don't care one _whit_ what you do. You can
switch to Vista for all I care, and I really don't mind. All I care about
is doing a good job technically.And you just show that you don't have a clue what you are talking about.
If you want stable kernel, don't follow the current -git tree. Don't mind
the fact that in two weeks we merge6672 files changed, 373817 insertions(+), 285901 deletions(-)
and instead look at something like the enterprise kernels or other tree
that lags the development tree by half a year or more exactly _because_
they care about stable, not development.In short: what do you think the git tree is? Is it something that should
prioritize good developmnent, or is it something that should worry about
you making inane arguments? Ask yourself that.Linus
--
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 2:21 pm 2008Linus Torvalds wrote:
> Can you point to any actual stability problem?
>Well of course. So could you because they are a matter of public record
on the list. Don't pretend otherwise. Just to give you some recent,
personal bugaboos, and not even drawing on the many hundreds of relevant
messages on LKML each month:1. Out of memory, caused by apparent leak somewhere, resulting in
machine effectively hanging for a minute or two (massive disk i/o)
culminating in termination of one or more processes. (For what it's
worth: 512MB, no swap.) Problem takes a couple of days to develop
(hence I suspect a leak.) This is running only Firefox, Thunderbird and
Evince, plus whatever xubuntu wants. Restarting the killed
application(s) causes the problem to recur. Restarting X doesn't help.
Killing almost all processes also doesn't help. Reboot is required.
This problem seems not to be in 2.6.17, but is in 2.6.22 (plus whatever
patches xubuntu use) and 2.6.23. I'm still testing 2.6.25, but probably
going to have to abandon it and go backwards, because...2. Suspend to disk doesn't resume properly (two out of three times.)
System comes back but X has severe wierdness. Draws frames and title
bar, but not window contents. Text-mode is just as bad: Screen is blank
(erased font table, perhaps?) Subsequent suspend to disk doesn't resume
at all.Note the wide range of kernels exhibiting problem 1. I don't even want
to think about problem 2 at this stage; I just want to stop having to
reboot to reclaim memory, especially when a mate who does Windows
training visits!> the fundamental issue is that we are just
> too damn good at development.
>Not so good. The process is flawed. Inadequate testing. Inadequate
review. This has been mentioned by others, so you know I'm not making
it up. The real fundamental issue is that people are too keen to
release and don't appear to care enough about correctness.> you think that mentioning BSD as some
> kind of shining example of goodness is a good way to reach that agenda.Yes, BSD does seem to be a shining example of goodness, but I didn't
mention it because I think people should switch. I did so to warn of
competition, to say that the world does not owe Linux a second chance
and isn't going to give it one. It's pointless to debate the relative
merits of the two systems because, aside from the kernel, they are
identical; and there's little that matters between the kernels, other
than one appears to have a careful, robust and professional development
process. Make no mistake about this point: I'm not saying that BSD is
better, rather that Linux cannot lose credibility and survive.> But you know what? I really don't care one _whit_ what you do. You can
> switch to Vista for all I care, and I really don't mind. All I care about
> is doing a good job technically.
>
Sadly, you're doing a bad technical job in certain, important areas.
You're pushing out buggy kernels and claiming that they're stable. This
can't continue. Attrition to BSD is the risk, not some threat that I'm
making.> And you just show that you don't have a clue what you are talking about.
> If you want stable kernel, don't follow the current -git tree.Why are you bringing up git trees (which I don't use)? I'm presently
plagued with a problem that's 2.6.22 or older, extending to at least
2.6.23 and maybe still current. I've said quite clearly that I'm
talking about "stable" kernels, yet you presume I mean the git tree.
Yet it's not the specifics of the problem I'm having that matters, it's
the systemic problems in Linux's development process.I don't think I've anything to add unless the topic evolves in a
direction that asks what should be changed. I'm posting this only
because I want on record the answer to the question about actual
stability problems.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 2:27 pm 2008On Thu, 1 May 2008, David Newall wrote:
>
> Why are you bringing up git trees (which I don't use)? I'm presently
> plagued with a problem that's 2.6.22 or older, extending to at least
> 2.6.23 and maybe still current.Ok, *PLONK*.
You're on an old kernel, don't know if your problem is fixed, and ask us
to slow down development.That makes sense.
Go away.
Linus
--
From: Chris Friesen <cfriesen@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 3:06 pm 2008Linus Torvalds wrote:
>
> On Thu, 1 May 2008, David Newall wrote:
>
>>Why are you bringing up git trees (which I don't use)? I'm presently
>>plagued with a problem that's 2.6.22 or older, extending to at least
>>2.6.23 and maybe still current.
>
>
> Ok, *PLONK*.
>
> You're on an old kernel, don't know if your problem is fixed, and ask us
> to slow down development.
>
> That makes sense.
>
> Go away.He did say that he was testing 2.6.25, and that suspend-to-disk was
broken in 2.6.25.Chris
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 3:13 pm 2008On Wed, 30 Apr 2008, Chris Friesen wrote:
>
> He did say that he was testing 2.6.25, and that suspend-to-disk was
> broken in 2.6.25.Neither of which had anything to do with the whole "slow down" argument.
If you have a bug, make a bug report, and push it, and make people aware
of it. But don't make it an argument for development to slow down.Should we all stand around with our thumbs up our *ss because somebody has
a bug? Should the other developers just stop, because suspend-to-disk is
broken for somebody? Should everything come to a standstill because David
Newall doesn't like how there are other things going on that are
independent of _his_ problems?Do you really believe that?
Linus
--
From: David Newall <davidn@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 3:22 pm 2008Linus Torvalds wrote:
> Should everything come to a standstill because David
> Newall doesn't like how there are other things going on that are
> independent of _his_ problems?
>You're being a nasty piece of work this day, Linus, and you're fibbing
by mischaracterising what I said which, by the way, included, "it's not
the specifics of the problem I'm having that matters". You're taking
this far too personally. Get a grip.
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 3:42 pm 2008On Thu, 1 May 2008, David Newall wrote:
>
> You're taking this far too personally.Umm. If you didn't want a personal opinion, why did you Cc me in the first
place then, and ask for my input?I gave my input to you. I think your arguments are ludicrous, to the point
of being totally idiotic. You complain how I don't release kernels that
are stable, but without any suggestions on what the issue might be, apart
from apparently me merging too much and making too many releases.But do you really expect me to stop merging, or hold up releases that fix
hundreds of issues, just because there are other issues pending? Do you
really think development can be stopped? Trust me, we've tried. Every
time, it just leads to worse problems when the floodgates are then opened.And yes, there is a solution: don't develop so much. Don't allow thousands
of developers to be involved. Do a small core group, and make development
so hard or inconvenient that you only have a few tens of people who write
code, and vet them and force them to jump through hoops when adding new
features (or fixing old ones, for that matter).And yes, that *does* result in a "stable" system. Never mind that it's
stable for all the wrong reasons, and generally doesn't actually work well
across a dynamic environment (whether the hardware base below or user
space above).See? This is why I think your arguments are so silly and misguided.
But if you actually have real constructive ideas on things to actually
*do*, please do mention them. We've changed our models over time, several
times, exactly because we've searched for better ways to do thigns. But do
realize that(a) we can't just stop, or even really slow down. We can onyl try to
regulate and to some degree direct the flood, not hold it up for any
particular issue.(b) We do have process in place, and it may not be perfect, but I doubt
anything is, and what we do have actually has evolved over the years.And that's not just my process (ie "two-week merge window, followed
by about 6-8 weeks of fixups"), but the whole process both before and
after it (Andrew and now linux-next in front of it, and stable kernel
tree and the vendors after it).(c) the "big picture" discussion is separate from individual issues. If
you want your suspend-to-disk issue resolved, or a memory leak
solved, you don't solve those by trying to complain about other parts
of the system, that are totally separate.The global flow of patches and releases is not something that we can
hold up for _any_ of your individual problems. I do end up delaying
releases for really core things, so individual problems do obviously
affect (for example) the release timing. But the solution to them is
not in complaining about slowing down development, it is about
actually trying to engage the developers of *that* feature in *that*
particular bug.And finally, trust me, if you want to have people care about your
problems, the last thing you want to do is say "I might switch to BSD".
Because quite frankly, I really don't care. People who think that threats
like that work in any productive way can go screw themselves. I'll flame
idiots like that, and my likelihood of helping people because they think
they hold a gun to my head is almost zero.Linus
--
From: Andrew Morton <akpm@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:31 pm 2008On Wed, 30 Apr 2008 18:19:56 -0700 (PDT) Linus Torvalds wrote:
> You (and Andrew) have tried to argue that slowing things down results in
> better quality,eh? I argued the opposite: that increasing quality will as a side-effect
slow things down.If we simply throttled things, people would spend more time watching the
shopping channel while merging smaller amounts of the same old crap.--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: Apr 30, 9:43 pm 2008On Wed, 30 Apr 2008, Andrew Morton wrote:
>
> eh? I argued the opposite: that increasing quality will as a side-effect
> slow things down.Yes, my bad, I realized that when I read through my message and already
sent out a fix for my buggy email ;)> If we simply throttled things, people would spend more time watching the
> shopping channel while merging smaller amounts of the same old crap.I agree totally. And although some of the time would probably _also_ be
spent on the frustrating crap that was designed to do the throttling, that
isn't much more productive than watching the shopping channel would be ...Linus
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 6:59 am 2008On Thursday, 1 of May 2008, Linus Torvalds wrote:
>
> On Wed, 30 Apr 2008, Andrew Morton wrote:
> >
> > eh? I argued the opposite: that increasing quality will as a side-effect
> > slow things down.
>
> Yes, my bad, I realized that when I read through my message and already
> sent out a fix for my buggy email ;)
>
> > If we simply throttled things, people would spend more time watching the
> > shopping channel while merging smaller amounts of the same old crap.
>
> I agree totally. And although some of the time would probably _also_ be
> spent on the frustrating crap that was designed to do the throttling, that
> isn't much more productive than watching the shopping channel would be ...Okay, so what exactly are we going to do to address the issue that I described
in the part of my last message that you skipped?Rafael
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 11:26 am 2008On Thu, 1 May 2008, Rafael J. Wysocki wrote:
>
> Okay, so what exactly are we going to do to address the issue that I described
> in the part of my last message that you skipped?Umm. I don't really see anythign to say. You said:
> Still, the issue at hand is that
> (1) The code merged during a merge window is somewhat opaque from the tester's
> point of view and if a regression is found, the only practical means to
> figure out what caused it is to carry out a bisection (which generally is
> unpleasant, to put it lightly).
> (2) Many regressions are introduced during merge windows (relative to the
> total amount of code merged they are a few, but the raw numbers are
> significant) and because of (1) the process of removing them is generally
> painful for the affected people.
> (3) The suspicion is that the number of regressions introduced during merge
> windows has something to do with the quality of code being below
> expectations, that in turn may be related to the fact that it's being
> developed very rapidly.And quite frankly, (2) and (3) are both: "merge windows introduce new
bugs", and that's such an uninteresting tautology that I'm left
wordless. And (1) is just a result of merrging lots of stuff.Of course the new bugs / regressions are introduced during the merge
window. That's when we merge new code. New bugs don't generally happen
when you don't get new code.And of course finding bugs is always painful to everybody involved.
And of course the bugs indicate something about the quality of code
being merged. Perfect code wouldn't have bugs.So what you are stating isn't interesting, and isn't even worthy of
discussion. The way you state it, the only answer is: don't take new
code, then. That's what your whole argument always seems to boild down
to, and excuse me for (yet again) finding that argument totally
pointless.So let me repeat:
(1) we have new code. We always *will* have new code, hopefully. A few
million lines pe year.If you don't accept this, I don't have anything to say.
(2) we need a merge window. That is a direct result not of wanting to
have lots of code at the same time, but of the _reverse_ issue: we
want to have times of relative calm.And again, if you continue to see the merge window as the
"problem", rather than as the INEVITABLE result of wanting to have
a calm period, there's no point in talking to you.(3) Ergo, there's a very fundamental and basic and inescapable result:
we absolutely _will_ have times when we get lots and lots of new
code.So these are not "problems". They are *facts*. Stating them as
problems is stupid and pointless. I'm not going to discuss this with
you if you cannot get over this.So please accept the facts.
Once you accept the facts, you can state the things you can change. But
the things you cannot change is the merge window, and the fact that we
get a lot of new code at a high rate (where the merge window will
inevitably compress that rate, so that we have _another_ window where
the rate is lower).So stop arguing against facts, and start arguing about other things that
can be argued about. That's all I'm saying.Linus
--
From: Rafael J. Wysocki <rjw@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 1:09 pm 2008On Thursday, 1 of May 2008, Linus Torvalds wrote:
>
> On Thu, 1 May 2008, Rafael J. Wysocki wrote:
> >
> > Okay, so what exactly are we going to do to address the issue that I described
> > in the part of my last message that you skipped?
>
> Umm. I don't really see anythign to say. You said:
>
> > Still, the issue at hand is that
> > (1) The code merged during a merge window is somewhat opaque from the tester's
> > point of view and if a regression is found, the only practical means to
> > figure out what caused it is to carry out a bisection (which generally is
> > unpleasant, to put it lightly).
> > (2) Many regressions are introduced during merge windows (relative to the
> > total amount of code merged they are a few, but the raw numbers are
> > significant) and because of (1) the process of removing them is generally
> > painful for the affected people.
> > (3) The suspicion is that the number of regressions introduced during merge
> > windows has something to do with the quality of code being below
> > expectations, that in turn may be related to the fact that it's being
> > developed very rapidly.
>
> And quite frankly, (2) and (3) are both: "merge windows introduce new
> bugs", and that's such an uninteresting tautology that I'm left
> wordless.Perhaps if they introduced fewer bugs, all of that would be less frustrating to
people who get hit by them, especially by two or more at a time. Everyone
seems to be fine with that until it happens to him personally (like it happened
to David).> And (1) is just a result of merrging lots of stuff.
>
> Of course the new bugs / regressions are introduced during the merge
> window. That's when we merge new code. New bugs don't generally happen
> when you don't get new code.I obviously agree with that. The question is, however, if we can decrease the
number of bugs introduced during merge windows and you seem to be saying
that no, we can't. Which is disappointing.> And of course finding bugs is always painful to everybody involved.
>
> And of course the bugs indicate something about the quality of code
> being merged. Perfect code wouldn't have bugs.
>
> So what you are stating isn't interesting, and isn't even worthy of
> discussion. The way you state it, the only answer is: don't take new
> code, then. That's what your whole argument always seems to boild down
> to, and excuse me for (yet again) finding that argument totally
> pointless.I have never said you shouldn't take new code at all. That's not what I'm
saying and please don't paint me this way.I see a problem in that you get patches that you shouldn't have got because
they are unfinished and not well thought through. They introduce regressions
which are only possible to find using bisection because of the amount of code
merged at a time and that's frustrating.You seem to be regarding this as a necessity, but I'm really not convinced
that you're right in that.> So let me repeat:
>
> (1) we have new code. We always *will* have new code, hopefully. A few
> million lines pe year.
>
> If you don't accept this, I don't have anything to say.
>
> (2) we need a merge window. That is a direct result not of wanting to
> have lots of code at the same time, but of the _reverse_ issue: we
> want to have times of relative calm.
>
> And again, if you continue to see the merge window as the
> "problem", rather than as the INEVITABLE result of wanting to have
> a calm period, there's no point in talking to you.However, the width of the merge window is not a predetermined thing and might
be adjusted, for example. Other things might be changed too.> (3) Ergo, there's a very fundamental and basic and inescapable result:
> we absolutely _will_ have times when we get lots and lots of new
> code.But that need not include obviously broken patches.
> So these are not "problems". They are *facts*. Stating them as
> problems is stupid and pointless. I'm not going to discuss this with
> you if you cannot get over this.
>
> So please accept the facts.
>
> Once you accept the facts, you can state the things you can change. But
> the things you cannot change is the merge window, and the fact that we
> get a lot of new code at a high rate (where the merge window will
> inevitably compress that rate, so that we have _another_ window where
> the rate is lower).The problem is the (relatively small) fraction of patches pushed to you that
is broken. Some patches are obviously broken, some of them are just not
tested well enough. The result is pretty much the same in either case.Now, the question is if we can get rid of that fraction by adjusting the
process somehow. You're arguing that we can't and so be it. [This is your
opinion and BTW there's nothing allowing me to call that unreasonable or saying
that you use made up arguments or something like this.]My opinion is that we could at least try to do something about it. linux-next
is probably a step in the right direction, though time will tell. I'm afraid,
though, that I personally can't do much more than I've been doing already to
improve things.> So stop arguing against facts, and start arguing about other things that
> can be argued about. That's all I'm saying.The message that started this whole thread was not from me and I believe
it was sent for a reason. So the fact is that at least some people lose their
patience over the current handling of merge windows. And I'm not sure that's
necessary.Thanks,
Rafael
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 1:41 pm 2008On Thu, 1 May 2008, Rafael J. Wysocki wrote:
>
> I obviously agree with that. The question is, however, if we can decrease the
> number of bugs introduced during merge windows and you seem to be saying
> that no, we can't. Which is disappointing.No, that's not what I'm saying.
What I *am* saying is that as long as you concentrate on "merge window"
and "lots of code", you're concentrating not on the problems, but on the
facts of life. You can't change facts, and even trying is pointless.What you should concentrate on is not how many patches there are during
the merge window (because we can't do anything about that) or the fact
that they all happen in a short timeframe, but about quality of patches
_regardless_ of merge window.So if you can make an argument that does not even *try* to change the fact
that
- we have lots of patches
and
- we have a merge window
and
- merging patches causes bugsbut argues about quality from some other standpoint, then I can start to
believe that you have a point.But as long as you argue about the fact that we merge a lot of stuff, and
that bugs come in during the merge window, I'm not interested. Arguing
about facts is totally non-productive.And as long as people keep saying "let's not merge broken patches" or "we
should never have bugs", I'll just ignore those kinds of idiotic
statements. They aren't even arguments, they are wishes, and they are
unrealistic. If we knew they were broken and had bugs, of course we
wouldn't merge them.In short - I'm simply not interested in what you _wish_ reality was.
People need to first acknowledge reality, and _then_ they may have
solutions.So the reality is:
- we do have tons of patches, and they need to be merged (and furiously)- there *will* be bugs. And the number of bugs will inevitably be
relative to the number of patches. There is no "perfect", and anybody
who argues for a lower number of bugs by lowering the number of patches
is an idiot in my book.- there *will* be releases, even in the presense of bugs, because holding
everything up is simply not an option.Those are the things that we have to accept. Anything else is just
dreaming.Now, what part _can_ we improve and still be realistic?
We can try to improve average quality - the number of bugs will *still* be
relative to the size of the changes (no getting away from that), but we
may be able to lower the absolute number of bugs. But not to zero!And that "not to zero" is IMPORTANT. If you think you can aim for zero
bugs, I'm simply not interested in discussing it with you. You live in a
different universe, and we're not talking about the same reality.And if you're not being realistic, then why the hell would I believe that
your solutions are realistic? I'd rather take some pills and talk to the
little purple man living under the deck in my back yard, because at least
he's amusing, even if he doesn't make much sense either.And I'm also not in the *least* interested in arguments like "We should
just improve our quality of patches".Of course everybody wishes for that. Again, it's not an argument, it's
just a unrealistic wish, unless you can actually give a suggestion of a
process or other thing that would actually seem to reach it (without
assuming other impossible things like "we need more time" or "we need
more people who just spend their day looking for bugs").Same goes for "we should all just spend time looking at each others
patches and trying to find bugs in them". That's not a solution, that's a
drug-induced dream you're living in. And again, if I want to discuss
dreams, I'd rather talk about my purple guy, and the bad things he does to
the hedgehog that lives next door.So do you have any productive *suggestions*? Some that involve more than
"let's write less code" or "let's just review each others patches more".Linus
--
From: Linus Torvalds <torvalds@...> Subject: Re: Slow DOWN, please!!! Date: May 1, 2:30 pm 2008On Thu, 1 May 2008, Linus Torvalds wrote:
>
> In other words: do people have realistic ideas for how to make others
> spend _more_ time looking at patches? And not just _wishing_ people did
> that?Just to throw out an example:
- make a "Random pending patch of the day" google gadget.
I know that's abit out there, and I'm not sure the google gadget thing is
realistic, but I bet I'm not the only one who ends up using the google
homepage all the time. A button that says "this patch looks ok", "this
patch looks crap", or "I dunno, give me another one to look at" might be a
fun game that would encourage people to look at a couple of patches a day.You get five thousand people doing that occasionally (not every day, but
maybe when they are bored and look for something more rewarding than
trying to find bad music videos on youtube), and maybe you'd actually get
feedback on patches.Make it pick a random commit that is in linux-next but hasn't been merged
into main -git yet.Crazy? Probably. But at least it fits my notion of "let's not just wish
people did more patch commentary" thing.IOW, if people are really serious about coming up with ways to improve
code quality, I really think it needs to be about _practical_ things that
can fit in our flow or can be extensions to it, not just wishing for
better quality."If wishes were horses, beggars would ride"
Linus
--

Typo
The concept of a regular "merge window" was first discussed in July of 2005 with the release of the 2.6.24-rc4 kernel
That bold "2" should be "1", I guess.
Linus is right
Torvalds is right. There's no need to slow down the merges. It's the opposite. The development needs to go faster. There're lots of things that need to be merged and take too long. As far as i know Linus by his interviews, he is a practical/pragmatic guy. So instead of complaints, people should propose practical alternatives.
So where is this gadget?
Or a Gnome (sorry KDE) applet to do the same... Could be fun.
nash
[Who actually enjots reading patches]
We don't want to slow down
We don't want to slow down, because we want the kernel to evolve fast and get new cool stuff.
But at the same time, David Miller has a valid reason to complain. We must remember that these are people, and it is a lot of work for them.
Someone write the widget and
Someone write the widget and hook it up to Amazon's Mechanical Turk
virtual machines to the rescue?
With every commit, build a new machine. if it compiles and boots, accept the patch, if not, return a list of errors to the developer. 3500 commits per week works out to 21 rebuilds / hour, or one every 3 minutes. Distributed workload would make that manageable. Give the power to commit regardless of test results to only the highest level folks..
Over time, add in optional/mandatory regression tests...