If the goal for 2.6.20 was to be a stable release (and it was), the goal
for 2.6.21 is to have just survived the big timer-related changes and some
of the other surprises (just as an example: we were apparently unlucky
enough to hit what looks like a previously unknown hardware errata in one
of the ethernet drivers that got updated etc).
So it's been over two and a half months, and while it's certainly not the
longest release cycle ever, it still dragged out a bit longer than I'd
have hoped for and it should have. As usual, I'd like to thank Adrian (and
the people who jumped on the entries Adrian had) for keeping everybody on
their toes with the regression list - there's a few entries there still,
but it got to the point where we didn't even know if they were real
regressions, and delaying things further just wasn't going to help.
So the big change during 2.6.21 is all the timer changes to support a
tickless system (and even with ticks, more varied time sources). Thanks
(when it no longer broke for lots of people ;) go to Thomas Gleixner and
Ingo Molnar and a cadre of testers and coders.
Of course, the timer stuff was just the most painful and core part (and
thus the one that I remember most): there's a lot of changes all over. The
appended changelog is just for the fixes since -rc7, so that doesn't look
very impressive, the full changes since 2.6.20 are obviously a *lot*
bigger (and you're better off reading the individual -rc changelogs).
We now return you to your regular scheduler discussions,
Linus
---
Akinobu Mita (1):
fault injection: add entry to MAINTAINERS
Alan Cox (3):
exec.c: fix coredump to pipe problem and obscure "security hole"
pata_sis: Fix oops on boot
[SPARC] openprom: Switch to ref counting PCI API
Alexey Dobriyan (1):
paride drivers: initialize spinlocks
Alexey Kuznetsov (1):
[NETLINK]: Infinite recursion in netlink.
Andi Kleen (5):
x86: Fix gcc 4.2 _proxy_pda workaround
...Number of different known regressions compared to 2.6.20 at the time of the 2.6.21 release: 14 Number of different known regressions compared to 2.6.20 at the time of the 2.6.21 release that were first reported in March or earlier: 8 Number of different known regressions compared to 2.6.20 at the time of the 2.6.21 release with patches available at the time of the 2.6.21 release [1]: 3 What I will NOT do: Waste my time with tracking 2.6.22-rc regressions. We have an astonishing amount of -rc testers, but obviously not the developer manpower for handling them. If we would take "no regressions" seriously, it might take 4 or 5 months between releases due to the lack of developer manpower for handling regressions. But that should be considered OK if avoiding regressions was considered more important than getting as quick as possible to the next two week regression-merge window. But releasing with so many known regressions is insulting for the many people who spent their time testing -rc kernels. cu Adrian [1] http://lkml.org/lkml/2007/4/25/496 -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
On Thu, 2007-04-26 at 06:08 +0200, Adrian Bunk wrote: Adrian, please reconsider. Without you the issues I've reported (most likely to the wrong people) would have been missed too. And also keep in mind that it takes 2 to tango. If reporters can nail down issues to single config options/patches or go further by adding kprint's chances that things get fixed increase a lot. Soeren. -- Sometimes, there's a moment as you're waking, when you become aware of the real world around you, but you're still dreaming. -
On Thu, Apr 26, 2007 at 06:08:06AM +0200, Adrian Bunk wrote: > What I will NOT do: > Waste my time with tracking 2.6.22-rc regressions. I seriously hope you'll reconsider. If you hadn't have done this, things would have been a *lot* worse imo. But either way, thanks for doing what remains a really grotty job that may not get you as many kernel groupies as rewriting the process scheduler, but is equally as (if not moreso) important. Dave -- http://www.codemonkey.org.uk -
I sure hope you don't do this. Tracking these is tough, and I think you are doing a great job with it. No release will have no regressions, there's just too many different combinations of hardware and sometimes people don't have the time to test to see if their original report is even fixed or not. And some of them will get fixed with patches coming in the next kernel release, which will then be tracked down and added to the -stable releases. So if you can, please keep it up, if you think it's a thankless job, here's my hearty thanks for doing this work. It's really needed and I really appreciate it. thanks, greg k-h -
Fifthed here, Adrian. It could potentially become one of the best things to happen to the mainline release process (and I believe has already been worthwhile). Even if it takes a while for people to get on board, or some regressions slip through. And note, a release with regressions doesn't make your hard work useless -- you've still got the important who, when, how, etc. info that can be used in future, and it could serve as a "known issues for upgraders" document as well. -- SUSE Labs, Novell Inc. -
Adrian, I understand your concerns, it's more and more common to see developers considering their work is worthless. But it's not. You should see the current development model as a pipeline. What you feed at the input can take some time to reach the output, and if we wait for the whole pipeline to flush, more crap gets released. What is needed is a higher priority on fixes for known regressions. I find your summary above more readable than the large lists of regressions. I think that you should reply to Linus' announces with something that short, starting from the known-with-patch, known-for-more-than-1-month, and all-known-regressions. It may help Linus focus even more on those. Also, while it will not prevent any release with regressions, at least it will prevent such a stupid case of known regressions with patch available. Also, check how many regressions you have reported and which have been fixed during the -rc stage. You'll see your work really was useful. Maybe Linus should accept to dedicate -final to known regressions only, to force a check in this area ? Whether or not all of them get fixed is not the real problem, but at least we will not have any regressions with pending patch unapplied ! Please do continue that task if you have the time to do so ! Thanks, Willy -
Adding my voice to the chorus, I too hope you'll reconsider and continue doing a great job with the regressions lists. It's really useful! -- Jens Axboe -
A clarification:
I am aware that my work had some effect, and I am aware that my work
gets appreciated - there's no need for everyone to repeat this.
The point is: I'm not satisfied with the result.
Linus said 2.6.20 was a stable kernel. My impression was that at least
two of the regressions from my 2.6.20 regressions list should have been
fixed before 2.6.20.
They have both been fixed through -stable, but I also remember a quite
experienced kernel maintainer running into one of them after 2.6.20 was
released and spending half a day tracking it down - and my answer was
"known unfixed regression, first reported more than a month ago".
There is a conflict between Linus trying to release kernels every
2 months and releasing with few regressions.
Trying to avoid regressions might in the worst case result in an -rc12
and 4 months between releases. If the focus is on avoiding regressions
this has to be accepted.
And a serious delay of the next regression-merge window due to unfixed
regressions might even have the positive side effect of more developers
becoming interested in fixing the current regressions for getting their
shiny new regressions^Wfeatures faster into Linus' tree.
0 regressions is never realistic (especially since many regressions
might not be reported during -rc), but IMHO we could do much better than
what happened in 2.6.20 and 2.6.21.
These are just my personal opinions, and other people consider the
resulting 2.6.20 and 2.6.21 kernels OK.
I'm not satisfied with the result, and the world won't stop turning when
I'm not tracking 2.6.22-rc regressions.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Adrian, Nevertheless, thanks for your efforts and time spent. You did a great Nobody is satisfied with pending regressions. I can completely understand your frustration, but you need to adjust your expectations on that as well. Your regression lists are extremly useful, as they point folks like me to the burning points. I try to follow LKML as far as I can, but I have to admit that I occasionally go the easy way of marking 10000 mails as read in one go after a week of travelling. I don't do this to my That happens all the time. I have a dozen of boxen around and I can't do tests on all of them continously. So trapping into some known regression Yes, it's a conflict, but one that is unresolvable except we want to go Maybe we need to coordinate changes better. 2.6.21 got three big updates which affected suspend/resume - one of them is my fault. But fiddling out which one of those - we had nested problems as well - makes it quite hard to grok them in time, especially if they happen only on one reporters system. Your reports are not invalid, when Linus releases a final. They are still there and worked on. I believe we are getting better at that, and one reason for this is your relentless effort to poke the experts^culprits to actually solve the Please take a couple of days to reconsider. I personally would welcome if you carry on. Thanks, tglx -
Yes. _If_ we had known how painful the timer changes would end up being, we'd probably have done them separately from everything else. That is the kind of thing that looks obvious in hindsight: merge stuff that is questionable and scary alone, and don't do anything else that release cycle. But while the timer code is obviously pretty core, I think everybody expected it to be a lot easier to merge (and it had existed as patches in various forms for some time). So we simply didn't know beforehand that it was going to cause the kinds of regressions it did cause (and in fact, some of the regressions were initially blamed on other things entirely - some of them looked like IO regressions). Water under the bridge. It's also easy to say in hindsight that something should have been merged separately and been given a release cycle all its own. Linus -
No.
Regressions _increase_ with longer release cycles. They don't get fewer.
The fact is, we have a -stable series for a reason. The reason is that the
normal development kernel can work in three ways:
(a) long release cycles, with two subcases:
(a1) huge changes (ie a "long development series". This is what we
used to have. There's no way to even track the regressions,
because things just change too much.
(a2) keep the development limited, just stretch out the
"stabilization phase". This simply *does*not*work*. You might
want it to work, but it's against human psychology. People
get bored, and start wasting their time discussing esoteric
scheduler issues which weren't regressions at all.
(b) Short and staggered release cycle: keep changes limited (like a2),
but recognize when it gets counter-productive, and cut a release so
that the stable team can continue with it, while most developers (who
wouldn't have worked on the stable kernel _anyway_) don't get
frustrated.
And yes, we've gone for (b). With occasional "I'm not taking any half-way
No. You are ignoring the reality of development. The reality is that you
have to balance things. If you have a four-month release cycle, where
three and a half months are just "wait for reports to trickle in from
testers", you simply won't get _anything_ done. People will throw their
No. Quite the reverse.
2.6.20 was actually really good. Yes, it had some regressions, but I do
believe that it was one of the least buggy releases we've had. The process
_worked_.
True. However, it's sad that you feel like you can't bother to track them.
They were _very_ useful. The fact that you felt they weren't is just
becasue I think you had unrealistic expectations, and you think that the
stable people shouldn't have to have anything to do.
You're maintaining 2.6.16 yourself - do you not see what happens when you
decide that "zero regressions" is ...<SCNR>
They get frustrated because they focussed on developing new features
instead of fixing regressions, and now it takes longer until their new
features get merged because noone fixed the regressions...
"wait for reports to trickle in from testers" is exactly the opposite of
our problem.
I started the regression lists originally to prove the fairy tale
"noone tests -rc kernels" some kernel developers spread as wrong.
Look at the facts:
8 out of 14 regressions in my current list were reported in March or earlier.
And for many regressions fixed it took several weeks until debugging
by a kernel developer was started.
We do not lack testers for getting bug reports quickly.
There's not a realistic chance for 0 regressions, and 4 months was
a worst case, not the average case.
And all the people who have to upgrade to 2.6.21 for getting an
important security fix run into a dozen known (and many unknown)
regressions.
If we had the developer manpower to get each reported regression
debugged and fixed [1] within three weeks, 2.6.21 might be in the shape
I would have liked it to be today.
But there are the three interdependent variables time, developer
manpower and quality. And few developer manpower and few time results in
a lower quality of the release I'm not happy with.
Life has taught me that sometimes I'm right, sometimes I'm wrong, and
sometimes both sides have a possible solution. We might agree to
disagree, and you are the one who's opinion counts. I can only say that
I am not happy with the result, and that I do therefore not spend my
time on maintaining regression lists for 2.6.22 - and maintaining such
cu
Adrian
[1] "fixed" can also be e.g. "patch reverted" or "not a bug"
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon ...I would disagree: They get frustrated because they are blocked on some small regression which is stopping a ton of other fixed including features people need (like new hardware support) from being released. The "no regressions" model doesn't really work when you ask about the greater good of the userbase. The goal of no regressions is great and the regression lists for ATA were certainly very helpful but the greater good comes first. Alan -
"no regressions" is definitely not feasible.
14 known regressions, some of them not yet debugged at all, are
different from your "some small regression".
And look e.g. at the many (and non-trivial) changes between -rc7 and
-final, resulting in more than one report from people who were running
-rc7 without problems - and 2.6.21 doesn't work for them.
It's not a choice between "regressions don't matter" and "no regressions",
it's about the place in the area between these two extremes. I have my
opinions on what I want to expect from a stable Linux kernel, and other
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Yes, but when were some of these regressions reported? Past a certain point, I think it's reasonable to look at the regression, decide how many people would be affected by it, and why it hadn't been noticed earlier, and in some cases, decide that it's better to get this Everyone is going to disagree to some extent; and their own comfort zone. So a certain amount compromise is always going to be necessary. Of course, it's up to you decide whether this has gone beyond the zone where you aren't comfortable working with other people's development style. Regards, - Ted -
8 of them have been reported in March or earlier. [1] Patches for 2 of these 8 were available at the time of the release. [2] While the question whether to merge one of them into 2.6.21 was controversial, the other one was not controversial. For one of the bugs, it became obvious when someone looked at it after the release of 2.6.21 that between the bug report on March 31th and the cu Adrian [1] http://lkml.org/lkml/2007/4/26/2 [2] http://lkml.org/lkml/2007/4/25/496 [3] http://lkml.org/lkml/2007/4/26/496 [4] and although it turned out this specific regression was already fixed in 2.6.21, I hope you get my point -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
What Adrian was doing, or anybody in the future, is not going to be productive unless Linus holds the people who are responsible for the bug to get it fixed or report why they can't. My $.02 Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) -
I can imagine that the dvb oops bugfix got held back to avoid some noise with dvb developers who claimed that they didn't get notified about how that patch got into the v4l-dvb repository (it didn't get reviewed by these people for weeks because it simply got ignored and some of them were aware of that) On the other side if I read bugreports like the following one: http://www.mail-archive.com/linux-dvb%40linuxtv.org/msg23028.html (My Nova-T 500 crashes quite regularly. My machine has been running for about a week and in that time has had 5 oopses.) It doesn't solve the Nova-T disconnects, but it at least solves that the machine doesn't go down when this happens till the driver gets fixed. I think it would have been nice to get that patch into it -- Markus Rechberger -
And the next kernel will go out with no list to warn users, and no to-do list for -stable. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
I agree. That's part of it. But part of it is not just the "it's 2 months until the next release", part of it is also very much a "nothing has happened in the normal kernel for the last 8 weeks, this is boring, so I'll do my own exciting stuff". So one _fundmanetal_ issue is that all the people who aren't directly involved with a particular regression are simply bored. And bored is not good. You want people productive - and that meas that you want a active development kernel that they can work with, since they aren't going to help with the regressions anyway. This is why the -stable tree is so useful. It's not only that users want a stable tree - it allows people who do *not* have regressions on their plate to not be stuck twiddling their thumbs - they can be on the regular I'm saying that four months wouldn't even have *helped* in the case of 2.6.21. Do you really think bugs get fixed faster just because there wasn't a release? Quite the reverse. Bugs get _found_ faster thanks to a release (simply because you tend to get more information thanks to more users), giving the stable people more information, causing the bugs to be able to be found and fixed _more_quickly_ in the stable release than if we had waited for four months to release 2.6.21. The two last weeks of 2.6.21-rc were almost entirely "wasted", apart from getting the e1000 issue at least resolved (which was the reason for that delay, so I'm not complaining - I'm just saying that not a lot of people actually were able to _help_ with regressions during that time, and for some of them, we might well be better off with more information about the issue). Did we fix other bugs? Yes. There was one long-time bug (since 2.6.15 or something) that happened to come in during that time, and we had some cleanups, we had MIPS bugs, we found some networking issues etc etc. But I disagree. Quite often, having 5 people report the same thing is actually more useful (because you see a pattern) ...
It is my time, and it's therefore my decision what I consider to make
sense spending it for.
Instead of continuing our discussion it makes more sense that we simply
accept that we disagree regarding when a kernel is ready for being
released instead of repeating the same arguments in a lengthy
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Quite possible, given the (very) limited range of the bugs. Most people just can't debug them. This isn't IMHO fundamentally wrong, and releasing a ".0" kernel with known problems isn't fundamentally wrong either. What is missing is easily accessible KNOWN_PROBLEMS information for released kernels. While I think your work documenting etc. known regressions is a very good thing, publishing it with the released kernels (certainly .0 and next stable releases, perhaps "quite stable" rc versions as well) would be ideal. I the process worked with 2.6.21 as well. Obviously no two releases Anyway, I and many others are satisfied with the result. I think it's one of the few "quite recent" things which are a great improvements. Other such things are using that weird git thing :-) and perhaps the most important - the length of devel cycle under control (I mean the We've got stable series. With KNOWN_PROBLEMS information, sysadmins can decide if they can safely upgrade to .0 or if they have to wait for .123. Pressing the responsible people to fix the problems in .123 (would) help it greatly. -- Krzysztof Halasa -
Listing regressions like the following will most likely be zero help for them when deciding whether to upgrade now or later (and BTW, the latter might imply running a kernel with known security issues): Subject : acpi_pm clocksource loses time on x86-64 References : http://lkml.org/lkml/2007/4/17/143 Submitter : Mikael Pettersson <mikpe@it.uu.se> Handled-By : John Stultz <johnstul@us.ibm.com> Len Brown <lenb@kernel.org> cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
On the other hand when I was responsible for a bug, it was a great help to see this sort of mail. Not only as it reminded I was responsible for a screw-up but when I sent a fix out, these mails let me know it was being picked up and ensured it made it to mainline. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
The biggest problem I see is that developers want to make improvements in an area, like ide, but they don't seem to look at the old code and make it sure the new code supports everything the old code did. This causes hardware that used to work to not work, or work in a degraded fashion. My $.02 Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) -
Krzysztof Halasa wrote: For how long you plan to maintain 2.6.x.y -stable series for each 2.6.x release? The thing is that tehere will probably be NO .123 "revision" (with maybe the exception of 2.6.16, thanks to Adrian again). The end result is that there will be just no stable kernels *at all*. because when the next 2.6.x will start looking more or less useful due to 2.6.x.y series, there will be new 2.6.x+1, and work with 2.6.x stops... It's not the case currently, but this way ("let's fix the bugs in 2.6.x.y -stable series, don't bother releasing 2.6.x in a good shape"), we can finally come to the above situation... /mjt -
Michael Tokarev wrote: Actually, bugs are fixed in --> 2.6.(x+1)-rc <--, and shortly thereafter a small selection of the fixes are "backported" to 2.6.x.y and to various distributor kernels. -- Stefan Richter -=====-=-=== -=-- ==-== http://arcgraph.de/sr/ -
Problem is, not enough developers pay attention to the -stable series. Adrian, maybe you could shift your attention there and stop trying to track the bleeding edge? -
on what Adrian does or stop doing. Apparently, unless Adrian posts his list of know regressions, most of the people doesn't look at the bugzilla at all. Maybe it'd be useful to create a per-release bug tracker in the bugzilla or collect them into one of the a kernel.org's wiki, to make easier to follow the current state of all the "important" regressions. -
Any web-based interface is a no-no. It's one reason I don't use bugzilla a lot. If I can't get it by email, it doesn't exist, as far as I'm concerned. I bet that's true even of a lot of people who are more "web oriented" than I am. They may look at webpages, but getting notified by email is still the wakeup call. There's a difference between "active and directed pushing to the involved people" and "the resource exists, that people could look at". So it would have to be more than just a wiki or a bugzilla entry. It would have to have that weekly email status thing, and I think that it needs to have some human who tries to find messages on the kernel mailing list too, and make a first-level judgement on the bugs. Adrian was doing a good job. But it doesn't necessarily need somebody with intimate knowledge of the kernel. In fact, almost everybody who *does* have intimate knowledge tends to have so in a very specific area (nobody knows everything - and that very much includes people like me and Andrew too) and maybe be skewed in other ways too, so a "generalist" is probably more useful than somebody who is a "deep coder" in some subsystem. And it almost certainly doesn't have/prefer to be _one_ person. I suspect that this is something where it actually might be better to have some collection of people interested in it, and yes, perhaps editing a wiki is part of the process, but with at least that "automated email" thing going on in additin (and it needs to go to the people involved, not just the kernel mailing list - so part of it is not just gathering the reports themselves, but also gathering target addresses from MAINTAINERS files and perhaps git logs etc). And yes, it's quite possibly a good way to get into kernel development - it definitely helps to know about programming, but as mentioned, I don't think it is something where you even need to know specifically about *kernel* programming per se. For example, I don't think it was an ...
Bugzilla sucks quite a lot at email, but you can answer emails and they get into the bugzilla database; and there're two mailing lists (listed in Documentation/HOWTO) that send notifications about every new bug added/modified- I know it's not the perfect email interface every hacker wants, but it's better than nothing. I suggested some time ago that it'd be useful to send every new bug notification from bugme-new to the LKML (and/or other lists). The volume should not be so high to make it so annoying that it makes it unuseful, and at least it makes the bugzilla-haters aware of the bugs reported, and since bugzilla tracks the answers to emails and the reporter email address is in the email, it makes easier for bugzilla-haters to ask for more data and try to fix the problem, without starting any browser. I can understand Adrian's resign. Bugzilla is crap, but there're users reporting bugs there and willing to cooperate to fix them, and they're not getting listened. There're even a few description of patches (ie: "line 6 in foo.c is wrong and it breaks our testing, it should read like this:") that have been sitting there for *years* and not getting merged. I guess that Adrian tried to canalize the important regressions to the hackers, and he got tired of apparently being the only one that cares about getting them fixed. So I, or anyone else, could try to do Adrian's job. But if Adrian (a guy that sends patches to make global functions static 8) got tired of doing that job, I suspect that I, or anyone else would also got tired of it even sooner. There're other big projects with probably more bug reports than linux, they don't work this way, and they look more succesful in their bug handling. So in my humble opinion there's a problem, about how the whole bug reporting/fixing process works. With the current linux development model, a good bug reporting/fixing process doesn't looks optional, since it's important to fix bugs ASAP to get the fixes into -stable. The fix may go ...
No, it's *not* better than nothing. The thing is, these reports MUST NOT go to "everybody". If they do, that is actually *worse* than nothing, because people will just ignore them entirely, since they aren't "directed". The emails need to be directed to the appropriate parties, not go to everybody. There is nobody who is interested in seeing all regressions, except perhaps me and Andrew. Most *real* developers (as opposed to people like me, who are integrators, not "real developers") want to be notified about problems in *their* area, and if it's just automation that sends out everything, it just dilutes the value of the thing, to the point where people will ignore it even for the cases when they happen to be related to I don't know a lot of developers who actually read LKML. I know a lot of people who look for interesting subject lines and interesting people, but read LKML in the sense of reading everything? Not likely. That's why I think Adrian did a great job: he took the "noise" and made it somethng worth looking at! And part of that is very much to make it directred to only relevant parties (yes, they *also* got cc'd to linux-kernel, but people would get them in their personal mailboxes and I personally refuse to have anything at all with bugzilla. The interface is so horrible that it's just not worth my time. I know there are a few people who use it productively, but I'm always amazed that they can do that. The *big* problem with bugzilla is that it's such a "detail-oriented" thing. It's fine if you have *one* bug that you're tracking. But whenever that's not the case, it's almost totally useless. Let me put it another way: I would never use a source control system that forces me to look at my 22,000 files one at a time. I think such a system is fundamentally broken, because it makes it impossible to get the big picture ("what changed in the last week" kind of thing). The same is true of bugzilla: if you *know* which bug you're ...
Seems like the bug tracking system needs to be re-evaluated. Perhaps Bugzilla can be modified to better serve the needs of kernel developers. There might be alternatives, JIRA (http://www.atlassian.com/software/jira/) for example. I am sure there are others. Finally, I have over 7 years experience writing various web based systems (using WebObjects, J2EE and others). I would be quite willing to volunteer my spare time and experience if this community decides that a custom written solution (one that would handle email bug submissions/resolutions :) is in order. Kind regards, Marek Wawrzyczny -
And if it's a bug in an unmaintained subsystem, a user could do this for
100 weeks without any effect.
There's no value in keeping reporters busy with useless tasks. [1]
Don't forget:
A good bug report is an important contribution.
We are already quite good at ignoring bug reports that come through
linux-kernel, and it's an _advantage_ of the kernel Bugzilla to see more
than 1600 open bugs because this tells how bad we are at handling bugs.
How many thousand bug reports have been ignored during the same time on
linux-kernel?
If a developer asked for further information and the submitter didn't
answer within 1 months, I will close this bug. [2] And "I will" is not
talking about the future, I'm doing this in the kernel Bugzilla for
three years or so.
The problem is we have tree states of subsystems and drivers:
- unmaintained
- maintained [3]
- maintained and maintainer looks after bug reports
I do hereby promise you to manually ask the submitters of all 1600 open
bugs in the kernel Bugzilla within one month whether their problem is
still present with 2.6.21 and forwarding all bugs if the answer was
"yes" to whoever is the right recipient if you promise me that all bugs
where the submitter said "yes" will be debugged by a kernel developer
cu
Adrian
[1] "Still present with the latest kernel?" is a valid question
if a developer intends to debug further, but otherwise it's silly.
[2] sometimes a bit later, but I'll do it
[3] I'll not do public maintainer bashing, but we have very active
maintainers with nearly zero activity in debugging user bug reports
[4] I don't think you'll do - but if you do that's a serious offer
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
However, look at this bug: http://bugme.osdl.org/show_bug.cgi?id=7760 It's outside my knowledge to be able to fix for various reasons: 1. I don't know _anything_ about IXP4xx hardware, so I'm not the right person to own this bug. 2. I've no idea who's looking after IXP4xx stuff now. (When it was posted to the ARM kernel list by Rob, it was ignored so I guess the IXP4xx maintainer isn't watching for bugs, if such a person even exists.) 3. I've little idea about why the MM page allocation is failing early. 4. The ARM DMA bounce code is a hack, and I'm pretty sure that this failure is a result of trying to use this contorted code instead of the relevant subsystems having a correct DMA mask. Can you make any suggestion what should be done about this bug? I'm personally very tempted to close it as "won't fix" (I wish there was a "can't fix" category.) As for my other bugs: http://bugme.osdl.org/show_bug.cgi?id=8149 Again, EP93xx is not my thing, but I've recently merged a patch to allow AMBA PL010 uarts (which are present on this platform) to use the clock control API. The EP93xx people (provided they're willing) now need to do whatever's required to resolve that bug. (Hopefully they've taken ownership of that bug now.) http://bugme.osdl.org/show_bug.cgi?id=4270 http://bugme.osdl.org/show_bug.cgi?id=5875 http://bugme.osdl.org/show_bug.cgi?id=7750 I'm no longer serial maintainer. Bug IDs after about 7000 reflect bugs submitted since I've resigned my serial maintainership, and therefore I've ignored them. It's far easier to ignore bug reports in bugzilla than it is to get categories reassigned (to whom? - dunno) or even deleted (if no one steps up presumably that's what needs to happen?) http://bugme.osdl.org/show_bug.cgi?id=7389 This one isn't a regression, or even a bug IMHO, but could be viewed as undesirable behaviour. Fixing it is utterly non-trivial though. A serial port which happens to have an IrDA ...
So this is a completely debugged bug in a well-maintained subsystem (no matter what the status in Bugzilla is). That's one of the problems: Unmaintained subsystems. Since you stepped down as serial maintainer (and it's your right as maintainer to do so), the serial subsystem is unmaintained. That's exactly where Linus' "drop any bug reports that are more than a week old" suggestion is completely flawed - no matter what the submitter New subsystems always get default owners like drivers_ieee1394@kernel-bugs.osdl.org, and people interested in such bugs can edit their preferences to watching all (pseudo) users in whose bugs they are interested. drivers_serial@kernel-bugs.osdl.org [1] would cu Adrian [1] or @linux-foundation.org -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
You're being very optimistic. I'm not sure where you get the idea that it's "completely debugged". It isn't - I've no real idea what the problem is, let alone what the solution might be. I've only one guess based upon what is sane in the kernel, and that isn't even based on the data provided in the bug report. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
You talk, but what do you actually *suggest*? Talk is cheap. You use to do the walk too, but you've already said that you're not interested in that any more. So excuse me if I'm not impressed. The thing is, bugzilla is totally broken because it's designed to help track bugs, but it's *not* designed to actually handle the much harder problem, which is to actually get the *right* developers to be aware of the *right* bugs! And both of those "right"s are important. Spamming everybody will just mean that everybody tunes them out. And spamming even the right developers with useless bugreports will also just cause them to tune out the good ones. The thing is, the "tracking bugs" part that bugzilla _can_ do is also totally _useless_ without that much more important phase, namely the "connect the parties". That's what you really used to do. You made developers connect to the reports, because your reports were _useful_ and not overly noisy. But go back and look - did you notice that once you connect the dots, it turns out that bugzilla itself isn't all that wonderful. Quite a lot of your regression reports had other ways of pointing to the problems, including very much mailing list archive pointers etc! So bugzilla isn't actually all it's touted to be even _once_ the connection between reporter and developer has been established. I really don't see why you are so hung up about bugzilla, when your own regression reports didn't generally point to it all that often! (I just went back and double-checked: you had more than twice as many pointers to kernel mailing list archives than you had pointers to bugzilla in the one series I looked at. And I'm _not_ saying that's wrong at all: I think the mailing list is actually likely to be at LEAST as useful as bugzilla is as a bug-tracker!) And bugzilla actually falls down even more than the mailing list does for the whole (and MUCH MORE IMPORTANT!) phase of connecting developers to bug reports. And THAT ...
This means we need people who figure out who to assign bugs too. Aka bugmasters. In theory it could be nearly automated. Figure out what files related to the bug and assign to the last 5 people who submitted patches for them and/or signed off. Ok I suppose it's not that easy -- you would need some human judgement. BTW one big problem in our current bugzilla is that a lot of people cannot reassign bugs they don't own. I sometimes see bugs that I don't own bug I know who is responsible, but bugzilla doesn't allow me to do it. So I think what would help: - Ask more people to just categorize and reassign bugs (anybody interested?) - Give more people in bugzilla the power to reassign arbitary bugs (bugzilla maintainers would need to do that) -Andi -
Folks might want to take a look at the Debian Bug Tracking System (BTS). It has a web interface which you can use to query history, but *everything* is e-mail driven, and the way you submit, close, update, tag/classfy bugs --- everything --- is via e-mail. More importantly, anyone is allowed to recategorize and reassign bugs. If someone does so maliciously or incorrectly, you can always revert it, and if someone is being truly malicious, you can always blacklist that one person. It this respect, it is far more wiki-like than bugzilla, which has always been too much like a straightjacket. It's not perfect, but it's better than bugzilla --- but then again, just about *anything* would be better than bugzilla. (Hmm, except maybe SourceForge's very tragic bug tracking system... :-) Of course, as Linus has said, it's not a complete solution --- you still need humans to be smart about things --- but if the goal is to make it easier to archive and track information about a bug, at *least* with the Debian BTS, when you reply to an e-mail message, the reply is automatically appended to the bug log! - Ted -
It might not be bad to write up an email-based BTS-alike bug-tracking system just for the Linux kernel. It should probably even be implemented 100% via email at first, with a web-based status viewer as a later add-on. Here's a possible email format: [kbugger: action1 arg1 arg2 ..., action2 arg1 arg2 ...] Make it almost totally message-id and thread based, and make it an implicit part of LKML (IE: subscribe the kbugger program to LKML). People who are flagged "admin" may ban/unban users and make certain large-scale changes. Supported actions: create, create-parent, create-thread: Create a new bug associated with this message. The arguments specify the title. This would automatically happen for new threads with titles like "[BUG] foo: It's broken" merge: Merges the current bug and/or email thread into an existing bug. The arguments are a list of bug numbers and/or message-IDs to merge together with this one. prune, prune-parent, prune-thread: Prunes a given thread from the current bug. Optional argument specifies a referential message-ID settitle: Change the title of the current bug fixed, broken: Mark the bug as fixed or broken in a particular version/configuration Arguments are used as opaque strings representing configurations where it is known to be fixed or broken. For example [kbugger: fixed 2.6.16 2.6.20-x86, broken 2.6.20-ppc] would just store the list of strings and statuses. If the bug was auto-created with a title like "[BUG ppc] foo: It's broken" or "[BUG 2.6.20] bar: I dunno", then the argument to the [BUG] title portion will be auto-passed to [kbugger: broken]. status: Get a brief status report on the current bug. info: Get a detailed status report on the current bug. history: Get detailed information about the history of the current bug. This only sends the reply to the author. stop: Stop parsing the rest of this email. Useful when teaching somebody ...
On Sun, Apr 29, 2007 at 07:55:35PM -0400, Theodore Tso wrote: > but if the goal is to > make it easier to archive and track information about a bug, at > *least* with the Debian BTS, when you reply to an e-mail message, the > reply is automatically appended to the bug log! bugzilla does that too. Dave -- http://www.codemonkey.org.uk -
[Oops, the first try of this mail got out from my local address, sorry] I've started hacking on some bash scripts to do the e-mail part, based on the requirements I've gathered from reading this thread. So far, I got one script that can be plugged into procmail and processes mails to do the following (right now): - Create a bug - Set the bug type to "regression" - Update the timestamp of the last action for that bug - Assign the bug to a subsystem - Track the submitter and whoever grabs the bug More (hopefully) to come... Those commands are just added to any email in the thread discussing a bug like this: @bugthing mine # Mark myself as owner regression # Mark this bug as a regression (used for reports) subsystem mm # Assign the bug to mm needinfo # Tell the tracker to bug the reporter if nothing happens Thanks That block can appear anywhere in the email, so if you're discussing some problem on lkml and want to track that bug, you can just add such a block to your email and turn an untracked email conversion into a tracked bug. Tracking does _not_ mean that all emails are stored, those can be looked up on lkml.org or MARC, where the created reports will probably contain URLs to the latter, because it supports lookups based on message ids. Tracking does just mean that the state of the bug is stored somewhere. That means (currently, suggestions welcome): - What's its name? (E-Mail subject) - Who reported it? - Who (if any) stepped up to own that bug? - What type of bug is it? - Which subsystem does it belong to? - What's its current state? (new, owned, fixed, ...) - When did the last action on this bug happen? Based on that information, I've started writing some scripts that create "reports" (all of them currently being pretty incomplete): - Bugs that belong to a specific subsystem (on request, currently through a procmail triggered script; this is meant to satisfy Adrian's request of asking for example for all SATA bugs.) - ...
Plus it has the very user-friendly reportbug and querybts commandline interfaces. No going to web pages to query bugs, and you can just download the email thread for a bug report as an mbox file and then reply via email. (Querybts currently only works if you know the package name, it can't search across all packages so it wouldn't be that useful for the kernel. But for Debian packages this tool is gold.) Johannes -
Bugzilla can do that too. But I'm not convinced this is a good idea. We had this some years ago with the jitterbug experiment and usually it tended to faithfully keep track of very long off topic threads that had drifted long from the original bug. The resulting collections of emails usually not very useful. While other interfaces might be a culture shock for some people at least they force them to concentrate on the particular issue -- contribute to a specific bug. -Andi -
this, and the fact that anyone can add to the bug log by just sending an e-mail are a nice feature however, I had a reason to take a look at the debian BTS late last week to see if the bugs and patch that I sent to the sysklog maintainer (both debian and upstream) got included in debian 4.0. talk about depressing. there are about a dozen bugs _with_ patches sitting in the queue for several years David Lang -
Yes. But not using bugzilla. I don't know if you've noticed already, but I'm not the only one that doesn't have a very high opinon of bugzilla ;) What works is somebody who is a bugmaster, and it doesn't really matter *what* bug tracker he points to (bugzilla being one of the possibilities, although not necessarily the best, and absolutely NOT the only choice), and turn them into emails. Once they are emails, bugzilla can track them. I do think you're pretty optimistic. I think it's true for trivial driver bugs, but even for trivial driver bugs the initial report is often not enough to pinpoint the driver. Let's take the sis900 bug as an example (not because I want to rag on that being a problem, but it happened to be a _trivial_ bug in 2.6.21, so it's a good case of something really really easy - and if that easy case isn't handled trivially and obviously, then the bug-reporting doesn't work). In that case, the initial report was (condensed version, but fairly accurate): "system hangs on boot at random points. 2.6.21-rc7 worked well". Now, realistically, if that entry had been in bugzilla, what would you do? Equally realistically, let's ignore bugzilla for a moment, and ask what the best method for handling something like this would be? Have an open mind, no rules on "have to use bug tracker XYZ". You know what? The report went to me and the kernel mailing list as email. And that was the *right* thing in that case. There was no good sign of who it should go to, and while there wasn't a whole lot of information there, there *was* a very tight timeframe (ie it could pinpoint to within about a week when it started). But the only thing I could really ask for was for the person to bisect it. Would bugzilla have helped? HELL NO. It would have been a disaster. It would have wasted reporter time, it would have wasted developer time, and it would likely have been ignored because the bug report wasn't specific enough to really trigger any ...
So you want to do the work that should be done by a computer in You need a few people in bugzilla that ask the questions to narrow it down (= bugmasters). e.g. the opensuse bugzilla works this way :- everything new gets assigned to a few screening people who An unmaintained bugzilla yes. A well maintained one would have someone asking them the (often quite repetive questions) to narrow it down There have been a couple of email thread trackers; like jitterbug -- in fact bugzilla can do that with an email interfaces. But in my experience they don't work well because a bug tracking system has slightly different requirements than normal emails (e.g. it wants you to roughly stay on topic for the current bug). With email people always forget that and in the end you end up with lots of stuff in there not related to the bug at all. Also with distributed solutions it would be hard to get a global "how many regressions do we really have right now" statistic, which is fairly important. The web interface is slow and ugly but at least it puts the people in the right mind set for this unlike email. And it gives you a central point to get an overview of the bugs. Anyways I'm sure bugzilla could be improved, I just don't know of anything better currently. -Andi -
Didn't you even *read* my email? I already told you: we have real bugs getting reported and fixed that don't hit bugzilla or any bugtracker AT ALL. This is not a "all or nothing" situation. There is absolutely _zero_ real Yes. The ones that *work*. Plain email is preferably over bugzilla 90% of the time. But quite frankly, if you think you can make bugzilla work (and realize that a lot of people will _not_ be looking at it or reporting bugs into it), go ahead. I don't care. The only think I care about is *REALITY*, and that means: - a lot of reporters will not use bugzilla, because it's damn inconvenient even for reporting. If you propose something that uses _only_ bugzilla, you'd better also have the people who enter other peoples bugreports into there. - a lot of developers will not use bugzilla, because it's even more inconvenient for developers, with no sane ways to interact with the right people. So if you propose using bugzilla, you'd really better have the manpower to turn bugzilla into emails (and no, the bugzilla cc list etc is _not_ the primary one - the email cc's are the primary ones, because that's where it is much easier to bring in new people) In other words, anything that thinks that bugzilla is "primary" is just broken. It can be a _part_ of the thing, but drop the belief of it being a primary tracker. It's just too inconvenient. Linus -
Don't think that's true. There are plenty of projects who only accept bugs through bugzilla (mozilla, various distributions, etc.) and I don't see any evidence of your claim being true. Sure there will be always people who cannot be bothered to use any kind of interface for bugs, but then these are unlikely to stay on board during a longer remote debugging q'n'a session either. So those people can be just ignored; they essentially don't exist in the bug report universe. Anyways it only works if people are willing to use it too and there are enough people who maintain bugs (aka ask questions to find out who to reassign, prune old bugs etc.) If that's not there then it won't work well obviously, like it is currently the case. I don't think the "keep it in Andrew's/Adrian's head" method is going to scale longer term at least (and one of them has already thrown in the towel) The "send it to a gigantic mailing list and hope someone catches it" method also doesn't seem to be that great. At least there are lots of lost reports in my experience this way. I suspect the real reason is more "Linus doesn't like web interfaces for no particular good reason". Not much can be done about that. Well perhaps someone can write a gopher based bugzilla interface or something to solve that instead @) -Andi -
From: Andi Kleen <andi@firstfloor.org> That explains why my bugs don't get looked at for months if not years when I submit them to such projects. I reported a bug that eats people's hard disks due to a bug in the X.ORG PCI support code on sparc, NOBODY has fixed the bug in 2 years even though a full bugzilla entry with even a full patch fix is in there. Bugzilla sucks, emails rules because it is in your face and gets people to work on things. -
Well but at least they could find it again if they wanted. If you sent it by email and it had gotten lost for some reason (nobody interested, which seems to be the real issue here) then it would be lost forever. Of course a database is not a silver bullet by itself, it still needs people to use it correctly and then actually work on the bugs. A database just a tool to move some work humans are bad at (keeping track of a lot of data and deriving trends out of it) to a a computer which is much better at this. -Andi -
From: Andi Kleen <andi@firstfloor.org> WRONG! If I have sent it to the main developer list the damn patch would be applied by now. WHY? BECAUSE EMAIL ENGAGES PEOPLE AND BUGZILLA DOES NOT! Nobody looks at the bugzilla because there is too much junk in there to make the signal any useful to search for, there's simply too much noise. -
That means just x.org doesn't have a working bugmaster setup. Again a technical solution doesn't fix misorganization or missing people; it just pushes some work humans are at to computers if you have the right structure The normal bugzilla workflow is that some people categorize the bugs, ask the necessary questions and then figure out which developer to assign it to. Then the developer doesn't end up with "too much noise" but just a limited set of bugs to look at. And the responsible developer then gets an email and looks at the bug. This already happens to some limited fashion in kernel.org bugzilla: I get bugs assigned occasionally and while it's slow I tend to look at (near) all of them and try to improve things there. If a single developer ends up with too many bugs this way or there is nobody to assign a bug to or nobody processes the incoming bugs then the project has a problem. Yes bugzilla doesn't work then if the project is not well organized. But that's in no way different from what would happen with your email sent to a mailing list if it had the same problem. -Andi (who finds it a bit bizarre he has to explain the concept of "things can get easier when some work is pushed to computers" concept to a hacker) -
There's an ARM category in the kernel.org bugzilla. Folk are completely free to submit ARM bugs to either the (closed) mailing list or bugzilla. 99.99999999% of bug reports in the ARM community come via the mailing list. I think to date there's been about 10 bugzilla entries since Feb 2004. This is inspite of me linking to the kernel.org bugzilla from my website. So it seems that virtually all the folk involved with the ARM kernel, given a completely free choice, _prefer_ to send email-based bug reports over touching bugzilla. That's quite a different metric to projects forcing bugzilla on people, and I'd say is a more valid metric to gauge whether bugzilla is really suitable. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
This, actually, might work if the report is 'flagged' in a specific way. For example, if there's a message sent to LKML with a combination of '[BUG]' and 'suspend' in the subject, I have no problems whatsoever with spotting it. ;-) Greetings, Rafael -
So far, it seems that most of people's opinion WRT to bug reporting and trackingcan
be divided into 2 groups:
- People who argues (and they're right) that bugzilla and web interfaces in general
suck and that email + an "Adrian-like" solution works better
- People who argues that a bug tracker better than a mailing list is absolutely
needed (and they're right). They also argue that while bugzilla sucks, it's
better than nothing.
There's a common point between both groups: bugzilla sucks. The ideal
solution would be to replace bugzilla with some alternative and better
opensource bug tracking software, but I doubt it exists (there must be a
reason why everybody uses bugzilla). A good bug tracker should feel like
it makes your work easier, instead of making you feel like you're wasting
time (which is what bugzilla does)
I don't see why a web interface bug tracker should be bad for bug tracking,
as long as it's good and integrates 100% in the mailing lists. In my humble
opinion the "perfect" bug tracker for Linux should be something like this:
- Has a email interface (like the Debian bug tracking database).
- Has a web interface that completely follows the email threads
(reading/posting), but make the comments real emails, not just
database fields.
If done well (unlike the current bugzilla-to-email hack), it should possible
to do many nice things, like add a lkml bug report to the bug tracking
database (which shouldn't be a "real" database, but just an lkml mail
archive with a list of message IDs that are considered a bug and its state)
by just replying the thread, CCing the bug tracker and telling him to include
the thread in the database.
So unless someone is willing to write such tool (which I doubt, since it
doesn't looks easy), all this discussion seems pointless, and we should
stick with this http://kernelnewbies.org/known_regressions page
which is showing to be quite useful :)
-
This list currently contains 29 known regressions.
Someone has to manually add them.
Someone has to manually track the status of all of them.
Someone has to manually group related regressions (e.g. in the same
subsystem) together.
The kernel Bugzilla currently contains 1600 open bugs.
Maintaining a regression list by hand was really a pain when during
2.6.21-rc we had 36 known regressions.
Any approach that does not involve some kind of tracker with some kind
of database simply doesn't scale.
Bugzilla might not be perfect, but it works and it's better than doing
it by hand.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
The good thing about the wiki is that it doesn't exclude bugzilla. It's just a "regressions list", it doesn't intends to replace bugzilla. If a bug doesn't gets fixed for a while, I don't think it's very useful to keep it forever in the list like you do in the bugzilla, because I don't think it's possible to fix every single bug, and it steals you time to fix bugs that you are able to fix. It's not great but it's the best clone of you we've found 8) -
What exactly is the purpose of the 2.6.21 regressions list in the Wiki?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
AFAIK, submitting its contents to the list periodically CCing the developers, like you did with your lists. If developers care to fix it or not or how much Linus cares about that list before releasing a new version is another question. I think it's useful because it makes those bugs look more important than the 1600 stored in the bugzilla...it won't help to fix those 1600, but it attracts some attention over the "release critical" ones and encourages developers to fix them, even if not all of them get fixed. I don't think you can do many other things to get as much bugs fixed as possible, unless we reward bug fixers with weekends in the Playboy mansion. I think the fundamental question here is: is there a way to make hackers follow and fix _all_ the bugs? I'd love it was possible, but AFAIK all the projects that have tried to be ultra-stable and have adopted a policy to fullfill such goal have fallen behind of competing projects that cared more about working in improving their software. -
Apart from this many bugs are found and get fixed in the process of developing new code, so the 'ultra stable' approach is not really practical. Greetings, Rafael -
It's for -stable team. Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Don't be silly, did any of the developers say, that he has spare time to read your regression lists ? Michal posted it to LKML with the relevant developers in CC including the stable team, so they are in the loop for updates, which is a Good Thing! Hey, you did this yourself. It is a great help and as your lists did, Michals list pointed me to a bug, which I would have missed otherwise. So what are you complaining about ? Folks stepped up and built a regression list and posted it to LKML. What's wrong with that ? tglx -
It worked because several people (including Linus) emphasized that
fixing regressions from this list was important.
And it failed because many regressions still stayed unfixed and some
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Right. Simply because these lists are assembled by someone - who knows how to pick that reports from the mailinglists - who knows how to sort them in a useful way - who knows how to add the relevant folks on CC - .... No it failed not. It is not perfect. Way more bugs, which have been fixed or are in the debugging process, would have been unnoticed and It will work not much different from your lists. It'll be not perfect either. tglx -
That all needs to be done by someone initially yes. But then tracking what happens afterwards is something that can be distributed. A difficult bug can take a long time to resolve and generate a lot of messages; you don't want to require the initial sorter to handle all that too. It's much more scalable to let the developers update the state themselves then once they handle the bug. -Andi -
The actual list of known regressions is wiki based. Everyone can Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
Well do they know about it? Also something a little more structured would seem better for this. How do you query a wiki? -Andi -
It depends on what you consider failure and what you consider success.
For me, it failed. Not because it wasn't perfect, but because we could
have done much better with fixing the known regressions, and also by not
introducing several regressions between the last -rc and the final
kernel (and people who did test -rc7 and would most likely also have
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I hope that this discussion about bugs will change something in Linux regressions front. Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
Adrian, why do you keep harping on this, and ignoring reality? Kernel bugzilla has 1600 open bugs BECAUSE IT SUCKS. How many of those are interesting and valid? How many of them are relevant? How many of them are duplicates? You don't know. Nobody does. So why do you bother reporting that number? That number is exactly as relevant as the number of dog-hairs on our couch ("in the millions"). An impressively large number, definitely uncountable, and definitely also not relevant to anythign at all. Not at all unlike that "1600 open bugs" number that you bring up. Do you think the number of dog-hairs on our couch is an argument for or against people trying to track regressions? If not, why do you keep bringing up bugzilla? Linus -
OK, how do you suggest to track bugs in a way that doesn't suck?
Bug reports to linux-kernel have the big problem that they are lost if
What I do know is that the majority of them has never been proper
I tracked 2.6.21-rc regressions, and I do not scale for higher numbers
of bugs. When I had to track 36 known regressions it was a real
nightmare.
Bugzilla tracks regressions and scales for higher numbers of bugs.
Let me ask you two questions:
- Do you think regression tracking makes any sense at all?
- If yes, which scalable way of regression tracking would in your
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I suspect some bug reports get ignored deliberately. Why? Because it's hard to write good bug reports, and thus a lot of them suck. In some cases you are lucky and get a test case which makes the bug easily reproducible, or you get a nice, comprehensible Oops message, and can pinpoint the bug right away. But usually you have to interact with the user to get more information, have her try various patches or do a bisect. And sometimes this interaction drives you crazy because the user is slow to answer, gives you incomplete or even false information, and doesn't do what you ask her to do to narrow the bug, or you just have no clue what the bug could be etc. Probably every developer went through this experience a couple of times, wasting a lot of time and causing a lot of grief. Thus it's just natural that you learn from the experience to ignore every bug report which even remotely smells fishy ;-/ Y'know, a lot of developers don't just work for the money, they work for the fun of it. Thus they try to avoid pain and grief. :-) Bugzilla just makes this visible because it doesn't forget. Which is stupid and discouraging for both (potential) bug reporters and developers. The lesson to learn is that there are some very valid reasons why bug reports get ignored (some not mentioned here), and there's nothing you can do about it. And it has nothing to do with the method or tool used for reporting or tracking. And I also think that ignoring bad bug reports _increases_ the software quality, because you can use the saved time working on something productive. And it makes developers happier :-) [Just to avoid misunderstandings: By no means do I advocate ignoring *every* bug report. But ignoring the bad ones is just the sane thing to do.] Johannes -
Hello, I don't know, but what about telling the hapless person who went through the process of posting a bug what's wrong with the bug report? Or is software quality the only thing you care about, and you don't want to waste time on learning people to write better bug reports? If you want to scare away bug reporters, just ignore their first (and thus likely bad) bug report they write. There isn't much less motivating than a deafening silence. Just compare it with writing a patch. Though, in a sense, if the software quality is measured by the number of bug reports, your tactic might "improve" it indeed. That said, if someone is an obvious idiot, ignoring saves time. But I think that's quite rare, and in general you should give the reporter feedback, and then ignore the bug report. (Until it improves.) Greetings, Indan -
Well, I'm a moody bastard, and I would hope other people handle this better than I. However, all the bugzillas of this world are full with old, ignored bugs, amd I thought this might serve as an explanation why. Developers are just humans and if they have no incentive to act on a bug report they will ignore it. I think this is a fact that you have to deal with. It's also not necessarily the fault of the reporter if a bug report gets ignored, but for every report a developer has to make a decision to handle it or not, and there are lots of reasons why he may decide to not handle it, or at least not now (and then forget about it). But I'm quite sure that an important bug would be reported again until fixed. Johannes -
Reporters are just humans too and if they have no incentive to post bugs they won't. So it's a lose/lose situation, really. With a group of people working together they should try to motivate each other, not demoralize everyone. (I know, each bug report is a pain, voicing someone's failure. So ignoring it might make people feel better, but it doesn't True. There's also a difference between a bad bug report and one that a specific developer won't handle. In the former case anyone could recognize it and tell the reporter about it. The latter is a bit trickier, but if the developer thinks about looking at it later, he better can tell the reporter just that. A short "I'll take a look at it, later, when I've more time." I wouldn't be so sure about that. What's worse, why would the reporter bother telling that the bug is fixed in version N+1? No one cared about it anyway, so there's no one to tell it to. That would explain a lot open bugs too. Greetings, Indan -
It's a tedious process you keep doing over and over and over and over again, and my experience shows it's sheer luck if people can actually fill in the missing bits given the list. Usually you have to ask thrice to obtain even the most essential information such as version. Let alone vendor patches. Anyways, the solution to this problem is someone _politely_ asking reporters to provide necessary information and also point out that they cannot ever hope to have their bug fixed without making a best-effort attempt at answering all questions the first time they're being asked. There are notable exceptions, people pinpointing code fragments at fault And that is what happens all too often (not in absolute figures, but in the developer's perception of it) - insufficient information to debug. Yes I know, some of the bugs hide themselves so well you actually need four or five reports by different people to actually pinpoint the bug, perhaps accompanied by insufficient interface documentation that make it difficult to verify assumptions/expectations or assess potential solutions (such as the res_init() issue in fetchmail, or probably the khubd going south issue in Linux), but that's not the point. -- Matthias Andree -
I've tried to explain. Bugzilla can be one _part_ of it, but anybody who thinks it's the "main part" is really not being realistic. It's too cumbersome, and it's too stupid. Quite frankly, "lkml + google" is probably in many ways a *better* way to search for problems. But yes, some manual smarts (and the _occasional_ pointer to bugzilla) is probably currently the only option. Exactly because I don't think anybody has shown any better automation than bugzilla. But that doesn't make bugzilla "the One Choice". That's not how it works. If there is no automation, manual tracking is still better than ..and this is different from bugzilla exactly _how_? Those things are lost too. As you yourself have pointed out. The fact that you can search for them is _exactly_ as relevant as the fact that you can And you blame the developers, but not bugzilla? Why are you so unable to see bugzilla as part of the *cause* of the problem? You're perfectly happy to blame other things, but bugzilla is somehow above blame? Linus -
Have you seen/used RT? -> http://bestpractical.com/rt We use it here at work and it works great. People can report bugs both by email or via web interface. We get everything that comes in emailed to us and we can respond by email and RT recognizes the responses being in the same thread and lumps them into the same bug (and when the origin was by email that is even without evil bug numbers appearing in the subject with the help of some perl scrip magic (aka RT action script)). The only time I ever go into the web interface is about once a week to have a look at my list of open bugs and to do some tidying like merging bug reports and things like that. It also has some cool features like "extract this into the FAQ" and there is a "FAQ" in RT that contains an autogenerated FAQ from what people have pulled out in that way. Only problem is for the kernel we would need a beefy system (needs fast database or it gets very slow when you get into 100k+ bugs region) and someone who knows RT well and has a lot of spare time to set it up to your liking and then to maintain it... (RT takes a while to set up because you can tweak just about everything and you can add/modify/remove functionality at will as it is very modular and written in Perl so pretty much anyone can adapt it to do exactly what they want without even needing to wait for lengthy recompiles to happen...) You could for example automate sorting of bug reports into queues (e.g. SCSI, Net, FS, etc) by grepping the emailed bug report (or website generated one although on the website people can choose the queue by hand if they want) and sorting appropriately. Admittedly for this to be at all useful someone would have to spend some time working out intelligent things to grep for or all bugs would match to all queues when they contain dmesg output for example... (-; This might not be perfect but in comparison to bugzilla it is actually usable and at least we here at ...
I do completely agree with you on this.
The main parts are people doing some sorting and forwarding of an
incoming bug (currently mostly Andrew) and someone with deeper subsystem
I had the regressions stored in a plain textfile.
For getting regressions reasonably grouped for my regression emails, I
used paper, pen and scissors - and this is not a joke.
That really didn't scale when we had 36 regressions.
So some tool is needed if the bug numbers are bigger - no matter whether
It depends on how you look at bugs.
My ideal was always that reported bugs should be fixed.
If you accept that this is anyway impossible because more bugs get added
If Andrew forwarded a bug reported in Bugzilla to a developer, and
the developer doesn't answer, is this Bugzilla's fault? Or in any other
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
..and this is where we differ. OF COURSE bugs should be fixed. But you seem to think that there is something magical and special about every single bug-report. You have a new home assignment: watch the "every sperm is sacred" thing from Monty Python's "Meaning of Life". Google for it. And if you cannot appreciate the absurdity and humor of that thing, maybe you should think about it a bit more. And once you _can_ appreciate the humor of that song/skit, look yourself in the mirror, and ask yourself: "is every bug report sacred?" That's a TOTALLY IDIOTIC argument. That goes from "every sperm is sacred" to "sperm doesn't count at all". Can you not see how stupid that statement of yours really is? Can you not see that anybody who thinks in those kinds of black-and-white terms is simply not FUNCTIONAL! Bugs are neither sacred, _nor_ should they be ignored. Ponder that, grasshopper. And until you can see that things are not "either-or", "black-and-white", "all or nothing", I don't think I really can have anything worthwhile to add in this discussion to you. People who think in absolutes are simply not worth talking to. Linus -
I like the Flying Circus and the other Monty Python films (including
"The Crimson Permanent Assurance"), but "Meaning of Life" didn't impress
I never expected the reality to be come as white as my ideal or the
washed things in washing powder ads.
My ideal was white, and the shade of grey of the current reality is
darker than I think it should be.
At least theoretically reachable are things like:
- every incoming bug report is quickly handled by one or more
kernel developers who know the drivers and subsystems involved
- there's a last -rc kernel published for a few days of testing,
and except for the Makefile change the -final is identically to it
(or a new -rc published)
There are sometimes nonsensical bug reports and "handling" could be
rejecting a bug e.g. due to a tainted kernel.
And sometimes mysterious bugs are more or less undebuggable.
But these two points would have resulted in 2.6.21 being released more
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
This reminds me very much of what the brilliant computing scientist Edsger W. Dijkstra more than once wrote: `Confusing "love of perfection" with "claim of perfection", people will accuse you of the latter and then blame you for the first.' (EWD709) Also relevant to the discussion, I think, but in another way, is this: `One of them is the dogma that striving for perfection is counterproductive in the sense that it would make software development much too expensive. But what are the main causes of the soaring costs of software development? A major cost, in terms of both manpower and unforeseen delays, is debugging, and one can save a lot by investing more in preventing the bugs from entering the design in the first place. Since the errors are so expensive, in general the high-quality design is also by far the cheaper. Another major cause is that many systems are built on shifting foundations in the sense that the underlying software of operating systems and compilers is too shaky to be stable, with the result that each new release of that underlying software requires possibly extensive adaptation of what has been built on top of it. Finally, many of the tools the programmer is supposed to work with are so poorly documented that they force him to find out by experiment what they might be able to do for him. Since these experiments can be pretty expensive and time-consuming and inductive reasoning being what it is an educated guess is the best the poor programmer can hope for, the poor programmer is really in a miserable position. So here you see three major sources of cost explosion traced down to someone's assumption that striving for perfection is counterproductive!' (EWD952) For the complete documents, see http://www.cs.utexas.edu/users/EWD/transcriptions/EWD07xx/EWD709.html and http://www.cs.utexas.edu/users/EWD/transcriptions/EWD09xx/EWD952.html, respectively. Regards from Vegard's -
This is another case of "perfect is the enemy of good". Tryng to reach perfect is not only guaranteed to fail, but trying to reach it AND NOT REALIZING that it's stupid and wrong is actually much WORSE than just trying to do a reasonable job. And if you put some _totally_idiotic_ expectation that all bugreports can be fixed, and should always be totally blocking, that's guaranteed to just cause a totally unusuable bug reporting system. And your bugzilla arguments seem to be exactly that. A na
Debian has a bug track system which interacts mainly with the users through email. Seems rather nice to use and doesn't make you sign up to submit things, and has no issues with mailing lists being "subscribed" So in other words, basically the debian bug track system, except perhaps with an ability to submit bugs through a web interface too -- Len Sorensen -
Hi Diego, On 29/04/07, Diego Calleja <diegocg@gmail.com> wrote: Thanks for your help with this! Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
Usefull? For what? Having seen the link at the time I was fighting with yet another rearranged USB setup, and I swear the boot process uses Schrodingers Cat to determine which device found in the usb scan gets the first ttyUSB, and which gets the second, so I went to this site to see if it could be more productive than bugzilla (I'd like to toss about 10 pounds of C4 into the only code repository on the planet that thing is built from), but no, sorry, no bisquit. Why? First, its yet another password I have to scribble on the wall along with enough surrounding data so I might be able to find it the next time. You can't have it even do a search to see if it already has something similar without creating an account and logging in. Since I'm out of wall space, and the missus is bugging me to paint over all that, I left. I repeat: Usefull? For what? -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Simon: (to Jayne) "Enemies? You? No, how can it be?" --Episode #7, "Jaynestown" -
Well, thats not a bugzilla problem. upstream bugzilla allows anonymous search. Infact bugme.osdl.org allows search right on the frontpage. And if you want to dig deeper, use the query function. This is the quicksearch on "USB": Try again. Gruss Bernd -
And that output I'll have to admit, is much more usefull. However, in your wildest dreams I couldn't recompose that link line if I was stuck in a pool -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) He who is content with his lot probably has a lot. -
These projects are probably losing plenty of trivial bug reports from people who shouldn't have to register with a bug tracker and Most of the time bugzilla appears to be a great way to ignore bugs. Every large project seems to have one with bugs that stay open for years - bugzilla's own inability to quick search without javascript[1] being a good example (it's fixed now). If no one can get around to fixing reported bugs why should anyone bother submitting more? They even released their software with a link to the bug if you tried to do a quick search without javascript: [1] https://bugzilla.mozilla.org/show_bug.cgi?id=70907 People are doing something about this, at least for tracking the regressions at http://kernelnewbies.org/known_regressions (which is read-only unless you register, people have already I have had experience of this with the dvb mailing list (this was a couple of months before their news post referencing bugzilla too), even when I included a patch to fix it. Someone suggested another way to fix a bug and I replied that it worked - that fix never went anywhere else and other people could be having the same problem today because it's still not been applied anywhere. (I will submit a -1 +1 patch myself when Bugzilla's advanced search interface is far too crowded, that should be clear to anyone. The simpler search usually isn't much use because either it finds too many bugs or "zarro boogs" and you're left wondering if the bug is there or not. -- Simon Arlott -
Right. Dig your head in the sand, and ignore all the other people who piped up and said they hate bugzilla too, and find email much more convenient. It's "just Linus" in your dreamworld. Linus -
I completely agree. My personal experience with bugzilla is that it's very unfriendly to reporters. IMHO it's suitable for tracking unresolved problems along with debug patches, system information etc., but not for _reporting_ new ones. Greetings, Rafael -
What did you find unfriendly? While I also cannot say I love it I don't find it any less unfriendly than an email. Besides the primary point of bug tracking is not to be friendly to someone, but to (a) fix the bugs and (b) know how many bugs there for a given release. Any replacement would need to solve this problem too. Email does not solve it as far as I can see. -Andi -
- You are required to select a category and 'component' for your report, which often is difficult (especially if you're not a kernel expert) - You need to have a bugzilla account (or to create one, if you don't) - If you want to add an address to the CC list, it must be known to bugzilla and there's no (obvious) way to check which addresses are known (bugzilla rejects the report if there's a 'wrong' email address in the list) [IMO this is really really broken.] - You are asked to provide many details that need not be relevant and casual reporters don't know that they can skip this part For _tracking_ bugs, the bugzilla is more-or-less suitable. For _reporting_ bugs, IMHO, it's not. And I think they are two _totally_ conceptually different things. You report a bug to let somebody know that there's a problem and this doesn't necessarily mean that the problem has to be tracked. It may be very simple and immediately resolvable, in which case registering it in bugzilla is a loss of time and resources. If the problem _turns_ _out_ to be difficult, _then_ it'll need to be tracked, but usually it's not known whether or not this is the case until someone 'in the know' looks at the initial report. For this reason there should be a simple means of filing initial bug reports with someone to look at them and forward them to appropriate people who will decide if the problem needs to be tracked. If they do, it's time to use You are right, email is not suitable for tracking bugs. Still, it works quite well as a means of sending initial reports. Greetings, Rafael -
The Novell bugzilla actually has that fixed. You have a search email button to look up addresses. Perhaps that feature will be ported someday into Anyways that are mostly all detail (except the registration requirement) that The problem is we need a way to route those reports to the right people. Routing it to a single person or broadcasting it just doesn't scale. And the best way I know of is to use some database that keeps track of the state. The only sane way to do that would be to save them somewhere and keep a list and then let a group of people process them. I disagree. It works small scale but does not really scale well. -Andi -
is the 'keep a list' thing, but that's not a rocket science, IMHO. [For example, you can create a bugzilla entry with a link to the lkml.org copy of the relevant message, so why to require the reporter to file the report with the bugzilla himself?] _Moreover_, some LKML archives, for example at http://marc.info/?l=linux-kernel, keep track of each thread separately, so you can browse any of them at any time. In particular, you can see the _history_ of each bug report sent to LKML if you have a link to any message in its thread. Really, if we ask reporters to put '[BUG]' in the subjects of their messages, you'll even be able to use the lkml.org archives plus wget and a couple of shell scripts to cherry pick the links to all bug reports sent to the list within a given time interval. IMO that depends on how you handle it. Greetings, Rafael -
How can I get the functionality "show me all unfixed SATA bugs"?
That's one of the important functionalities of every bug tracking
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
That's the missing piece, obviously. BTW, I didn't want to say that one could entirely replace a bug-tracking system with tracking the LKML archives. What I wanted to say was that the email messages sent to the LKML were easily trackable and could be hooked up into a bug-tracking system, for example with the help of URLs. In such a setup people could send initial reports to the LKML and the links to these messages might be put into a bug-tracking system as soon as it turned out that the bugs were worthy of tracking. Greetings, Rafael -
Who is doing this "might be put", and why don't you start with asking
the submitter to submit bugs in a bug tracking system and forward the
bug report from the bug tracking system (manually or automatically) to
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Because quite often I know what the problem is after having asked the reporter a couple of simple questions or I can forward his report to a list on which there are the right people and the problem gets fixed quickly, so it just doesn't need to be formally registered. The problem with the bugzilla is that it requires a considerable setup for each bug which is a loss of time if the bug is trivial. Using the bugzilla for handling trivial bugs just doesn't make sense, IMHO. It's too heavywieght for that. [It _is_ useful nevertheless. For example, the suspend metabug that you've created for me is really a nice thing, so thanks a lot again. :-) Perhaps something like this might be used for tracking the regressions ...] Unfortunately, you can't say whether or not the bug is trivial until you see the report. If you've got it from bugzilla, you're stuck with it, so it's just more efficient to use something else in the initial phase of resolving problems. Greetings, Rafael -
Last time I did this, bugzilla at osdl.org won't let me add original reporter to goddamn CC list. It would be el neat, because not everyone followed instructions and forwarding emails between reporter and bugzilla sucks. -
That's related to what I said before. The requirement that the addresses on the CC list must be 'known' to bugzilla is deadly wrong in every case I can imagine. -
But unknown-to-bugzilla email addresses are accepted when they're sent to bugme-daemon@kernel-bugs.osdl.org. This is why I'll very often switch a bug report to email, copying individuals and mailing lists and bugme-daemon. Then bugzilla just sits silently in the background recording everything. But once a bug has switched to email, it needs to stay there - it would be bad if someone were to update the bug via the web UI because none of the emailed participants would know of the update. So i'll often explicitly ask "please follow up via emailed reply-to-all". It's not great, but there's certainly enough material here for people to get in and work on the bug, should they be so inclined. My overall approach with this stuff is: short-term bugs are handled via email and long-term ones are tracked in buzilla. Hence someone (Hi, Mom) needs to track all the emailed-only bug reports and get them filed in bugzilla once they go stale. -
If I wanted to do this, what would I have to do? I mean, assume I have a bug report that I want to send to someone whom the bugzilla doesn't like. I can do that with suspend/hibernation-related bug reports, but sometimes I'm not sure who is the right person to notify in the first place. -
Sorry, have been out sick, and someone removed me from the cc list, which didn't help. In response to various bits: Firstly a general comment - we're about to upgrade versions, which will ease a few of these issues. I should really finish the creation of virtual category owners for *all* categories. Will see if we can batch that, as it's a total pain to do. Andi Kleen wrote: > - Ask more people to just categorize and reassign bugs (anybody > interested?) The category owners should be able to do that, and help spread the load. The virtual category owner stuff enables many people to "watch" new bugs for that category and help out. > - Give more people in bugzilla the power to reassign arbitary bugs > (bugzilla maintainers would need to do that) Fairly easy to do, just a permissions issue. Either I can add a bunch of "known" people, or let everyone do it and then slap people if Yes, though we could do with some improved email hooks still, I guess. I much prefer having people watch categories than spamming lists, but if people want lists spammed, we can have that. -
Amen. Both of those are show-stoppers. It may be ok for kernel developers, but it's not good for random people. It's one of the reasons I don't generally use bugzilla for other projects - if they require me to sign up etc, they can take care of their own bugs. (For the same reason I consider closed mailing lists to be useless for bugreports. If you have to be a member to send email, it's not a bug-report thing, it's just a secret society). I realize that spam is a problem, but there are better spam solutions than alienating the people who just want to send a report. Also, I don't know if anybody has ever tried to avoid sending duplicate bugs, but every time I use bugzilla to send a bug-report (which I do for things like FC7 live CD's breaking etc where I care enough, and I have a bugzilla account on that bugzilla _anyway_ for other reasons), I am ready to _kill_ somebody whenever I see that "humorous" zarro bugs found message, after it made it almost impossible for me to even figure out how to do a good search in the first place! So I gnash my teeth, and fill in the bug report anyway, and if it's a duplicate becasue bugzilla didn't have any sane way to get any kind of overview at all, hey, it's a duplicate. Nothing I can do about it. And the reason I gnash my teeth is that I realize that because I can't even get any overview of the bugzilla entries as a random submitter (and dammit, I probably have a better idea of what the problem might be than most people), I sure as hell can't expect a random user to be better. So I bet my bad reports are a lot better than the average, and a lot of people probably just give up and don't report anything at all. Linus -
It is too much complicated for new reporters. I remember I sent a patch by mail for NTP to people at ISC, and they asked me to pass through bugzilla because it was important for them to track it. What initially was a 5 minutes email turned to a 30 minutes nightmare with doubts at every click, and it was even difficult for me to attach the patch. Later they replied to me by mail, which I consulted from another address, I replied and was rejected because it was not the same address... I had to be very very motivated to use such a crap. Definitely the tool we need if we want to reduce the number of bug reports! Just my experience as a bug reporter... Willy -
Email fixes a _lot_ more bugs than bugzilla does. End of story. I don't think anybody who cannot accept that UNDENIABLE FACT should even participate in this discussion. Wake up and look at all the bugs we fix - most of them have never been in bugzilla. That's a FACT. Don't go around ignoring reality. Linus -
* Newsgroups: gmane.linux.kernel
* Date: Sun, 29 Apr 2007 10:50:22 -0700 (PDT)
I'm seeing this long (198) thread and just have no idea how it has
ended (wiki? hand-mailing?).
Just two general questions to Adrian.
1) You was maintainer of the woody backports, isn't it[0]? Why you didn't
proposed (used) Debian's BTS as alternative to bugzilla, and how you did
your regression tracking?
What exactly doesn't fit? Full control by e-mail, comprehensive
management, ML handling/redirection, tagging, sorting, searching? Finally,
reportbug tool and web-inteface?
2) Your decision to stop activity, was that with debian because Sarge was
release with known security hole in the kernel[1]?
I'm just wonder.
[0] google: "woody backports Adrian Bunk"
[1] Message-ID: <20070331194728.GA31853@powerlinux.fr>
Xref: news.gmane.org gmane.linux.debian.devel.kernel:27730
[Just take your news readers and have fun with Gmane!]
[For those, who don't know what it is -- web :]
Archived-At: <http://permalink.gmane.org/gmane.linux.debian.devel.kernel/27730>
--*--
Unfortunately this message is from a man, who was punished in very
unfair manner by "fellow developers". I'm not trying to rise this
issue (sorry, if i'm trolling), just want to say, that life can be
very unfair, when some wrong people are in power...
Message-ID: <20070529053026.GA28352@powerlinux.fr>
Xref: news.gmane.org gmane.linux.debian.devel.project:12330
Archived-At: <http://permalink.gmane.org/gmane.linux.debian.devel.project/12330>
____
-
Direct or indirect results:
- See Michal Piotrowski's periodic posts and
http://kernelnewbies.org/known_regressions .
- Meanwhile, the people who maintain bugzilla.kernel.org seem to work
on improvements. I noticed that (a) each page now has a backlink to
the bugzilla.kernel.org start page, (b) the show_bug.cgi=... page
layout is now an unreadable mess, (c) e-mail integration is still
the same (it's impossible at least for me to send e-mails to bugs).
[...]
BTS has been mentioned in that thread in a few posts; mostly positively
as I recall.
--
Stefan Richter
-=====-=-=== -==- -===-
http://arcgraph.de/sr/
-
On Thu, Jun 14, 2007 at 05:33:40PM +0200, Stefan Richter wrote: I know, that most developers here are not working/familiar with what Debian has as its bug shooting weapon ``The system is mainly controlled by e-mail, but the bug reports can be viewed using the WWW.''[0]. I thought somebody, who familiar with that, might propose to setup/tune it, but not doing yet another NIH thing, especially from e-mail integration POV. I doubt mozilla guys can think about it without javascript and/or java servlets :) [0] <http://www.debian.org/Bugs/> ____ -
The problem isn't Bugzilla, and the Debian BTS wouldn't solve any
problem.
What is missing?
We need people who know one or more subsystems and who are willing to
regularly handle bug reports in their area.
And we need a release process that makes debugging, and if possible
fixing, all regressions prior to the release mandatory. You might never
come down to zero regressions and might not be able to handle all
last-minute reported regressions, but the 2.6.21 situation with 3 week
old known regressions not ever being debugged by a kernel developer
before the release has much room for improvements.
Changing the BTS would make sense if some core developers would state
that they would start using the BTS after this change. But otherwise it
doesn't matter which BTS to use.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I think if somebody, by example will show how it can be handled in more
convenient way, that will eventually become mainstream. As we know,
nothing gets from vacuum just like that, without taking energy and time.
And my question was not about this social problem of acceptance, support
etc.
Linus had spent some time in this thread trying to explain what problems
are: as from that (social, think scheduler :) POV, as also from
"zarro bogs found" one.
Also, after i saw Linus' message about doing mostly tools last couple of
years, i wonder why you, Adrian, didn't think about your tools first,
before you've started regression tracking? You are not running in front
of a train, unlike you know who does, plus bugzilla issues are known for
years. Luckily Fedora kernel guys also upstream developers, thus lkml and
other MLs under their view.
After having read all that, i've asked you, my question, as the person
who supposedly used BTS as a maintainer.
Yes, in current form it might not be in suitable configuration, i.e.
kernel sub-systems instead of packages etc, anyway main thing is the way
BTS is handled. While i was looking and replying for bug reports in the
Debian kernel, that i saw in lkml, i've noticed, just how guys work with
it there. Now they even came up with tracking upstream bugzilla, it
seems [0]. I left that activity due to RL some months ago, but now trying
to catch up things again.
Thus it's just my curiosity about all this. And BTS is like, you know,
why not, if it fits by mostly all parameters?
[0] Message-ID: <handler.s.C.118074292516912.transcript@bugs.debian.org>
Xref: news.gmane.org gmane.linux.debian.devel.kernel:29426
So, as i've wrote before: one must give them pretty-shiny tool, kindly
barking in their inboxes, instead of for example
"Guilty: **** ***** <????@****.com>",
as it was on the very beginning.
____
-
My tool was a textfile with a text editor.
For a smaller amount of regressions that works fine.
And it's not that Linus started developing the Linux kernel with writing
Both the Debian BTS and Bugzilla are usable programs with their own
advantages and disadvantages.
A pretty-shiny tools wouldn't change anything.
What you need are humans debugging the regresssions and humans remining
other humans that they should debug the regressions.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I may have missed something, but I recall that Adrian's bugtracking, while it lasted, and now Michal's continuing it mostly came into being because Adrian just started doing it and others soon found it very useful. -- Stefan Richter -=====-=-=== -==- -===- http://arcgraph.de/sr/ -
I'm hoping it's not "ended". IOW, I really don't think we _resolved_ anything, although the work that Adrian started is continuing through the wiki and other people trying to track regressions, and that was obviously something good. But I don't think we really know where we want to take this thing in the long run. I think everybody wants a better bug-tracking system, but whether something that makes people satisfied can even be built is open. It sure doesn't seem to exist right now ;) Linus -
The problem is not the bug tracking system, be it manual tracking in a
text file or a Wiki or be it in Bugzilla or any other bug tracking
system.
One problem is the lack of experienced developers willing to debug bug
reports.
But what really annoyed me was the missing integration of regression
tracking into the release process, IOW how _you_ handled the regression
lists.
If we want to offer something less of a disaster than 2.6.21, and if we
want to encourage people to start and continue testing -rc kernels, we
must try to fix as many reported regressions as reasonably possible.
This means going through every single point in the regression list
asking "Have we tried everything possible to solve this regression?".
There are very mysterious regressions and there are regressions that
might simply be reported too late. But if at the time of the final
release 3 week old regressions hadn't been debugged at all there's
definitely room for improvement. And mere mortals like me reminding
people is often not enough, sometimes an email by Linus Torvalds himself
asking a patch author or maintainer to look after a regression might be
required.
And a low hanging fruit to improve the release would be if you could
release one last -rc, wait for 48 hours, and then release either this
-rc unchanged as -final or another -rc (and wait another 48 hours).
There were at least two different regressions people ran into in 2.6.21
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
*debug* I hope you saw what subject i've chosen to bring this discussion back. Yes, "tracking", as the first brick for big wall. Your arguments about developers and users, you've said already, but i've asked different questions, have i? Lets look on regular automatic report, like this one: Message-ID: <E1Hz5HK-0007uB-MO@merkel.debian.org> Xref: news.gmane.org gmane.linux.debian.devel.general:116248 Archived-At: <http://permalink.gmane.org/gmane.linux.debian.devel.general/116248 And what we see? Basic packages, like ``dpkg'', ``grub'', ``mc'' are in the list, requesting help. And as you can see for quite some time. And it's *OK*, because distribution is working, development is going on. *tracking* Despite of tools, Debian have such thing as long release cycles, so called ``Debian sickness''. And reason, i see, is what you've just pointed out: less disaster, zer0 RC bugs. Plus everybody is volunteer, big chunk of bureaucracy-based decisions. And all this for about 15000 packages. * + reporting* One side Linux is facing is hardware, and that kind of thing is very-very diverse. LKML traffic is huge, yet there's no suitable tracking and *social* (first approximation) That's a social problem, just like Debian loosing good kernel team members. For example you feel, that you've wasted time. But after all, if you've came up with some kind of tool, everybody else could take it. No problems, useful ideas must and will evolve. But _ideally_ this must not be from ground zero every time. _Ideally_ from technical, not personal point of view ;). That's why people in Debian have started *team* maintenance with alioth. Unfortunately problems with individuals in big machine with bad people, got randomly elected, can't be solved (IMHO). Even LKML's rule "patches are welcome", that is very technical, thus good, doesn't work there. What about Linus' tree is a development tree, Andrew's one is a "crazy development one" (quoting Linus)? What about open (web ...
Tracking regressions is not a real problem.
I sent weekly regression reports.
Not automatically generated but manually - but that doesn't matter.
no, *debugging*
I tracked regressions for the 2.6.21 disaster, and the not debugged
As I expected, someone else has picked up regression tracking.
The problem for the Linux kernel is that for a better bug handling you'd
need people willing to learn other people's code and to do the hard work
of debugging bug reports. E.g. writing a new filesystem is simply much
more fun than learning and debugging other people's code in some old
filesystem.
Talking about "team maintenance" sounds nice, but the problem in the
kernel starts with code that has zero maintainers. And if there's
already a maintainer, it's unlikely that he'll not accept patches from
some new person debugging bug reports. But how to find people who will
Debian has it's own problems, Linux kernel development at least has a
People aren't that dumb that some empty words like "release is a reward"
Bug tracking for the kernel is more or less working.
The main problem is getting people to debug bug reports.
We need the main problem fixed, not a different tool in an area that is
not the main problem.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
[I've added Herbert as former kernel team member in the debian(AFAIK),
sorry, if i'm wrong and you have no opinion on that, Herbert.]
I've tried to express different point of view. We have different ones {0}.
The term ``like'' here means people are not able/willing to do work, they
might/can do. And cause of it is *social*, not technical. {1},{2} are
results of that problem/behavior. But according to {0}, you think,
So you expected good, doing bad. ``Bad'' means bringing pointless flame
about what everybody should do, without constructive approach like: "OK,
i can't do it due to my POV{0}, useless manual work. Everybody willing to
bring another way of dealing with it is welcome."
Your first reply:
"And it's not that Linus started developing the Linux kernel with writing
git, the first 10 years of Linux development were without any SCM." {3}
to my note about, that you are not hurry anywhere, that after all that
years of Open Source and Free Software development you are not trying
to deal with such important thing like regression/bug tracking in
... {0}++
Do you know, for example, why i'm not making my "hacker's career"
doing that?
1. because i ended up with lynx, slrn, mutt, emacs-nox. Including
"zarro bogs found" kind of thing and other "userspace suck" problems.
2. i have no way to know if something *really* broken, unless it right
on my hardware
This all unlike Debian BTS using reportbug, where you have basic
information, mbox format messages for easy "mutt -f", and other funny
things, real maintainers aware of (i'm trying to know, learn about).
Thus organized, non brain-damaged way of reporting and tracking is the
It's like in {3} -- i don't like it (personally), so i'm going along.
Floppy went pretty fine, before it was started to be maintained, and
you know that. But you also told that unmaintained and not-working are
different things.
Thus, if that just happen to break, well reports are welcome, and if
long time run will show, ...I am dealing with bug tracking in the kernel Bugzilla.
I did regression tracking for the kernel.
Michal is currently tracking regressions.
Andrew is doing an enormous amount of work in these areas.
You might not see it because you are not active in this area, but it is
working in an organized way.
Bugzilla is a usable tool for bug and regression reporting and tracking
tracking.
I am using the Debian BTS since 8 years and I've used many Bugzillas.
Both are usable, and the real problem in kernel development is not
unmaintained != unused
user != developer
worked != went pretty fine
Stuff can easily get broken and noone looks after bugs if there's no
maintainer both knowing the code and willing to debug bugs.
The floppy driver is actually an example of code that has been broken
quite often by patches simply because noone who completely understands
this driver reviewed patches.
It somehow works and it might work for some years, but there is a
IMHO your point of view is simply not related to the real current
quality problems in kernel development.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
[...]
Linus also said that Andrew's tree is abused too often for broken stuff.
My goal for the little driver subsystem I'm maintaining is
- everything that Andrew pulls from me builds and runs and doesn't
introduce regressions to my and the submitters' knowledge. I am
slowly expanding my test procedures to catch things that fail that
goal.
- Everything that Linus pulls from me fulfills the above criteria
and, in addition, had reasonable time and publication for test and
review, depending on the kind of patch.
I had a few regressions in Linus' releases. None of them were known
before release. All of them were debugged and fixed rather soon after
report, AFAIR.
So what _I_ need is neither better regression tracking nor more manpower
for debugging of regression reports. What I need is more own spare time
and equipment for tests, more own knowledge and experience, and more
people who run-time test -rc kernels or at least my subsystem updates on
top of older kernels.
(Note, I'm talking only about regressions here, not old bugs.
There my requirements are different; the by far most important
one is more manpower for debugging and fixing.)
Well, if _other_ subsystems would get regressions in Linus' tree fixed
quicker, there might perhaps be more people who would consider to run
-rc kernels and would catch and report "my" regressions.
[Oleg, sorry that I too digressed from the subject of your thread, but
your remark about "[crazy] development tree"s caught my eye. IMO people
should care for quality already in Andrew's tree --- more so than at the
moment.]
[Adrian, I'm not saying "too few users run -rc kernels", I'm saying "too
few FireWire driver users run -rc kernels".]
--
Stefan Richter
-=====-=-=== -==- =----
http://arcgraph.de/sr/
-
Getting more people testing -rc kernels might be possible, and I don't
think it would be too hard. And not only FireWire would benefit from
this, remember e.g. that at least 2 out of the last 5 kernels Linus
released contained filesystem corruption regressions.
The problem is that we aren't able to handle the many regression reports
we get today, so asking for more testing and regression reports today
would attack it at the wrong part of the chain.
Additionally, every reported and unhandled regression will frustrate the
reporter - never forget that we have _many_ unhandled bug reports
(including but not limited to regression reports) where the submitter
spent much time and energy in writing a good bug report.
If we somehow gain the missing manpower for debugging regressions we can
actively ask for more testing. Missing manpower (of people knowing some
part of the kernel well) for debugging bug reports is IMHO the one big
source of quality problems in the Linux kernel. If we get this solved,
things like getting more testers for -rc kernels will become low hanging
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Hi all, Adrian, I agree with _all_ your points. I bet that developers will hate me for this. Please consider for 2.6.23 Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> --- linux-work-clean/Documentation/SubmitChecklist 2007-06-17 11:18:37.000000000 +0200 +++ linux-work/Documentation/SubmitChecklist 2007-06-17 11:29:26.000000000 +0200 @@ -90,3 +90,8 @@ kernel patches. patch style checker prior to submission (scripts/checkpatch.pl). You should be able to justify all violations that remain in your patch. + + + +If the patch introduces a new regression and this regression was not fixed +in seven days, then the patch will be reverted. -
Those regressions where we know which patch caused them are the easy ones. Often we don't know which patch (or even which subsystem merge) is at fault. I think. How many of the present 2.6.22-rc regressions which you're presently tracking have such a well-identified cause? -
Here lays the problem. git-bisect is a killer app, people should start using it. Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
It's OK _only_ in case of unknown, hard to find *hardware* bugs. If you think it's "a good thing" for bad, untested by developer code, then something is completely wrong. And if there's no debugger in the mainline kernel, which is developer's tool, then why do you think testers must stick with git-bisect, as their debugger-like tool (bandwidth in most and time consuming in some cases)? That's wrong if developers are tending to reply only one thing -- git-bisect. If things are going to be that bad, then better to start dealing with the cause, not consequences. In this situation requesting test-cases is a better way, as it's going to influence developer as cause of potential problems. If tests will show *hardware* side of problem, then, well some parts may be not obvious, thus bisecting is a way to continue. Sorry if i'm from the abnormally different side yet one more time. ____ -
Oh, I've just fixed two purely software bugs pointed out by binary searching in the code that I'm sure has been tested, not only by its developers, but the bugs only showed up in my configuration (on one out of four test boxes). There are so many different kernel configurations possible that there's no way a developer can test them all. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
With current state of affairs it's not only hard for developers, but
and for users: <20070221220520.GA20659@artselect.com>,
<20070429230037.95120@gmx.net>
I'm trying to re-do some kbuild stuff, but i'm getting rather offensive
answers :( <1182020654.8176.398.camel@chaos>
(Even if i'm academic with free Internet, i doubt i even tried to
think to improve something, if i didn't have one, because i wouldn't knew
huge lkml traffic, problems, etc.)
Maybe i'm wrong. But reducing amount of traffic/files and ease of
(re-)configuration are not last things to be done for better testing.
All for speed of getting and compiling kernel. Latter for avoiding
bugs and noise due to inconsistent build configuration.
Finally again, bug-reporting and tracking tools, i've tried to discuss
are major problems out there I think it's plain easy and deal with. One
more example:
<handler.s.C.117647526113388.transcript@bugs.debian.org>
Xref: news.gmane.org gmane.linux.debian.devel.kernel:28095
<http://permalink.gmane.org/gmane.linux.debian.devel.kernel/28095>
____
-
Uwe has an attitude that made many people (including Linus himself)
set their mail filters to deliver his emails directly to /dev/null.
Parts of the contents of his emails were usable including usable
regression reports - but the way he treats people simply disqualified
I'm not seeing anything in Thomas' email that could be considered
offensive. He told you in a technical way why he disagrees with you.
If you call this email "rather offensive", you should _really_
unsubscribe from lkml (or even any Debian mailing lists). And this is
not meant against you, it's simply that for the standards of lkml there
is nothing offensive in this email.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
most people who report bugs don't know enough about what's actually going wrong to be able to write a test case (those that do can probably just write a patch to fix it). Along similar lines a debugger wouldn't be of much use either. the fact that git-bisect doesn't require any knowledge other then knowledge the reporter has demonstrated that they already have (the ability to compile and install their own kernel) puts it within the reach of testers. unfortunantly, as good as it is it can take a lot of effort, especially if the bug takes time to show up. it's not perfect, but it's a huge help. and developers aren't always responding with 'do a bisect', sometimes they respond with 'yes, we know about that' or 'that sounds like X', so it's still worthwhile for people to report the problem first before going to the ffort of doing a bisect. David Lang -
Sorry for my English. Requesting test cases from the developers of
course. Or at least results of some kind of testing, so people may run
and check them as well, if something is suspected. And this from my POV
leads again to organized way of filtering noise and collecting structured
I think, positive feedback from {0} to the LTP may improve that.
That two are _exactly_ what reportbug tool is doing. That's why i'm
talking about it. And i'm *no* wonder why developers are boring -- at
__
Bits for Adrian.
*ML*
I *use* Gmane. I'm not subscribed (receiving e-mail to my mbox) to any ML,
except <kbuild@sf.net>.
Nearly every my e-mail here is with Gmane links. You seem ignored all of
them. As for me it's result of *your personal*, rather than technical
activity.
*offense*
I'm not talking about personal offense, you are seem thinking about, but
technical one. I.e. when possible benefit might be even more, than NOHZ
on x86 and a like[0], with much less effort. I still think, unless i will
develop or fail, that reducing traffic on one or two order of magnitude
is possible as well as improving kbuild/kconfig to reduce of the noise of
mis-configurations/tester's .config length. Discouraging that effort is
my source of offense.
(FYI Until Linus checked in my _RFT_ kbuild patches, i realized how
*many* people are willing to understand and try to test kbuild stuff.)
[0] I bet VGA, DRAM, HDD are far more power hungry room-heaters. Unless
you can substantially lower frequency, you might have no benefit at
all. Whana know how it's done in perfectly designed embedded
MCU/CPU? Please, see for instance MSP430 from TI (i know, it uses
SRAM, but i've asked to look on processor core design).
____
-
Except when the bisection points us to a patch exposing a bug that is present regardless (see http://lkml.org/lkml/2007/6/13/273 for example). Besides, if a patch is merged before -rc1 as a bugfix, there are several patches depending on it and only after -rc5 has been released we find out People should test _all_ of the -rc kernels and report problems. Otherwise, we may assume that there are no problems and go on. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
Fine with me, but:
There are not so simple cases like big infrastructure patches with
20 other patches in the tree depending on it causing a regression, or
even worse, a big infrastructure patch exposing a latent old bug in some
completely different area of the kernel.
And we should be aware that reverting is only a workaround for the real
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
It is different case. "If the patch introduces a new regression" introduces != exposes an old bug Removal of 20 patches will be painful, but sometimes you need to Regards, Michal [1] the quote from "The Last Wish/Minor Evil" by Andrzej Sapkowski :) -- LOG http://www.stardust.webpages.pl/log/ -
My remark was meant as a note "this sentence can't handle all
regressions" (and for a user it doesn't matter whether a new
regression is introduced or an old regression exposed).
And this is something I want to emphasize again.
How can we make any progress with the real problem and not only the
symptoms?
There's now much money in the Linux market, and the kernel quality
problems might result in real costs in the support of companies like
IBM, SGI, Redhat or Novell (plus it harms the Linux image which might
result in lower revenues).
If [1] this is true, it might even pay pay off for them to each assign
X man hours per month of experienced kernel developers to upstream
kernel bug handling?
This is just a wild thought and it might be nonsense - better
suggestions for solving our quality problems would be highly welcome...
cu
Adrian
[1] note that this is an "if"
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I think that we can handle bug reports like we handle modifications of code. Namely, for each subsystem there can be a person (or a team) responsible for handling bugs, by which I don't mean fixing them, but directing bug reports at the right developers or subsystem maintainers, following the history of each bug report etc. [Of course, these people can choose to use the bugzilla or any other bug tracking system they want, as long as it works for them.] The email addresses of these people should be known (and even documented), so that everyone can notify them if need be and so that it's clear who should handle given bug reports. Just an idea. :-) Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
Currently, these people are "Andrew Morton" and the addresses are linux-kernel@vger.kernel.org and http://bugzilla.kernel.org/ - and this part is working. Although there is room for improvement in this area, the problem in the pipeline is really to find developers who know the code in question and who are willing to debug bug reports. There are unmaintained parts of the kernel. And there are parts of the kernel where the maintainers are developing code, reviewing code and handling patches but are not willing or simply not capable of looking at bug reports. That's not against these people and they might do great work, but then there's simply an additional person missing who would be willing to learn the subsystem in question and handle bug reports. All bug handling becomes moot and every request for more information from the submitter a waste of time if there's noone available for cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
Those are very good ideas indeed. The whole development process came to the point when all realize that something needs to be done for the team to balance out new development and old and recent unresolved issues that are piling up... I've looked through a number of bugzillas recently and here is my scoop on shortcomings and some ideas. I am not sure how realistic they are, probably might fall into "wishful thinking" category. The way bugs get tracked and resolved is definitely a "no guarantee", and main reasons are: not enough time for a maintainer to attend to them all nobody else (except at best very few busy people) knows about majority of the problems. Andrew and Adrian and Michal post the most pressing ones. But there are many many smaller ones that are not assessed and not being taken care of. many problems are not easily reproducible and not easy to verify because there is no identical system, motherboard, application, etc. in case if reporter doesn't stick around until the end of the bug's life. Maybe along with bugzilla there should be another tracking tool - for resources and systems that are available to individual developers. Someone might have same or similar system to verify fixes in case if the reporter disappears or "the system is gone now". Requests for specific hardware can be automatically generated by the bugzilla say. Those can be posted once in a while for everyone to see and chip in and acknowledge if they happen to have such hardware and able to run a quick test to at least verify the patch. Statistically, such need doesn't happen often for each type of hardware, so it shouldn't be a big burden for owners. Besides, the database and resources can be useful for developers who want to test their new patches on variety of hardware. This might prevent future regressions which often caused by lack of testing as we all know. There are problems that require more research and thinking such as implementing new features or redesigning old ones. ...
I agree. In addition, there is only a limited time window in which it makes sense to debug given problem before the kernel changes too much (that of For that, I think, some "professional testers" would be needed ... Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
Hi, IMO we should concentrate more on preventing regressions than on fixing them. In the long-term preventing bugs is cheaper than fixing them afterwards. First let me tell you all a little story... Over two years ago I've reviewed some _cleanup_ patch and noticed three bugs in it (in other words I potentially prevented three regressions). I also asked for more thorough verification of the patch as I suspected that it may have more problems. The author fixed the issues and replied that he hasn't done the full verification yet but he doesn't suspect any problems... Fast forward... Year later I discover that the final version of the patch hit the mainline. I don't remember ever seeing the final version in my mailbox (there are no cc: lines in the patch description) and I saw that I'm not credited in the patch description. However the worse part is that it seems that the full verification has never been done. The result? Regression in the release kernel (exactly the issue that I was worried about) which required three patches and over a month to be fixed completely. It seems that a year was not enough to get this ~70k _cleanup_ patch fully verified and tested issues and as a reward I got no credit et all and extra frustration from the fact that part of my review was forgotten/ignored (the part which resulted in real regression in the release kernel)... Oh and in the past the said developer has already been asked (politely in private message) to pay more attention to his changes (after I silently fixed some other regression caused by his other patch). But wait there is more, I happend to be the maintainer of the subsystem which got directly hit by the issue and I was getting bugreports from the users about the problem... :-) It wasn't my first/last bad experience as a reviewer... finally I just gave up on reviewing other people patches unless they are stricly for IDE subsystem. The moral of the story is that currently it just doesn't pay off to do code ...
I dunno. I suspect (hope) that this was an exceptional case, hence one yup, Reviewed-by: is good and I do think we should start adopting it, although I haven't thought through exactly how. On my darker days I consider treating a Reviewed-by: as a prerequisite for Ignoring a review would be a wildly wrong thing to do. It's so unusual that I'd be suspecting a lost email or an i-sent-the-wrong-patch. As for high regressions/patches ratio: that'll be hard to calculate and tends to be dependent upon the code which is being altered rather than who is doing the altering: some stuff is just fragile, for various reasons. One ratio which we might want to have a think about is the patches-sent We of course do want to minimise the amount of overhead for each developer. I'm a strong believer in specialisation: rather than requiring that *every* developer/maintainer integrate new steps in their processes it would be better to allow them to proceed in a close-to-usual fashion and to provide for a specialist person (or team) to do the sorts of things which you're thinking about. -
How about the following "algorithm": * Step 1: Send a patch as an RFC to the relevant lists/people and only if there are no negative comments within at least n days, you are allowed to proceed to the next step. If anyone has reviewed/acked the patch, add their names and email addresses as "Reviewed-by"/"Acked-by" to the patch in the next step. * Step 2: Send the patch as an RC to the relevant lists/people _and_ LKML and if there are no negative comments within at least n days, you can proceed to the next step. If anyone has reviewed/acked the patch, add their names and email addresses as "Reviewed-by"/"Acked-by" to the patch in the next step. * Step 3: Submit the patch for merging to the right maintainer (keeping the previous CC list). where n is a number that needs to be determined (I think that n could be 3). Still, even very experienced developers make trivial mistakes, so there should be a way to catch such things before they hit -rc or even -mm kernels Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
I think that n should be a function of the number of accepted patches that this person sent in before, and the number of regressions he caused in the past. Ie, new developers have to wait a considerable amount of time - while experienced developers who never caused a regression should be able to write patches that are immediately applied. Also, if anyone causes a regression - that would lead to them having to wait longer the next time before they can apply the patch - a good reason for a developer to put extra time into making sure there are no regressions. -- Carlo Wood <carlo@alinoe.com> -
The character of the patch (potential impacts, size...) and availability of reviewers and testers influence the required review time so much that other factors, like reputation of the submitter, hardly matter. -- Stefan Richter -=====-=-=== -==- =---= http://arcgraph.de/sr/ -
So we need a bug/regression/patch tracking system based on MMORPG game ;) Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
Will send in pm. I've been too long around to not learn a few things... rule #3 of successful kernel developer Ignore reviewers - fix the bugs but don't credit reviewers (crediting them makes your patch and you look less perfect), if they are asking question requiring you to do the work (verification of taken assumptions etc.) do not check anything - answer in a misleading way and present the assumptions you've taken as a truth written in the stone - eventually they will do verification themselves. I really shouldn't be giving these rules out (at least for free 8) so this time only #3 but there are much more rules and they are as dead serious as Adding Reviewed-by for reviews which highlighted real issues is obvious (with more detailed credits for noticed problems in the patch description). Also when somebody reviewed your patch but the discussions it turned out that the patch is valid - the review itself was still valuable so it would Easy to workaround by a friendly mine "Reviewed-by:" for yours "Reviewed-by:" It is not unusual et all. I mean patches which affect code in such way that it is difficult to prove it's (in)correctness without doing time consuming audit. ie. lets imagine doing a small patch affecting many drivers - you've tested it quickly on your driver/hardware, then you skip the part of verifying correctness of new code in other drivers and just push the patch As a patch author you can either assume "works for me" and push the patch or do the audit (requires good understanding of the changed code and could be time consuming). It is usually quite easy to find out which approach the author has choosen - the very sparse patch description combined with the changes in code behavior not mentioned in the patch description should raise the red flag. :) As a reviewer having enough knowledge in the area of code affected by patch you can see the potential problems but you can't prove them without doing the time consuming part. You may try to NACK ...
Suppose you have modified the patch as a result of a review and you post the modified version. Is that still right to put "Reviewed-by" into it? Personally, I don't think so, because that suggests that this particular Yes, IMO in such a case it would be appropriate to do that. Also, the review need not lead to any negative comments from the reviewer, but in that case it's also appropriate to add a "Reviewed-by" to the patch. Generally, if someone comments my patches, I add his/her address to the next version's CC list, which sort of documents that the reviewer was involved. Then, if the reviewer ACKs the patch, that will be recorded. I think that for "Reviewed-by" to work correctly, we ought to have a two-stage process of accepting patches, where in the first stage the patch is reviewed and if there are no objections, the "Reviewed-by" (or "Acked-by") records are First of all, the author should have a good understanding of what he's doing and why. If there are any doubts with respect to that, the patch is likely to introduce bugs. This also depends on who will be handling the bug reports related to the patch. Well, IMHO, the author of the patch should convince _you_ that the patch is correct, not the other way around. If you have doubts and make him think twice of the code and he still can't prove his point, this means that he Unless you are the poor soul having to handle bug reports related to the I don't think that the education alone will be enough. IMO we need to have a system that promotes the reviewing of code. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth -
Well, if you got the "fix issues" part right it in the modified version it shouldn't really matter. ;) But yes, we may wait with adding "Reviewed-by" after the modified patch This is a nice theory, practise differs greatly. Sometimes you are not in position to prevent suspicious patches from being merged and sometimes you just don't want to do it for various reasons (not Sure, we need to start somewhere... Bart -
If we introduce a "Reviewed-by" with reasonably clear semantics (different from Signed-off-by; e.g. the reviewer is not a middle-man in patch forwarding; the reviewer might have had remaining reservations... very similar to but not entirely the same as "Acked-by" as currently defined in -mm) --- and also make the already somewhat established "Tested-by" more official, --- then the maintainers could start to make it a habit to add Reviewed-by and Tested-by. Plus, reviewers and testers could formally reply with Reviewed-by and Tested-by lines to patch postings and even could explicitly ask the I don't think that a maintainer (who signs off on patches after all) can easily afford to take the "bastard approach". I may be naive. -- Stefan Richter -=====-=-=== -==- =--=- http://arcgraph.de/sr/ -
Tested-by would be good too. Because over time, we will generate a list of people who own the relevant hardware and who are prepared to test changes. So if you make changes to random-driver.c you can do `git-log random-driver.c|grep Tested-by" to find people who can test your changes for you. Not that many people are likely to bother. The consequences of being slack are negligible, hence there is little incentive to do the extra work. -
Why not include a user-space tool that, when invoked, if you agree to send personnal info, sends your hardware vs driver info to a web database + your email address (maybie even you .config, etc..) ... In case of help for testing new patches/finding a bug/etc.. your email You would'nt even need to search in GIT. Maybie even when ever a patchset is being proposed a mail could be sent to appropriate hardware/or feature pseudo-auto-generated mailing-list? On lkml I mostly try to follow patches/bugs associated with hardware I use. Why not try to automate the process and get more testers in? - vin -
I think this is an excellent point. One data point could be a field in bugzilla to input the hardware information. Simple query can select common hardware and platform. So far it's not working when hardware is just mentioned in the text part. -
if it's free text it'll be useless for search ... I suppose we could do drop-downs for architecture at least? Not sure much beyond that would work ... *possibly* the common drivers, but I don't think we'd get enough coverage for it to be of use. M. -
How about several buckets for model/BIOS version/chipset etc., at least optional, and some will be relevant some not for particular cases. But at least people will make an attempt to collect such data from their system, boards, etc. --Natalie -
Mmm. the problem is that either they're: 1. free text, in which case they're useless, as everyone types mis-spelled random crud into them. However, free-text search through the comment fields might work out. 2. Drop downs, in which case someone has to manage the lists etc, they're horribly crowded with lots of options. trying to do that for model/BIOS version/chipset would be a nightmare. If they're mandatory, they're a pain in the butt, and often irrelevant ... if they're optional, nobody will fill them in. Either way, they clutter the interface ;-( Sorry to be a wet blanket, but I've seen those sort of things before, and they just don't seem to work, especially in the environment we're in with such a massive diversity of hardware. If we can come up with some very clear, tightly constrained choices, that's a decent possibility. Nothing other than kernel architecture (i386 / x86_64 / ia64) or whatever springs to mind, but perhaps I'm being unimaginative. Nothing complicated ever seems to work ... even the simple stuff is difficult ;-( M. -
I do agree. It _sounds_ like a great idea to try to control the flow of patches better, but in the end, it needs to also be easy and painfree to the people involved, and also make sure that any added workflow doesn't require even *more* load and expertise on the already often overworked maintainers.. In many cases, I think it tends to *sound* great to try to avoid regressions in the first place - but it also sounds like one of those "I wish the world didn't work the way it did" kind of things. A worthy goal, but not necessarily one that is compatible with reality. Linus -
Sure, simplicity is a key - but most of reporters on bugs are pretty professional folks (or very well rounded amateurs :) We can try still why not? the worst that can happen will be empty fields. Maybe searching free text fields can then be implemented. Then every message exchange in bugzilla can be used for extracting such info - questions about HW specifics are asked a lot, almost in every one. It's a shame we cant' use this information. I was once searching for "VIA" and got "zero bugs found", but in reality there are hundreds! Probably something that makes sense to bring up with bugzilla project? However, I've been working with other bugzillas (have to admit they were mostly company/corporate), where this was a required field that didn't seem to cause difficulties. I am planning to do some more research and get some more ideas from other bugzillas. I suppose we can have them discussed and revised sometime. --Natalie -
mmm. added complexity and interface clutter for little or no benefit is what I'm trying to avoid - they did that in the IBM bugzilla and it turned into a big ugly unusable monster. You can call me either "experienced" or "bitter" depending what mood you're in ;-) Not sure I'd agree that most of the bug submitters are all that That should work now ... seems to for me. http://bugzilla.kernel.org/buglist.cgi?query_format=advanced&short_desc_type=allwo... Produces a metric-buttload of results. Go to the advanced search option and do "A Comment" contains the string "VIA". By default "Status" is Would be good, thanks. I tend to favour keeping things as simple as possible though, we have very little control over our users, and they're a very broad base. Making the barrier to entry for use as low as possible is the design we've been pursuing. M. -
Yes it works great! Thanks... I'd say this should be really a default search, because first search screen is misleading - it promises find Actually, as long as search above is possible - it is going to work. I must say the new bugzilla interface is very nice in general, and well designed and easy to use. --Natalie -
OK, or at the very least we can fix the text at least to indicate it'll only search summaries (and likely only of open bugs at that ...) M. -
[Dear Debbug developers, i wish your ideas will be useful.] * From: Linus Torvalds * Newsgroups: gmane.linux.kernel There were some ideas, i will try to summarize: * New Patches (or sets) need tracking, review, testing Zero (tracking) done by sending (To, or bcc) [RFC] patch to something like submit@pts.e-mail.example.com (like BTS now). Notifications will be sent to intrested maintainers (if meta-information is ok) or testers (see below) First is mostly done by maintainers or interested individuals. Result is "Acked-by" and "Cc" entries in the next patch sent. Due to lack of tracking this things are done manually, are generally in trusted manner. But bad like <200706172053.41806.bzolnier@gmail.com> sometimes happens. When patch in sent to this PTS, your lovely checkpatch/check-whatever-crap-has-being-sent tools can be set up as gatekeepers, thus making impossible stupid errors with MIME coding, line wrapping, whatever style you've came up with now in checking sent crap. * Tracking results of review (Acked-by). This can be mostly e-mail exchange with comments and agreements. "Acked-by" semantic may be implemented in form of contlol message to tracking system, and this system will generate e-mail confirmation to the patch author in form of "Acked-by: Developer's Name <message-id of e-mail with acke-by>" Thus, next patch will have this entry. And if testing of this version ir regression happens, there's info about who is/was interested/involved. * Testing. Mainly same for "Tested-by" (newly suggested by Stefan <4675C083.6080409@s5r6.in-berlin.de>) |-*- Feature Needed -*- Addition, needed is hardware user tested have/had/used. Currently ``reportbug'' tool includes packed specific and system specific additions automaticly gathered and inserted to e-mail sent to BTS. (e.g. <http://permalink.gmane.org/gmane.linux.debian.devel.kernel/29740>) Formats of that hardware profile(as system information in ...
The goal is to get all patches for a maintained subsystem submitted to
The -mm kernel already implements what your proposed PTS would do.
Plus it gives testers more or less all patches currently pending
inclusion into Linus' tree in one kernel they can test.
The problem are more social problems like patches Andrew has never heard
The problem is that most problems don't occur on one well-defined
kind of hardware - patches often break in exactly the areas the patch
author expected no problems in.
And in many cases a patch for a device driver was written due to a bug
report - in such cases a tester with the hardware in question is already
"useless pet"?
Be serious.
How many open source projects use Bugzilla and how many use the Debian BTS?
And then start thinking about why the "useless pet" has so many more
user...
The Debian BTS requires you to either write emails with control messages
or generating control messages with external tools.
In Bugzilla the same works through a web interface.
I know both the Debian BTS and Bugzilla and although they are quite
different they both are reasonable tools for their purpose.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
[Dropping noise for Debbugs, because interested people may join via Gmane]
Having all-in-one patchset, like -mm is easy thing to provide
interested parties with "you know what you have -- crazy development"
However [P]TS is tracking, recording, organizing tool. {1} Andrew's cron
daemon easily can run script to check status of particular patch (cc,
tested-by, acked-by). If patch have no TS ID, Andrew's watchdog is
barking back to patch originator (with polite asking to send patch as:
* TS as "To:" target
* patch author as "Cc:" target, this is useful to require:
. author can check that copy himself with text-only pager program
(to see any MIME coding crap)
. preventing SPAM
Crazy development{0}. Somebody knows, that comprehensively testing
hibernation is their thing. I don't care about it, i care about foo, bar.
Thus i can apply for example lguest patches and implement and test new
asm-offset replacement, *easily*. Somebody, as you know, likes new fancy
file system, and no-way other. Let them be happy testing that thing
*easily*. Because another fancy NO_MHz will hang their testing bench
Linus' watchdog, as well, asking for valid patch id, or just doesn't
care (in similar manner Linus does :).
So far no human is involved in social things. Do you agree?
Human power is worth and needed in particular patch discussion and
testing under the participation (by Cc, acking, test-ok *e-mails*) of
I tried to test that new fancy FS, and couldn't boot because of
yet-another ACPI crap. See theme{0}?
Overall testing, like Andrew does, is doubtless brave thing, but he have
Tracking all possible testers (for next driver update, for example) is
I might be stupid, but i faced it. On my amd64 512M laptop i *cannot* run
mozillka any more, for example! And i don't care, because Linus said his
web interface? If you did .........</dev/random dd bs=1 count=13.....
As you just might have seen here, i was talking about organizing,
tracking, hopefully saving and ...That's for developers, not for users. There are different people involved in - patch handling, - bug handling (bugs are reported by end-users), therefore don't forget that PTS and BTS have different requirements. -- Stefan Richter -=====-=-=== -==- =--== http://arcgraph.de/sr/ -
Sure. But if tracking was done, possible bugs where killed, user's bug
report seems to depend on that patch (bisecting), why not to have a
linkage here? Usefulness for a developer (in sub-system association),
next time to see what went wrong, check test-cases, users might be
interested to have them run too before crying (again) about broken
system. Bug report can become part of (reopened) patch discussion (as
i've wrote). Until that, as bug-candidate without identified patch it
can be associated to some particular sub-system or abstract one
bug-category {1}.
Reversed time. As "do-bisection" shows, problems are not happening
just simply because of something abstract. If problem worth of solving
it, eventually there will be patch trying solve that, in both cases:
* when breaking patch (bisection) actually correct, but hardware
(or similar independent) problem arise.
* something different, like feature request or something.
So, this guys are candidate for patch, and can have ID numerically from
the same domain as patch ID, but with different prefix, like "i'm just
candidate for patch". Bugs {1}, are obviously in this category.
Current identification of problems and patch association
have completely zero level of tracking or automation, while Bugzilla is
believed by somebody to have positive efficiency in bug tracking.
That two (patch/bug tracking) aren't that perpendicular to each other at
all.
Eventually it might be that perfect unification, that bug-tracking can be
obsolete, because of good tracking of patches/features-added and what
they did/do.
In any case, i would like to ask mentors to write at least something
similar to technical task, if that, what i'm saying is accessible for
you. Because your experience is treasure, that must be preserved and
possibly automated/organized.
____
-
Of course there are certain links between bugs and patches, and thus there are certain links between bug tracking and patch tracking. I, as maintainer of a small subsystem, can personally track bug--patch relationships with bugzilla just fine, on its near-zero level of automation and integration. Nevertheless, would a more integrated bug/patch tracking system help me improve quality of my output? --- a) Would it save me more time than it costs me to fit into the system (time that can be invested in actual debugging)? This can only be answered after trying it. b) Would it help me to spot mistakes in patches before I submit? No. c) Would I get quicker feedback from testers? That depends on whether such a system attracts testers and helps testers to work efficiently. This is also something that can only be speculated about without trying it. The potential testers that I deal with are mostly either very non-technical persons, or persons which are experienced in their hardware/application area but *not* in kernel internals and kernel development procedures. -- Stefan Richter -=====-=-=== -==- =--== http://arcgraph.de/sr/ -
I'm not a wizard, if i will answer now: "No." [1:] If you ever tried to report bug with reportbug tool in Debian, you may understand what i meant. Nothing can substitute intelligence. Something They also don't bother subscribing to mailing lists and like to write blogs. I'm not sure about hw databases you talked about, i will talk about gathering information from testers. Debian have experimental and unstable branches, people willing to have new stuff are likely to have this, and not testing or stable. BTS just collects bugreports <http://bugs.debian.org/>. If kernel team uploads new kernel (release or even rc recently), interested people will use it after next upgrade. Bug reports get collected, but main answer will be, try reproduce on most recent kernel.org's one. Here, what i have proposed, may play role you expect. Mis-configuration/malfunctioning, programmer's error (Linus noted) in organized manner may easily join reporting person to kernel.org's testing. On driver or small sub-system level this may work. Again it's all about information, not intelligence. ____ -
Oleg Verych wrote: Seamonkey isn't interoperable with Debian's BTS? Lucky me that I frequently use other MUAs too. -- Stefan Richter -=====-=-=== -==- =--== http://arcgraph.de/sr/ -
Quite a big part of -mm are git trees of maintainers.
Where are they in your tool?
And I still don't think your tool would make sense.
But hey, simply try it - that's the only way for you to prove me wrong.
People said similar things about the 2.6.16 kernel or my regression
Patch dependencies and patch conflicts will be the interesting parts
when you will implement this.
E.g. new fancy filesystem patch in -mm might depend on some VFS change
that requires changes to all other filesystems.
I'm really looking forward to see how you will implement this for
something like -mm with > 1000 patches (many of them git trees that
themselves contain many different patches) without offloading all the
No.
Forcing people to use some tool (no matter whether it's Bugzilla or
For getting people to use your tool, you will have to convince them that
I doubt the placing of some Acked-By- tags in patches is really what
is killing Andrews time.
How does Andrew check the status of 1500 patches in -mm in your PTS?
And how do you implement the use case that Andrew forwards a batch of
200 patches to Linus? How does the information from your tool come into git?
But hey, write your tool and convince Andrew of it's advantages if you
Spamming people who have some hardware with information about patches
won't bring you anything. You need people willing to test patches that
won't bring them any benefit - and if you have such people they are
Did Linus state he would actually actively use a Debian BTS?
There's a difference between a discussion email and a control message in
How do people sell and buy goods at eBay?
eBay has a "do everything through the web interface plus notification
emails" quite similar to Bugzilla.
Or Wikis?
Or Blogs?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
...That's right. But the production of subsystem test patchkits is volunteer work which will be hard to unify. I'm not saying it's impossible to reach some degree of organized production of test patchkits; after all we already have some standardization regarding patch submission which is volunteer work too. -- Stefan Richter -=====-=-=== -==- =--== http://arcgraph.de/sr/ -
But still there's no one opinion about against what tree to base the patch. For somebody it's Linus's mainline, for somebody it's bleeding edge -mm. And there will be no one. Thus, particular patch entry might have as -mm as Linus's re-based versions or (as Adrian noted) VFS.asof02-07-2007 FANCYFS. For example, Rusty did that, after somebody asked him to have not only -mm lguest version. So, for really intrusive feature/patch (and not in-middle-development, Adrian) author can have a version (with git branch, patch directory or something). Counter-example: Scheduler patches are extraordinary with large threads or replies, but that is (one of) classical release-early and often. Proposed bureaucracy doesn't apply ;) ____ -
Well, to be honest, I've actually over the years tried to have a policy of *never* really having black-and-white policies. The fact is, some maintainers are excellent. All the relevant patches *already* effectively go through them. But at the same time, other maintainers are less than active, and some areas aren't clearly maintained at all. Also, being a maintainer often means that you are busy and spend a lot of time talking to *people* - it doesn't necessarily mean that you actually have the hardware and can test things, nor does it necessarily mean that you know every detail. So I point out in Documentation/ManagementStyle (which is written very much tongue-in-cheek, but at the same time it's really *true*) that maintainership is often about recognizing people who just know *better* Not really. The "problem" boils down to this: [torvalds@woody linux]$ git-rev-list --all --since=100.days.ago | wc -l 7147 [torvalds@woody linux]$ git-rev-list --no-merges --all --since=100.days.ago | wc -l 6768 ie over the last hundred days, we have averaged over 70 changes per day, and even ignoring merges and only looking at "pure patches" we have more than an average of 65 patches per day. Every day. Day in and day out. That translates to five hundred commits a week, two _thousand_ commits per month, and 25 thousand commits per year. As a fairly constant stream. Will mistakes happen? Hell *yes*. And I'd argue that any flow that tries to "guarantee" that mistakes don't happen is broken. It's a sure-fire way to just frustrate people, simply because it assumes a level of perfection in maintainers and developers that isn't possible. The accepted industry standard for bug counts is basically one bug per a thousand lines of code. And that's for released, *debugged* code. Yes, we should aim higher. Obviously. Let's say that we aim for 0.1 bugs per KLOC, and that we actually aim for that not just in _released_ code, but in patches. What does ...
Linus, Nice quote. I'm trying to make proposition/convince Adrian, who is in I'm proposing kind of smart tracking, summarized before. I'm not an idealist, doing manual work. Making tools -- is what i've picked up from This one is last at least from me. Sorry for taking you time. ____ -
Don't get me wrong, I wasn't actually responing to you personally, I was actually responding mostly to the tone of this thread. So I was responding to things like the example from Bartlomiej about missed opportunity for taking developer review into account (and btw, I think a little public shaming might not be a bad idea - I believe more in *social* rules than in *technical* rules), and I'm responding to some of the commentary by Adrian and others about "no regressions *ever*". These are things we can *wish* for. But the fact that we migth wish for them doesn't actually mean that they are really good ideas to aim for in practice. Let me put it another way: a few weeks ago there was this big news story in the New York Times about how "forgetting" is a very essential part about remembering, and people passed this around as if it was a big revelation. People think that people with good memories have a "good thing". And personally, I was like "Duh". Good memory is not about remembering everything. Good memory is about forgetting the irrelevant, so that the important stuff stands out and you *can* remember it. But the big deal is that yes, you have to forget stuff, and that means that you *will* miss details - but you'll hopefully miss the stuff you don't care for. The keyword being "hopefully". It works most of the time, but we all know we've sometimes been able to forget a detail that turned out to be crucial after all. So the *stupid* response to that is "we should remember everything". It misses the point. Yes, we sometimes forget even important details, but it's *so* important to forget details, that the fact that our brains occasionally forget things we later ended up needing is still *much* preferable to trying to remember everything. The same tends to be true of bug hunting, and regression tracking. There's a lot of "noise" there. We'll never get perfect, and I'll argue that if we don't have a system that tries to actively ...
By reading only known persons[1]? Fine, it is OK. But i hope, i did useful statements. In fact, noise reduction stuff WRT bug reports was before in my analysis of Adrian's POV here (reportbug tool). Also it showed again, when i've wrote about traces, where testers (bug reporters) can find test cases, before they will cry (again) about some issues. I see this, example is bugzilla @ mozilla -- known history. [1] Noise filtering -- that's obvious for me, after all :) By not flaming further, i'm just going to try to implement something. Hopefully my next patch will be usefully smart tracked. Thanks! ____ -
This is the most crucial point so far in my opinion. Well, not only people who report bugs are smart - they are curious, enthusiastic, and passionate about their system, and job, hobby - whatever linux means to them. They often do own investigations, give lots of detail, and often others jump in with "me too" and give even more detail (and more noise) But real detail that would help in bug assessment is not there, and needs to be requested in lengthy exchanges (time wise, since every request takes hours, days, months...) I think would help to make some attempt to lead them on to giving out what's important. Cold and impersonal upfront fields and drop-down menus are taking a lot of noise and heat off the actual report. Another observation - things like "me too" should be encouraged to become separate reports because generally only maintainer and people who work directly on the module can sort out if this is same problem, and in fact real problems get lost and not accounted for when getting in wrong buckets this way. -
Even generating the perfect signal is a complete waste of time if
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
My argument is that *if* we had "more signal, less noise", we'd probably get more people looking at it. In fact, I guarantee that's the case. You may not be 100% happy with the regression list, but every single maintainer/developer I've talked to has said they appreciated it and it made it easier (and thus more likely) for them to actually look at what the outstanding issues were. Linus -
The problems are the parts of the kernel without maintainer or with a
maintainer who is for whatever reason not able to look after bug
reports.
And you often need someone with a good knowledge of a specific area of
the kernel for getting a bug fixed.
Let me make an example:
During 2.6.16-rc, I reported a bug (not a regression) in CIFS where I
had reproducible during big writes to a Samba server after some 100 MBs
(not a fixed amount of data, but 100% reproducible when transferring 1 GB)
a complete freeze of my computer (no SysRq possible). And there is
nothing more I (or any other submitter) could have given as information -
in fact it even took me several days to isolate CIFS as the source of
these freezes.
Steve French and Dave Kleikamp told me to try some mount option.
With this option, I got an Oops instead of a freeze.
After they fixed the Oops, it turned out the patch also fixed the
freeze. The patch went into 2.6.16, and it was therefore fixed
in 2.6.16.
That's one important value of maintainers.
In many other parts of the kernel, my bug report wouldn't have had any
effect.
We need more maintaners who look after bugs - but where to find them,
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Exactly: We cannot get a regression free or even bug free kernel.
But we could handle the reported regressions (or even the reported bugs)
better than we do.
Lesson #6:
Get the data.
Some real life numbers from 2.6.21 development:
- 80 days between 2.6.20 and 2.6.21
- 98 post-2.6.20 regression have been reported before 2.6.21 was released
- 15 open post-2.6.20 regression reports at the time of the 2.6.21 release
- 8 open post-2.6.20 regression reports at the time of the 2.6.21 release
that were reported at least 3 weeks before the 2.6.21 release
This:
- only includes regressions with reasonably usable reports [1] and
- confirmed to be regressions and
- reported by the relatively small number (compared to the complete
number of Linux users) of -rc testers and
- reported before the release of 2.6.21.
We weren't even able to handle all reported recent regressions in
2.6.21, and for other bugs our numbers won't be better.
When Dave Jones says that for a kernel for a new RHEL release that is
based on a "stable" upstream kernel they spend 3 months only for shaking
out bugs in the kernel that's IMHO a good description of our "stable"
kernels.
I'm not claiming the kernel could become bug-free, but aiming at being
able to handle all incoming bug reports is IMHO a worthwhile and not
completely unrealistic goal with benefits for all Linux users (and the
overall image of Linux).
cu
Adrian
[1] submitter has given all information requested
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
The BTS, while fairly good at tracking issues for distributions made up of thousands of packages (like Debian), is rather suboptimal for dealing with the workflow of a single (relatively) monolithic entity like the linux kernel. Since the ultimate goal is presumably to apply a patch to a git tree, some sort of system which is built directly on top of git (or intimately intertwined) is probably required. Some of the metrics that the BTS uses, like the easy ability to use mail to control bugs may be useful to incorporate, but I'd be rather surprised if it could be made to work with the kernel developer's workflow as it exists now. It may be useful for whoever ends up designing the patch system to take a glimpse at how it's done in debbugs, but since I don't know how the workflow works now, and how people want to have it work in the end, I can't tell you what features from debbugs would be useful to use. Finally, at the end of the day, my own time and effort (and the primary direction of debbugs development) is aimed at supporting the primary user of debbugs, the Debian project. People who understand (or want to understand) the linux kernel team's workflow are the ones who are going to need to do the heavy lifting here. Don Armstrong -- N: Why should I believe that?" B: Because it's a fact." N: Fact?" B: F, A, C, T... fact" N: So you're saying that I should believe it because it's true. That's your argument? B: It IS true. -- "Ploy" http://www.mediacampaign.org/multimedia/Ploy.MPG http://www.donarmstrong.com http://rzlab.ucr.edu -
How about an easy way to send multiple hardware profiles to your bugzilla user account simultaniously linked to an online pciutils database and/or an hardware list database similar to overclocking web sites and why not even with a link to the git repository when possible? A some sort of really usefull "send your profile" of RHN that would link the driver with the discovered hardware and add you to appropriate mailing lists to test patches/help reproducing & solving problems/etc. In the end plenty of statistics and hardware compatibility list could be made. For example, that would make my life easier knowing what level of compatibility Linux can offer for old HP9000 K-boxes that we still have running at the office and presumably get people to contact to get help? - vin -
This is definitely something that can be done (and should) - well, especially having ability search by certain criteria - then all sorts of statistics and databases can be created. Everything that helps to find a way to work on a patch and to test easier should be done to make bug fixing easier and even possible. Often times the most knowledgeable people are not able to make quick fix just because there is no way to reproduce the case or get access to HW. --Natalie -
Hardware Compatibility Lists/ Databases already exist, for driver
subsystems, for distributions...
Some issues with those databases are:
- Users typically can only test one specific combination of a
hardware collection and software collection, at one or a few points
in time.
- Users have difficulties or don't have the means to identify chip
revisions, used protocols etc.
- The databases are typically not conceived to serve additional
purposes like bidirectional contact between developer and user.
These issues notwithstanding, these databases are already highly useful
As has been mentioned elsewhere in the thread,
- bug---hardware associations are sometimes difficult or impossible
to make. For example, the x86-64 platform maintainers are bothered
with "x86-64 bugs" which turn out to be driver bugs on all
platforms.
(We want details descriptions of the hardware environment in a bug
report, but this means we must be able to handle the flood of
false positives in bug---hardware associations, i.e. successively
narrow down which parts of the hardware/software combo are actually
affected, and what other combinations could be affected too.)
- Patch---hardware associations, especially for preemptive regression
tests, are virtually impossible to make. Murphy says that the
regression will hit hardware which the patch submitter or forwarder
thought could never be affected by the patch.
Of course, /sensible/ patch---hardware associations are (1) to try out
fixes for known issues with a specific hardware, (2) to test that a
cleanup patch or refactoring patch or API changing patch to a driver of
very specific hardware ( = a single type or few types with little
variance) does not introduce regressions for this hardware.
--
Stefan Richter
-=====-=-=== -==- =--==
http://arcgraph.de/sr/
-
Well, I'm not doing it myself but I find it tempting... ;) In case of being maintainer "bastard approach" is more about not discouraging developers by holding patches for too long than about getting credit. Bart -
The maintainer who is about to suffocate in newly contributed code is actually a lucky guy: He can ask his eager contributors to also help with cross-reviewing and bug fixing, otherwise all the fine work will be stuck in the clogged pipeline. (E.g. post a subsystem todo-list now and then, as a subtle hint.) -- Stefan Richter -=====-=-=== -==- =--=- http://arcgraph.de/sr/ -
I think that this is a very good idea - especially for large, intrusive patches. Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
...
Perhaps make lists of
- bug reports which never lead to any debug activity
(no responsible person/team was found, or a seemingly person/team
did not start to debug the report)
- known regressions on release,
- regressions that became known after release,
- subsystems with notable backlogs of old bugs,
- other categories?
Select typical cases from each categories, analyze what went wrong in
these cases, and try to identify practicable countermeasures.
Another approach: Figure out areas where quality is exemplary and try
to draw conclusions for areas where quality is lacking.
--
Stefan Richter
-=====-=-=== -==- =---=
http://arcgraph.de/sr/
-
No maintainer or no maintainer who is debugging bug reports is the
ieee1394 has a maintainer who is looking after all bug reports he gets.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I noticed some areas are well maintained because there is an awesome maintainer, or good and well coordinated team - and this is mostly in the "fun" areas ;) But there are "boring" areas that are about to be deprecated or no new development expected etc. It will be hard to get a dedicated person to take care of such. How about having people on rotation, or jury duty so to speak - for a period of time (completely voluntary!) Nice stats on the report about contributions in non-native areas for a developer would be great accomplishment and also good chance to look into other things! Besides, this way "old parts" will get attention to be be revised and re-implemented sooner. And we can post "Temp maintainer needed" list... -
I'd vote for that, I've seen alot very bad code already within some
subsystems and critical problems which just have been ignored by some
maintainers.
It mostly helps if some volunteers read through existing code and
state out their considerations about implementations which they don't
like.
I just grep'ed some examples I noticed (note I do not want to jump
onto someone's toe here, just give some examples):
(sn9c102_ov7660.c)
...
err += sn9c102_i2c_write(cam, 0x12, 0x80);
err += sn9c102_i2c_write(cam, 0x11, 0x09);
err += sn9c102_i2c_write(cam, 0x00, 0x0A);
err += sn9c102_i2c_write(cam, 0x01, 0x80);
err += sn9c102_i2c_write(cam, 0x02, 0x80);
err += sn9c102_i2c_write(cam, 0x03, 0x00);
... (around 150 lines directly after each other doing such writes and
adding error values to a variable, I don't understand why someone
should add the errors but continue with sending 150 more updates, how
about one write failed but others succeeded for any reason)
(tvp5150.c)
static int tvp5150_read(struct i2c_client *c, unsigned char addr)
{
unsigned char buffer[1];
int rc;
buffer[0] = addr;
if (1 != (rc = i2c_master_send(c, buffer, 1)))
tvp5150_dbg(0, "i2c i/o error: rc == %d (should be 1)\n", rc);
msleep(10);
if (1 != (rc = i2c_master_recv(c, buffer, 1)))
tvp5150_dbg(0, "i2c i/o error: rc == %d (should be 1)\n", rc);
tvp5150_dbg(2, "tvp5150: read 0x%02x = 0x%02x\n", addr, buffer[0]);
return (buffer[0]);
}
(i2c issues within some driver)
/* This code detects calls by card attach_inform */
if (NULL == t->i2c.dev.driver) {
tuner_dbg ("tuner 0x%02x: called during i2c_client
register by adapter's attach_inform\n", c->addr);
return;
}
... that code doesn't even work anymore since the i2c.dev.driver is
always initialized.
just reading through it and cleaning up some code ...Doing cleanups is a good way to get into the matter, to become able to Everybody is allowed to submit. But there is a certain degree of both persistence and adaptability required to get one's first submissions upstream. However, these qualities are also required to fix difficult bugs. -- Stefan Richter -=====-=-=== -==- =-==- http://arcgraph.de/sr/ -
Deja-kernel. Just two messages: <http://permalink.gmane.org/gmane.linux.debian.devel.general/116453> <http://permalink.gmane.org/gmane.linux.debian.devel.general/116463> Tell me, i'm wrong, if similar thing cannot be implemented here. Again, key word is _tracking_ system... Just trying attract attention, that time of ignorance and manual work must be ended. There must be new time, time of *tracking*, *counting* opinions and any kernel work anybody want to contribute. I just got bored after repeatings like, not funny work, code, etc. The manager, who will do that not funny work is automated tracking system. Based on e-mail, with additional tools, like * ``reportbug''-- reporting (imroved REPORTING-BUGS, EVERY-WORK-IS-APPRECIATED-THANK-YOU) * ``bts''-- command line interface, etc. I want to change it, and i will try to work on that. Important thing is -- to be in the corner *alone*, even with good, open source example system as Debian BTS is not gonna work. WRT this, opinions and doings of people in this thread, who spend in Linux development much more time, than i, just counter productive (fine, fine but i have a right to have different, wrong opinion on that :). -- Frenzy -o--=O`C #oo'L O <___=E M -
...but doesn't fix them all, and is usually slow with fixes. He should spend less time conversing on LKML. :-) -- Stefan Richter -=====-=-=== -==- =---= http://arcgraph.de/sr/ -
It is unworkable in wiki. There is a new regression field in bugzilla, but it is only the first Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
Just one comment. We don't try to recruit new skilled testers - it's a big problem. Skilled tester can narrow down the problem, try to fix it etc. There are too many "something between 2.6.10 and 2.6.21 broke my laptop" reports... Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
The measurement of "evil" is subjective. That's why there are releases with known regressions. -- Stefan Richter -=====-=-=== -==- =---= http://arcgraph.de/sr/ -
Hi Stefan, On 16/06/07, Stefan Richter <stefanr@s5r6.in-berlin.de> wrote: Rafael is working on translation of "Linux Kernel Tester's Guide" (it's almost finished). I hope you will get more -rc testers. Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ -
I know you hate bugzilla ... but at least I can try to make that bit of the process work better. The new version just rolled out does have a simple "regression" checkbox (and you can search on it), which will hopefully help people keep track of the ones already in bugzilla more easily. Thanks to Jon T, Dave J et al. for helping to figure out methods and implement them. M. -
Yes, good work, thanks a lot for it! The new interface is much better and more useful. Greetings, Rafael PS BTW, would that be possible to create the "Hibernation/Suspend" subcategory of "Power Management" that I asked for some time ago, please? :-) -- "Premature optimization is the root of all evil." - Donald Knuth -
While I'm not reading this entire thread for lack of time: This looks exactly like the kind of bug tracking that Timo Sirainen of Dovecot fame concocted and talked about on the dovecot-users list, quote: | Dovecot BTS | ----------- | |The preferred way to report bugs is to send them to dovecot-bugs at |dovecot.org. The only thing it does is prefix the subject line with [BUG |#nnn] and forward it to dovecot at dovecot.org. | |Now everyone can reply to it just as it was a normal mailing list mail. |As long the subject contains the "[BUG #nnn]" prefix, it's part of the |bug. | |Existing mailing list threads can also be turned into bugs by replying |to the thread's root message with To: dovecot-bugs at dovecot.org. This |again causes the new reply to contain [BUG #nnn] prefix. | |Then comes the web part. [...]" -- quoting Timo Sirainen <http://www.dovecot.org/list/dovecot/2007-January/018786.html> Perhaps it's a stripped-down sibling of the Debian BTS without itself knowing :-) anyways, it's E-Mail centric, low ceremony, devised from the currently implemented workflow. I haven't followed its details, portability, shape, state or anything, but the requirements appear very similar to Linus's -- at least to me with entirely outside view (I've never used kernel bugzilla), so it might be a starting point, conceptionally or with some luck even implementation-wise. (Yes I know it's going to be tough to obtain all the precioussssssss bugssssssssssss from bugzzzzzzzzzzzzzzzilla but anyways, if nobody likes to use it, something will be done and if only neglect...) HTH -- Matthias Andree -
It has to be able to suck in reports from people who don't know much
about how the Linux guys handle bugs, and has to keep the reporters
involved up until a bug can be closed. IOW it has to be compatible with
integrators, developers, and reporters/testers. Luckily though, not all
of them need all of that system's features. ('system' == the right
people with the right tools)
--
Stefan Richter
-=====-=-=== -=-- ===-=
http://arcgraph.de/sr/
-
[...] http://bugzilla.kernel.org/faq.cgi says, although it doesn't make a lot of sense: "Q. If a bug has an owner does that mean they are working on it? A. No. If it is not in the ASSIGNED state then no one is working on it. The owner defaults to the subsytem maintainer. However, anyone who wants to submit a patch or add more info to a bug can do so. If the bug is reassigned to someone then the owner field will reflect that change." So the "owner" field is bogus per default. It would be better if the bugzilla admins used only meta-addresses instead of a person's address for any automatically filled-in "owner" field, unless a person specifically wants to assume this automatic owner role. I for example am not automatic owner of IEEE 1394 bugs; drivers_ieee1394@kernel-bugs.osdl.org is. And I am watching this pseudo owner. So in fact, the "owner" field should be replaced by - a mail exploder for each component which can be watched by interested people, - an "assignee" field which is filled in when a bug is assigned to a person. Now that I am at it, another quote from http://bugzilla.kernel.org/faq.cgi : "Q. What does a subsystem maintainer do? A. He or She will track new bugs and assign them to people or reject it for various reasons. They periodically check to make sure things are getting worked on and review fixes to make sure they are well written." A maintainer in the project called linux kernel will almost never assign bugs to people (besides to himself). He could if he employed or otherwise supervised people to assign bugs to. This especially applies to so-called "subsystem maintainers in kernel tracker", which are not what many people think "subsystem maintainers" are: "Q. Why are the subsystem maintainers in kernel tracker sometimes different than the person listed in the MAINTAINER file? A. The subsystem maintainers in kernel tracker are volunteers to help track bugs in an area they are interested in. ...
... and it gets better: I've been trying to fix/improve stuff I can, with
my limited abilities as coder for some time now in the linux kernel, but there
are simply cases where, although you're trying to even come up with a proper fix
and adhere to all the kernel coding standards and _even_ produce a patch which
can serve as rough cut for a probable fix, your mail simply disappears into the
void unanswered. Then you send again, and again, yet it remains unreplied and then you
give up and turn to something else. Some of those patches, for example, were the
rework of the debugging scheme of libata i did more than a year ago which got
reviewed and then .. forgotten, some kernel janitor cleanups, etc...
--
Regards/Gruß,
Boris.
-
No, it just shows that bugzilla doesn't matter for most of the kernel. Don't say that "bugzilla tells how bad we are at handling bugs". It tells how bad *bugzilla* is for handling bugs, nothing more. Trying to play politics by pointing to bugzilla is pointless. Bugzilla is used for a few subsystems (ACPI seems to use it actively, for example), but I doubt most developers use it. Would be be good to have a better bug-tracking setup? Yes. But I think it takes man-power, and it would take something *fundamentally* better than bugzilla. Maybe the new "http://kernelnewbies.org/known_regressions" thing will evolve to something worth tracking. Right now, bugzilla isn't it (although it can be a useful tracking place for individual bugs, *once* you've found and gotten the right developer involved - but that's a huge step that bugzilla generally does *not* do for us). Linus -
Bugzilla has an email interface.
Andrew forwards bugs from Bugzilla to developers.
There might be small room for improvements, but I don't see how
"*once* you've found and gotten the right developer involved" is the
real problem, not how to track bugs.
And not only a developer active in this area, more important a
developer who knows the subsystem/driver involved *and is willing
to work on bug reports*.
*This* is *the* problem.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
From: Adrian Bunk <bunk@stusta.de> Therefore, bugzilla only works at all when Andrew forwards things around by-hand. -
That's not entirely true. There are people watching the bugs which might be relevant for them on their own. It does not make bugzilla better though. The user interface sucks and getting things correlated is simply not possible. tglx -
Yes. And if you search around, you'll even see that I occasionally use it. But it's useful only once the bug has been assigned and somebody has actually *looked* at it. The fact that some people do this is a big credit I'm talking about getting the developers to _look_ at the bugs in the first place. Bugzilla is not very good at that, because it has no useful interfaces for doing so (unless you can specify your area of interest so exactly that you can actually set yourself up as a maintainer of one particular area). Almost none of the subtle (and thus harder) bugs tend to fall into that kind of nice category. For example, look at suspend/resume bugs. Do you realize that 99% of them are device driver issues, but how the heck do you connect a "my laptop does't resume" to the _right_ device driver maintainer? And do you realize (and acknowledge) that it would be total madness to send all suspend/resume bugs to _everybody_ who maintains any device driver at all? See? THAT is the problem with bugzilla. It only works for the "easy" cases. It works for the case where a reporter can say with certainty that the reason his machine doesn't boot is a particular network device driver (like the sis900 regression we had in 2.6.21). But once you know the subarea that precisely, bugzilla doesn't even help you that much, and it's likely easier and more productive to just send email directly to the right And I agree 100%. So why are you pushing bugzilla? There are actually better bug trackers around. One of them is "google". For oopses, one of the thngs I do is to put in the most relevant information (backtrace etc) into google, and ask google to try to find the pattern. That sometimes actually does pretty well - you can get a real feel for "oh, there's a pattern here - they're all AMD machines with the NVidia chipset" kind of thing. Bugzilla doesn't offer anything even remotely as useful. It's the "big picture" that tends to be hard to ...
I totally disagree here, bugzilla is a very good tool. If someone is too lazy to look at it it's his problem. Kernel Janitors can pick out some bugs which aren't addressed by anyone or got left behind. I also found some bugs there which could have been solved by anyone here, the matter is just that many people aren't interested in mainly bug fixing and many also work on different other topics here. How else should bugs get handled, sending them to the lkml? I'm 100% sure some bugreports will also get lost then, but on the lkml they'll very likely remain lost whereas in the bugzilla they'll remain as for the em28xx I actively use it, but I also set up a mailinglist what are your suggestions to improve a bugreporting tool, I'm very sure that many people, especially people who want to get into existing I'd say this is a personal opinion, some people will get along with it and some of them will not... Markus -
From: "Markus Rechberger" <mrechberger@gmail.com> No, Bugzilla really does suck, and I personally refuse to use it when I have a choice. And guess what? You better be concerned about that because I maintain all of the networking code :-) It puts the onus FAR too much on the developer and not enough on the reporter and other minions. We have a small resource of developers, yet lots of users, bug reporters, and minions, so something that doesn't take advantage of the larger resource we have is going to not function efficiently at all. Yet that is what bugzilla does. It's made way too much work for me every time I'm come in contact with it, it wastes my time instead of making good use of it. As a developer I do not want to get pounded with emails containing state changes and other bullshit that typically comes with being assigned to or on the CC of a bugzilla entry. I don't want to be reminded that a bug hasn't been touched in weeks, if the reporter doesn't care I don't care and I'll work on things that people do care about. It makes me delete all the bugzilla email, even the ones with important information in them, because it's rediculious to have to sift through all of that crap. People only use bugzilla because it is well understood and nothing better has reached critical mass yet. -
I'll say that as a user I hate having to deal with bugzilla. there's nothing more frustrating then spending a good chunk of time trying to find a similar bug, then jumping through all the bugzilla hoops to file a report to eventually (days/weeks later) get a message 'closed becouse it's a duplicate report), then have to go and track down what it's a duplicate of, read through that bug report, only to find that it's not solved there either, and to top it off, the people working on that bug won't see my report or that I'm available to troubleshoot it. from a user poit of view, e-mailing the kernel list (retrying a few days later of there is no response) tends to work _much_ better. David Lang -
Ideally, joining duplicate reports should be a low-cost, lossless operation. That said, when bug B is marked as duplicate of bug A, people at bug A at least get a link to bug B, aren't they? If they are too lazy to read the report B, they obviously are not very interested in A either. Tough luck. Vice versa, people at bug B get notified that the matter is now continued at bug A and can add their Cc there. Of course that addition is one of the very few things that could probably be automated. Joining duplicate reports at a mailinglist involves responding to multiple threads and send links into web archives of the list, which happens to be redundant to and disparate from your local e-mail storage. What I from a maintainer's POV agree with is that a report to the appropriate mailinglist is often easier to triage than a report at bugzilla, because the reporter often needs initial help to properly define the problem. Bugzilla becomes useful after a report reached a minimum level of quality (after minimum initial triage) and if the bug can be clearly associated with a maintained subsystem of the kernel (as e.g. Linus already pointed out in this thread). -- Stefan Richter -=====-=-=== -=-- ===-= http://arcgraph.de/sr/ -
PS: Of course what _does_ work better on mailinglists than on bugzilla is to recognize duplicates as such in the first place, when the symptoms seem only loosely related. (I.e. seeing the big picture and recognize patterns.) -- Stefan Richter -=====-=-=== -=-- ===-= http://arcgraph.de/sr/ -
I'm glad we finally found _the_ person using it ! More seriously, it's so much a complicated interface ! It's hard to bring more people into a discussion, it's hard to comment on code or suggested patches, etc... Mail is by far more adapted to the job ! See how many times people do public patch reviews here. You get one comment every 5-10 lines. I have yet to see how this could be done in bugzilla. And maintainers have to _think_ about going there. Mail comes in without deliberate action. This is especially important because only your eye is involved in noticing bugs affecting areas where you can help. What _may_ be useful would be to send digests or batches of recently open bugs to LKML. But not all of them. Maybe doing this once a week in the same way we post patches lists for review. We could have a batch of "[BUG 1/23] quick subject". At least more people will notice them and will be able to comment on them. And given the number of people reading LKML, the bugs should only be posted once. Because if a bug posted to LKML in a noticeable manner does not get assigned in a week, it will never be. Personally, I got used to review Greg & Chris' announces of stable patches to find if some of them could affect 2.4. It's far easier to ask for precisions in a mail you weren't expecting than it is to go somewhere search for something you don't know exists. Willy -
To continue on the sarcastic tangent: This flaw of bugzilla is irrelevant for subsystems where there are less than three or two persons who steadily hunt bugs anyway. At the field I work on, I wouldn't have anybody else to bring in in the first place, except that I sometimes suggest to reporters to subscribe to a bug ticket. -- Stefan Richter -=====-=-=== -=-- ===-= http://arcgraph.de/sr/ -
If you think so, try reading my email and responding constructively on how the issues there can be resolved. That email contains good examples where bugzilla fails, and bugs end up sitting around for ages untouched. And no, it's not because I'm "lazy". -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
You must be doing things very differently from a lot of other people if IF that happened, it would actually be great. That's what I'm arguing for. Actually, looking at Adrian's regression lists, yes. lkml worked better What's the difference between bugzilla and lkml.org? Both have search I don't know what the perfect setup is, but I do know that bugzilla is very close to be totally useless for the top-level maintainers. Try to think like a person who doesn't maintain *one* specific file in the kernel, but who can actually make a good judgement about a lot of things, or at least funnel a problem report to the right person? And now, imagine that that person is also fairly busy (exactly *because* he's not looking at a single file, he may be maintaining a huge subsystem that has multiple submaintainers etc). I think bugzilla really only works for very "directed" issues. If you already know exactly which driver is affected (which is often wrong anyway: some of the bugs that were due timer breakage got blamed as disk hangs!) it's almost totally useless. And yes, maybe that's why you have a much higher opinion of bugzilla than I do. To _me_ bugzilla is a total mess. There's absolutely _zero_ useful information there. And I'm pretty certain that is true of a *lot* of other people too. But if you have a small project, or you maintain a very specific (and clearly delineated) part of a big project, bugzilla probably looks a lot more palatable. Linus -
Mailing lists don't track bugs.
The _only_ reason why I originally started regression lists was because
several kernel developers were spreading the fairy tale "noone tests -rc
kernels".
This was only possible because so many bug reports to linux-kernel never
get any reply, and are therefore lost.
After I started the regression lists, it suddenly turned out how many
people test -rc kernels, and that even with regression lists at least
weekly, it still often took weeks ontil some developer even started
debugging a regression.
So what's the difference between bugzilla and lkml.org?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
On Sun, Apr 29, 2007 at 02:05:42AM +0200, Adrian Bunk wrote: > On Sat, Apr 28, 2007 at 04:40:31PM -0700, Linus Torvalds wrote: > >... > > What's the difference between bugzilla and lkml.org? Both have search > > buttons. Both archive the old stuff. Both can be pointed to. > > Mailing lists don't track bugs. There's no reason they can't. Store them in folders 'fixed' 'pending' 'notabug' etc. Move mails between them when states change. reply-to them when necessary. Bounce them. Add people to Ccs. etc. etc. The only remaining piece of the puzzle is "how does everyone see the states of the various pools", which could be solved in a number of ways, daily uploads to a place on kernel org for eg. Hell, you could store them in a git tree if it came to it. That would also solve the problem for those with an aversion to web browsers to see bugs :-) Dave -- http://www.codemonkey.org.uk -
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
and archive off the old reports into folders as well. While I don't think that closeing a report becouse it hasn't seen an update in a week is reasonable, closeing a report that hasn't seen an update or retest in a couple 2.6.x releases definantly is (this is a 4-6 month period at the rate of change in the kernel the odds are pretty good that the code is no longer the same at that if people are interested in doing this it's not a technicly hard problem. get bugs to a 'bug maintainer' e-mail address. this can either be a copy of the full lkml firehose, or volunteers who pull selected messages from the list, with a server that discards duplicates it's safe to have multiple people bounce the same message to the bug list, if they forward the message (to add a comment on who they think may need to work on it) then it will take manual weeding out of duplicates have an IMAP server available for this address. make it read-only to the world and read-write to a list of volunteers who will sort the messages. sort the messages into the different catagories, with a subfolder for each bug (note: the structure at this point is arbtrary, the volunteers can further orginise the folders) with a good IMAP server you can add the address of the particular bug folder to the cc list if you want (bugs+drivers.ethernet.3com.123@kernel.org for a complex example) to have the follow-ups go directly into that folder. now the problem with this is that developers would still have to look at it for an overview (the volunteers would still need to copy the developers when they create a new bug) the cyrus mail server will scale well for this sort of thing, and I beleive that it can also export the mailboxes as news feeds as well (I haven't ever configured this so I can't say the exact details) for searching, just make it available to google (I'm sure google would cooperate with this sort of thing to find the best way to do it) the key is to find people to volunteer to do the ...
Since 1992, lkml (with "Cc:" to the appropriate subsystem mailing list if applicable) and the presumed responsible parties are the only channels I've used to report the bugs I encounter. Other methods come and go, but old habits die hard, particularly when they have a knack for producing the desired result. Historically, requiring a developer to fire up a GUI to read a bug report decreases the chance that bug report will be noticed. The development community can do whatever flips its collective switch as far as tracking bugs, but the bugs have to be reported and noticed before tracking becomes a meaningful activity. One more thought and I'll get off your screens... We've steadfastly resisted making lkml and friends subscriber-only mailing lists precisely because we don't want to miss a potential bug report because a would-be submitter isn't subscribed. If people aren't looking for bug reports here, what's the point? --Bob Tracy rct@frus.com -
since there are subsystems out there which are managed separatly this doesn't work out. I wasn't happy when I noticed that patches got applied to the sourcecode I contributed without notifying me while I still worked on that code separatly It was moreover the fault of the subsystem maintainer to not notify me back then but a centralized bugreporting (as bugzilla) tool would at least have notified me, or I would have been able to see the suggested changes there. it's just easy to miss something here, if an ext3 bug comes in and all people who're involved in the ext3 filesytem are on vacation I'm sure they won't read all the mails which came in during a week, now take a part of the kernel which is smaller than the ext3 filesystem (eg. usb gadgets, smaller drivers) Markus -
Well I'm behind the stuff I'm doing because I'm interested in it. And if some bugs are introduced by my work or derived by my work I'd like to get them cleaned up in the end. If I see that someone reports bugs which doesn't really address my work at all I just forward them to the subsystem/maintainer who's "in I'm very sure that happens maybe it's just not visible to everyone because there are so many open issues. (I just take myself as an example here, I didn't do too much with other bugs but at least some of my work closed 5 other bugs this year beside the bugreports I'm Yes Adrian did a very good job with collecting every bugreport and Both have search buttons yes, but the lkml doesn't leave an unread mail open ontop of the lkml as bugzilla does if you look for open bugs bugzilla keeps the bugs open at least, at the lkml I use to skip days sometimes. Many people who consider themself as maintainer of a subsystem are assigned to a subsection on bugzilla, if it really doesn't work out we have to change the corresponding maintainer. If that maintainer doesn't know where to go with that bugreport he can easily send it to the lkml and some people will recognize the sender/email and pay extra attention to it (that's just how I think well are there any bugs that cannot be forwarded/directed to a corresponding maintainer? Maybe I don't see something here, can you point me out to a bugreport which cannot be handled at all? As a reference I'll take following bugreport: http://thread.gmane.org/gmane.linux.kernel/521185 the bug doesn't even mention what device is affected, asking for further detailed information (dmesg) shows up what's left at least.. (in the meanwhile the bug even got solved) Markus -
It's easy to send the different categories to different mailing lists, if that's what we want to do. Apart from some aggressive filtering on the SCSI lists etc stops me from bouncing messages to it, but that's fixable. Yes, human involvement from someone with half a brain would be better. Andrew does a lot of that. Not a particularly good use of talent really. but still. As Andrew has pointed out before though - even though he forwards the bugs, nobody does anything with it. The sad truth seems to be that people have very little interest in fixing bugs when they are Go to http://bugzilla.kernel.org. Hit query. Find the box that says "Bug Changes, Only bugs changed in the last __ days". Stick 7 in it. 74 bugs found. I'm reluctant to drop / close them. We could fairly easily move them to a "STALE" state if you want, and have that ping the user. Not sure what we'd ping them with apart from "Nobody seems to give a toss about your bug. Life's a bitch. Try sending chocolates, flowers, or fireworks". I'm still unconvinced the users or the tool are the problem, but if it What would you want from a smarter / better bugzilla or other bug tracking tool? A list of requirements / suggestions would be nice. The main complaint we had before was lack of an email interface, and that was fixed a long time ago. I admit development has not exactly been active since, but the only person I got real feedback from was Dave J, and we've been fixing his UI issues. M. -
And what part of the "directed" did you miss? Do you really expect me to go there every day to look at all bugs? That's nbot a bug tracker. That's just a noise-maker. It needs to be email, not some "mouse around for 30 seconds and type thing", and it needs to be *directed*. Preferably with somebody who actually did some manual scanning over it and spent a few minutes just looking at whether it looks like a worthy bug. In other words: we shouldn't have all developers wasting time doing this. It would be much better to have _one_ person (or a group of people) doing it, and actually turnign your "Not hard to do" into real information, rather than just random data. Adrian did. The good news is that it looks like now that people are aware of it, we hopefully have others who will help do this kind of thing. Linus -
I am doing that. It's only 5-10 a day - routing them to the relevant culprit is very little work. It's also very little work for said culprits to totally ignore said routing, which is a tougher problem. -
If the result is fixing things which then don't get fixed in mainline, as Adrian notes, then there is something wrong with the process, and why will people bother to work on stable if they have doubts that there will be long term benefit. With all the effort the regressions list takes and the stable group puts into fixes, someone in charge should insist that regressions fixed in stable be fixed in mainline. Since there's only one "someone in charge" of policy, I think that's a reasonable commitment to the people doing the work. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
That whole premise is flawed. The *rule* for the stable tree is that things don't get merged into the stable tree unless they are fixed in mainline already. We had that problem in the 2.4.x / 2.5.x split. I think we learnt our lesson. Linus -
If Adrian cares to note which two regressions he had in mind in his previous post <20070426125802.GL3468@stusta.de> or what the exact timing I'd love to think that's the case. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
I think there is an issue with two different things being conflated, and this causes real stability problems. 2.6.x is both the first kernel in a series that is judged to be "stable" and the kernel that is the split between 2.6.x.y and 2.6.x+1. This is a fundamental problem, because it means that 2.6.x must have all of the problems that are being debugged by the people who understand the areas they are in, because 2.6.x+1 has to start so that people who are clueless about all of the areas with remaining bugs don't spend their time putting more regressions into their submissions for 2.6.x+1. It is also a problem because it is easily possible for a problem to exist in 2.6.x-rcN which can only be correctly fixed by doing intrusive things, but can be papered over in an obviously-safe way. (E.g., the issue with legacy interrupt delivery when MSI is enabled). The intrusive patch could easily break a bunch of unrelated stuff, so that's no good for 2.6.x-rcN, but papering over bugs is no good for mainline. These bugs have to be fixed after the split, which means that the version at the fork must contain the bug. Furthermore, everybody (people reporting bugs, people fixing them, and people merging fixes) seem to doze off late in -rc kernels. Having an announcement of something with a qualitatively different version wakes them up. I say have a target of no known regressions in 2.6.21.1, with 2.6.21 being pretty good, and don't count too much on the stability of 3-number kernel I think the "stick" can't be delaying the window, because that's too broad. I think it has to be making people who are needed for fixing stuff miss the window. People aren't going to go learn a new area of the kernel to resolve regressions in it, but they're more likely to keep their own I don't think 2.6.x can be OK, by policy. I think 2.6.20.y got to an OK state eventually, which is to say that there's no need now to use a 2.6.19.y kernel. I think that 2.6.21 isn't OK yet, but I ...
Without someone holding Linus feet to the fire the next release may be a real POS. I think you have done the perfect job, identifying the show stoppers, quantifying the obscure and minor regressions, and serving to give testing targets as purported fixes are applied. I don't think you should judge your work by leaving some targets for -stable and 2.6.22, but rather from the number of problems you detected, documented, and caused to be addressed. If it were my week to be God, I would insist that the rcN to final step was regressions-only, and that all regressions be classified as (a) acceptable results of changes to fix other problems, (b) must be fixed before release, or (c) obscure enough to tolerate for a short time, must be fixed in stable and mainline before N+1 release. Measuring releases or your own value against perfection is thankless! -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
IMO, the closer you look, the more warts you find. Before you starting doing your work with kernel regressions, no one was really tracking it. I bet you have helped cut down on the regressions, but I have no good way to quantify my gut feeling. Additional comments on developers and fixing regressions: * Sometimes seeing a long list, peoples' eyes glaze over. Its just human nature. A long list also gives us no idea of scale, or severity. I bet a weekly "top 10 bugs and regressions" email would help focus developer attention. * To be effective, lists, either long or top-10, must be pruned if you get a sense that only one user is affected. [With oopses and BUGs as a clear exception,] many problems benefit from at least two users reporting a bug. * It gets a bit tiresome to field the large number of driver bug reports that eventually turn out to be related to broken interrupt handling somehow. I think we developers need to get better at showing users how to isolate driver vs. PCI/ACPI/core bugs. Maybe drivers need to start introducing interrupt delivery tests into their probe code. Overall, broken interrupt handling manifests in several ways, most of which initially appear symptomatic of a broken driver. Jeff -
Top 10 ordered by what?
I always tried to put the most mysterious ones like "kwin dies silently"
or "gammu no longer works" at the top of my lists hoping some developer
might become interested in getting clues about them.
But when there were 15 outstanding suspend regressions, these were also
important.
And how to order the rtl8139 netdriver regression reported one month ago
against the snd_hda_intel regression reported one month ago?
Both are "only" drivers no longer working for some users, but there
might be many more users for whom they don't work in 2.6.21 because the
maintainers didn't bother to debug and fix them.
Ideally, maintainers should debug and fix regressions (and other bugs)
without requiring weekly regression lists.
If bug reports and weekly reminders aren't enough, I don't think
First of all we had several cases where one report by one user resulted
in a serious bug being fixed. There might be only one -rc tester with a
strange workload or some unusual hardware.
A bigger point is that "at least two users reporting a bug" assumes you
always know whether two bug reports are for the same bug.
Some examples for problems:
- during early 2.6.21-rc, it was not unusual for users to run into
three unrelated suspend related regressions on one computer
- during 2.6.20-rc, we had 4 similar looking reports, 3 for one
regression, and the forth for a completely different regression
- during 2.6.21-rc, it turned out the following reports were caused
by the same regression:
kwin dies silently
Regressions have the advantage of being able to compare with a
known-good kernel.
If a driver works with 2.6.20 but doesn't work with 2.6.21-rc, ask the
submitter for dmesg's from both and diff them. In my experience that
usually gives a good indication whether or not it's an interrupt related
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many ...Adrian Bunk <bunk@stusta.de> : Pointer for the rtl8139 regression please ? -- Ueimor Anybody got a battery for my Ultra 10 ? -
Subject : boot failure: rtl8139: exception in interrupt routine References : http://lkml.org/lkml/2007/3/31/160 Submitter : Stephen Clark <Stephen.Clark@seclark.us> Status : unknown cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
The poster says rtl8139, but doesn't provide more info. His lspci says "RTL8169SC", which sounds more like r8169 to me. Jeff -
Interesting.
I didn't notice it, Andrew didn't notice it, and it seems that although
it was in three posted versions of my regression lists, noone looked at
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Yes, thanks. The r8169 driver has been added a few bugfixes between 2.6.21-rc5 and now. Some are related to latent, timing changes induced bugs. Stephen, would you mind testing 2.6.21 and open a PR at bugzilla.kernel.org if the bug does not go away ? Bugzilla e-mails end directly in my mailbox. l-k traffic can be temporarily unnoticed (especially on saturday night). -- Ueimor Anybody got a battery for my Ultra 10 ? -
Sure I'll give it a try in the next couple of days. Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) -
Yeah, that was my bad it is a RTL8169SC, and the problem was intermittent sometimes is cause a panic othertimes it didn't. It is laptop that does not have a serial port and I could not couldn't get the kernel to boot using a usb serial port so I couldn't get a screen capture of the intermittant panic. Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) -
Jeff, If hardware worked in the previous version of the kernel can't users expect the same hardware to work in this kernel? Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) -
I think that is indeed a reasonable expectation. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) It is exactly because a man cannot do a thing that he is a proper judge of it. -- Oscar Wilde -
Failure of that assumption is the heart of the whole "regression" discussion. It's not limited to hardware, kernel security might be an issue, some network protocols might work faster and less reliably, etc. Kernel behavior changes sometimes totally break user software which makes unwarranted assumptions. That's not a regression, although users may see it that way. When a change in fork() changed the child-runs-first behavior, many programs broke, as was true with threading changes. Bad reliability is the reward for bad code, but if a kernel change makes that obvious some people think it's a regression. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
Many people have already said this, but it needs saying again. You are doing a great job that really helps, both with the regression lists and the trivial patch monkey stuff. Please keep up the good work, you are really helping the kernel. -- Jesper Juhl <jesper.juhl@gmail.com> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html -
Hey, if I know enough to help, I would. But since there's so many more talented developers, I can only contribute by testing. Without your help, I would not have gotten STD/STR working on my notebook for 2.6.21-rcx. We need definitely need a more robust system. Thank you, Jeff. -
I count 13. (v2) had 15 items, of which 2 were subsequently fixed or found The -stable team can presumably take care of these in 2.6.21.1, right? That leaves 10 that need developer attention. John Stultz seems to be taking care of 3 of them. Oliver Neukum has 1. 2 are particular drivers (ali_pata and rtl8139, according to the reports). 2 seem to be ACPI-related; at least one has a candidate patch now. 1 seems to be an ALSA problem. 1 is STD and being debugged. It looks like all of the known regressions are being worked on, and getting fixes in for them is -stable material at this point. Furthermore, it doesn't look to me like anyone who is needed for dealing with these regressions is trying to get stuff into the 2.6.22 merge window. I think it's clear that this is the right point for Linus to start the 2.6.22 cycle and leave the rest of the 2.6.21 work to the -stable team, who are the experts of taking care of this sort of stuff. Furthermore, it seems like -rc testers at this point have found everything in 2.6.21-rc they're going to, so, again, it's time for new regressions. Personally, I'd vote for having Linus leave off at 2.6.X-final, and have 2.6.X be the first -stable release of the series, where the remaining known regressions get fixed, but that's an issue of nomenclature, not development process. I think you've allowed for a well-tested 2.6.21, and a good chance of a 2.6.21.1 or .2 with no known regressions against 2.6.20, which seems to me like you succeeded as far as everything except making Linus a release engineer. -Daniel *This .sig left intentionally blank* -
Two of them are heavily discussed patches, and I'm therefore not sure
they will ever reach the 2.6.21 branch now that the attention has been
You are overinterpreting the Handled-By field in my reports.
It does not imply that this person promised to fix this issue, it only
says that this person is or was working on this issue.
And more than 50% of the issues were reported first last month or
earlier and are still unfixed despite repeated reminder emails - if
they weren't fixed until now they might as well become similar to
"foo seems to be broken since at about 2.6.9" issues.
It sounds highly unrealistic that all issues unfixed for a month
suddenly become fixed even though the main focus of everyone shifts to
That's a wrong impression, nearly every active kernel developer has at
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
I've just compiled 2.6.21, seems to run fine, in tickless mode, on a Dell D610. Is there any way to check how many timer ticks are firing, except for monitoring /proc/interrupts? Thanks for all the hard work! Jan -- She has an alarm clock and a phone that don't ring -- they applaud. -
[Offtopic] Today, April, 26, 21 year has been passed since Chernobyl Nuclear Power Plant disaster, and Linus announced .... *drum roll* .... 2.6.21 !!! What a mysterious coincidence... -
I really appreciate the lot of -rcs, especially if there are so many intrusive changes/regressions. Like Andrew, I have a feeling that it gets buggier, but at least, it seems to be made up every ... two releases. 2.6.16 was a good one, .18, and .20. I suppose the bug/regression distribution between [2.6.16-2.6.17, 2.6.17-2.6.18] was biased like [70, 50]. About 2.6.21 - will see, rc has been to my liking. Since a picture says more than a thousand words, have http://jengelh.hopto.org/GFX0/kernel-rating-2.6.21.png (The last kernel to have only 5 -rcs was 2.6.14 - interesting) Happy hacking, Jan -- -
I wouldn't say that, but yes, there is at least *some* tendency to not merge scary stuff after a painful release. For example, I can certainly say that after 2.6.21, I'm likely to be very unhappy merging something that isn't "obviously safe". I knew the timer changes were potentially painful, I just hadn't realized just *how* painful they would be (we had some SATA/IDE changes too, of course, it's not all just about the timers, those just ended up being more noticeable I actually hope that 2.6.21 isn't even all that bad, despite all the worries about it. And I may be complaining about the problems the timers caused, but it was definitely something that was not only worth it, it was overdue - and those NO_HZ issues had been brewing literally for years. So considering issues like that, I think we're actually doing fairly well. One of the bigger issues is that I think -mm (and I'm pretty sure Andrew will agree with me on this) has really had a rather spotty history. It's been unstable enough at times that I suspect people have largely stopped testing it, with just the most die-hard testers running -mm. So -mm is still very useful just because *Andrew* tests it, and finds all kinds of issues with it, but I literally suspect that Andrew himself is personally a big part of that, which is kind of wasteful - we should be able to spread out the pain more. Andrew is also too damn polite when something goes wrong ;) So we should have somebody like Christoph running -mm, and when things break, we'll just sic Christoph on whoever broke it, and teach people proper fear and respect! As it is, I think people tend to send things to -mm a bit *too* eagerly, because there is no downside - Andrew is a "cheap date" testing-wise, and always puts out ;) Linus -
And with Al Viro doing random code review and fill in the commits for regression fixes, even long established developers will check their code twice before submitting ;-) Willy -
test.kernel.org also picks up -mm and additional machines run -mm kernels. It doesn't catch everything but a few bugs get rattled out. That said, it's automated with Andy Whitcroft and Steve Fox kicking it along periodically. After spending the day tracking just two issues in -mm and simple ones at that, the lack of testing may be simply because it's really bloody boring even with the access to an automated system. There is no A few more "you're a spanner" mails probably would not hurt even though -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
Perhaps do one at a time [ at the cost of queueing other stuff, yeah :( ] Like: 2.6.21 - only NO_HZ & hrtimers, and the SATA code in .22. Probably does not work out in reality, so perhaps just live with long rc cycles. Yes, perhaps we need a weakchanges-mm ("weak" is inteded, not to be confused with week) that can carry stuff like doc updates, Kconfig updates, etc. - patches that are a little more than -trivial. Jan -- -
