Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"

Previous thread: [ANNOUNCE] Userland suspend/hibernation tool v0.8 released by Rafael J. Wysocki on Wednesday, January 2, 2008 - 11:36 am. (1 message)

Next thread: UML woes in 2.6.24-rc6-mm1 by Miklos Szeredi on Wednesday, January 2, 2008 - 1:53 pm. (7 messages)
To: <linux-kernel@...>
Cc: Matthew Wilcox <matthew@...>, James Bottomley <James.Bottomley@...>, Andrew Morton <akpm@...>, Linus Torvalds <torvalds@...>
Date: Wednesday, January 2, 2008 - 12:25 pm

revert commit:

commit 6f5391c283d7fdcf24bf40786ea79061919d1e1d
Author: Matthew Wilcox <matthew@wil.cx>
Date: Tue Sep 25 12:42:04 2007 -0400

[SCSI] Get rid of scsi_cmnd->done

this is a supposed-to-be-cleanup commit, but apparently it causes
regressions:

Bug 9370 - v2.6.24-rc2-409-g9418d5d: attempt to access beyond end of device
http://bugzilla.kernel.org/show_bug.cgi?id=9370

this patch should be reintroduced in a more split-up form to make
testing of it easier.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/scsi/scsi.c | 20 +++-----------------
drivers/scsi/scsi_error.c | 1 +
drivers/scsi/scsi_lib.c | 14 ++++++++++++++
drivers/scsi/scsi_priv.h | 1 -
drivers/scsi/sd.c | 28 ++++++++++------------------
drivers/scsi/sr.c | 21 +++++++++++++++------
include/scsi/scsi_cmnd.h | 2 ++
include/scsi/scsi_driver.h | 1 -
include/scsi/sd.h | 13 +++++++++++++
9 files changed, 58 insertions(+), 43 deletions(-)

Index: linux/drivers/scsi/scsi.c
===================================================================
--- linux.orig/drivers/scsi/scsi.c
+++ linux/drivers/scsi/scsi.c
@@ -59,7 +59,6 @@
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_dbg.h>
#include <scsi/scsi_device.h>
-#include <scsi/scsi_driver.h>
#include <scsi/scsi_eh.h>
#include <scsi/scsi_host.h>
#include <scsi/scsi_tcq.h>
@@ -368,8 +367,9 @@ void scsi_log_send(struct scsi_cmnd *cmd
scsi_print_command(cmd);
if (level > 3) {
printk(KERN_INFO "buffer = 0x%p, bufflen = %d,"
- " queuecommand 0x%p\n",
+ " done = 0x%p, queuecommand 0x%p\n",
scsi_sglist(cmd), scsi_bufflen(cmd),
+ cmd->done,
cmd->device->host->hostt->queuecommand);

}
@@ -654,12 +654,6 @@ void __scsi_done(struct scsi_cmnd *cmd)
blk_complete_request(rq);
}

-/* Move this to a header if it becomes more generally us...

To: Ingo Molnar <mingo@...>
Cc: <linux-kernel@...>, Matthew Wilcox <matthew@...>, Andrew Morton <akpm@...>, Linus Torvalds <torvalds@...>
Date: Wednesday, January 2, 2008 - 12:46 pm

I disagree with this. We only have one reporter of a problem and it
appears to be some type of obscure interaction with pktdvd which no-one
can track down (although it's not really helped by the reporter not
being very responsive).

The correct thing to do is root cause the problem and fix it at source,
since it's very likely that this is a pre-existing bug that was simply
uncovered by the patch you're recommending we revert.

Unfortunately, I suspect it won't get fixed until someone else actually
manages to reproduce it (which I haven't been able to so far).

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Ingo Molnar <mingo@...>, <linux-kernel@...>, Matthew Wilcox <matthew@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 3:19 pm

It's totally immaterial if we have one reporter or many. The fact is, that
thing has been outstanding for almost two months now. The root cause is
almost certainly known (and Matthew is apparently even aware of it), but
it has been dicked-around-with by having totally weak excuses ("is it an

No, the correct thing would have been to fix it a month ago.

By now, it needs to either have a patch, or to be reverted. It's too late
to make excuses.

And no, it doesn't really need any more reporters or ways to reproduce it,
all it needs is looking at the patch and seeing what the differences are
AT A SOURCE level. And quite frankly, I see one huge honking difference,
which is that new test for

if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC)

which will disable all the crud that sd_done() does.

And guess what? Look at what sd_done() does: it checks whether the request
was partially filled, and tries to handle that end-of-medium case nicely.
Which is *exactly* what seems to have broken.

So I think you are making totally idiotic excuses. I may not know the SCSI
layer all that intimately, and maybe I'm wrong and there is some other
cause for this, but quite frankly, I doubt it. And I can look at just the
patch and have a damn good idea of why something is broken, but I can't
actually fix it myself because I don't know how to look up a a
"scsi_driver" from a "scsi_cmnd" sanely.

But almost two months after the fact, we should have had a patch that does
that, or we should revert it.

NO MORE BOGUS EXCUSES!

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 3:40 pm

sd_done and sr_done are called for REQ_TYPE_FS -- if the request comes
in through one of the SG interfaces, we call scsi_setup_blk_pc_cmnd()

That's in the patch. But it would be the wrong thing to do because SG
commands should not be molested by the sr/sd driver.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: Linus Torvalds <torvalds@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:12 pm

OK ... I'll revert it. However, I still think it's the wrong course of
action, because as far as my analysis goes, this code is functionally
equivalent to what went before with the exception that we now rely on
the request->cmd_type information in the post processing (previously we
just relied on the cmnd->done pointer).

As far as I understand what's going on, the reporter is misusing the
interface. He sets up a packet mapping and then accesses the underlying
device, which is wrong (you're supposed to access now via the packet
device). The one thing the packet mapping does is to alter the
blocksize, which could lead to the errors reported (because the
underlying fs block size and the actual block size differ) ... however,
I think it should do this all the time, so what I'm unable to explain is
why it doesn't write past the end of the device with this commit
reverted.

But all of the above is the hallmark of an existing bug that this commit
exposes ... revert it, and we'll cover the underlying problem up again.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:45 pm

To say that another way:

"the code is functionally equivalent, EXCEPT IT ISN'T, and it's
known to be broken".

wouldn't you say my version is more honest and correct?

The old code did a per-command callback. The new one doesn't. The code was
*supposed* to be equivalent, but it clearly isn't. Why argue the point?

And no, maybe it's not that REQ_TYPE_BLOCK_PC should be calling ->done,
maybe it's that some REQ_TYPE_FS commands should *not* be calling ->done.
Or maybe we somehow got the wrong ->done in the first place, because we
now get it from a different source.

I don't know, but what I'm arguing (very strongly) against is this
attitude of "we don't know what's wrong, but wë́'ll leave it broken because
we can't be bothered to figure it out".

That is exactly what reverting is there for. It doesn't matter one *whit*
if the new code is cleaner and prettier, if it doesn't work.

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 7:33 pm

No. Just because a bug appears when a particular piece of code is in
and disappears when it is reverted doesn't automatically equate to the
code in question being buggy. We seem to get a lot of these second
order effect type things; sometimes its just a problem caused by a
particular routine compiling to a longer byte sequence and pushing
something else out.

Do give us credit for thinking "functional equivalency problem" when
this bug report first came in ... I've had myself and several other
people over the code. If there's an inequivalency somewhere I'm damned
if I can spot it. The most promising other failure mode we tried was
request type changes over the lifetime of the command, but we can't make
that one fly either.

Look at the taxonomy of the bug. This is the form of the error:

buffer I/O error on device sr0, logical block 20304
attempt to access beyond end of device
sr0: rw=0, want=81224, limit=40944

The last limit is the most suggestive, that comes straight from
bdev->bd_inode->i_size>>9 and is supposed to be the size of the block
device in 512 byte blocks. For a 4.7GB DVD, it's a little small.
Nothing in the sr code sets this directly (although it does come from
get_blkdev() for the first opener). pktcdvd does set it, though ... and
probably wrongly if the drive in question isn't UDF formatted.

I have also tried on many occasions to reproduce this without success
(there's a simple recipe in the bug report, but it just doesn't work for
me). My setup is with an aic94xxx->expander->SATAPI DVD, whereas the
original reporter is ata_piix -> PATA DVD, so it could be stack
differences---but again, if it is, the bug itself can't be a simple one
in the generic code. The fact that there are no other reporters of
problems like this also indicate to me that it isn't a widespread
problem (again, pointing to something more specific in the setup of the
reporter).

The unreproduceability coupled with the lack of other error reports
leads me to b...

To: James Bottomley <James.Bottomley@...>
Cc: Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 9:58 pm

.. but you're ignoring the fact that if pktcdvd sets it wrong, then it
should be visible with the pre-commit kernel *also*.

In other words, you continue to ignore the fact that BEHAVIOUR CHANGED.

Why?

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Saturday, January 5, 2008 - 10:55 pm

pktcdvd sets it when opening the /dev/pktcdvd device, but when the
drive is later opened as /dev/scd0, there is nothing that sets it
back. (Btw, 40944 is possible if the disk is a CDRW that was formatted
with "cdrwtool -m 10236".)

The problem is that pktcdvd opens the cd device in non-blocking mode
when pktsetup is run, and doesn't close it again until pktsetup -d
is run. The effect is that if you meanwhile open the cd device,
blkdev.c:do_open() doesn't call bd_set_size() because bdev->bd_openers
is non-zero.

I don't know the correct way to fix this. Maybe adding bd_set_size()
to sr.c:get_sectorsize() which already does set_capacity() would

I can repeat this bug, both with and without the scsi patch that is
claimed to make a difference, both with an external USB drive and an
internal IDE drive.

To repeat:

1. Start with an empty drive.
2. pktsetup 0 /dev/scd0
3. Insert a CD containing an isofs filesystem.
4. mount /dev/pktcdvd/0 /mnt/tmp
5. umount /mnt/tmp
6. Press the eject button.
7. Insert a DVD containing a non-writable filesystem.
8. mount /dev/scd0 /mnt/tmp
9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
10. If the DVD contains data beyond the physical size of a CD, you
get I/O errors in the terminal, and dmesg reports lots of
"attempt to access beyond end of device" errors.

--
Peter Osterlund - petero2@telia.com
http://web.telia.com/~u89404340
--

To: Peter Osterlund <petero2@...>
Cc: Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 9:57 am

Could be ... this is deep viro magic, though; I've added him to the Cc

Brilliant! I can confirm the reproduction of the bug too (that's with
the originally fingered commit reverted).

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 10:47 am

may i point out the obvious at this stage? The thing that finally got
movement into this bug was ... :

exposure on lkml

The reproducer came to you via Peter Osterlund who has never authored a
single drivers/scsi/ commit before (according to git-log) and who (and
here i'm out on a limb guessing it) does not even follow
linux-scsi@vger.kernel.org.

this bug was obscure and hidden on linux-scsi@vger.kernel.org for
_months_, (it is a rarely visited and rarely read mailing list) and
there was just not enough "critical mass" to get this issue fixed.

_THAT_ is the power of lkml. People who are not generally interested in
your subsystem come and help. There is extra noise, but it's manageable.

so may i at this point suggest that you as the SCSI maintainer start
reading SCSI bugreports on lkml and start participating in SCSI topics
there, without extra prompting? It _is_ an important aggregation mailing
list for development, just like -mm or the upstream kernel is an
aggregation point of all things Linux.

I believe the "I only read linux-scsi, please post bugs there" approach
is harmful. If lkml traffic is too big then i'd suggest to set up email
filters to separate out mails that have 'SCSI' in their subject line or
body. Fortunately it's a really easy key to filter on. [ Scheduler mails
are much harder to filter out :-/ ] In fact i'd suggest to do all SCSI
development on lkml. We've got one upstream kernel codebase, one git
stream of commits and hence we should use one lkml feed to discuss
things on.

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: James Bottomley <James.Bottomley@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 1:29 pm

Good idea. Minor flaw: If somebody forgets to Cc LSML, he might also

This yields false positives whenever a .config is posted.

Also, filtering based on message body contents is costlier than
filtering by headers. I for one use Sieve to sort mails into different
IMAP folders at my mail provider's server, and I think Sieve doesn't
offer tests for patterns in message bodies at all.
--
Stefan Richter
-=====-==--- ---= --==-
http://arcgraph.de/sr/
--

To: Stefan Richter <stefanr@...>
Cc: James Bottomley <James.Bottomley@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 4:26 pm

these two patterns:

"^CONFIG_"
"^# "

will exclude .config-ish lines.

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 11:20 am

I won't disagree with that. That's why my philosophy is to try to force
all bug reports out of bugzilla and on to the relevant mailing list

If I were you, I'd actually make a cursory effort to get my facts
straight before spouting off.

This bug was actually hidden in bugzilla for ages, where Matthew Wilcox
was trying to deal with it on his own. The first I heard of it (apart
from a linux-scsi question on 13 November, when regrettably, I was busy
with other things) was on 18 Dec when Natalie added me to the bugzilla
cc list. The first thing I did on that date was finger pktcdvd and add
Jens to the cc list ... however, since there was no mailing list thread
to follow he ended up asking for context which no-one provided.

The whole problem with this bug was generated precisely because it was
kept in bugzilla where too few people actually looked at it. You're the
one who annotated the bugzilla entries with trite little homilies asking
why there was no action *without* ever notifying any mailing list, I
might add.

The fault lies in our bug processing methodology. Bugzilla is a fine
tracking tool, but it's a bloody useless workflow one for actually
solving problems because, as you say, and I agree, the mailing lists are

Oh good grief ... we do add other mailing lists to the cc list (most
often lkml) when it becomes evident that it's not a SCSI problem. Most
bug reports actually start off going to a set of lists (including lkml)
anyway, so there's usually full context. You may love drinking from the
firehose ... I find it makes me want to pee a lot. Forcing your work
habits on everyone else isn't really a very community way of solving
anything.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 1:11 pm

Not true. I sent my response to the bug to linux-scsi and cc'd bugzilla.
Unfortunately, the bug reporter sent the information requested to bugzilla
instead of linux-scsi. So it got lost and overlooked.

If there's willingness to change, I'm willing to put some effort into
moving us to a bug tracking system that fits our workflow better than
bugzilla. But if that effort will be disregarded, then let me know now
so I don't waste my time.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 1:36 pm

Well, I'd say yes, certainly, and I'll support the effort ... but the
problem is that I'm not one of the powers that be who control our
current bugzilla ... that's the constituency we need to convince. As I
keep saying, just getting new SCSI bug reports tipped onto the SCSI
mailing list will give us about 90% of what we need, but I haven't even
managed to get that to happen.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 2:34 pm

As long as the process will be that much complicated, it will never
correctly work. The primary requirement for a useful bug reporting
workflow is *not to put the burden on the reporter*.

I agree with Ingo here. How can a user know where to post ? He has
a problem with his Linux distro. He finally understands or gets told
that the strange numbers on the screen indicate a kernel oops and
that he must report it if he wants it to be fixed. He then googles
for how/where to report a bug, and the first reply says :

http://www.kernel.org/pub/linux/docs/lkml/reporting-bugs.html

in which you can read :

"If you are totally stumped as to whom to send the report, send
it to linux-kernel@vger.kernel.org".

So *this* is the official central entry point, like it or not. And
in fact it works, given the number of off-topic reports we get. It
proves that end users know how to report their problems there.

Other lists should be used when the problem is clearly qualified.
And most of the time, it's not up to the end user to qualify his
problem, but to *us*. Our mission is not to blindly write lines of
code, but to spend some time getting user feedback, and bug reports
are part of this feedback.

In my opinion, the most important reader contribution to LKML is
just reading bugs reports and redirecting them to the proper list
if obviously required. People do this all the time and it has
always worked.

In parallel, bug entries may be added into bugzilla, either by the
reporter once suggested to do so, or by the person redirecting him
to the proper list.

So a working workflow would look like this :

1) from: user
to: lkml
subject: help needed with my CD burner

2) from: any LKML reader
to: user
cc: lkml, linux-scsi
subject: re: help needed with my CD burner

Message forwarded to linux-scsi. You may accelerate resolution
by describing your problem in bugzilla [url, mail...]

=> user knows his problem is being considered (ver...

To: Willy Tarreau <w@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 2:56 pm

Where does this opinion end users would find Bugzilla too difficult come
from? Many other projects use Bugzilla without big problems.

Is this just plain FUD or based on real reports from end users?
In the latter case, please give pointers to them so that whatever
problems exist can be fixed.

And different to LKML, you don't run into problems like majordomo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

--

To: Adrian Bunk <bunk@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 3:10 pm

I, as an end user of ntpd, have been harrassed to use it to get an
ntp bug reported "because by mail it would get lost". What complicated
an interface it is when you don't know it ! I remember I wanted to
attach a patch and it didn't even get through the first time. I did
it wrong. Blame me if you want, but an interface which need training
for proper use is certainly not for casual end users.

Also, it's very annoying to have to create an account somewhere, leaving
there one of the passwords you use on many other sites, just to help a
random developer fix a bug in his code. You quickly wonder if someone
else will report it and have more patience.

Another recent example: a coworker recently told me he installed the
latest beta from ubuntu, and that he had some problems with his WIFI
randomly hanging. I asked him if he filed a bug, he replied me "no,
it's too much boring, I'm not the only one with this hardware, others
have certainly already done it". When the release went out, he insisted
telling me he was right not filing the bug because indeed it was fixed !

We must accept that end users :
1) do not like creating accounts (remember or divulgate passwords,
and risk of getting spam)

2) do not know how to classify their problem, and are not even
sure it's a real bug. On the first page, when uncertain they
would probably click "Other". Adding doubt in the reporter's
mind is counter-productive as it will refrain him from being
precise about what he did to get the problem.

3) are not familiar with our vocabulary :
- "Tree" : mainline? mm? mjb? ac? what's that ?
- "Component" : Configuration? LSM? Modules? Other?

=> finally, I'm not sure I had to click "Other" in the first place,
I want to choose something else, I click "Back" and I get back
to the login page! Bye bye.

Also :
"No binary modules - NVIDIA users this means YOU!"
=> about half the reporters will wonder if they should stop here
or not. Most ...

To: Willy Tarreau <w@...>
Cc: Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 11:25 am

I'll agree with what Willy wrote here, Bugzilla is a pain to use, you
can't just dump an email into it and have it captured. I think we
should be looking at something more like 'WebRT' which is an *issue*
tracker software. But that too might be too heavy weight and too
noisy as well.

And suddenly it would starting putting ticket numbers onto all the
non-problem conversations we have here on lkml as well, which I'm not
sure we really want.

Does it mean that we needs to have a 'bugs@kernel.org' email for
people to report bugs/problems/issues, while lkml remains (but is
copied for all 'bugs@kernel.org' emails) primarily the developer point
of contact?

The question to me really revolves around how do you automate the
process in a transparent manner so that people don't have to change
much how they interact with lkml.

John
--

To: John Stoffel <john@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 3:04 pm

I think the more important questions are:
- Are there people who know how to get reports to developers?
(We can't train all potential reporters to get reports right.)
- Are there enough developers to work towards bug fixes?
The technical means to capture reports, let alone the bugtracking tools,
are of secondary importance.
--
Stefan Richter
-=====-==--- ---= --===
http://arcgraph.de/sr/
--

To: Stefan Richter <stefanr@...>
Cc: John Stoffel <john@...>, Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 3:59 pm

>>>>> "Stefan" == Stefan Richter <stefanr@s5r6.in-berlin.de> writes:

Stefan> I think the more important questions are:
Stefan> - Are there people who know how to get reports to developers?
Stefan> (We can't train all potential reporters to get reports right.)

Sure, but telling them "email to helpme@vger.kernel.org" isn't tough.

Stefan> - Are there enough developers to work towards bug fixes?

Nope, never enough.

Stefan> The technical means to capture reports, let alone the
Stefan> bugtracking tools, are of secondary importance.

The technical means are secondary, it's the human factors of how bug
reports are captured which is of prime concern. It's obvious people
aren't happy with bugzilla. Most, but not all, are happy with LKML,
but a dumb mailing list isn't ideal for tracking bug reports.

As a SysAdmin, I love being able to tell me user 'just email help' to
open a ticket. As a SysAdmin, I can then interact with the tickets
which are created and managed using 'RT' (http://www.fsck.org/rt) from
inside my email client without having to think about it too much.

Again, transparency to the end-user AND tothe developers is key. If
neither finds if easy to use, it won't be. So we need to come up with
a work flow which works with us (read you, I just read and contribute
bug reports myself... :-) and for us. But which never gets in the way
if at all possible.

John

--

To: Willy Tarreau <w@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 3:58 pm

Noone knows how many thousand bug reports have never reached lkml
since majordomo silently dropped them.

What exactly is the problem with attaching files?

If you already lack patience at this point, would you be willing to

Send _one_ email to lkml and you'll get forever spam to this address.

If the bug ends at Other/Other that's not a problem - this usually gets

Works fine here.

If majordomo didn't drop it.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

--

To: Adrian Bunk <bunk@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 5:08 pm

[Empty message]
To: Willy Tarreau <w@...>
Cc: Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 4:50 pm

Personally, I've lost count of the number of times I've *not* reported a
bug/oops just because of the "NVidia users this means you" statement, only
to have the exact same issue reported several weeks/months later by somebody

Exactly.

To: <Valdis.Kletnieks@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 5:37 pm

If you can reproduce a bug reliably, you can reproduce it without the
nvidia module loaded.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 7:04 pm

Theoretically, at least. Sometimes, in the real world, other constraints
enter into it...

You're welcome to stop by and figure out why (I've sunk multiple tens of hours
into trying already) the NVidia binary driver can handle the Gateway monitor
hanging off my docking station and the open-source one selects a non-functional
sync rate, or pay for a replacement (management doesn't want to shell out $$ to
replace a monitor that works just fine when the manufacturer's drivers/config
are installed - surprising, that). Or I could get behind the docking station,
unplug the ergonomic keyboard, the power brick, the trackball, and the cat-5,
set the laptop up on the desk, find some way to arrange the laptop, cat-5,
trackball, power brick, and keyboard so things are reasonable to use for a long
day of typing, and wait for it to *maybe* happen again, and then put everything
back when it's safe to use the monitor again. Oh, and the open source driver
has some *horrid* color-banding issues on the laptop's LCD, as well, you're
welcome to fix those while you're at it.. ;)

And no, I'm *not* just winging it with the laptop for the day - I have enough
carpal tunnel issues *with* an ergonomic keyboard and a trackball, I'm not
going to press my luck with the onboard keyboard and mouse.

Or I can just say "screw it" and not bother reporting any bugs that aren't
easily repeatable before GDM gets started.

Guess which one ends up happening in the real world?

To: <Valdis.Kletnieks@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 7:19 pm

So you're saying that you can't find reliable ways to reproduce problems
on demand? Those are some of the lower quality bug reports, so I don't
think we're losing much by having you not report them.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Wednesday, January 9, 2008 - 12:03 am

And in the next e-mail in my lkml folder we see:

Are you saying that we're not losing much if Parag says "screw it" and
doesn't report the problem?

To: Matthew Wilcox <matthew@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Wednesday, January 9, 2008 - 12:01 am

I'm sure that *everybody* on this list would *love* to know how you find
a reliable way to reproduce all the bugs that start off with "after X days of
uptime". But when you're chasing what might be a race condition with a
very small timing hole, you may need an event to happen several million times
before the accumulated chance of hitting it becomes appreciable.

To: <Valdis.Kletnieks@...>
Cc: Matthew Wilcox <matthew@...>, Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Al Viro <viro@...>
Date: Wednesday, January 9, 2008 - 12:10 am

I must say that the number of bugs which actually go away when the user
stops using nvidia/fglrx/ndiswrapper/etc is a small minority.

And you can usually tell beforehand too: if the user reports bad_page
warnings or pte table scroggage or whatever and they're using nvidia I just
hit 'd'. But people who think that removing the nvidia driver will
magically fix that khubd-got-stuck-in-D-state bug are urinating up an
incline.

Facts:

- lots of people use nvidia/etc

- most bugs they report aren't caused by nvidia/etc

- we need lots of testers

draw you own conclusions.
--

To: Andrew Morton <akpm@...>
Cc: <Valdis.Kletnieks@...>, Matthew Wilcox <matthew@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Al Viro <viro@...>
Date: Wednesday, January 9, 2008 - 2:03 am

On Tue, Jan 08, 2008 at 08:10:42PM -0800, Andrew Morton wrote:

Thanks Andrew for this demonstration. At least now I know I'm not the only
one to think that. And no, I do not have any nvidia/etc. It's just that I
value their users' reports as much as the other ones just because otherwise
we would only track some elite's bugs, thus reducing the amount of information
we have to understand the circumstances under which it happens.

Cheers,
Willy

--

To: Matthew Wilcox <matthew@...>
Cc: <Valdis.Kletnieks@...>, Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Tuesday, January 8, 2008 - 12:47 pm

Or those are the more difficult problems.
--
Stefan Richter
-=====-==--- ---= -=---
http://arcgraph.de/sr/
--

To: Stefan Richter <stefanr@...>
Cc: Matthew Wilcox <matthew@...>, <Valdis.Kletnieks@...>, Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Tuesday, January 8, 2008 - 1:11 pm

Indeed. If it's some race condition, or dependent on memory pressure at
just the right time, or a use-after-free that corrupts some memory that
normally nobody will even notice (it's freed, after all, and not
necessarily re-allocated), reproduction can be really very hard.

It's happily not exactly *common*, but it's certainly not unheard of
either, when you need to run some specific workload for hours to trigger
the bug - and then when it doesn't happen, you have to ask yourself: "was
I just lunky, punk?"

Some of those things also go away magically between kernel versions or
subtly different configurations. A use-after-free problem might be obvious
in one config, but then another configuration might change the size of a
structure, and suddenly the two kmalloc's that used to be in the same slab
(and made the problem more visible) end up in different slabs, and now you
suddenly cannot reproduce it with that particular load at all any more!

These things *are* fairly rare (most bugs by _far_ are of the trivial
stupid kind), but some of those things can stay around for a long time,
and it can take months of different people reporting similar problems
until somebody finally puts two and two together and sees the pattern.

When we get a good bug-reporter that is willing and able to reproduce and
test kernels, that's wonderful, but when we get some "background noise" of
bad bug-reports, that's usually good too - even if it's good only in the
long run (ie sometimes we just have to accept that the bug-report didn't
contain enough information for us to really do anythign about it, and just
let it be - and hope that future events will clarify things)

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Stefan Richter <stefanr@...>, Matthew Wilcox <matthew@...>, <Valdis.Kletnieks@...>, Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Tuesday, January 8, 2008 - 4:01 pm

one common pattern i've noticed is bug dependency. In some areas we need
to fix a series of 2-3 increasingly less trivial bugs to get enough test
exposure, tester confidence and developer attention to trigger (and fix)
the _truly_ bad bugs.

That's why agressive regression elimination (and prevention) is so
important IMHO - trivial regressions can totally block testing of
certain areas of code, and with an agressive 90 days release schedule
every day counts.

Ingo
--

To: <Valdis.Kletnieks@...>
Cc: Willy Tarreau <w@...>, Adrian Bunk <bunk@...>, James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Monday, January 7, 2008 - 5:31 pm

And I've lost count of the number of bugs reported that turned out to be
the Nvidia modules.

Alan
--

To: Willy Tarreau <w@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 6:25 pm

If someone works in a company where the default MUA setting is to also
attach the text as HTML and to add a vCard to all emails people might
try once to submit their bug report, not notice that it didn't arrive

People reporting bugs together with a patch to fix it are not the usual

This depends on how people answer in Bugzilla.

But an advantage of Bugzilla is that each email contains a link to the

If you report a regression in the kernel and are not willing to bisect
the probability of the bug being resolved becomes _much_ smaller.

Partially due to this requiring much more developer time, but also
partially due to the fact that many regressions are undebuggable without

People tend to report bugs if and only they have no other choice (like
some workaround).

So when they report a bug they really need a fix for their problem.

And have you ever worked in a company that pays a seven digit amount of
Euros each year to Oracle for licences and support for their database?
I have. It's not that spending some man days on debugging or working
around one of the many regressions in the POS they ship to their paying
costumers was unusual.

But you might need the new release e.g. because the older release no
longer has security support or has another bug that is fixed in the
latest release, so your boss has no choice than assigning you for
as long as required at helping Oracle support to figure out what they

It's good for finding what caused a regression.

If a user doesn't want to spend some time helping to find a problem he
experiences there's nothing we can do.

Where we can and should improve is to no longer scare people who do
report bugs and who are willing to spend some time on helping to debug

That's neither our fault nor our problem.

Our problem are the people who whine because bugs they actually

When the expensive part of bug reporting is pasting the bug report
somewhere the submitter most likely hasn't spent enough time on writing
a...

To: James Bottomley <James.Bottomley@...>
Cc: Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 12:12 pm

Huh? The bugzilla just tracked a bug reported to lkml. The very
description of the bugzilla says:

Subject : v2.6.24-rc2-409-g9418d5d: attempt to access beyond end of device
Submitter : Thomas Meyer <thomas@m3y3r.de>
References : http://lkml.org/lkml/2007/11/13/250

so no, it was evidently not "hidden in bugzilla for ages" - all the

again, this bugzilla entry originated from lkml. I did ping the bugzilla
because i saw that the suspected commit's author was Cc:-ed already. Why
should every bug reporter and debugger be fully aware of the absolutely
SILLY little details and preferences of maintainers about how and whom
to report bugs?

YOU made the workflow fragile in the first place, by going away to
linux-scsi and ignoring lkml reports. Or in your own words, in the
bugzilla, comment #9:

http://bugzilla.kernel.org/show_bug.cgi?id=9370#c9

Reply-To: James.Bottomley@HansenPartnership.com

[...]
Erm, actually no ... this is the first I've heard of it, except as a
passing question from matthew. It's usually safe to assume if it's
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
not on linux-scsi I haven't seen it.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

that's fundamentally flawed. For testers there should be only one,
simple as possible rule:

"if you have a problem with the Linux kernel, then report it to lkml"

[ or report it to your distro or bugzilla.kernel.org, where it will be
propagated towards lkml by others. ]

not to "report it to one of the 100+ lists listed here - good luck
getting it right":

L: accessrunner-general@lists.sourceforge.net
L: acpi4asus-user@lists.sourceforge.net
L: alsa-devel@alsa-project.org (subscribers-only)
L: atl1-devel@lists.sourceforge.net
L: autofs@linux.kernel.org
L: blinux-list@redhat.com
L: bluesmoke-devel@lists.sourceforge.net
L: bluez-devel@lists.sf.net
L: bonding-devel@lists.sourceforge.net
L: bridge@lists.linux-foundation.org
L: cbe-oss-dev@ozl...

To: Ingo Molnar <mingo@...>
Cc: Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 1:10 pm

... and your original accusation was "this bug was obscure and hidden on
linux-scsi@vger.kernel.org for _months_" which I was pointing out wasn't
true.

Even the original lkml report was obscured by sweeping the report into
bugzilla and forgetting about it, so in fact, no action happened, even
on lkml.

The history is that I was made aware of the bug on 18 Dec. I suggested
then it was a pktcdvd problem and actually cc'd the wrong maintainer ...
again an error which is compounded by following this single person
workflow bugzilla requires. I also told you in no uncertain terms that
it wasn't caused by the commit you were trying to blame, but that didn't
stop the crusade to affix blame and close the bug without doing proper
root cause analysis.

Can we stop it with the recriminations and blame shifting now. The
problem we need to solve is fixing our broken bug resolution workflow.

My suggestion (for actually the third time in various fora) is:

to get the best of both worlds, file a bugzilla and note the
bugid. Then email a complete report to the relevant list, but
add [BUG <bugid>] to the subject line and cc
bugme-daemon@bugzilla.kernel.org If you do this, bugzilla will
keep track of the entire discussion as it progresses and allow
those who track bugs through bugzilla to get a pretty accurate
idea of the status. You should never need to touch bugzilla
again once the initial bug report is filed: all future
information flow is via the mailing lists.

I don't give a toss what you recommend the default list to be ... use
lkml if you wish. There are a lot of people who will vector it back on
to linux-scsi and keep lkml in the cc.

I also think that bug reports need to be complete, so copy the
information from the mailing list, don't just give a URL ... one of the
other enforced breaks in workflow is that the URL in the original
bugzilla wasn't working when we all tried to look at the or...

To: James Bottomley <James.Bottomley@...>
Cc: Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Tuesday, January 8, 2008 - 12:55 pm

you are right - let me rephrase it as: "this issue was mainly hidden due
to the unhealthy ping-pong between lkml (which you said you didnt read),

all the "action" already happened on the first day of reporting the bug.
(the wrong commit was identified, but that's besides the point - it all
sat inactive after that point. I pinged the bugzilla to get the lkml
discussion active again, not to debug it there.)

what "blame shifting" ???

all i'm worried about here is the long latency for a bugfix which very
apparently (to me) happened due to the isolation of linux-scsi and the
resulting bug processing inefficiencies. Bugs happen and nobody is to be
"blamed" for the bug itself - but the bug processing flow was broken and
i've pointed that out. (If you see similar cases for code i maintain,
and if you can pinpoint the reason why you think it happened and how to
improve that, then please point it out to me as well.)

Ingo
--

To: James Bottomley <James.Bottomley@...>
Cc: Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 11:45 am

The problem is that mailing lists are far too often equivalent to
/dev/null for many bug reports.

Tracking e.g. helps with not missing regressions and getting more of
them fixed.

And another of the advantages of using Bugzilla is that it gives us
numbers how bad we are in terms of introducing regressions and having
unfixed bugs, so developers are no longer able to tell we didn't have a

Bugzilla for tracking and mailing lists for discussing are not mutually
exclusive.

What about asking the Bugzilla admins to set the default owner of new
SCSI bugs to linux-scsi@vger.kernel.org?

This way all SCSI bugs submitted in Bugzilla will automatically be

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

--

To: Adrian Bunk <bunk@...>
Cc: Ingo Molnar <mingo@...>, Peter Osterlund <petero2@...>, Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 12:00 pm

I didn't ever say they were. What I said was we needed the work flow on

Well, I've only been asking for this for the last two years. If you can
actually make it happen, I'd be eternally grateful.

James

--

To: Peter Osterlund <petero2@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Saturday, January 5, 2008 - 11:43 pm

Hmm.. I wouldn't say that's the correct way to fix it. The thing is, if
somebody has explicitly set the size of the device, than that *is* the
size.

The kernel should do what it is told, and it very much on purpose does the
size probing only on the first open: exactly because other open fd's in
progress may be there explicitly to fix up something, or may simply depend
on some size it already knew about (ie we don't want to change the size
behind its back).

And in particular, setting the size with bd_set_size also sets the
block-size, and while most things migt react fairly well to the pure
*size* changing, they will react very badly indeed to the blocksize
changing!

So no, doing a "bd_set_size()" in any but the outermost opener simply
isn't acceptable. It would cause serious problems and total chaos for the
block cache (we do not handle aliasing of multiple different blocksizes).
So we are very careful indeed to only call bd_set_size() when bd_openers
is zero.

part of the problem here seems to be that the "media change" notification
that /dev/scd0 hopefully handled correctly and caused the "set_capacity()"
hass not made it to the i_size thing (that the block device layer checks).

So the way things are *supposed* to work is that the media-change function
("revalidate_disk()") should have triggered as part of the media change,
and that *should* have already done the set_capacity(), and that in turn
is the thing that should do it all (sets the disk capacity *without*
changing the blocksize!)

But since "set_capacity()" doesn't actually change "i_size", only
dev->capacity (which is correct - there may be many inodes associated with
one device), we actually ended up having this subtle dependency on
calling bd_set_size() (which we can *only* do on the first open, due to
the blocksize issues).

Which means that media-change won't fix these things like it is supposed
to. And I suspect we've had this bug (well, it *appears* to be a bug) for
a ...

To: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>
Cc: Linus Torvalds <torvalds@...>, Peter Osterlund <petero2@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 12:19 pm

James Matthew.
I have a (very) wild guess at what maybe have changed with the cmnd->done
patch:

Do you remember the effective loop in scsi_lib:scsi_end_request() where
if bufflen was smaller then original request size, do to truncation
of bufflen by ULD, then the remaining of the request is re-queued again
as a new scsi-command. Well I think that the old system would call
cmnd->done for every iteration, and the new system, since the done is
called by the block-Q, does not see the resubmit of the new command.

I have not followed all code path of the matter, but I know that sr does
alters bufflen in some cases.
All this is not a bug in itself, but it is a change in behavior that might
cause the current sr hack to fail.

Boaz
--

To: Boaz Harrosh <bharrosh@...>
Cc: Matthew Wilcox <matthew@...>, Linus Torvalds <torvalds@...>, Peter Osterlund <petero2@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 12:47 pm

Actually, this is cmnd->done, not req->done we're removing.
cmnd->done() isn't seen by the block layer; all its uses are in the SCSI

It's a good thought. You're right, the old code calls done for every
iteration. However, it calls it in scsi_finish_completion. The new
code will actually call drv->done() in that same spot for every
iteration as well.

The requeue is done via scsi_requeue_request which calls
blk_requeue_request, which resets the START flag and sends the command
right back through the system (including the prep function because
scsi_requeue_request unpreps the command), so even with the new code
we'll go back through all the same done paths.

James

--

To: Linus Torvalds <torvalds@...>
Cc: Peter Osterlund <petero2@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 10:42 am

Actually, I think that code is fine ... the block size shouldn't change
across positive values of bd_openers. I think the true fix is below:
pktcdvd shouldn't be mucking with the size of the underlying CD/DVD ...
it can change its own layered device capacity, certainly, but it
shouldn't be mucking with the capacity that was already set in the
sr_probe routine. I'm in two minds as to whether the set capacity on
the underlying device should be removed as well ... I think it should,
but it is harmless as the sr_probe will also reset it.

However, I'll defer to what Al wants to do.

James

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 3535ef8..10f3692 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2344,7 +2344,6 @@ static int pkt_open_dev(struct pktcdvd_device *pd, int write)

set_capacity(pd->disk, lba << 2);
set_capacity(pd->bdev->bd_disk, lba << 2);
- bd_set_size(pd->bdev, (loff_t)lba << 11);

q = bdev_get_queue(pd->bdev);
if (write) {

--

To: James Bottomley <James.Bottomley@...>
Cc: Peter Osterlund <petero2@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 2:14 pm

The block size should indeed not change, and that's why we must *not* do
bd_set_size() there (because that changes the block size too, not just the
size of the device).

But I would argue that if we have had a device invalidation event (ie
media change), then we should indeed change the actual *size* of the block
device as seen by anybody who opens the file again.

And yes, it will affect old openers of the same inode too (since the size
is in the inode, not the file descriptor), but considering that this
really should only happen for media change events, I think that's better
than what we used to have.

Now, we could have made it even more clear by moving the "i_size" setting
into the same "if (dbev->bd_invalidated)" conditional, but the thing is,
we only set bd_invalidated for devices that have minors, so things like
floppies etc that don't have partition support also don't have
bd_invalidated set (in effect, bd_invalidated really *is* just a flag for
the partitioning code, not for disk change in general).

I'm not sure if it should be mucking with the size or not, but it
definitely shouldn't be mucking with the block-size, because that can
indeed cause huge problems.

So removing the bd_set_size() is almost certainly the right thing to do
(because it's always incorrect to do at an "inner" open), and the real
size should have already been set by the regular open.

But this should be tested by somebody who uses the dang thing. My personal
favourite for a real fix would actually be to always make the capacity of
the pktcdvd device match the capacity of the underlying device, ie the
diff would look something like the appended (untested as usual), but that
may be a bit too extensive a change.

But regardless, I think the i_size change makes sense - at least in some
form. It doesn't necessarily have to be the explicit i_size setting: I
also considered just changing the block_dev.c do_open() make bd_set_size()
_unconditionally_, and then move th...

To: James Bottomley <James.Bottomley@...>
Cc: Peter Osterlund <petero2@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 2:44 pm

Hmm. Looking closer, it's probably ok in that case, because it does do a
"bd_claim()" to make sure that it has exclusive access, so while there may
be other openers around, at least those other openers won't be filesystem
mounts or anything that opened with O_EXCL.

So changing the blocksize is probably ok in this case.

That still leaves the question whether pktcdvd *should* muck with the base
device at all, and I'm not at all sure about that. But I'm no longer sure
that the pktcdvd code is necessarily *clearly* broken, now it's more of a
"should it really do that?" thing.

is likely a good thing to do (in conjunction with my patch that made
i_size be "reliable" after an open), but there may be some reason why
pktcdvd really wants to control the size rather than be on the receiving
end of the size.

Peter, this is your decision. Apparently my one-liner fixes the immediate
bug (but it's not really a regression either - I think the i_size issue
has been there since pretty much day #1), and what pktcdvd does is
somewhat less critical an issue?

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Peter Osterlund <petero2@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 2:54 pm

I think perhaps the true bug lies in the way we handle layered devices
like this. pktcdvd holds the underlying device open, so its refcount
never drops to zero. This is what causes the gendisk/block layer never
to update the sizes, and what lead to pktcdvd doing it instead.

However, what perhaps really needs to happen is that pktcdvd needs to
take over the media change events as well. That way it could see the
disk change and invalidate and reread its own setting of the block size
(and possibly re set the size of the underlying device).

I agree, though, this isn't a regression. It's probably obscure enough
in reproduction to warrant not holding up 2.6.24 --- especially as I
think the true fix will do small perturbations to a lot of subsystems.
If this were a product and I were the release manager, I'd be updating
the release notes with a note about having to break pktcdvd binding
across media changes to work around this bug and a promise to fix it in
the next release.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Linus Torvalds <torvalds@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 11:01 am

That code was added to fix a similar bug, see:

http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-08/4288.html

Maybe that fix was wrong and should have just set bd_inode->i_size instead
of calling bd_set_size().

--
Peter Osterlund - petero2@telia.com
http://web.telia.com/~u89404340
--

To: Linus Torvalds <torvalds@...>
Cc: Peter Osterlund <petero2@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 10:04 am

It isn't even a secondary effect like I thought. This commit genuinely
didn't have anything to do with the bug, it was purely accidental. It
came about because if you look at the reporter's recipe to reproduce,
which all of us tried without success, it's missing several steps. To
get the bug, these steps must have been done somehow, but I bet by pure
chance they weren't when reverting
6f5391c283d7fdcf24bf40786ea79061919d1e1d which led to wrongly fingering
this commit.

Now, if only someone who understood the mechanics of what the commit was
doing tried to stop you reverting it we could have saved a lot of
trouble ...

James

--

To: Linus Torvalds <torvalds@...>
Cc: James Bottomley <James.Bottomley@...>, Matthew Wilcox <matthew@...>, Ingo Molnar <mingo@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Jens Axboe <jens.axboe@...>, Al Viro <viro@...>
Date: Sunday, January 6, 2008 - 6:17 am

It does fix my scenario, with the trivial fix of adding bdev-> at the
beginning of that line, ie:

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 993f78c..a8ed344 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1191,6 +1191,7 @@ static int do_open(struct block_device *bdev, struct file *file, int for_part)
}
if (bdev->bd_invalidated)
rescan_partitions(bdev->bd_disk, bdev);
+ bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;
}
}
bdev->bd_openers++;

--
Peter Osterlund - petero2@telia.com
http://web.telia.com/~u89404340
--

To: Matthew Wilcox <matthew@...>
Cc: James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 3:57 pm

Why do you think that REQ_TYPE_BLOCK_PC has anything to do with SG?

It has *nothing* to do with SG, and anybody who uses SG in this day and
age on a block device is just crazy.

The way you do generic SCSI commands (on perfectly normal block device
nodes) is using the SCSI ioctl() interfaces. That's how you are supposed
to do things like burn DVD's or do any kind of special ops.

So REQ_TYPE_BLOCK_PC does quite commonly happen on perfectly regular block
devices, it's how all commands that aren't pure reads or writes done by
the kernel behave.

If you actually use /dev/sg*, you will be using the SG driver, and if you
don't want that to have a ->done callback, then just set "done" to NULL
for sg_driver.

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:18 pm

I spoke imprecisely; a raw command one through SG or ioctl is
REQ_TYPE_BLOCK_PC and did not go through sd_done or sr_done before this

Unless you think that we should see different behaviour when using
/dev/sg* and /dev/sd*, the aspect of the patch you criticised was
correct.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Linus Torvalds <torvalds@...>
Cc: Matthew Wilcox <matthew@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:17 pm

I think you misunderstood Matthew here. REQ_TYPE_BLOCK_PC is indeed
used by any kind of SG_IO or similar passthrough no matter where it
originates. And exactly because REQ_TYPE_BLOCK_PC are entirely passthru
the actual driver (sd, sr or sg) is not doing the actual command
completion but it's handled in the scsi layer because it's exactly
the same no matter what driver it came on.

--

To: Christoph Hellwig <hch@...>
Cc: Matthew Wilcox <matthew@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:49 pm

You say that, but you then ignore that something *did* change.

Maybe it's not that one suspicious test. Maybe it's somethign else. But
that commit was confirmed to break something, almost two months ago. You
guys seem to be in denial, and saying "it didn't change anything".

And no, waiting for more reporters when one reporter has already narrowed
it down to the exact (smallish) commit, is simply not good. Either you can
fix it by looking at the source, or it gets reverted.

I was hoping somebody in SCSI-land would actually look at the commit and
try to find out what's wrong, instead of all of you apparently trying to
say "nothing is wrong".

I hate the excuses of "but, but, but.. it *should* work". It doesn't. Face
that, *then* you can argue about why.

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Christoph Hellwig <hch@...>, James Bottomley <James.Bottomley@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Date: Wednesday, January 2, 2008 - 4:53 pm

I've spent about a week of time looking at it over the last couple of
months. I haven't been able to figure it out. That's why I'm calling
for it to be reverted, not because I'm "in denial".

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

Previous thread: [ANNOUNCE] Userland suspend/hibernation tool v0.8 released by Rafael J. Wysocki on Wednesday, January 2, 2008 - 11:36 am. (1 message)

Next thread: UML woes in 2.6.24-rc6-mm1 by Miklos Szeredi on Wednesday, January 2, 2008 - 1:53 pm. (7 messages)