On Wed, 2008-01-02 at 12:45 -0800, Linus Torvalds wrote:No. Just because a bug appears when a particular piece of code is in and disappears when it is reverted doesn't automatically equate to the code in question being buggy. We seem to get a lot of these second order effect type things; sometimes its just a problem caused by a particular routine compiling to a longer byte sequence and pushing something else out. Do give us credit for thinking "functional equivalency problem" when this bug report first came in ... I've had myself and several other people over the code. If there's an inequivalency somewhere I'm damned if I can spot it. The most promising other failure mode we tried was request type changes over the lifetime of the command, but we can't make that one fly either. Look at the taxonomy of the bug. This is the form of the error: buffer I/O error on device sr0, logical block 20304 attempt to access beyond end of device sr0: rw=0, want=81224, limit=40944 The last limit is the most suggestive, that comes straight from bdev->bd_inode->i_size>>9 and is supposed to be the size of the block device in 512 byte blocks. For a 4.7GB DVD, it's a little small. Nothing in the sr code sets this directly (although it does come from get_blkdev() for the first opener). pktcdvd does set it, though ... and probably wrongly if the drive in question isn't UDF formatted. I have also tried on many occasions to reproduce this without success (there's a simple recipe in the bug report, but it just doesn't work for me). My setup is with an aic94xxx->expander->SATAPI DVD, whereas the original reporter is ata_piix -> PATA DVD, so it could be stack differences---but again, if it is, the bug itself can't be a simple one in the generic code. The fact that there are no other reporters of problems like this also indicate to me that it isn't a widespread problem (again, pointing to something more specific in the setup of the reporter). The unreproduceability coupled with the lack of other error reports leads me to be about 90% confident the problem isn't in the code you want reverted. However, I grant that we cannot seem to find the root cause, and reverting the code will cause our bug metrics to go down by one (at least until something else causes it to reappear), so it is the corporate thing to do, I suppose. I'll send in a reversion with the sr_mod removal bug fix. James --
| Linus Torvalds | Linux 2.6.27-rc8 |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Linus Torvalds | Linux 2.6.20-rc6 |
| Mike Snitzer | Re: Distributed storage. |
git: | |
| Gerrit Renker | [PATCH 03/37] dccp: List management for new feature negotiation |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Herbert Xu | Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state ch... |
