[ Sent again now with correct CC to linux-usb ]
Hi,
Recently I found a problem with a buggy camera that doesn't mount anymore with
2.6.27 (its memory is available via usb-storage), since commit
04ebd4aee52b06a2c38127d9208546e5b96f3a19
The camera is an Olympus X-840. The original issue comes from the camera
itself: its format program creates a partition with an off by one error,
while the device reports that its memory has 42079 sectors, the partition
table reports also that the only partition on the disk has the size of 42079,
but it fails to account for the first sector in the memory that contains the
partition table, so in the end the partition exceeds the limit of the device
size (42080, first sector plus 42079 from the first partition).
In previous kernels (2.6.26 and before), I still could mount and access the
device (/dev/sdb1), although with the following errors:
sd 6:0:0:0: [sdb] Assuming drive cache: write through
sdb: sdb1
sdb: p1 exceeds device capacity
sd 6:0:0:0: [sdb] Attached SCSI removable disk
sd 6:0:0:0: Attached scsi generic sg2 type 0
usb-storage: device scan complete
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079
__ratelimit: 16 messages suppressed
Buffer I/O error on device sdb1, logical block 42078
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079
If you note the log snippet above the first notable thing is "p1 exceeds
device capacity", so looking at commit
04ebd4aee52b06a2c38127d9208546e5b96f3a19 it is clear why sdb1 isn't created
anymore.
After formatting the camera is this you get from fdisk (display units in
sectors):
Disk /dev/sdb: 21 MB, 21544448 bytes
6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 42079 21039+ 1 FAT12
Partition 1 has different physical/logical endings:
...I have to believe that it _is_ desired. Why else would that commit have been merged? Why clear the partition table? Just replace the bogus entry with a This doesn't resemble the original partition table at all. Did you create this strange table or did the camera's firmware change the data That is not a valid procedure. You have to go into Expert mode and set the number of sectors, heads, and tracks first. Most likely you'll want to set them to the same values used by the firmware; it looks like the firmware thinks there are 16 sectors/track, 8 heads, and 329 Adding quirks to alter partition sizes sounds very dangerous. Your best bet is simply to write a valid partition table. Alan Stern --
Several people objected to similar patches when they were posted on LKML in the past, usually for the reason "this breaks my forensics use case." I guess no one objected this time around. http://marc.info/?l=linux-kernel&m=116069718613683&w=2 (I don't have an opinion, I'm just providing collective memory.) -- Bob Copeland %% www.bobcopeland.com --
Well, Herton can object and ask to have the offending commit reverted. Alan Stern --
Clearing partition table seemed easier than fixing values with fdisk, to make 8 heads? it reports 6 from the fdisk output: Yes, but this is breaking other devices, not only this olympus camera. Bogdano reported what could be the same case with his cel. phone (Bogdano can you give more details?), also on IRC today Damien Lallement complained about what looks the same issue of "p* exceeds device capacity" (Damien can you give more info too?). But I'm afraid of how much devices this partition error fatal check now can affect, after all it's not "user friendly" to instruct -- []'s Herton --
No. phys=(328, 5, 16) doesn't mean that the device has 329 tracks, 6 heads, and 16 sectors. It means that the last sector of the partition (as recorded in the table) is track 328, head 5, sector 16. Note that partitions don't have to end on cylinder boundaries. Guessing that the firmware intends there to be 16 sectors per track, and also guessing that the 42079 value is too low by one, we get a total of 42080 / 16 = 2630 as the product of heads and tracks. Since 2630 = 8 * 328.75 this seems to say that the firmware thinks there are approximately 329 tracks and 8 heads. That explains the "328" above. Continuing this reasoning, the physical end (328,5,16) corresponds to sector (328*8*16) + (5*16) + 15 = 42079 (the 15 is because sector numbers start at 1 instead of at 0). Which would be valid if the first sector of the partition was sector 1, or phys=(0,0,2), and if the capacity was 42080. However you didn't tell us what value was stored You should complain to the people who wrote and accepted the troublesome commit. Tell them it caused a regression; that always gets people's attention! Alan Stern --
Ok, thanks for the explanation, indeed you are right, it's about the physical end, I confused things, I'm not used to CHS... The start sector is 1, (0,0,2), so what you said is valid. I attach the dump of first sector of the Ok, here goes a patch that revert the relevant part of the commit and goes back to behaviour of previous kernels, and fixes the regression here. If considered a good solution, can be applied. --- fs/partition/check.c: revert part of commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19 Fix regression introduced by commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19, where kernel changed behaviour making fatal the error when some partition exceeds the limit of the device size. Some buggy devices become inacessible because of errors in their partition table if the error is fatal. This closes http://bugzilla.kernel.org/show_bug.cgi?id=11554 diff --git a/fs/partitions/check.c b/fs/partitions/check.c index 7d6b34e..15c70df 100644 --- a/fs/partitions/check.c +++ b/fs/partitions/check.c @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev) if (!size) continue; if (from + size > get_capacity(disk)) { - printk(KERN_ERR " %s: p%d exceeds device capacity\n", + printk(KERN_WARNING + " %s: p%d exceeds device capacity\n", disk->disk_name, p); - continue; } res = add_partition(disk, p, from, size, state->parts[p].flags); -- []'s Herton
And I forgot my Signed-off.... -- []'s Herton --
On Fri, Sep 12, 2008 at 4:14 PM, Herton Ronaldo Krzesinski It might make sense to comment here that we intentionally want to add the partition anyway. That might keep someone from re-adding the continue (same patch has already come up at least 3 times). -- Bob Copeland %% www.bobcopeland.com --
Yes, here goes a new version: ___ fs/partition/check.c: revert part of commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19 Fix regression introduced by commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19, where kernel changed behaviour making fatal the error when some partition exceeds the limit of the device size. Some buggy devices become inacessible because of errors in their partition table if the error is fatal. This closes http://bugzilla.kernel.org/show_bug.cgi?id=11554 Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> diff --git a/fs/partitions/check.c b/fs/partitions/check.c index 7d6b34e..2dd346d 100644 --- a/fs/partitions/check.c +++ b/fs/partitions/check.c @@ -499,9 +499,13 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev) if (!size) continue; if (from + size > get_capacity(disk)) { - printk(KERN_ERR " %s: p%d exceeds device capacity\n", + printk(KERN_WARNING + " %s: p%d exceeds device capacity\n", disk->disk_name, p); - continue; + /* note: we don't want to break access to + * devices with buggy partition tables, so we + * don't want to fail here, just go on and + * add partition */ } res = add_partition(disk, p, from, size, state->parts[p].flags); if (res) { -- []'s Herton --
On Fri, 12 Sep 2008 18:07:27 -0300
Well gee. Given a choice, I went and replied to the wrong thread.
Here's what I think:
From: Andrew Morton <akpm@linux-foundation.org>
Herton Krzesinski reports that the error-checking changes in
04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount. "The camera is an Olympus X-840.
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".
Buggy devices happen. It is better for the kernel to warn and to proceed
with the mount.
Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/partitions/check.c | 1 -
1 file changed, 1 deletion(-)
diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
--- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
+++ a/fs/partitions/check.c
@@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
if (from + size > get_capacity(disk)) {
printk(KERN_ERR " %s: p%d exceeds device capacity\n",
disk->disk_name, p);
- continue;
}
res = add_partition(disk, p, from, size, state->parts[p].flags);
if (res) {
_
--
So that now deserves to be KERN_WARN not KERN_ERR, yes? --
On Fri, 12 Sep 2008 16:46:08 -0700 spose so. I'm fairly unenthused about the recent KERN_correctness fad since it went and broke sysrq-T output (you have to set the loglevel beforehand to avoid getting only partial output). --
I have no idea where that came from. Wasn't in the original or On development systems I generally "echo 8 > /proc/sysrq*" to make sure KERN_DEBUG isn't hidden. In this case it's just that I saw flamage a few minutes earlier from someone trying to keep a distro boot from spewing scarey messages for things that were NOT errors. Like ... this message. :) --
On Fri, 12 Sep 2008 16:59:20 -0700
Your email client added it. What I sent:
0008A0: 68 65 63 6B 2E 63 0A 40 40 20 2D 35 34 30 2C 37 >heck.c@@@ -540,7<
0008B0: 20 2B 35 34 30 2C 36 20 40 40 20 69 6E 74 20 72 > +540,6 @@ int r<
0008C0: 65 73 63 61 6E 5F 70 61 72 74 69 74 69 6F 6E 73 >escan_partitions<
0008D0: 28 73 74 72 75 63 74 20 67 65 6E 64 69 73 6B 20 >(struct gendisk <
0008E0: 2A 64 69 0A 20 09 09 69 66 20 28 66 72 6F 6D 20 >*di@ @@if (from <
0008F0: 2B 20 73 69 7A 65 20 3E 20 67 65 74 5F 63 61 70 >+ size > get_cap<
000900: 61 63 69 74 79 28 64 69 73 6B 29 29 20 7B 0A 20 >acity(disk)) {@ <
000910: 09 09 09 70 72 69 6E 74 6B 28 4B 45 52 4E 5F 45 >@@@printk(KERN_E<
000920: 52 52 20 22 20 25 73 3A 20 70 25 64 20 65 78 63 >RR " %s: p%d exc<
000930: 65 65 64 73 20 64 65 76 69 63 65 20 63 61 70 61 >eeds device capa<
000940: 63 69 74 79 5C 6E 22 2C 0A 20 09 09 09 09 64 69 >city\n",@ @@@@di<
000950: 73 6B 2D 3E 64 69 73 6B 5F 6E 61 6D 65 2C 20 70 >sk->disk_name, p<
000960: 29 3B 0A 2D 09 09 09 63 6F 6E 74 69 6E 75 65 3B >);@-@@@continue;<
000970: 0A 20 09 09 7D 0A 20 09 09 72 65 73 20 3D 20 61 >@ @@}@ @@res = a<
your reply:
000C50: A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 >@@@@@@@@@@@@@@@@<
000C60: A0 70 72 69 6E 74 6B 28 4B 45 52 4E 5F 45 52 52 >@printk(KERN_ERR<
000C70: 20 22 20 25 73 3A 20 70 25 64 20 65 78 63 65 65 > " %s: p%d excee<
000C80: 64 73 20 64 65 76 69 63 65 20 63 61 70 61 63 69 >ds device capaci<
000C90: 74 79 5C 6E 22 2C 0A 3E 20 A0 A0 A0 A0 A0 A0 A0 >ty\n",@> @@@@@@@<
000CA0: A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 >@@@@@@@@@@@@@@@@<
000CB0: A0 A0 A0 A0 A0 A0 A0 A0 A0 64 69 73 6B 2D 3E 64 >@@@@@@@@@disk->d<
000CC0: 69 73 6B 5F 6E 61 6D 65 2C 20 70 29 3B 0A 3E 20 >isk_name, p);@> <
000CD0: 2D A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 >-@@@@@@@@@@@@@@@<
000CE0: A0 A0 A0 A0 A0 A0 A0 A0 63 6F 6E 74 69 6E 75 65 ...In addition it deserves to be commented so that somebody doesn't go and change it yet again. Alan Stern --
On Sat, Sep 13, 2008 at 1:36 AM, Andrew Morton I was happy to see the original fix, as if causes real problems for userspace, that the kernel creates invalid block devices with a size that exceeds the physical disk. So, if we can not make that partition to skip, like original patch did, because of broken hardware we don't want to break, can we make it at least do the obvious thing, and limit the partition with the broken entry to the size of the underlying hardware. So that the kernel does no longer pretend to have devices of a size which the hardware does not have. It breaks all sort of userspace tools which read the "size" file in sysfs, or do BLKGETSIZE and we get, if we are lucky, only: attempt to access beyond end of device sda: rw=0, want=1953535936, limit=976773168 in other cases it might cause corruption, or lead mkfs to create devices which will fail when they get used. Thanks, Kay --
From: Kay Sievers <kay.sievers@vrfy.org>
Subject: block: sanitize invalid partition table entries
We currently follow blindly what the partition table lies about the
disk, and let the kernel create block devices which can not be accessed.
Trying to identify the device leads to kernel logs full of:
sdb: rw=0, want=73392, limit=28800
attempt to access beyond end of device
Here is an example of a broken partition table, where sda2 starts
behind the end of the disk, and sdb3 is larger than the entire disk:
Disk /dev/sdb: 14 MB, 14745600 bytes
1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors
Device Boot Start End Blocks Id System
/dev/sdb1 29 7800 3886 83 Linux
/dev/sdb2 37801 45601 3900+ 83 Linux
/dev/sdb3 15602 73402 28900+ 83 Linux
/dev/sdb4 23403 28796 2697 83 Linux
The kernel creates these completely invalid devices, which can not be
accessed, or may lead to other unpredictable failures:
grep . /sys/class/block/sdb*/{start,size}
/sys/class/block/sdb/size:28800
/sys/class/block/sdb1/start:29
/sys/class/block/sdb1/size:7772
/sys/class/block/sdb2/start:37801
/sys/class/block/sdb2/size:7801
/sys/class/block/sdb3/start:15602
/sys/class/block/sdb3/size:57801
/sys/class/block/sdb4/start:23403
/sys/class/block/sdb4/size:5394
With this patch, we ignore partitions which start behind the end of the disk,
and limit partitions to the end of the disk if they pretend to be larger:
grep . /sys/class/block/sdb*/{start,size}
/sys/class/block/sdb/size:28800
/sys/class/block/sdb1/start:29
/sys/class/block/sdb1/size:7772
/sys/class/block/sdb3/start:15602
/sys/class/block/sdb3/size:13198
/sys/class/block/sdb4/start:23403
/sys/class/block/sdb4/size:5394
These warnings are printed to the kernel log:
sdb: p2 ignored, start 37801 is behind the end of the disk
sdb: p3 size 57801 limited ...Kay, I added this variant. -- Jens Axboe --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano |
