Re: cd/dvd inaccessible in 2.6.24-rc2

Previous thread: Re: [PATCH 5/6] MN10300: Add the MN10300/AM33 architecture to the kernel [try #5] by Andrew Morton on Friday, November 9, 2007 - 11:53 pm. (13 messages)

Next thread: [patch 1/2] mm: page trylock rename by Nick Piggin on Saturday, November 10, 2007 - 1:12 am. (7 messages)
To: <linux-kernel@...>
Date: Saturday, November 10, 2007 - 12:27 am

Hello,

Motherboard: Gigabyte GA-P35-DS4 (rev. 1.1)
Chipset: Intel P35 + ICH9R
PATA port runs off JMicron controller
CD/DVD Device: BENQ DW1640 16X

I cannot access my dvd burner under 2.6.24-rc2, I have no problems under
2.6.23. Basically the drive is detected OK, everything looks ok but as
soon as I go to use it errors like this occur:

ata9.00: qc timeout (cmd 0xa0)
ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata9.00: cmd a0/00:00:00:02:00/00:00:00:00:00/a0 tag 0 cdb 0x5a data 2
in
res 51/54:03:00:02:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
ata9.00: status: { DRDY ERR }
ata9: soft resetting link
ata9.00: revalidation failed (errno=-2)
ata9: failed to recover some devices, retrying in 5 secs
ata9: soft resetting link
ata9.00: qc timeout (cmd 0xa1)
ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata9.00: revalidation failed (errno=-5)
ata9: failed to recover some devices, retrying in 5 secs
ata9: soft resetting link
ata9.00: qc timeout (cmd 0xa1)
ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata9.00: revalidation failed (errno=-5)
ata9.00: disabled
ata9: soft resetting link
ata9: EH complete

The drive is locked up after the first attempt to access it occurs, the
tray cannot be ejected.

dmesg output

http://paste.ubuntu-nl.org/43948/

kernel .config

http://paste.ubuntu-nl.org/43944/

lspci -vvvxxxx

http://paste.ubuntu-nl.org/43950/

Regards,

Will Trives

-

To: Will Trives <will@...>
Cc: <linux-kernel@...>, IDE/ATA development list <linux-ide@...>
Date: Saturday, November 10, 2007 - 7:35 pm

Is 2.6.24-rc1 also broken?

Jeff

-

To: Jeff Garzik <jeff@...>, <linux-kernel@...>
Date: Sunday, November 11, 2007 - 12:40 am

Hello Jeff,

Yes it is. I'll keep testing with previous kernels.

Dmesg does look different with 2.6.23 vs 2.6.24-rc2

This is 2.6.23 :

scsi8 : pata_jmicron
scsi9 : pata_jmicron
ata9: PATA max UDMA/100 cmd 0x000000000001c000 ctl 0x000000000001c102 bmdma 0x000000000001c400 irq 17
ata10: PATA max UDMA/100 cmd 0x000000000001c200 ctl 0x000000000001c302 bmdma 0x000000000001c408 irq 17
ata9.00: ATAPI: BENQ DVD DD DW1640, BSRB, max UDMA/33
ata9.00: configured for UDMA/33
ata9: EH pending after completion, repeating EH (cnt=4)
scsi 8:0:0:0: CD-ROM BENQ DVD DD DW1640 BSRB PQ: 0 ANSI: 5

This is 2.6.24-rc2 :

scsi8 : pata_jmicron
scsi9 : pata_jmicron
ata9: PATA max UDMA/100 cmd 0xc000 ctl 0xc100 bmdma 0xc400 irq 17
ata10: PATA max UDMA/100 cmd 0xc200 ctl 0xc300 bmdma 0xc408 irq 17
ata9.00: ATAPI: BENQ DVD DD DW1640, BSRB, max UDMA/33
ata9.00: configured for UDMA/33
scsi 8:0:0:0: CD-ROM BENQ DVD DD DW1640 BSRB PQ: 0 ANSI: 5

Regards,

Will Trives

-

To: Jeff Garzik <jeff@...>
Cc: <linux-kernel@...>
Date: Monday, November 12, 2007 - 2:00 am

Hello,

My mistake, it looks like the issue is to do with writing only.

Mounting a standard DVD works fine with 2.6.24-rc2-git2.

As soon as I try to use wodim or load k3b, that's when drive gets locked
up.

The issue was still there with 2.6.23-git15 , I will continue to test
with previous ones.

Regards,

Will Trives

-

To: Will Trives <will@...>
Cc: Jeff Garzik <jeff@...>, <linux-kernel@...>
Date: Monday, November 12, 2007 - 11:23 am

I think I now know what's wrong with all these ATAPI issues. I'm
working on generic solution. Please standby a bit.

Thanks.

--
tejun
-

To: Will Trives <will@...>
Cc: <linux-kernel@...>, <linux-ide@...>, Rafael J. Wysocki <rjw@...>
Date: Saturday, November 10, 2007 - 6:49 pm

Thanks for letting us know.

Added linux-ide Cc.

Rafael, it looks like another regression.
-

To: Andrew Morton <akpm@...>
Cc: Will Trives <will@...>, <linux-kernel@...>, <linux-ide@...>, Rafael J. Wysocki <rjw@...>
Date: Saturday, November 10, 2007 - 7:05 pm

On Sat, 10 Nov 2007 14:49:23 -0800

Could be an IRQ/ACPI regression, and in fact to me it looks more like
that, than an IDE one. Probably worth trying the various IRQ routing
options and seeing if they help.

Alan
-

To: Alan Cox <alan@...>
Cc: Will Trives <will@...>, <linux-kernel@...>, <linux-ide@...>, Rafael J. Wysocki <rjw@...>, <linux-acpi@...>
Date: Saturday, November 10, 2007 - 8:13 pm

Yup.

Please, if you have time, bisect it down to the offending commit?

There's info at http://www.kernel.org/doc/local/git-quick.html which should
help.

Thanks.
-

To: Alan Cox <alan@...>, Tejun Heo <htejun@...>
Cc: Andrew Morton <akpm@...>, Will Trives <will@...>, <linux-kernel@...>, <linux-ide@...>, Rafael J. Wysocki <rjw@...>
Date: Saturday, November 10, 2007 - 7:28 pm

Agreed, though the output is indeed signalling an error... IMO the EH
should handle the error if the device is signalling an error, upon
timeout, rather than just going ahead and resetting the device.

Its similar to where ATA devices on PCI SFF controllers signal DMA error
via timeout, where EH must inspect BMDMA Status register to determine if
it's a DMA error signalled by hardware, or something that requires
additional autopsy.

EH for ATAPI is quite different from EH for ATA, so there may be some
areas where we don't handle things the right way for ATAPI.

Decoding the error message we have:

cdb 0x5a ==
MODE SENSE(10)
status 0x51 ==
DRDY
command-specific flag (aka SERV, in !overlap case)
CHK (check condition, aka error)
error 0x54 ==
ABRT (command aborted or command parameter invalid)
sense key 0x5 (illegal request)
ireason 0x3 ==
the hardcoded values (bits 0 and 1) remain hardcoded, all good

Since BSY is not set in the Status register, and given the other
information derived from the decoded values, it looks like the device is
otherwise happy and ready to accept additional commands.

It appears to have chewed on an ATAPI command, spit it out, but failed
to send a completion interrupt.

So its an open question whether it's a device not completing this
errored-out command, or whether its IRQ/ACPI stuff infecting libata.

Jeff

-

Previous thread: Re: [PATCH 5/6] MN10300: Add the MN10300/AM33 architecture to the kernel [try #5] by Andrew Morton on Friday, November 9, 2007 - 11:53 pm. (13 messages)

Next thread: [patch 1/2] mm: page trylock rename by Nick Piggin on Saturday, November 10, 2007 - 1:12 am. (7 messages)