On Tuesday 18 March 2008, Linus Torvalds wrote:Which is with compliance with PIO-in protocol specification and years of usage of the task_in_intr() code for fs code paths and HDIO_DRIVE_TASKFILE ioctl has proven that real hardware also works this way (please note that the changed code was used only for HDIO_DRIVE_CMD ioctl requests). If INTRQ is asserted and "BSY is cleared to zero and DRQ is cleared, then the device has completed the command with an error." (thus task_in_intr() assumed ERR bit to be set), otherwise the value ERR may not be defined (however task_in_intr() still assumed that it is OK to check ERR bit). Additionally handling of premature shared PCI interrupts comes into the picture making the whole thing much more messier. It could happen in the past that drive_is_ready() was called before drive had time to assert BSY so later also task_in_intr() assumed that the drive is ready. However this should be already fixed as we now always guarantee sufficient delay after the command was written so how's about the following patch which just makes DRQ == 0 || BSY == 1 || ERR == 1 an error (ideally we should also proceed with transfer if DRQ == 1 && ERR == 1 but it may have other gotchas so lets keep it as it was for now): --- preferably this should get some testing in -mm first drivers/ide/ide-taskfile.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) Index: b/drivers/ide/ide-taskfile.c =================================================================== --- a/drivers/ide/ide-taskfile.c +++ b/drivers/ide/ide-taskfile.c @@ -431,14 +431,9 @@ static ide_startstop_t task_in_intr(ide_ struct request *rq = HWGROUP(drive)->rq; u8 stat = ide_read_status(drive); - /* new way for dealing with premature shared PCI interrupts */ - if (!OK_STAT(stat, DRQ_STAT, BAD_R_STAT)) { - if (stat & (ERR_STAT | DRQ_STAT)) - return task_error(drive, rq, __FUNCTION__, stat); - /* No data yet, so wait for another IRQ. */ - ide_set_handler(drive, &task_in_intr, WAIT_WORSTCASE, NULL); - return ide_started; - } + /* TODO: more complex handling is needed for ERR == 1 */ + if ((stat & DRQ_STAT) == 0 || (stat & (BUSY_STAT | ERR_STAT))) + return task_error(drive, rq, __func__, stat); ide_pio_datablock(drive, rq, 0); Now back to the recent debug log from Anders: Mar 18 16:02:14 tippex hda: tf: feat 0xd2 nsect 0xf1 lbal 0x00 lbam 0x4f lbah 0xc2 dev 0x00 cmd 0xb0 Mar 18 16:02:14 tippex hda: hob: nsect 0x00 lbal 0x00 lbam 0x00 lbah 0x00 Mar 18 16:02:14 tippex task_in_intr: hda: stat=50 This is SMART command with SMART ENABLE ATTRIBUTE AUTOSAVE sub-command (feat == 0xd2, nsect == 0xf1) but it should use no-data protocol instead of PIO-in which brings us back to commit 18a056feccabdfa9764016a615121b194828bc72 ("ide: don't enable local IRQs for PIO-in in driver_cmd_intr() (take 2)"): @@ -638,19 +638,22 @@ static ide_startstop_t drive_cmd_intr (ide_drive_t *drive) { struct request *rq = HWGROUP(drive)->rq; ide_hwif_t *hwif = HWIF(drive); - u8 *args = (u8 *) rq->buffer; - u8 stat = hwif->INB(IDE_STATUS_REG); + u8 *args = (u8 *)rq->buffer, pio_in = (args && args[3]) ? 1 : 0, stat; - local_irq_enable_in_hardirq(); - if (rq->cmd_type == REQ_TYPE_ATA_CMD && - (stat & DRQ_STAT) && args && args[3]) { + if (pio_in) { u8 io_32bit = drive->io_32bit; + stat = hwif->INB(IDE_STATUS_REG); + if ((stat & DRQ_STAT) == 0) + goto out; drive->io_32bit = 0; hwif->ata_input_data(drive, &args[4], args[3] * SECTOR_WORDS); drive->io_32bit = io_32bit; stat = wait_drive_not_busy(drive); + } else { + local_irq_enable_in_hardirq(); + stat = hwif->INB(IDE_STATUS_REG); } - +out: if (!OK_STAT(stat, READY_STAT, BAD_STAT)) return ide_error(drive, "drive_cmd", stat); /* calls ide_end_drive_cmd */ as can be seen before this patch protocol to use was decided basing on DRQ bit being set - what would you call _that_ if taskfile code is horrible, he? ;-) [ I tried really hard to make patchset which converted HDIO_DRIVE_CMD to use the common code as bisectable and simple as possible (by separating real changes in behavior from pure code transformations and making patches of small granularity) since potential problems were kind of expected but I must admit that I completely overlooked the hidden meaning of DRQ_STAT check... ] Later commit 4d977e43d8ae758434e603cf2455d955f71c77c4 just triggered the bug... The "real" fix (Anders, please test it): --- BTW libata seems to have exactly the same problem but it is hidden by the lack of quirk for "premature PCI IRQs" drivers/ide/ide-taskfile.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) Index: b/drivers/ide/ide-taskfile.c =================================================================== --- a/drivers/ide/ide-taskfile.c +++ b/drivers/ide/ide-taskfile.c @@ -771,15 +771,25 @@ int ide_cmd_ioctl (ide_drive_t *drive, u tf->lbam = 0x4f; tf->lbah = 0xc2; tfargs.tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_IN_NSECT; + + /* SMART READ DATA / LOG */ + if (tf->feature == 0xD0 || tf->feature == 0xD5) + tfargs.data_phase = TASKFILE_IN; + else + tfargs.data_phase = TASKFILE_NO_DATA; } else { tf->nsect = args[1]; tfargs.tf_flags = IDE_TFLAG_OUT_FEATURE | IDE_TFLAG_OUT_NSECT | IDE_TFLAG_IN_NSECT; + + if (args[3]) + tfargs.data_phase = TASKFILE_IN; + else + tfargs.data_phase = TASKFILE_NO_DATA; } tf->command = args[0]; - tfargs.data_phase = args[3] ? TASKFILE_IN : TASKFILE_NO_DATA; - if (args[3]) { + if (tfargs.data_phase == TASKFILE_IN) { tfargs.tf_flags |= IDE_TFLAG_IO_16BIT; bufsize = SECTOR_WORDS * 4 * args[3]; buf = kzalloc(bufsize, GFP_KERNEL); --
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Andi Kleen | Re: [patch] Add basic sanity checks to the syscall execution patch |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Stoyan Gaydarov | From 2.4 to 2.6 to 2.7? |
git: | |
| Elijah Newren | Trying to use git-filter-branch to compress history by removing large, obsolete bi... |
| Matthieu Moy | git push to a non-bare repository |
| Johannes Schindelin | Re: Git as a filesystem |
| Jakub Narebski | Re: VCS comparison table |
| Richard Stallman | Real men don't attack straw men |
| Joachim Schipper | Re: OpenBSD/alpha Status |
| Theo de Raadt | Re: hardware needed for network stack performance work |
| Marcus Andree | Re: Cyrus IMAP performance problems [Long] |
| Andrew Morton | Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY" |
| John Rigby | [PATCH] [Rev2] MPC5121 FEC support |
| Pekka Enberg | Re: [rfc][patch 1/3] slub: fix small HWCACHE_ALIGN alignment |
| Ilpo Järvinen | [PATCH] [TCP]: Separate lost_retrans loop into own function |
