Re: INITIO scsi driver fails to work properly

Previous thread: Re: [PATCH 2.6.24-rc5-mm 3/3] gpiolib: obsolete drivers/i2c/chips/pca9539.c by eric miao on Wednesday, December 19, 2007 - 4:45 am. (4 messages)

Next thread: Re: [RFC/PATCH 2/8] revoke: inode revoke lock V7 by Pekka J Enberg on Wednesday, December 19, 2007 - 5:02 am. (1 message)
To: Boaz Harrosh <bharrosh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Wednesday, December 19, 2007 - 4:48 am

I tried patch[2] (addition of sg++) at 2.6.24-rc5-mm1 but the
system hangs after some seconds when the initio driver loads.
I will try patch[1] next week to see what happens.

--

To: Filippos Papadopoulos <psybases@...>
Cc: Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Friday, December 21, 2007 - 3:30 pm

There is also a Fedora bug report against 2.6.23. The user has
applied commit e9e42faf47255274a1ed0b9bf1c46118023ec5fa from
2.6.24-rc plus the two additional fixes under discussion and it
hangs for him too.

https://bugzilla.redhat.com/show_bug.cgi?id=390531
--

To: Chuck Ebbert <cebbert@...>
Cc: Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Friday, December 21, 2007 - 5:03 pm

It really sounds like there's some problem applying the patches. The
consistent report throughout is this one:

initio: I/O port range 0x0 is busy.

Which should be fixed by 99f1f534922a2f2251ba05b14657a1c62882a80e. I
didn't actually find that in the bug thread anywhere, but maybe I missed
it?

--

To: James Bottomley <James.Bottomley@...>
Cc: Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Friday, December 21, 2007 - 6:43 pm

The "I/O port 0" bug just prints the message and the system continues
to run. It's only after that is fixed that the system just hangs on
boot shortly after loading the driver.
--

To: Chuck Ebbert <cebbert@...>
Cc: Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Friday, December 21, 2007 - 6:49 pm

That should happen unless the PCI BAR is genuinely misconfigured; it's
saying we got zero when we requested the starting address of BAR0. What
does lspci -vv show for this device?

James

--

To: <linux-scsi@...>
Cc: Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, January 3, 2008 - 8:18 pm

First of all let me wish a happy new year.
I come back from the vacations and i compiled the initio driver with

#define DEBUG_INTERRUPT 1
#define DEBUG_QUEUE 1
#define DEBUG_STATE 1
#define INT_DISC 1

I used the sources from 2.6.24-rc6-git9 kernel. At kernel boot time the initio
driver prints the following:

" scsi: Initio INI-9X00U/UW SCSI device driver
Find scb at c0c00000
Append pend scb c0c00000;"

After 3 seconds the whole system freezes there and i have to reboot.

P.S here is the info from 'lspci -vv' running 2.6.16.13 kernel:

"00:08.0 SCSI storage controller: Initio Corporation 360P (rev 02)
Subsystem: Unknown device 9292:0202
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Latency: 32, Cache Line Size 08
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at d000 [size=256]
Region 1: Memory at ef000000 (32-bit, non-prefetchable) [size=4K]
[virtual] Expansion ROM at 50000000 [disabled] [size=128K]
"
--

To: Filippos Papadopoulos <psybases@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 1:16 am

This proves the BAR0 to be non zero, but I also take it from your report
that the

initio: I/O port range 0x0 is busy.

I think there's still one remaining bug from the sg_list conversion,
namely that cblk->sglen is never set, but it is used to count the number
of elements in the sg array. Could you try this patch (on top of
everything else) and see if the problem is finally fixed?

Thanks,

James

---
diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..d038459 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2603,6 +2603,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c
nseg = scsi_dma_map(cmnd);
BUG_ON(nseg < 0);
if (nseg) {
+ cblk->sglen = nseg;
dma_addr = dma_map_single(&host->pci_dev->dev, &cblk->sglist[0],
sizeof(struct sg_entry) * TOTAL_SG_ENTRY,
DMA_BIDIRECTIONAL);

--

To: James Bottomley <James.Bottomley@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 5:54 am

On Jan 11, 2008 7:16 AM, James Bottomley

--

To: Filippos Papadopoulos <psybases@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 11:44 am

Sorry ... we appear to have several reporters of different bugs in this
thread. That message was copied by Chuck Ebbert from a Red Hat

First off, has this driver ever worked for you in 2.6? Just booting
SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to
check a really old kernel would be helpful. If you can get it to work,
then we can proceed with a patch reversion regime based on the
assumption that the problem is a recent commit.

Thanks,

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Filippos Papadopoulos <psybases@...>, <linux-scsi@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 1:52 pm

Our reporter has applied patches since then and now reports the exact
same symptoms that Filippos does. (It just hangs after loading the driver.)
--

To: James Bottomley <James.Bottomley@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 12:44 pm

On Jan 11, 2008 5:44 PM, James Bottomley

Yes it works under 2.6.16.13. See the beginning of this thread, i
--

To: Filippos Papadopoulos <psybases@...>
Cc: James Bottomley <James.Bottomley@...>, <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 1:01 pm

It worked (ish.. it has problems and always has had) before the big
updates, and according to my tester after the big update + two patches
that escaped somewhere in the process. Unfortunately my tester no longer
has the card to dig further.

The 0x0 bug was fixed a while ago but seems to have sat in -mm for a bit.
Don't know about further stuff.
--

To: Alan Cox <alan@...>
Cc: Filippos Papadopoulos <psybases@...>, <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 1:33 pm

The statement that OpenSuse 10.3, based on 2.6.22.5, also fails
indicates there may be something else that predates your reorganisation
at the root of this (depending on whether the vendor kernel contains a
back port or not). That's why I want to see what happens on this system
with a vanilla 2.6.22

James

--

To: Filippos Papadopoulos <psybases@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, January 11, 2008 - 1:01 pm

Could you try with a vanilla 2.6.22 kernel? The reason for all of this
is that 2.6.22 predates Alan's conversion of this driver (which was my
95% candidate for the source of the bug). I want you to try the vanilla
kernel just in case the opensuse one contains a backport.

Thanks,

James

--

To: James Bottomley <James.Bottomley@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Sunday, January 13, 2008 - 8:28 am

Yes you are right. I compiled the vanilla 2.6.22 and initio driver works.
--

To: Filippos Papadopoulos <psybases@...>
Cc: <linux-scsi@...>, Chuck Ebbert <cebbert@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, January 15, 2008 - 11:16 am

That's good news ... at least we know where the issue lies; now the
problem comes: there are two candidate patches for this issue: Alan's
driver update patch and Tomo's accessors patch. Unfortunately, due to
merge conflicts the two are pretty hopelessly intertwined. I think I
already spotted one bug in the accessor conversion, so I'll look at that
again. Alan's also going to acquire an inito board and retest his
conversions.

I'm afraid it might be a while before we have anything for you to test.

James

--

To: <psybases@...>, <James.Bottomley@...>
Cc: <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Wednesday, January 16, 2008 - 1:59 am

On Tue, 15 Jan 2008 09:16:06 -0600

Can you try this patch?

Thanks,

diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..6891d2b 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c
cblk->bufptr = cpu_to_le32((u32)dma_addr);
cmnd->SCp.dma_handle = dma_addr;

+ cblk->sglen = nseg;

cblk->flags |= SCF_SG; /* Turn on SG list flag */
total_len = 0;
--

To: FUJITA Tomonori <fujita.tomonori@...>
Cc: <psybases@...>, <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Wednesday, January 16, 2008 - 10:57 am

We already tried a variant of this here:

http://marc.info/?l=linux-scsi&m=120002863806103&w=2

The answer was negative. Although I've saved the patch because it's
clearly one of the bugs.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: FUJITA Tomonori <fujita.tomonori@...>, <psybases@...>, <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Monday, January 21, 2008 - 6:20 pm

Ok my attempt to get the card failed so we are going to have to do this
the hard way. See where this patch crashes and what it prints

(On top of the other patches)

diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.24-rc8-mm1/drivers/scsi/initio.c linux-2.6.24-rc8-mm1/drivers/scsi/initio.c
--- linux.vanilla-2.6.24-rc8-mm1/drivers/scsi/initio.c 2008-01-19 14:22:43.000000000 +0000
+++ linux-2.6.24-rc8-mm1/drivers/scsi/initio.c 2008-01-21 14:54:48.000000000 +0000
@@ -2537,10 +2537,12 @@
struct Scsi_Host *dev = dev_id;
unsigned long flags;
int r;
-
+
+ printk("ISR\n");
spin_lock_irqsave(dev->host_lock, flags);
r = initio_isr((struct initio_host *)dev->hostdata);
spin_unlock_irqrestore(dev->host_lock, flags);
+ printk("ISR DONE %d\n", r);
if (r)
return IRQ_HANDLED;
else
@@ -2643,6 +2645,7 @@
struct initio_host *host = (struct initio_host *) cmd->device->host->hostdata;
struct scsi_ctrl_blk *cmnd;

+ printk("SCB QUEUE\n");
cmd->scsi_done = done;

cmnd = initio_alloc_scb(host);
@@ -2650,7 +2653,9 @@
return SCSI_MLQUEUE_HOST_BUSY;

initio_build_scb(host, cmnd, cmd);
+ printk("SCB EXEC\n");
initio_exec_scb(host, cmnd);
+ printk("SCB EXEC DONE\n");
return 0;
}

@@ -2766,6 +2771,8 @@
struct scsi_cmnd *cmnd; /* Pointer to SCSI request block */
struct initio_host *host;
struct scsi_ctrl_blk *cblk;
+
+ printk("SCB POST\n");

host = (struct initio_host *) host_mem;
cblk = (struct scsi_ctrl_blk *) cblk_mem;
@@ -2934,9 +2941,11 @@

pci_set_drvdata(pdev, shost);

+ printk("SAH\n");
error = scsi_add_host(shost, &pdev->dev);
if (error)
goto out_free_irq;
+ printk("SSH\n");
scsi_scan_host(shost);
return 0;
out_free_irq:
--

To: Alan Cox <alan@...>
Cc: James Bottomley <James.Bottomley@...>, FUJITA Tomonori <fujita.tomonori@...>, <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Tuesday, January 22, 2008 - 1:50 pm

I get the following:
SAH
SSH
SCB Q
SCB EXEC
SCB EXEC DONE

After ~3 secs the system freezes.

--

To: Filippos Papadopoulos <psybases@...>
Cc: Alan Cox <alan@...>, FUJITA Tomonori <fujita.tomonori@...>, <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Friday, January 25, 2008 - 12:49 pm

Actually, I suspect your issues should be fixed by this patch:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi...

Could you download 2.6.24 and try it out to see if they are?

Thanks,

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Alan Cox <alan@...>, FUJITA Tomonori <fujita.tomonori@...>, <linux-scsi@...>, <cebbert@...>, <bharrosh@...>, <akpm@...>, <linux-kernel@...>
Date: Friday, January 25, 2008 - 5:04 pm

Well, 2.6.24 fixes the problem.
Thanks to all of you!
--

To: Filippos Papadopoulos <psybases@...>
Cc: Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Wednesday, December 19, 2007 - 9:29 am

No, it wouldn't. Bugzilla is a place where bug reports go to be
ignored. Witness 9370 where despite my best efforts to move discussion
to the mailing list, it's been thoroughly ignored because the original
reporte insists on posting additional information there instead of to
the mailing list.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Wednesday, December 19, 2007 - 12:50 pm

That's a bit harsh on bugzilla. It is of use to people whose job it is
to track outstanding bugs.

However, Matthew is completely correct, it's useless for getting bugs
fixed *if* the information isn't on the mailing list. The reason for
using mailing list is the more eyes principle: if you email linux-scsi,
all the SCSI experts will see it, not just the one email listed as owner
in bugzilla. Likewise, as the bug goes through analysis, if it turns
out to be in a different area, that areas mailing list can be added to
the Cc list.

So, to get the best of both worlds, file a bugzilla and note the bugid.
Then email a complete report to the relevant list, but add [BUG <bugid>]
to the subject line and cc bugme-daemon@bugzilla.kernel.org If you do
this, bugzilla will keep track of the entire discussion as it progresses
and allow those who track bugs through bugzilla to get a pretty accurate
idea of the status. You should never need to touch bugzilla again once
the initial bug report is filed: all future information flow is via the
mailing lists.

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Wednesday, December 19, 2007 - 1:05 pm

The problem is that it appears to the casual observer as if they can
then add information to the bug through the web interface. But that
information will never be forwarded to the mailing list. Unless there's
a way of marking bugs as 'unchangable through the web interface' or 'all
messages appended to this bug need to be forwarded', Bugzilla just
doesn't fit our needs.

The Debian BTS fits our way of working much better. Perhaps somebody
should investigate a migration.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--

To: Matthew Wilcox <matthew@...>
Cc: James Bottomley <James.Bottomley@...>, Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Thursday, December 20, 2007 - 5:32 am

This is excellent observation by Matthew and James. There is no magic
in bugzilla not being loved, it is just "not the right set of features
for effective work on a problem". It doesn't support multiple
developer' collaboration well.
This distaste is not universal, since some people don't have a problem
with bugzilla as is, maybe those who tend to work on problems
"alone"...
But making it to be a workable tool for everyone is definitely worth it.
Any other favorite bugzillas that are nice to work with and that have
--

To: Natalie Protasevich <protasnb@...>
Cc: Matthew Wilcox <matthew@...>, Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Thursday, December 20, 2007 - 11:08 am

We have actually been trying for over two years to get bugzilla fixed so
that it suits our email and list publishing workflow for fixing bugs. I
surmise that 90% of our problems with bugzilla could be solved if it
simply tipped a SCSI bug report onto the SCSI list when it was created
in such a way that all replies were gathered back into bugzilla.
Unfortunately, no-one who maintains our bugzilla has actually been able
to make this happen. The other 10% of the problem is that bugzilla
doesn't seem to have a way properly to integrate people who insist on
using its web interface to reply into the email flow.

James

--

To: Natalie Protasevich <protasnb@...>
Cc: Matthew Wilcox <matthew@...>, James Bottomley <James.Bottomley@...>, Filippos Papadopoulos <psybases@...>, Boaz Harrosh <bharrosh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Thursday, December 20, 2007 - 11:14 am

Actually, Bugzilla *could* be configured so that, say, linux-scsi was
copied for all SCSI bugs (linux-scsi could just be added to the cc
list). The problem though is that Bugzilla will then proceed to cc
linux-scsi for all the Bugzilla state change details which might annoy
the denizens of the linux-scsi list.

But if new entries on the Bugzilla entry could be set to forward to
the appropriate mailing list with the messaging *looking* a lot more
like a mail message, I suspect it could be acceptable. One of the
advantages of the Debian BTS is that it's much more integrated into
the e-mail workflow. (Although it lacks the roll up and reporting
capabilities that are beloved by managers...)

But hey, it could be worse. We could have chosen the Sourceforge bug
tracker. :-)

- Ted
--

To: Filippos Papadopoulos <psybases@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-scsi@...>
Date: Wednesday, December 19, 2007 - 6:08 am

Please first pull from scsi-rc-fixes git-tree first. it has a couple
of other fixes for initio plus patch[2] included.
(maybe its already in -mm tree I'm not sure).
I would prefer linux-scsi ml

<snip>

Boaz
--

Previous thread: Re: [PATCH 2.6.24-rc5-mm 3/3] gpiolib: obsolete drivers/i2c/chips/pca9539.c by eric miao on Wednesday, December 19, 2007 - 4:45 am. (4 messages)

Next thread: Re: [RFC/PATCH 2/8] revoke: inode revoke lock V7 by Pekka J Enberg on Wednesday, December 19, 2007 - 5:02 am. (1 message)