I tried patch[2] (addition of sg++) at 2.6.24-rc5-mm1 but the
system hangs after some seconds when the initio driver loads.
I will try patch[1] next week to see what happens.--
There is also a Fedora bug report against 2.6.23. The user has
applied commit e9e42faf47255274a1ed0b9bf1c46118023ec5fa from
2.6.24-rc plus the two additional fixes under discussion and it
hangs for him too.
It really sounds like there's some problem applying the patches. The
consistent report throughout is this one:initio: I/O port range 0x0 is busy.
Which should be fixed by 99f1f534922a2f2251ba05b14657a1c62882a80e. I
didn't actually find that in the bug thread anywhere, but maybe I missed
it?--
The "I/O port 0" bug just prints the message and the system continues
to run. It's only after that is fixed that the system just hangs on
boot shortly after loading the driver.
--
That should happen unless the PCI BAR is genuinely misconfigured; it's
saying we got zero when we requested the starting address of BAR0. What
does lspci -vv show for this device?James
--
First of all let me wish a happy new year.
I come back from the vacations and i compiled the initio driver with#define DEBUG_INTERRUPT 1
#define DEBUG_QUEUE 1
#define DEBUG_STATE 1
#define INT_DISC 1I used the sources from 2.6.24-rc6-git9 kernel. At kernel boot time the initio
driver prints the following:" scsi: Initio INI-9X00U/UW SCSI device driver
Find scb at c0c00000
Append pend scb c0c00000;"After 3 seconds the whole system freezes there and i have to reboot.
P.S here is the info from 'lspci -vv' running 2.6.16.13 kernel:
"00:08.0 SCSI storage controller: Initio Corporation 360P (rev 02)
Subsystem: Unknown device 9292:0202
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Latency: 32, Cache Line Size 08
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at d000 [size=256]
Region 1: Memory at ef000000 (32-bit, non-prefetchable) [size=4K]
[virtual] Expansion ROM at 50000000 [disabled] [size=128K]
"
--
This proves the BAR0 to be non zero, but I also take it from your report
that theinitio: I/O port range 0x0 is busy.
I think there's still one remaining bug from the sg_list conversion,
namely that cblk->sglen is never set, but it is used to count the number
of elements in the sg array. Could you try this patch (on top of
everything else) and see if the problem is finally fixed?Thanks,
James
---
diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..d038459 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2603,6 +2603,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c
nseg = scsi_dma_map(cmnd);
BUG_ON(nseg < 0);
if (nseg) {
+ cblk->sglen = nseg;
dma_addr = dma_map_single(&host->pci_dev->dev, &cblk->sglist[0],
sizeof(struct sg_entry) * TOTAL_SG_ENTRY,
DMA_BIDIRECTIONAL);--
On Jan 11, 2008 7:16 AM, James Bottomley
--
Sorry ... we appear to have several reporters of different bugs in this
thread. That message was copied by Chuck Ebbert from a Red HatFirst off, has this driver ever worked for you in 2.6? Just booting
SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to
check a really old kernel would be helpful. If you can get it to work,
then we can proceed with a patch reversion regime based on the
assumption that the problem is a recent commit.Thanks,
James
--
Our reporter has applied patches since then and now reports the exact
same symptoms that Filippos does. (It just hangs after loading the driver.)
--
On Jan 11, 2008 5:44 PM, James Bottomley
Yes it works under 2.6.16.13. See the beginning of this thread, i
--
It worked (ish.. it has problems and always has had) before the big
updates, and according to my tester after the big update + two patches
that escaped somewhere in the process. Unfortunately my tester no longer
has the card to dig further.The 0x0 bug was fixed a while ago but seems to have sat in -mm for a bit.
Don't know about further stuff.
--
The statement that OpenSuse 10.3, based on 2.6.22.5, also fails
indicates there may be something else that predates your reorganisation
at the root of this (depending on whether the vendor kernel contains a
back port or not). That's why I want to see what happens on this system
with a vanilla 2.6.22James
--
Could you try with a vanilla 2.6.22 kernel? The reason for all of this
is that 2.6.22 predates Alan's conversion of this driver (which was my
95% candidate for the source of the bug). I want you to try the vanilla
kernel just in case the opensuse one contains a backport.Thanks,
James
--
Yes you are right. I compiled the vanilla 2.6.22 and initio driver works.
--
That's good news ... at least we know where the issue lies; now the
problem comes: there are two candidate patches for this issue: Alan's
driver update patch and Tomo's accessors patch. Unfortunately, due to
merge conflicts the two are pretty hopelessly intertwined. I think I
already spotted one bug in the accessor conversion, so I'll look at that
again. Alan's also going to acquire an inito board and retest his
conversions.I'm afraid it might be a while before we have anything for you to test.
James
--
On Tue, 15 Jan 2008 09:16:06 -0600
Can you try this patch?
Thanks,
diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..6891d2b 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c
cblk->bufptr = cpu_to_le32((u32)dma_addr);
cmnd->SCp.dma_handle = dma_addr;+ cblk->sglen = nseg;
cblk->flags |= SCF_SG; /* Turn on SG list flag */
total_len = 0;
--
We already tried a variant of this here:
http://marc.info/?l=linux-scsi&m=120002863806103&w=2
The answer was negative. Although I've saved the patch because it's
clearly one of the bugs.James
--
Ok my attempt to get the card failed so we are going to have to do this
the hard way. See where this patch crashes and what it prints(On top of the other patches)
diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.24-rc8-mm1/drivers/scsi/initio.c linux-2.6.24-rc8-mm1/drivers/scsi/initio.c
--- linux.vanilla-2.6.24-rc8-mm1/drivers/scsi/initio.c 2008-01-19 14:22:43.000000000 +0000
+++ linux-2.6.24-rc8-mm1/drivers/scsi/initio.c 2008-01-21 14:54:48.000000000 +0000
@@ -2537,10 +2537,12 @@
struct Scsi_Host *dev = dev_id;
unsigned long flags;
int r;
-
+
+ printk("ISR\n");
spin_lock_irqsave(dev->host_lock, flags);
r = initio_isr((struct initio_host *)dev->hostdata);
spin_unlock_irqrestore(dev->host_lock, flags);
+ printk("ISR DONE %d\n", r);
if (r)
return IRQ_HANDLED;
else
@@ -2643,6 +2645,7 @@
struct initio_host *host = (struct initio_host *) cmd->device->host->hostdata;
struct scsi_ctrl_blk *cmnd;+ printk("SCB QUEUE\n");
cmd->scsi_done = done;cmnd = initio_alloc_scb(host);
@@ -2650,7 +2653,9 @@
return SCSI_MLQUEUE_HOST_BUSY;initio_build_scb(host, cmnd, cmd);
+ printk("SCB EXEC\n");
initio_exec_scb(host, cmnd);
+ printk("SCB EXEC DONE\n");
return 0;
}@@ -2766,6 +2771,8 @@
struct scsi_cmnd *cmnd; /* Pointer to SCSI request block */
struct initio_host *host;
struct scsi_ctrl_blk *cblk;
+
+ printk("SCB POST\n");host = (struct initio_host *) host_mem;
cblk = (struct scsi_ctrl_blk *) cblk_mem;
@@ -2934,9 +2941,11 @@pci_set_drvdata(pdev, shost);
+ printk("SAH\n");
error = scsi_add_host(shost, &pdev->dev);
if (error)
goto out_free_irq;
+ printk("SSH\n");
scsi_scan_host(shost);
return 0;
out_free_irq:
--
I get the following:
SAH
SSH
SCB Q
SCB EXEC
SCB EXEC DONEAfter ~3 secs the system freezes.
--
Actually, I suspect your issues should be fixed by this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi...
Could you download 2.6.24 and try it out to see if they are?
Thanks,
James
--
Well, 2.6.24 fixes the problem.
Thanks to all of you!
--
No, it wouldn't. Bugzilla is a place where bug reports go to be
ignored. Witness 9370 where despite my best efforts to move discussion
to the mailing list, it's been thoroughly ignored because the original
reporte insists on posting additional information there instead of to
the mailing list.--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
That's a bit harsh on bugzilla. It is of use to people whose job it is
to track outstanding bugs.However, Matthew is completely correct, it's useless for getting bugs
fixed *if* the information isn't on the mailing list. The reason for
using mailing list is the more eyes principle: if you email linux-scsi,
all the SCSI experts will see it, not just the one email listed as owner
in bugzilla. Likewise, as the bug goes through analysis, if it turns
out to be in a different area, that areas mailing list can be added to
the Cc list.So, to get the best of both worlds, file a bugzilla and note the bugid.
Then email a complete report to the relevant list, but add [BUG <bugid>]
to the subject line and cc bugme-daemon@bugzilla.kernel.org If you do
this, bugzilla will keep track of the entire discussion as it progresses
and allow those who track bugs through bugzilla to get a pretty accurate
idea of the status. You should never need to touch bugzilla again once
the initial bug report is filed: all future information flow is via the
mailing lists.James
--
The problem is that it appears to the casual observer as if they can
then add information to the bug through the web interface. But that
information will never be forwarded to the mailing list. Unless there's
a way of marking bugs as 'unchangable through the web interface' or 'all
messages appended to this bug need to be forwarded', Bugzilla just
doesn't fit our needs.The Debian BTS fits our way of working much better. Perhaps somebody
should investigate a migration.--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
This is excellent observation by Matthew and James. There is no magic
in bugzilla not being loved, it is just "not the right set of features
for effective work on a problem". It doesn't support multiple
developer' collaboration well.
This distaste is not universal, since some people don't have a problem
with bugzilla as is, maybe those who tend to work on problems
"alone"...
But making it to be a workable tool for everyone is definitely worth it.
Any other favorite bugzillas that are nice to work with and that have
--
We have actually been trying for over two years to get bugzilla fixed so
that it suits our email and list publishing workflow for fixing bugs. I
surmise that 90% of our problems with bugzilla could be solved if it
simply tipped a SCSI bug report onto the SCSI list when it was created
in such a way that all replies were gathered back into bugzilla.
Unfortunately, no-one who maintains our bugzilla has actually been able
to make this happen. The other 10% of the problem is that bugzilla
doesn't seem to have a way properly to integrate people who insist on
using its web interface to reply into the email flow.James
--
Actually, Bugzilla *could* be configured so that, say, linux-scsi was
copied for all SCSI bugs (linux-scsi could just be added to the cc
list). The problem though is that Bugzilla will then proceed to cc
linux-scsi for all the Bugzilla state change details which might annoy
the denizens of the linux-scsi list.But if new entries on the Bugzilla entry could be set to forward to
the appropriate mailing list with the messaging *looking* a lot more
like a mail message, I suspect it could be acceptable. One of the
advantages of the Debian BTS is that it's much more integrated into
the e-mail workflow. (Although it lacks the roll up and reporting
capabilities that are beloved by managers...)But hey, it could be worse. We could have chosen the Sourceforge bug
tracker. :-)- Ted
--
Please first pull from scsi-rc-fixes git-tree first. it has a couple
of other fixes for initio plus patch[2] included.
(maybe its already in -mm tree I'm not sure).
I would prefer linux-scsi ml<snip>
Boaz
--
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Matheos Worku | 2.6.24 BUG: soft lockup - CPU#X |
