Re: Issue with qla2xxx_probe_one

Previous thread: [PATCH 10/10] sysfs: user namespaces: add ns to user_struct by Benjamin Thery on Tuesday, April 29, 2008 - 10:11 am. (3 messages)

Next thread: PCI tree info by Jesse Barnes on Tuesday, April 29, 2008 - 11:00 am. (5 messages)
From: Alan D. Brunelle
Date: Tuesday, April 29, 2008 - 10:34 am

I /think/ that there is an issue with this routine /if/ the firmware
images are not loaded properly - on a 16-way ia64 box I am starting to
see this with an up-stream kernel (Jens Axboe's origin/io-cpu-affinity
branch). In any event, it looks to me that :

        if (qla2x00_initialize_adapter(ha)) {
                qla_printk(KERN_WARNING, ha,
                    "Failed to initialize adapter\n");

                DEBUG2(printk("scsi(%ld): Failed to initialize adapter - "
                    "Adapter flags %x.\n",
                    ha->host_no, ha->device_flags));

                ret = -ENODEV;
                goto probe_failed;
        }

skips around:

        ret = scsi_add_host(host, &pdev->dev);

which is needed to properly initialize the freelist (via:
scsi_setup_command_freelist).

When qla2xxx_probe_one ends up calling scsi_host_put in this error path
it eventually gets to scsi_destroy_command_freelist and we get the error
below.

There's a lot of code here to go through for me, but perhaps someone out
there has a quicker way of figuring out what is really wrong and/or
being able to provide a fix.

BTW: I have had the issue with firmware for a while, just never gotten
around to fixing it - typically just:

modprobe -r qla2xxx
modprobe qla2xxx

has gotten it to work in the past, but now with the NaT issue I can't
unload and reload the module.

Alan D. Brunelle
HP

=========================================================

qla2xxx 0000:2a:01.0: Found an ISP2312, irq 100, iobase 0xc0000f4010040000
qla2xxx 0000:2a:01.0: Configuring PCI space...
qla2xxx 0000:2a:01.0: Configure NVRAM parameters...
qla2xxx 0000:2a:01.0: Verifying loaded RISC code...
qla2xxx 0000:2a:01.0: Firmware image unavailable.
qla2xxx 0000:2a:01.0: Firmware images can be retrieved from:
ftp://ftp.qlogic.com/outgoing/linux/firmware/.
qla2xxx 0000:2a:01.0: Failed to initialize adapter
insmod[1828]: NaT consumption 17179869216 [1]
Modules linked in: qla2xxx(+) firmware_class ...
From: Andrew Vasquez
Date: Tuesday, April 29, 2008 - 10:44 am

Wasn't something like this posted recently to linux-scsi:

http://lkml.org/lkml/2008/4/27/333

this is sitting in scsi-misc-2.6.git:

[SCSI] bug fix for free list handling
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a79cbe...

--

From: Alan D. Brunelle
Date: Tuesday, April 29, 2008 - 1:12 pm

My apologies for not having seeing that.

But after looking at it, doesn't it still have a hole?

o  scsi_setup_command_freelist initializes the free_list list.

o  It then invokes scsi_get_host_cmd_pool, if this fails there is no
need to invoke scsi_put_host_cmd_pool (it wasn't gotten).

o  If scsi_get_host_cmd_pool succeeds but scsi_pool_alloc_command fails,
it will (correctly) invoke scsi_put_host_cmd_pool.

However, if either of scsi_get_host_cmd_pool or scsi_put_host_cmd_pool
happens to fail, we'll end up in scsi_destroy_command_freelist - and
since the free_list was initialized, the while loop will be bypassed,
but scsi_put_host_cmd_pool will be invoked an extra time. And this is
badness, right?

Wouldn't the attached patch [boot tested on my previously failing
system] be correct (and perhaps cleaner - you're not looking at the
innards of the list data structure to determine things)?

Alan
From: Andrew Vasquez
Date: Tuesday, April 29, 2008 - 2:26 pm

...
<snip>

Hmm, I'll defer to James B. on that...

--
av
--

From: FUJITA Tomonori
Date: Tuesday, April 29, 2008 - 3:57 pm

On Tue, 29 Apr 2008 16:12:51 -0400

scsi_put_host_cmd_pool doesn't fail but I think that you are right. If
scsi_get_host_cmd_pool or scsi_pool_alloc_command in

Looks correct to me.
--

From: James Bottomley
Date: Tuesday, April 29, 2008 - 5:39 pm

Yes, that looks like a better fix.  I tidied up your change log, because
it's helpful to identify the original problem commit, but otherwise
applied it unchanged.

Thanks,

James


--

Previous thread: [PATCH 10/10] sysfs: user namespaces: add ns to user_struct by Benjamin Thery on Tuesday, April 29, 2008 - 10:11 am. (3 messages)

Next thread: PCI tree info by Jesse Barnes on Tuesday, April 29, 2008 - 11:00 am. (5 messages)