Hi Andrew & Alex, I get following panic in rpaphp_register_slot() on 2.6.25-rc8-mm1. Known issue ? Thanks, Badari pci_hotplug: PCI Hot Plug PCI Core version: 0.5 rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1 Unable to handle kernel paging request for data at address 0x00000070 Faulting instruction address: 0xc00000000031e4b8 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA pSeries Modules linked in: NIP: c00000000031e4b8 LR: c00000000031dd60 CTR: 0000000000000000 REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 42002022 XER: 20000009 DAR: 0000000000000070, DSISR: 0000000040000000 TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0 GPR00: c0000000006c6e90 c0000000700dbcb0 c000000000769140 c0000000703c1440 GPR04: c000000070226400 0000000000000000 0000000020000009 d000080080000004 GPR08: 0000000000000001 0000000000000000 c000000071fff1b8 0000000000000000 GPR12: 0000000000004000 c000000000690380 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000 GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48 GPR28: c0000000703c1440 c000000070ff7780 c0000000006f7778 c0000000006c6e90 NIP [c00000000031e4b8] .rpaphp_register_slot+0xd8/0x164 LR [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260 Call Trace: [c0000000700dbcb0] [c0000000700dbd70] 0xc0000000700dbd70 (unreliable) [c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260 [c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68 [c0000000700dbea0] [c0000000006103ec] .kernel_init+0x1f8/0x38c [c0000000700dbf90] [c000000000023f80] .kernel_thread+0x4c/0x68 Instruction dump: 48000020 e8bd0020 e89e8010 e87e8048 4bd38c09 60000000 3860fff5 48000080 e93d0028 e89d0030 7f83e378 e9290038 <e9290070> 80a90004 78a5eee2 4bffee7d ---[ end trace 561bb236c800851f ]--- Kernel panic ...
Hi Badari, I know about it now... :-/ Any chance you could turn debugging on for both rpaphp and pci_hotplug? Thanks. --
I did set "debug" variables in rpaphp_core.c and pci_hotplug_core.c to 1. But it didn't give any more information. How do I enable the debug ? I just changed dbg() to printk() in rpaphp_register_slot() to print out more information. Thanks, Badari pci_hotplug: PCI Hot Plug PCI Core version: 0.5 rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1 rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16] Unable to handle kernel paging request for data at address 0x00000070 Faulting instruction address: 0xc00000000031e4ac Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA pSeries Modules linked in: NIP: c00000000031e4ac LR: c00000000031e434 CTR: 800000000013f270 REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 22002022 XER: 00000004 DAR: 0000000000000070, DSISR: 0000000040000000 TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0 GPR00: c0000000006c6f10 c0000000700dbcb0 c0000000007691c0 c000000070372440 GPR04: c000000070217400 0000000000000001 0000000000000000 0000000000000001 GPR08: c0000000007917cc 0000000000000000 00000000000050b3 c0000000007917c8 GPR12: 0000000000004000 c000000000690400 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000 GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48 GPR28: c000000070372440 c00000006e036780 c0000000006f77f8 c0000000006c6f10 NIP [c00000000031e4ac] .rpaphp_register_slot+0xcc/0x158 LR [c00000000031e434] .rpaphp_register_slot+0x54/0x158 Call Trace: [c0000000700dbcb0] [c00000000031e434] .rpaphp_register_slot+0x54/0x158 (unreliable) [c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260 [c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68 [c0000000700dbea0] ...
If those modules are built into your kernel, then you'll need to
modify the kernel command line to add:
rpaphp.debug=1 pci_hotplug.debug=1
If you're manually modprobing them:
$ modprobe pci_hotplug debug=1
Hrm, this is a little more information, but still not quite
enough. I'm going to take a stab in the dark and say I'm probably
doing something wrong on this line, maybe dereferencing a pointer
incorrectly:
retval = pci_hp_register(php_slot, slot->bus,
PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
But what the real answer is, I don't know, since I don't know
anything about POWER.
Linas -- can you help me out and tell me if I'm doing anything
obviously stupid?
Thanks.
--
Sorry. I thought you knew this already. Disassembly clearly showed that slot->dn->child is NULL. I confirmed it by adding printk also. Thanks, Badari --
This patch is a complete guess on my part (since I've not been
able to understand pseries architecture) but I think it should
fix your issue.
Can you give it a try and let me know? It applies on top of the
-mm tree that includes my physical pci_slot series.
Also, I'm hoping Linas will speak up and let me know what the
real answer might be. ;)
Thanks.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
Unpopulated device_node slots do not have children, and
attempting to dereference them will result in a panic.
Instead, attempt to derive the PCI slot number from the bus
itself, and failing that, default to 0.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
index 0d4cfc7..91ce6a6 100644
--- a/drivers/pci/hotplug/rpaphp_slot.c
+++ b/drivers/pci/hotplug/rpaphp_slot.c
@@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
{
struct hotplug_slot *php_slot = slot->hotplug_slot;
int retval;
+ int slot_nr;
dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
__FUNCTION__, slot->dn->full_name, slot->index, slot->name,
@@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
return -EAGAIN;
}
- retval = pci_hp_register(php_slot, slot->bus,
- PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
+ if (slot->bus->self)
+ slot_nr = PCI_SLOT(slot->bus->self->devfn);
+ else
+ slot_nr = 0;
+ retval = pci_hp_register(php_slot, slot->bus, slot_nr);
if (retval) {
err("pci_hp_register failed with error %d\n", retval);
return retval;
--
On Mon, 7 Apr 2008 17:42:55 -0600 Did we hear back from Badari on this? Thanks. --
Haven't heard a peep. :( Badari? Did it help? Thanks. /ac --
No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my machine. I was trying to get some ppc expert to review the patch, as Alex was not comfortable with the fix/hack. Linas is no longer with IBM :( Thanks, Badari --
It didn't help because you're still getting a NULL deref? Or are you seeing some other failure mode? Thanks. /ac --
I am pretty sure it was NULL deref. I can double check again, but have to wait till tomorrow morning - don't have access to console right now. Thanks, Badari --
Alex, I verified this patch again. It seem to have worked this time :( I am not able to reproduce the problem with this problem (tried many times). Sorry for the false alarm :( Thanks, Badari --
Hi Badari, Thanks for testing. Please do let me know if you see any more problems. Ben, I would still appreciate a code review on: http://lkml.org/lkml/2008/4/7/319 Thanks! /ac --
Hi Ben, Could you take a look at this patch and tell me what I'm doing wrong? Thanks. /ac --
Hrm, I'll have a look but I'd need some context first... This is to fix a problem introduced by another serie of patches ? Can you give me some pointers here ? The pSeries PCI stuff can be tricky so I'll need some time tomorrow to context switch and remind myself of everything involved :-) So I'd like to make sure by that time, I have all the elements. Thanks, --
Hi Ben, Thanks for taking a look. The patch series that got accepted into -mm is here: http://lkml.org/lkml/2008/3/25/228 There have been several fixes on top of that series. I'm not sure what the canonical way to refer to -mm patches is, but if you navigate to this URL: http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-m... You'll also need to apply: pci-hotplug-introduce-pci_slot-fix.patch pci-hotplug-introduce-pci_slot-fix-fix.patch pci-hotplug-introduce-pci_slot-fix-2.patch pci-hotplug-introduce-pci_slot-fix-99.patch pci-hotplug-introduce-pci_slot-fix-3.patch pci-hotplug-acpi-pci-slot-detection-driver-fix.patch drivers-acpi-pci_slotc-fix-build-with-config_dmi=n.patch The basic idea, which I keep botching on pSeries, is that when we make a call to pci_hp_register, we now need to pass it: pci_hp_register(struct hotplug_slot *slot, struct pci_bus *bus, int slot_nr) I am having trouble figuring out the slot_nr argument. Basically, I want to get the devfn of the slot we're looking at. Thanks again. --
On Wed, 16 Apr 2008 11:11:27 -0600 I folded it all down to two patches. Unfortunately they don't apply very well to mainline: 1 out of 4 hunks FAILED -- saving rejects to file drivers/pci/hotplug/acpiphp_core.c.rej 2 out of 3 hunks FAILED -- saving rejects to file drivers/pci/hotplug/acpiphp_ibm.c.rej 1 out of 6 hunks FAILED -- saving rejects to file drivers/pci/hotplug/pciehp_core.c.rej 1 out of 4 hunks FAILED -- saving rejects to file drivers/pci/hotplug/shpchp_core.c.rej 1 out of 5 hunks FAILED -- saving rejects to file include/linux/pci.h.rej I guess because I reworked everything to fit on Greg's tree(s). fwiw, http://userweb.kernel.org/~akpm/bh.gz (agains -rc9) contains all of -mm up to and including pci-hotplug-acpi-pci-slot-detection-driver.patch and is suitable for review/repair/etc. --
Ok, I'll use that. Cheers, Ben. --
Hi Ben, *poke* Any update on this? Thanks, /ac --
Not yet no. I've looked a bit, but ran out of time, I'll look more this week-end or next week. I'm also trying to figure out who in IBM is responsible for that hotplug stuff so they can get involved too. BTW. How would you do if the answer was we simply can't declare hotplug "slots" ? ie. if I end up finding out (which is what I think will happen) that there is simply no way for us to know in advance any concept of "slot" with a devfn for hotplug ? Basically, when doing hotplug, the hypervisor sends us new bits of device-tree with things potentially ranging from host bridges, P2P bridges to devices, but I'm not certain at this stage we can know in advance where they'll hook up (in fact, for PHBs, we can't for sure) and thus even if we end up supporting hotplug of actual slots, we don't even know in advance the devfn where devices will appear. It's all hidden from us by the hypervisor. How would that fit in your infrastructure ? Can we just disable usage of your slots abstraction in our case ? Cheers, Ben. --
Hm, I may be getting lost in the twisty maze of pseries, but I guess I don't really understand this statement. Today, before calling rpaphp_register_slot, we first call rpaphp_enable_slot. rpaphp_enable_slot makes a call to pcibios_find_pci_bus which must succeed before we ever try to register the slot. So if there is a pci_bus, then it must have a ->self, which means it has a ->devfn. Right? Are you saying that it is not accurate to use this I suppose you could just pass in 0 as slot_nr/devfn. That is what my fixup patch did if it couldn't find a pci_bus->self. The result would be that for a given pci_bus, you would only see the first "slot" with this 0 slot_nr appear in sysfs, and it would have whatever name originally associated with your dn. I think if I were to understand more about this issue, we could figure out a better solution... thanks, /ac --
Yes but we can have a PHB with no self and suddenly the HV brings in multiple devices behind it. Oh well, I need to dig more, if I manage to get the bloody hotplug stuff working here at all.. In some case, we don't even have the PHB, so nothing will appear in sysfs but that's probably mostly harmless. The main thing is that those hotplug operations are never driven that way, they are driven by the management console which talks to a daemon which talks to the HV etc... Ben. --
I haven't looked in details yet, but I can already tell that things on pSeries aren't that simple because we don't necessarily know in advance about "slots"... When you add a physical PCI device to a partition, things can range from: - A device gets added to an existing bridge (ie. slot) - A whole P2P brigde gets added with that device below it (that's a slot too) - A whole PCI host bridge gets added with that device below it (or with a P2P bridge below it and the device below that). In the later case, it's hard to have any concept of slot since you don't know anything about the host bridge before it gets added to your partition :-) So I'm not sure how much we can use of your slot infrastructure, I'll have to look, I suspect it can cover some cases but not all of them. Cheers, Ben. --
