sysdev .suspend/.resume is invoked with irq disabled, GFP_ATOMIC should be used in lapic_resume. Without this patch, I got the following warning messages after resume. [ 109.780371] BUG: sleeping function called from invalid context at mm/slub.c:793 [ 109.780782] in_atomic(): 0, irqs_disabled(): 1, pid: 1391, name: bash [ 109.781024] Pid: 1391, comm: bash Not tainted 2.6.37-rc5+ #182 [ 109.781264] Call Trace: [ 109.781501] [<ffffffff8104156a>] __might_sleep+0xeb/0xf0 [ 109.781743] [<ffffffff8110d5e2>] slab_pre_alloc_hook.clone.33+0x28/0x31 [ 109.781987] [<ffffffff8110dcef>] __kmalloc+0x88/0x115 [ 109.782224] [<ffffffff81025e4a>] ? kzalloc.clone.19+0x13/0x15 [ 109.782465] [<ffffffff81025e4a>] kzalloc.clone.19+0x13/0x15 [ 109.782704] [<ffffffff81025ff0>] alloc_ioapic_entries+0x20/0x82 [ 109.782948] [<ffffffff81024201>] lapic_resume+0x3a/0x245 [ 109.783189] [<ffffffff813a8329>] ? cpufreq_resume+0x30/0xb0 [ 109.783431] [<ffffffff812ec4ee>] __sysdev_resume+0x25/0xc5 [ 109.783673] [<ffffffff812ec644>] sysdev_resume+0xb6/0xfb [ 109.783914] [<ffffffff81083256>] suspend_devices_and_enter+0x13c/0x1c1 [ 109.784159] [<ffffffff810833b8>] enter_state+0xdd/0x12e [ 109.784398] [<ffffffff81082a61>] state_store+0xae/0xcb [ 109.784640] [<ffffffff81225737>] kobj_attr_store+0x17/0x19 [ 109.784882] [<ffffffff81173311>] sysfs_write_file+0x114/0x150 [ 109.785124] [<ffffffff8111aebb>] vfs_write+0xac/0xff [ 109.785365] [<ffffffff8111b0c2>] sys_write+0x4a/0x6e [ 109.785604] [<ffffffff8100ac82>] system_call_fastpath+0x16/0x1b Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- arch/x86/include/asm/io_apic.h | 2 +- arch/x86/kernel/apic/apic.c | 4 ++-- arch/x86/kernel/apic/io_apic.c | 7 +++---- 3 files changed, 6 insertions(+), 7 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c +++ ...
You can't do the allocation before disabling irqs when intr_remapping_enabled is set? --
yes, we can. The first idea came into my mind is to register a pm notifier callback to allocate/free the memory. But that one duplicates the code of alloc_ioapic_entries, which doesn't look nice, neither. Plus, is there any problem with this one? thanks, --
We try to avoid GFP_ATOMIC whenever possible and this seems like a particularly trivial case. You can simply move the alloc_ioapic_entries() and NULL check before local_irq_save() and GFP_KERNEL will work fine. --
I'm afraid not.
lapic_resume is invoked in sysdev_resume, which is done with irq
disabled, please refer to the code in kernel/power/suspend.c.
arch_suspend_disable_irqs();
BUG_ON(!irqs_disabled());
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE) &&
pm_check_wakeup_events()) {
error = suspend_ops->enter(state);
events_check_enabled = false;
}
sysdev_resume();
}
arch_suspend_enable_irqs();
BUG_ON(irqs_disabled());
To pre-allocate the memory, we need to build a notifier bloack and call
register_pm_notifier.
thanks,
--
what about this one? Note that it just builds okay, I have not test the patch yet. pre-allocate the memory used in lapic_resume because lapic_resume, as a sysdev .resume callback, is always invoked with irq disabled. Without this patch, I got the following warning messages after resume. [ 109.780371] BUG: sleeping function called from invalid context at mm/slub.c:793 [ 109.780782] in_atomic(): 0, irqs_disabled(): 1, pid: 1391, name: bash [ 109.781024] Pid: 1391, comm: bash Not tainted 2.6.37-rc5+ #182 [ 109.781264] Call Trace: [ 109.781501] [<ffffffff8104156a>] __might_sleep+0xeb/0xf0 [ 109.781743] [<ffffffff8110d5e2>] slab_pre_alloc_hook.clone.33 +0x28/0x31 [ 109.781987] [<ffffffff8110dcef>] __kmalloc+0x88/0x115 [ 109.782224] [<ffffffff81025e4a>] ? kzalloc.clone.19+0x13/0x15 [ 109.782465] [<ffffffff81025e4a>] kzalloc.clone.19+0x13/0x15 [ 109.782704] [<ffffffff81025ff0>] alloc_ioapic_entries+0x20/0x82 [ 109.782948] [<ffffffff81024201>] lapic_resume+0x3a/0x245 [ 109.783189] [<ffffffff813a8329>] ? cpufreq_resume+0x30/0xb0 [ 109.783431] [<ffffffff812ec4ee>] __sysdev_resume+0x25/0xc5 [ 109.783673] [<ffffffff812ec644>] sysdev_resume+0xb6/0xfb [ 109.783914] [<ffffffff81083256>] suspend_devices_and_enter +0x13c/0x1c1 [ 109.784159] [<ffffffff810833b8>] enter_state+0xdd/0x12e [ 109.784398] [<ffffffff81082a61>] state_store+0xae/0xcb [ 109.784640] [<ffffffff81225737>] kobj_attr_store+0x17/0x19 [ 109.784882] [<ffffffff81173311>] sysfs_write_file+0x114/0x150 [ 109.785124] [<ffffffff8111aebb>] vfs_write+0xac/0xff [ 109.785365] [<ffffffff8111b0c2>] sys_write+0x4a/0x6e [ 109.785604] [<ffffffff8100ac82>] system_call_fastpath+0x16/0x1b Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- arch/x86/kernel/apic/apic.c | 38 +++++++++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 5 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/apic.c =================================================================== --- ...
Quite frankly, I very much prefer the simple patch doing an atomic allocation. Using GFP_KERNEL in a sysdev callback is a plain bug and should be fixed, preferably in the simplest possible way (ie. replace GFP_KERNEL with GFP_ATOMIC). The question whether or not we can avoid allocating memory in there is a different one and I'm not sure if the solution below is the most straightforward one. Thanks, --
