Hi.
I looked at the sys_init_module() and found that the ->init callback
for the module is called without the module_mutex held and *after*
the module's symbols are exported. Doesn't this create the race when
loading two modules in parallel? Like this.Consider the first module to be (without any internal locking for
simplicity)=== foo.c ===
static struct list_head foo_list;void foo_register(struct list_head *elem)
{
list_add(elem, &foo_list);
}
EXPORT_SYMBOL(foo_register);static int foo_init(void)
{
INIT_LIST_HEAD(&foo_list);
}
module_init(foo_init);
===and the second one
=== bar.c ===
struct bar_struct {
struct list_head list;
...
} bar;static int bar_init(void)
{
foo_register(&bar.list);
}
module_init(bar_init);
===If I understand the sys_init_module() right, the following code
flow is possible:1CPU 2CPU
sys_init_module(/* foo module */) sys_init_module(/* bar module */)
mutex_lock(&module_mutex);
load_module();
/* export the foo_init here */
mutex_unlock(&module_mutex);
mutex_lock(&module_mutex);
load_module();
/* resolve the foo_init here */
mutex_unlock(&mudule_mutex);
mod->init(); /* bar_init */
/*
* OOPS! The foo_list is not ready yet, because the foo_init
* is not called yet.
*/
mod->init(); /* foo_init... too late */Is this analysis correct?
Thanks,
Pavel
-
Hi Pavel,
In a word, no. See "strong_try_module_get()".
Cheers,
Rusty.
-
Hi Rusty,
I've seen a symbol-resolving race on s390. The qeth module uses symbols
from qdio and although the loading order seems correct and the qdio
symbols should be available the following error appears:qdio: loading QDIO base support version 2
qeth: Unknown symbol qdio_synchronize
qeth: Unknown symbol do_QDIO
qeth: Unknown symbol qdio_initialize
qeth: Unknown symbol qdio_cleanup
qeth: Unknown symbol qdio_activate
qeth: Unknown symbol qdio_synchronize
qeth: Unknown symbol do_QDIO
qeth: Unknown symbol qdio_initialize
qeth: Unknown symbol qdio_cleanup
qeth: Unknown symbol qdio_activate
qeth: loading qeth S/390 OSA-Express driver
qeth: Device 0.0.f5f0/0.0.f5f1/0.0.f5f2 is a OSD Express card (level: 087a) with link type OSD_1000 (portname: OSAPORT)
qeth: Hardware IP fragmentation not supported on eth0
qeth: VLAN enabled
qeth: Multicast enabled
qeth: IPV6 enabled
qeth: Broadcast enabled
qeth: Using SW checksumming on eth0.
qeth: Outbound TSO enabledAfter that both drivers work fine but I'm curious why this happens.
Cheers,
Jan-
FWIW i see such messages all the time with usb serial on my workstation too.
It's not new.For some reason it never makes it to the log file though, i only see
it flashing by on the console.-Andi
-
Looks like qdio does something which triggers qeth to load, but of course qdio
isn't finished initializing yet so its symbols aren't available.It's not obvious what's triggering the load, but you could probably find it by
using printk's through qdio.c's init_QDIO().Cheers,
Rusty.
-
Digging through the module loader I found what triggers the error...
CPU0 (sys_init_module for qdio) CPU1 (sys_init_module for qeth)
mutex_lock()
-> load_module()
mod->state = COMING
mutex_unlock()
mutex_lock()
init().......takes some time load_module()
-> resolve_symbols()
-> use_module()
-> stong_try_module_get() bails out
because state == COMING
-> simplify_symbols() complains with the warningmutex_lock()
mod->state = LIVE
mutex_unlock()So is it correct that sys_init_module() is called for qeth even if qdio is
not yet in MODULE_STATE_LIVE?-jang
-
Jan - did you get anything?
Jon.
-
I still think this is a bug in the module loader since qdio does not
depent on qeth, qdio can be modprobe'd without automatically loading
qeth. Adding a printk to the end of init_QDIO() inidicates that
qeth_init() runs _after_ init_QDIO() is finished.Here is the debug output if DEBUGP is enabled in module.c:
d to find symbol journal_get_write_access
Failed to find symbol journal_unlock_updates
Failed to find symbol journal_lock_updates
Failed to find symbol journal_stop
Failed to find symbol journal_extend
Failed to find symbol journal_restart
Failed to find symbol journal_start
load_module: umod=0000020000000010, len=265456, uargs=00000000800905a0
Core section allocation order:
.text
.fixup
.exit.text
.rodata
.rodata.str1.2
__ex_table
__versions
.eh_frame
.symtab
.strtab
.data
.data.rel.ro
.data.rel
.data.rel.local
.gnu.linkonce.this_module
.bss
Init section allocation order:
.init.text
final section addresses:
0x208953a8 .text
0x208ae0a8 .fixup
0x208ae0f0 .exit.text
0x20815000 .init.text
0x208ae130 .rodata
0x208af2e8 .rodata.str1.2
0x208b1ed0 __ex_table
0x208b1f40 __versions
0x208b57c0 .eh_frame
0x208c00cc .data
0x208c00d0 .data.rel.ro
0x208c0428 .data.rel
0x208c0970 .data.rel.local
0x208c0e00 .gnu.linkonce.this_module
0x208c5100 .bss
0x208b7bd0 .symtab
0x208bd558 .strtab
ext3 does not use jbd!
Allocating new usage for ext3.
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
Absolute symbol: 0x00000000
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
ext3 uses jbd!
load_module: umod=0000020000000010, len=148432, uargs=00000000800...
Of course it does: it has to.
qeth needs qdio, and qdio hasn't finished init when qeth tries to load.
Something tries to load qeth again, and this time it works (qdio's init has
presumably finished).We fail rather than sleep in the "dependency isn't ready" case. Partially
because it's not happened before, but partially because we risk nasty loops.Cheers,
Rusty.
-
If we fail since we have to in the "dependency isn't ready" case then
the warning seems quite useless. I'll send a patch in a seperate mail
to remove the warning.-jang
-
| Peter Zijlstra | [RFC][PATCH 7/7] lockdep: spin_lock_nest_lock() |
| Gabriel C | Re: 2.6.24-rc2-mm1 |
| Andrew Morton | Re: [PATCH 2.6.21] cramfs: add cramfs Linear XIP |
| Jiri Kosina | Re: 2.6.21-rc5-mm4 |
git: | |
| Gregory Haskins | [RFC PATCH 00/17] virtual-bus |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
