* Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote:This should work. Untested for now. Can you give it a try ? Fix Immediate Deadlock Jason Baron <jbaron@redhat.com> : In testing this patch, i've run across a deadlock...apply_imv_update() can get called again before, ipi_busy_loop() has had a chance to finsh, and set wait_sync back to its initial value. This causes ipi_busy_loop() to get stuck indefinitely and the subsequent apply_imv_update() hangs. I've shown this deadlock below using nmi_watchdog=1 in item 1). Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> : Hrm, yes, Jan pointed out the exact same problem in my ltt-test-tsc TSC test module in LTTng a few days ago. His fix implied to add another barrier upon which the smp_call_function() caller must wait for the ipis to finish. Since this immediate value code does the same I did in my ltt-test-tsc code, the same fix will likely apply. Thanks to Jason Baron for finding this bug and proposing an initial implementation. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Jason Baron <jbaron@redhat.com> --- kernel/immediate.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) Index: linux-2.6-lttng/kernel/immediate.c =================================================================== --- linux-2.6-lttng.orig/kernel/immediate.c 2008-02-26 18:16:10.000000000 -0500 +++ linux-2.6-lttng/kernel/immediate.c 2008-02-26 18:29:32.000000000 -0500 @@ -53,12 +53,12 @@ static void ipi_busy_loop(void *arg) do { /* Make sure the wait_sync gets re-read */ smp_mb(); - } while (atomic_read(&wait_sync) > loop_data.value); + } while (atomic_read(&wait_sync) > 2 * loop_data.value); atomic_dec(&wait_sync); do { /* Make sure the wait_sync gets re-read */ smp_mb(); - } while (atomic_read(&wait_sync) > 0); + } while (atomic_read(&wait_sync) > loop_data.value); /* * Issuing a synchronizing instruction must be done on each CPU before * reenabling interrupts after modifying an instruction. Required by @@ -67,6 +67,7 @@ static void ipi_busy_loop(void *arg) sync_core(); flush_icache_range(loop_data.imv->imv, loop_data.imv->imv + loop_data.imv->size); + atomic_dec(&wait_sync); local_irq_restore(flags); } @@ -113,7 +114,7 @@ static int apply_imv_update(const struct kernel_text_lock(); get_online_cpus(); online_cpus = num_online_cpus(); - atomic_set(&wait_sync, 2 * online_cpus); + atomic_set(&wait_sync, 3 * online_cpus); loop_data.value = online_cpus; loop_data.imv = imv; smp_call_function(ipi_busy_loop, NULL, 1, 0); @@ -122,7 +123,7 @@ static int apply_imv_update(const struct do { /* Make sure the wait_sync gets re-read */ smp_mb(); - } while (atomic_read(&wait_sync) > online_cpus); + } while (atomic_read(&wait_sync) > 2 * online_cpus); text_poke((void *)imv->imv, (void *)imv->var, imv->size); /* @@ -133,6 +134,12 @@ static int apply_imv_update(const struct atomic_dec(&wait_sync); flush_icache_range(imv->imv, imv->imv + imv->size); + /* + * Wait until all other CPUs are done so that we don't overwrite + * loop_data or wait_sync prematurely. + */ + while (unlikely(atomic_read(&wait_sync) > 1)) + cpu_relax(); local_irq_restore(flags); put_online_cpus(); kernel_text_unlock(); -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
| Borislav Petkov | 2.6.23-rc1: no setup signature found... |
| Andrew Morton | Re: [PATCH] Memory management livelock |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Steven King | ti_usb_3410_5052 breakage in 2.6.24-rc1 |
git: | |
| Linus Torvalds | cleaner/better zlib sources? |
| Theodore Ts'o | [PATCH] Add --no-reuse-delta, --window, and --depth options to git-gc |
| Karl | Re: [PATCH] Add a birdview-on-the-source-code section to the user manual |
| Yossi Leybovich | corrupt object on git-gc |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Richard Stallman | Re: Real men don't attack straw men |
| David Newman | setting dscp or tos bits |
| Stijn | Re: [i386/Thinkpad T41]USB mouse + Xorg obsd 4.1 |
| Quentin Garnier | [cube@cubidou.net: Re: Moving ethfoo in the main tree] |
| cube | Moving ethfoo in the main tree |
| Blair Sadewitz | PCI latency timer values |
| Oliver Gould | kqemu: driver(9)/filedesc(9) questions |
