login
Header Space

 
 

CONFIG_PREEMPT causes corruption of application's FPU stack

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <linux-kernel@...>
Date: Saturday, May 17, 2008 - 12:31 pm

I am running the Einstein@home application (version 4.35, 
http://einstein.phys.uwm.edu).This application does lots of computations 
mostly with FPU and SSE instructions.
After I started experimenting with real-time optimized kernels the 
application began to crash with floating point errors like in the 
following message:

APP DEBUG: Application caught signal 8.

FPU status word ffffa0e1, flags:  ERR_SUMM STACK_FAULT PRECISION INVALID
Obtained 6 stack frames for this thread.
Use gdb command: 'info line *0xADDRESS' to print corresponding line 
numbers.
einstein_S5R3_4.35_i686-pc-linux-gnu[0x8069e7e]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x818d436]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x805db8f]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x806b11c]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7e14fe0]
einstein_S5R3_4.35_i686-pc-linux-gnu(shmat+0x59)[0x804bda1]
Stack trace of LAL functions in worker thread:
GetSemiCohToplist at line 3177 of 
file /home/bema/einsteinathome/HierarchicalSearch/EaH_build_release_einstein_S5R3_4.35/extra_sources/lalapps-CVS/src/pulsar/hough/src2/HierarchicalSearch.c
At lowest level status code = 0, description: NO LAL ERROR REGISTERED
called boinc_finish

I tracked this down to a single kernel configuration option. If 
CONFIG_PREEMPT is set to 'y' the application will start crashing. If 
CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the application 
will run without errors.

The problem is reproducible in so far as the error always occurs when 
CONFIG_PREEMPT is set, but the time to the first occurrence varies greatly 
from some minutes up to more than 10 CPU hours.

I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. I verified 
the problem on the following kernel versions:

openSUSE 2.6.22.17-0.1-default
openSUSE 2.6.23.17-ccj64-rt
kernel.org 2.6.26-rc1
kernel.org 2.6.26-rc2-git5

My CPU is an Intel Core2Duo 6420, running two of the Einstein applications 
in 32-bit mode. From a discussion on the Einstein message boards I know 
that other user of the application are also affected.

Please let me know if you need any additional information to track this 
down.
              Jürgen
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
CONFIG_PREEMPT causes corruption of application's FPU stack, Jürgen Mell, (Sat May 17, 12:31 pm)
speck-geostationary