Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Alexey Dobriyan
Date: Monday, June 23, 2008 - 5:50 pm

[rcutorture failures with PREEMPT_RCU]

Status update:
* bug is reproduced on another box with the very same symptoms:
  SMP=y, maxcpus=1 kernel occasionally fails, SMP=n is fine.
  Also Core 2 Duo, x86_64 [1]

  Race is wide -- 60 seconds of rcutorture is enough.

So far tried without effect:
	not doing SMP-alternatives
	NO_HZ=y/n
	HIGH_RES_TIMERS=y/n
	compiling with gcc 3.4.6/4.1.2
	different HZ
	s/asm/asm volatile/g at percpu asm code and PDA asm code
	turning on and off varying CONFIG_DEBUG_ options
	CONFIG_DEBUG_PREEMPT
	softlockup on/off
	making x86_64 cpu_idle() same as 32-bit one wrt rcu_pending et al
	sched_setaffinity() in __synchronize_sched doesn't fail

Probably forgot something, but not a single thing that can remove the
bug in SMP=y case.

Using SMP percpu stuff for UP case miserably failed because of some hard
hang due to incomplete patch, but I still leave this for doomsday.

I'm going to try 32-bit setup and reading rcupreempt disassembly with
microscope.

[1]

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
stepping	: 11
cpu MHz		: 800.000
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips	: 4791.74
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
stepping	: 11
cpu MHz		: 800.000
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips	: 4787.76
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:
-----------------------------------------------------------------------------
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6400  @ 2.13GHz
stepping	: 2
cpu MHz		: 2135.041
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 4272.61
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6400  @ 2.13GHz
stepping	: 2
cpu MHz		: 2135.041
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 4270.14
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.26-rc5-git2: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sat Jun 7, 1:38 pm)
[Bug #10493] mips BCM47XX compile error, Rafael J. Wysocki, (Sat Jun 7, 1:38 pm)
[Bug #10764] some serial configurations are now broken, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10748] dhclient fails to run; capabilities error, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10741] bug in `tty: BKL pushdown'?, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10726] x86-64 NODES_SHIFT compile failure., Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10724] ACPI: EC: GPE storm detected, disabling EC GPE, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10725] Write protect on on, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10714] Badness seen on 2.6.26-rc2 with lockdep enabled, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10629] 2.6.26-rc1-$sha1: RIP __d_lookup+0x8c/0x160, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10616] Horrendous Audio Stutter - current git, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #9791] Clock is running too fast^Wslow using acpi_pm ..., Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10861] 2.6.26-rc4-git2 - long pause during boot, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10874] blackfin drivers/net/smc91x.c build error, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10873] serial/bfin_5xx.c build error, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10864] [regression][bisected] ~90,000 wakeups as of ..., Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10863] kvm causing memory corruption? now 2.6.26-rc4, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10862] forcedeth: lockdep warning on ethtool -s, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10860] total system freeze at boot with 2.6.26-rc, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10830] two different oopses with 2.6.26-rc4, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10823] stuck localhost TCP connections, v2.6.26-rc3+, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10827] 2.6.26rc4 GFS2 oops., Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10826] NFS oops in 2.6.26rc4, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10825] appletouch after wakeup, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10819] Fatal DMA error with b43 driver since 2.6.26, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10821] rt25xx: lock dependancy warning, association ..., Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10816] vt/fbcon: fix background color on line feed, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10799] sky2 general protection fault, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10794] mips: CONF_CM_DEFAULT build error, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10787] pcie hotplug bootup crash fix, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
[Bug #10786] 2.6.26-rc3 64bit SMP does not boot on J5600, Rafael J. Wysocki, (Sat Jun 7, 1:42 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Sat Jun 7, 3:54 pm)
Re: [Bug #10816] vt/fbcon: fix background color on line feed, Rafael J. Wysocki, (Sun Jun 8, 9:52 am)
Re: [Bug #10825] appletouch after wakeup, Rafael J. Wysocki, (Sun Jun 8, 9:53 am)
Re: [Bug #10860] total system freeze at boot with 2.6.26-rc, Rafael J. Wysocki, (Sun Jun 8, 9:58 am)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Sun Jun 8, 11:26 am)
Re: [Bug #10787] pcie hotplug bootup crash fix, Kenji Kaneshige, (Mon Jun 9, 1:22 am)
Re: [Bug #10825] appletouch after wakeup, Oliver Neukum, (Mon Jun 9, 2:07 am)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 9:18 am)
Re: [Bug #10825] appletouch after wakeup, Oliver Neukum, (Mon Jun 9, 12:53 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 1:29 pm)
Re: [Bug #10825] appletouch after wakeup, Oliver Neukum, (Mon Jun 9, 1:31 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 1:52 pm)
Re: [Bug #10825] appletouch after wakeup, Oliver Neukum, (Mon Jun 9, 2:16 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 3:04 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 3:36 pm)
Re: [Bug #10825] appletouch after wakeup, Oliver Neukum, (Mon Jun 9, 3:40 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 5:03 pm)
Re: [Bug #10825] appletouch after wakeup, Jiri Kosina, (Mon Jun 9, 5:06 pm)
Re: [Bug #10825] appletouch after wakeup, Justin Mattock, (Mon Jun 9, 6:11 pm)
Re: [Bug #10493] mips BCM47XX compile error, Adrian Bunk, (Wed Jun 11, 9:17 am)
Re: [Bug #10794] mips: CONF_CM_DEFAULT build error, Adrian Bunk, (Wed Jun 11, 11:51 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Wed Jun 11, 1:30 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Adrian Bunk, (Fri Jun 13, 6:52 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sat Jun 14, 7:42 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Oleg Nesterov, (Sat Jun 14, 7:58 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sat Jun 14, 11:12 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Alexey Dobriyan, (Sat Jun 14, 12:43 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sat Jun 14, 8:30 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Alexey Dobriyan, (Sun Jun 15, 9:21 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sun Jun 15, 11:17 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Linus Torvalds, (Sun Jun 15, 1:32 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sun Jun 15, 4:26 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sun Jun 15, 4:27 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Linus Torvalds, (Sun Jun 15, 4:38 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Alexey Dobriyan, (Sun Jun 15, 8:01 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Sun Jun 15, 8:31 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Alexey Dobriyan, (Sun Jun 15, 8:46 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Vegard Nossum, (Mon Jun 16, 6:53 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Mon Jun 16, 8:42 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Alexey Dobriyan, (Mon Jun 23, 5:50 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Linus Torvalds, (Mon Jun 23, 6:31 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Nick Piggin, (Mon Jun 23, 6:51 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Tue Jun 24, 5:04 am)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Tue Jun 24, 2:08 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Ingo Molnar, (Tue Jun 24, 2:15 pm)
Re: [Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Paul E. McKenney, (Wed Jun 25, 2:04 am)