Find command causes occasional panic

Submitted by Anonymous
on January 25, 2008 - 8:03am

Lately I have been seeing kernel panics caused by the use of the find (from GNU findutils 4.1.20 on kernel 2.6.9-55.0.12.EL with CentOS.
Can anyone translate what is happening based on what I pulled from /var/log/messages after a reboot?

progmain kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
progmain kernel: printing eip:
progmain kernel: c018fa8d
progmain kernel: *pde = 1a3b7001
progmain kernel: Oops: 0000 [#1]
progmain kernel: SMP
progmain kernel: Modules linked in: md5(U) ipv6(U) parport_pc(U) lp(U) parport(U) tun(U) thermal(U) processor
(U) uhci_hcd(U) ehci_hcd(U) bcm5700(U) tg3(U) sg(U) st(U) aic79xx(U) ext3(U) jbd(U) ata_piix(U) libata(U) aacraid(U) sd_mod(U
) scsi_mod(U)
progmain kernel: CPU: 2
progmain kernel: EIP: 0060:[] Not tainted VLI
progmain kernel: EFLAGS: 00010246 (2.6.9-55.0.12.EL.p44g)
progmain kernel: EIP is at sysfs_readdir+0xc7/0x1ea
progmain kernel: eax: 00000000 ebx: f7cd3580 ecx: ffffffff edx: 00000002
progmain kernel: esi: f7cd3584 edi: 00000000 ebp: f7cefb84 esp: d8005f44
progmain kernel: ds: 007b es: 007b ss: 0068
progmain kernel: Process find (pid: 11024, threadinfo=d8004000 task=da9f3150)
progmain kernel: Stack: 0000000e 00000000 00001941 00000004 00000000 c82d984c 00018800 c82d9840
progmain kernel: c016b9f6 d8005fa0 ed0e5340 c02f6340 ed0e5340 f7c76550 f7c765bc c016b6cb
progmain kernel: d8005fa0 c016b9f6 09cb4f84 ffffffda ed0e5340 00000000 c016bcc4 09cb512c
progmain kernel: Call Trace:
progmain kernel: [] filldir64+0x0/0x10d
progmain kernel: [] vfs_readdir+0x77/0x8c
progmain kernel: [] filldir64+0x0/0x10d
progmain kernel: [] sys_getdents64+0xb2/0xba
progmain kernel: [] syscall_call+0x7/0xb
progmain kernel: [] unix_shutdown+0xa9/0x11e
progmain kernel: Code: 14 85 ff 75 12 8b 36 39 74 24 14 75 ee 83 c4 2c 31 c0 5b 5e 5f 5d c3 89 d8 e8 7a eb ff
ff b9 ff ff ff ff 89 44 24 10 89 c7 31 c0 ae f7 d1 49 0f b7 43 1c 8b 53 20 66 c1 e8 0c 89 54 24 08 8b
progmain kernel: <0>Fatal exception: panic in 5 seconds
progmain syslogd 1.4.1: restart.

I am at a loss as to why find would do this. The command that was being used at the time was roughly: 'find / -name blah* -exec rm -f {} \;'

deleting stuff in /sys

strcmp
on
January 25, 2008 - 1:32pm

of course this is a kernel bug that should not happen, but the crashed function is sysfs_readdir() (readdir gathers the list of filenames in a directory) and /sys is a strange place for recursive deletes anyway. if your recursive delete is the only trigger for this bug, the best work around is to stop doing that. deleting files in the virtual filesystems /sys and /proc should not work anyway and you should not even try there and in /dev, so you should rewrite your find to skip these directories anyway, this is just wrong. maybe deleting something in /sys (which should have failed but that may be the bug) caused the oops later on; there have historically been refcounting problems in /sys.

have you entered this into the centos bugtracker? they and redhat may be the only people still willing to maintain 2.6.9(+ rh patches).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.