Re: [DRBD-user] Crash on 2.6.22

Previous thread: 2.6.23-rc8: cannot make netconsole work by Andrey Borzenkov on Friday, September 28, 2007 - 2:27 am. (3 messages)

Next thread: Re: Serial ATA does not find partitions (Hitachi HD, new? ATI controller) where old SATA works by Tejun Heo on Friday, September 28, 2007 - 2:45 am. (10 messages)
From: Laurent CARON
Date: Friday, September 28, 2007 - 2:00 am

Hi,

I did experience a quite strange problem (at least for me) on the first
node of our 2 node cluster.

This is basically an imap/smtp/http proxy server.

One of the imapd processes started to use a lot of cpu, memory... this
morning.

Oomkiller showed up and killed slapd, imapd, amavisd....

I then restarted those processes manually, and it went fine.

A few moments later i got the following messages on my ssh terminal:

kernel: Bad page state in process 'swapper'
kernel: Bad page state in process 'swapper'
kernel: page:c1032a40 flags:0x40000400 mapping:00000000 mapcount:0 count:0
kernel: page:c1032a40 flags:0x40000400 mapping:00000000 mapcount:0 count:0
kernel: Trying to fix it up, but a reboot is needed
kernel: Trying to fix it up, but a reboot is needed
kernel: Backtrace:
kernel: Backtrace:
kernel: Bad page state in process 'swapper'
kernel: Bad page state in process 'swapper'
kernel: page:c1032a40 flags:0x40000400 mapping:00000000 mapcount:0 count:0
kernel: page:c1032a40 flags:0x40000400 mapping:00000000 mapcount:0 count:0
kernel: Trying to fix it up, but a reboot is needed
kernel: Trying to fix it up, but a reboot is needed
kernel: Backtrace:
kernel: Backtrace:


The machine then completely locked, and did reboot (thanks to the watchdog).

This server is a HP DL380G5 with 12Gb memory, 8 SAS Disks, .... a quite
standard box.


The $HOME directories are stored on a drbd (version: 0.7.24
(api:79/proto:74)) partition (with an XFS filesystem).

The only 'non standard' thing I did use is a swap file instead of a swap
partition.

$ free             total       used       free     shared    buffers
 cached
Mem:      12471932    7420364    5051568          0       3984    6680868
-/+ buffers/cache:     735512   11736420
Swap:       393208          0     393208


$ grep swap /etc/fstab
/var/tmp/swapfile swap    swap    defaults    0   0


Might this be the (or one of the) cause of this problem ?

.config is available here: ...
From: Hannes Dorbath
Date: Friday, September 28, 2007 - 2:05 am

Was the deadlock with 2.6.22 only in 0.8.x? Is 0.7.x fine with 2.6.22?


-- 
Regards,
Hannes Dorbath
-

From: Laurent CARON
Date: Friday, September 28, 2007 - 2:12 am

I only experienced it with 2.6.22.

Since I pulled the latest svn tree of drbd 0.7.X, it should be pretty
deadlock safe.

Isn't it ?

Laurent
-

From: Stefan Seifert
Date: Friday, September 28, 2007 - 2:21 am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


The deadlock also occures with 0.7.x. A patch for that is floating around.

Regards,
Stefan Seifert

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFG/Meh1QuEJQQMVrgRAsJxAJ46+//VfKIxFBmauhsXdq0DFpo6wQCeOuHa
clUidp3JcwzW8ZDGeFdAXhM=
=xD65
-----END PGP SIGNATURE-----
-

From: Laurent CARON
Date: Friday, September 28, 2007 - 2:25 am

Here is a transcript from a mail I sent to Lars Ellenberg

It should 'normally' be fixed.

Am I wrong ?

Thanks





now, since you are so pushy today  :)
I just checked in something (r3062),
which should do the job, but is untested by me so far.
please test and report back.
-

Previous thread: 2.6.23-rc8: cannot make netconsole work by Andrey Borzenkov on Friday, September 28, 2007 - 2:27 am. (3 messages)

Next thread: Re: Serial ATA does not find partitions (Hitachi HD, new? ATI controller) where old SATA works by Tejun Heo on Friday, September 28, 2007 - 2:45 am. (10 messages)