login
Header Space

 
 

Linux: Kernel Crash Dumps

October 6, 2005 - 10:33am
Submitted by Jeremy on October 6, 2005 - 10:33am.
Linux news

A kernel crash dump is a snapshot of system state taken at the time that the kernel crashed, useful for finding and debugging the problem that caused the crash in the first place. There is no standard mechanism for automatiaclly collecting a crash dump on Linux, but there are a number of existing projects working toward efficiently meeting this goal. A "Linux Kernel Dump Summit" was recently mentioned on the lkml, with participants from some of the many crash dump projects looking to standardize the dump process and information collected. A followup email noted, "as memory size grows, the time and space for capturing kernel crash dumps really matter." It went on to examine partial dumps, and full dumps that are compressed. The former risks not collecting information necessary for proper debugging, while the latter risks greatly increasing the amount of time required to collect a dump.

There are a number of existing projects for collecting automatic kernel crash dumps on Linux, including Linux Kernel Crash Dump (LKCD), Mini Kernel Dump (mkdump), kdump, and diskdump (detailed here). Some of these projects also include tools for examining the obtained dumpfiles. Other projects focus just on tools for analyzing kernel crash dumps, including the perl-based Alicia (the Advanced LInux Crash-dump Interactive Analyzer) and Red Hat's crash analysis tool "loosely based on the SVR4 UNIX crash command, but significantly enhanced by completely merging it with the GNU gdb debugger."


From: Hiro Yoshioka [email blocked]
To:  linux-kernel
Subject: Linux Kernel Dump Summit 2005
Date:	Wed, 21 Sep 2005 20:55:50 +0900 (JST)

To whom may concern

We had a Linux Kernel Dump Summit 2005.

The participants are

Dump tools Session
diskdump -- Fujitsu
mkdump   -- NTT Data Intellilink
LTD      -- Hitachi
kdump    -- Turbolinux
Summary  -- Miracle Linux

Dump Analysis tools Session
Alicia/crash -- Uniadex

Other participants are
VA Linux/NEC/NSSOL/IPA/OSDL/Toshiba

Some discussion topics are (but not limited to)

- What kind of information do we need?
    trace information
    all of registers
    the last log of panic, oops
    LTD (Linux Tough Dump) has some nice features

- We need a partial dump
- We have to minimize the down time

- We have to dump all memory
    how can we distinguish from the kernel and user if
    kernel data is corrupted

- How we are not able to dump data
    device
    power management
    we need a generic mechanism to reset a device

- Hang
    NMI watch dog
    mount

- It is very difficult to debug a memory corrupt bug
- hardware error

- Where will we go to?
   IHV and Linux Kernel community collaboration are needed

Dump Analysis tools are very important

- There is a concern that the development process of 'crash'
is not open.
- Do we have to extend gdb?
- We'd like to collaborate 'crash'

- kexec/kdump, mkdump, LTD, all of them use the second kernel
to dump it.

- We have to share the test data, check list, test tools of
dump tool developments.

We agree to have the Linux Kernel Dump Summit.

Regards,
  Hiro


From: OBATA Noboru [email blocked] Subject: Re: Linux Kernel Dump Summit 2005 Date: Thu, 06 Oct 2005 21:17:18 +0900 (JST) Hi, Hiro, On Wed, 21 Sep 2005, Hiro Yoshioka wrote: > > We had a Linux Kernel Dump Summit 2005. > - We need a partial dump > - We have to minimize the down time > > - We have to dump all memory > how can we distinguish from the kernel and user if > kernel data is corrupted As memory size grows, the time and space for capturing kernel crash dump really matter. We discussed two strategies in the dump summit. 1. Partial dump 2. Full dump with compression PARTIAL DUMP ============ Partial dump captures only pages that are essential for later analysis, possibly by using some mark in mem_map[]. This certainly reduces both time and space of crash dump, but there is a risk because no one can guarantee that a dropped page is really unnecessary in analysis (it can be a tragedy if analysis went unsolved because of the dropped page). Another risk is a corruption of mem_map[] (or other kernel structure), which makes the identification of necessary pages unreliable. So there would be best if a user can select the level of partial dump. A careful user may always choose a full dump, while a user who is tracking the well-reproducible kernel bug may choose fast and small dump. FULL DUMP WITH COMPRESSION ========================== Those who still want a full dump, including me, are interested in dump compression. For example, the LKCD format (at least v7 format) supports pagewise compression with the deflate algorithm. A dump analyze tool "crash" can transparently analyze the compressed dump file in this format. The compression will reduce the storage space at certain degree, and may also reduce the time if a dump process were I/O bounded. WHICH IS BETTER? ================ I wrote a small compression tool for LKCD v7 format to see how effective the compression is, and it turned out that the time and size of compression were very much similar to that of gzip, not surprisingly. Compressing a 32GB dump file took about 40 minutes on Pentium 4 Xeon 3.0GHz, which is not good enough because the dump without compression took only 5 minutes; eight times slower. Besides, the compress ratios were somewhat picky. Some dump files could not be compressed well (the worst case I found was only 10% reduction in size). After examining the LKCD compress format, I must conclude that the partial dump is the only way to go when time and size really matter. Now I'd like to see how effective the existing partial dump functionalities are. Regards, -- OBATA Noboru [email blocked]



Related Links:

FreeBSD

October 6, 2005 - 11:57am
Anonymous (not verified)

In FreeBSD, if there is a screwup in the kernel code (such as null pointer dereference), you immediately drop into the debugger 'ddb' and from there, you either debug online (look at stack trace etc) or even better, you can call panic to dump the core to the swap (or whatever is set as 'dumpdev'). On reboot, savecore runs and puts the core file into a certain directory. You can then do a source code debugging using gdb. I thought this was the case with linux as well. Am I missing something here ?

Re: FreeBSD

October 6, 2005 - 6:06pm

Yes. If your server goes down, you want it back up as fast as possible, at any time of day or night, so you want it to reboot without manual intervention. OTOH, you want to be able to track down the problem, so you need to collect the information.
I agree, though, that an interactive debugger with crashdump features is the best way to go to debug a problem on a development system. kgdb can do that for you, but I don't think you can get it to dump to disk.

What you're missing is that L

October 7, 2005 - 11:43am
Anonymus (not verified)

What you're missing is that Linux has not one, but *many* crashdump tools.

Boy...I think you've never se

October 7, 2005 - 3:02pm
Anonymous (not verified)

Boy...I think you've never seen a "real" unix.

Linux has no "real" crashdump utility because is a "moving" target.

Linux has *many* crashdump tools as you wrote, but not one of them is really usable. Try to compare them with one of the commercial unices. Or with at least the *BSDs.

But you've no idea about what I'm talking because you've never had to support a real customer.
For example running Oracle and the server hangs.

Think again, kid. I was raise

October 7, 2005 - 4:10pm
Anonymus (not verified)

Think again, kid. I was raised on AIX in a huge company. And I bet I know UNIX internals waaaay better than you.

"Real UNIX" is a myth by lamers who haven't ever used a "Real UNIX". Thinking about the times with "Real UNIX" makes my neck-hair stand up. We've had to rewrite our applications because the thing would scale like shit once you had a certain amount of files in a directory. And when my company paid for the horribly expensive update for the next version to overcome some of those shortcommings, the the commandline options changed(!). We had to rewrite all of our scripts.

What was that with "moving target" again?

There is a saying... AIX is n

October 14, 2005 - 8:34pm
Anonymous (not verified)

There is a saying... AIX is not a "real" Unix... ;)
-Ask anybody...

(I started with Solaris and now work with HP-UX)

Yeah. commercial unix version upgrades. Lets say...1-2-3 years or the issue with patches. -But the fundamentals never change within a version AND you have documentation! ;)

But! With linux -new kernel versions at every...what? every months nowadays?
(Thanks God! )

And...memory handling models come and go...

What? You have to rewrite some shell script? Ohh no!!!
But! What has it to do with crash dump analysis????

And the main quiestion was: -Have you ever tried to analyze a crash dump????

Sure. You've no idea what it

October 7, 2005 - 3:06pm
Anonymous (not verified)

Sure. You've no idea what it means to analyze a crash dump. -No commercial unix knowledge yes?

Linux has many crash dump tools but not one of them is really usable.

I'm sure you've never had to support a customer with a server hung. -For example running Oracle...

*LOL* Kid, I've seen SAP run

October 7, 2005 - 4:14pm
Anonymus (not verified)

*LOL*
Kid, I've seen SAP running on hardware you can only dream of...

Yep! Kid, I don't think so

October 14, 2005 - 8:42pm
Anonymous (not verified)

Yep!

Kid, I don't think so. I can see this type of hardware every day...
A little Starfire, a little Superdome... ;)

Better compressor

October 6, 2005 - 5:54pm
Anonymous (not verified)

This is the best compressor that I know, better than RAR (on file size).

http://www.7-zip.org/

Unix version

October 6, 2005 - 5:55pm
Anonymous (not verified)

thanks´, that i´m looking f

October 9, 2005 - 2:15pm

thanks´, that i´m looking for

7-zip is also far slower then

October 6, 2005 - 8:11pm
Anonymous (not verified)

7-zip is also far slower then deflate which already increased the time it takes by a factor of 8.

That's not the real obstacle...

October 6, 2005 - 9:18pm

You could speed up Zip with more conservative compression settings. The mediocre compression rate has the most to do w/ the fact they only compress pages, which are 4kB. To get reasonable compression rates, you need a much larger block size.

Even w/ gzip -1 or -3, I think you'd see pretty good compression if you compressed 32kB to 128kB at a time instead of 4kB.

or you could use LZO which is

October 7, 2005 - 2:29pm
Anonymous (not verified)

or you could use LZO which is fast

http://www.oberhumer.com/opensource/lzo/

LZO1X decompression in optimized assembler: ~20 MB/sec

Yeah, but how fast does it compress?

October 7, 2005 - 4:21pm

Decompression isn't the issue. Most compression algorithms have a very asymmetric compression vs. decompression time cost. From the webpage, compression seems to be about 1/4th the speed of decompression, for the default compression levels. To get to compression levels that are competitive with gzip etc. I wager it gets a lot slower.

Also, that 20MB/sec number isn't terribly impressive by itself. It only becomes impressive if you point out that it was measured on a Pentium 133.

These crash dumps are on the order of 32GB. I'm not sure why such slow numbers were turned in by gzip. Perhaps dialing back the compression a bit and compressing on larger groups of pages would take the time penalty away.

And to those of you mentioning bzip2: It has too large of a memory footprint to be useful in the kernel (think megabytes), and it would definitely require aggregating large numbers of pages prior to compression.

--

Edit: I looked up the compression speed numbers for LZO and it is indeed impressive and has a reasonable footprint. It might be worth a shot.

The second kernel dump system

October 7, 2005 - 9:48pm
Anonymous (not verified)

The most promising Linux kernel crash dump today is kdump, which boots the second kernel upon a crash. The memory image of the first OS can be accessed via /proc/vmcore or /dev/oldmem, from a _userspace_ application. Thus, using bzip2 algorithm is not impossible.

The point is that LKCD v7 compress format only supports the pagewise compression, which means a compressed chunk is limited to 4096 bytes in ia32. Worse, every compressed chunk must have their own huffman tree.

So, as you mention, some new dump compression format which allows multiple pages in one chunk, while allowing the fast random access, would be a help.

Hmm?

October 7, 2005 - 5:43am
Anonymous (not verified)

How is it compared with bzip2?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary