A kernel crash dump is a snapshot of system state taken at the time that the kernel crashed, useful for finding and debugging the problem that caused the crash in the first place. There is no standard mechanism for automatiaclly collecting a crash dump on Linux, but there are a number of existing projects working toward efficiently meeting this goal. A "Linux Kernel Dump Summit" was recently mentioned on the lkml, with participants from some of the many crash dump projects looking to standardize the dump process and information collected. A followup email noted, "as memory size grows, the time and space for capturing kernel crash dumps really matter." It went on to examine partial dumps, and full dumps that are compressed. The former risks not collecting information necessary for proper debugging, while the latter risks greatly increasing the amount of time required to collect a dump.
There are a number of existing projects for collecting automatic kernel crash dumps on Linux, including Linux Kernel Crash Dump (LKCD), Mini Kernel Dump (mkdump), kdump, and diskdump (detailed here). Some of these projects also include tools for examining the obtained dumpfiles. Other projects focus just on tools for analyzing kernel crash dumps, including the perl-based Alicia (the Advanced LInux Crash-dump Interactive Analyzer) and Red Hat's crash analysis tool "loosely based on the SVR4 UNIX crash command, but significantly enhanced by completely merging it with the GNU gdb debugger."
From: Hiro Yoshioka [email blocked]
To: linux-kernel
Subject: Linux Kernel Dump Summit 2005
Date: Wed, 21 Sep 2005 20:55:50 +0900 (JST)
To whom may concern
We had a Linux Kernel Dump Summit 2005.
The participants are
Dump tools Session
diskdump -- Fujitsu
mkdump -- NTT Data Intellilink
LTD -- Hitachi
kdump -- Turbolinux
Summary -- Miracle Linux
Dump Analysis tools Session
Alicia/crash -- Uniadex
Other participants are
VA Linux/NEC/NSSOL/IPA/OSDL/Toshiba
Some discussion topics are (but not limited to)
- What kind of information do we need?
trace information
all of registers
the last log of panic, oops
LTD (Linux Tough Dump) has some nice features
- We need a partial dump
- We have to minimize the down time
- We have to dump all memory
how can we distinguish from the kernel and user if
kernel data is corrupted
- How we are not able to dump data
device
power management
we need a generic mechanism to reset a device
- Hang
NMI watch dog
mount
- It is very difficult to debug a memory corrupt bug
- hardware error
- Where will we go to?
IHV and Linux Kernel community collaboration are needed
Dump Analysis tools are very important
- There is a concern that the development process of 'crash'
is not open.
- Do we have to extend gdb?
- We'd like to collaborate 'crash'
- kexec/kdump, mkdump, LTD, all of them use the second kernel
to dump it.
- We have to share the test data, check list, test tools of
dump tool developments.
We agree to have the Linux Kernel Dump Summit.
Regards,
Hiro
From: OBATA Noboru [email blocked]
Subject: Re: Linux Kernel Dump Summit 2005
Date: Thu, 06 Oct 2005 21:17:18 +0900 (JST)
Hi, Hiro,
On Wed, 21 Sep 2005, Hiro Yoshioka wrote:
>
> We had a Linux Kernel Dump Summit 2005.
> - We need a partial dump
> - We have to minimize the down time
>
> - We have to dump all memory
> how can we distinguish from the kernel and user if
> kernel data is corrupted
As memory size grows, the time and space for capturing kernel
crash dump really matter.
We discussed two strategies in the dump summit.
1. Partial dump
2. Full dump with compression
PARTIAL DUMP
============
Partial dump captures only pages that are essential for later
analysis, possibly by using some mark in mem_map[].
This certainly reduces both time and space of crash dump, but
there is a risk because no one can guarantee that a dropped page
is really unnecessary in analysis (it can be a tragedy if
analysis went unsolved because of the dropped page).
Another risk is a corruption of mem_map[] (or other kernel
structure), which makes the identification of necessary pages
unreliable.
So there would be best if a user can select the level of partial
dump. A careful user may always choose a full dump, while a
user who is tracking the well-reproducible kernel bug may choose
fast and small dump.
FULL DUMP WITH COMPRESSION
==========================
Those who still want a full dump, including me, are interested
in dump compression. For example, the LKCD format (at least v7
format) supports pagewise compression with the deflate
algorithm. A dump analyze tool "crash" can transparently
analyze the compressed dump file in this format.
The compression will reduce the storage space at certain degree,
and may also reduce the time if a dump process were I/O bounded.
WHICH IS BETTER?
================
I wrote a small compression tool for LKCD v7 format to see how
effective the compression is, and it turned out that the time
and size of compression were very much similar to that of gzip,
not surprisingly.
Compressing a 32GB dump file took about 40 minutes on Pentium 4
Xeon 3.0GHz, which is not good enough because the dump without
compression took only 5 minutes; eight times slower.
Besides, the compress ratios were somewhat picky. Some dump
files could not be compressed well (the worst case I found was
only 10% reduction in size).
After examining the LKCD compress format, I must conclude that
the partial dump is the only way to go when time and size really
matter.
Now I'd like to see how effective the existing partial dump
functionalities are.
Regards,
--
OBATA Noboru [email blocked]
FreeBSD
In FreeBSD, if there is a screwup in the kernel code (such as null pointer dereference), you immediately drop into the debugger 'ddb' and from there, you either debug online (look at stack trace etc) or even better, you can call panic to dump the core to the swap (or whatever is set as 'dumpdev'). On reboot, savecore runs and puts the core file into a certain directory. You can then do a source code debugging using gdb. I thought this was the case with linux as well. Am I missing something here ?
Re: FreeBSD
Yes. If your server goes down, you want it back up as fast as possible, at any time of day or night, so you want it to reboot without manual intervention. OTOH, you want to be able to track down the problem, so you need to collect the information.
I agree, though, that an interactive debugger with crashdump features is the best way to go to debug a problem on a development system. kgdb can do that for you, but I don't think you can get it to dump to disk.
What you're missing is that L
What you're missing is that Linux has not one, but *many* crashdump tools.
Boy...I think you've never se
Boy...I think you've never seen a "real" unix.
Linux has no "real" crashdump utility because is a "moving" target.
Linux has *many* crashdump tools as you wrote, but not one of them is really usable. Try to compare them with one of the commercial unices. Or with at least the *BSDs.
But you've no idea about what I'm talking because you've never had to support a real customer.
For example running Oracle and the server hangs.
Think again, kid. I was raise
Think again, kid. I was raised on AIX in a huge company. And I bet I know UNIX internals waaaay better than you.
"Real UNIX" is a myth by lamers who haven't ever used a "Real UNIX". Thinking about the times with "Real UNIX" makes my neck-hair stand up. We've had to rewrite our applications because the thing would scale like shit once you had a certain amount of files in a directory. And when my company paid for the horribly expensive update for the next version to overcome some of those shortcommings, the the commandline options changed(!). We had to rewrite all of our scripts.
What was that with "moving target" again?
There is a saying... AIX is n
There is a saying... AIX is not a "real" Unix... ;)
-Ask anybody...
(I started with Solaris and now work with HP-UX)
Yeah. commercial unix version upgrades. Lets say...1-2-3 years or the issue with patches. -But the fundamentals never change within a version AND you have documentation! ;)
But! With linux -new kernel versions at every...what? every months nowadays?
(Thanks God! )
And...memory handling models come and go...
What? You have to rewrite some shell script? Ohh no!!!
But! What has it to do with crash dump analysis????
And the main quiestion was: -Have you ever tried to analyze a crash dump????
Sure. You've no idea what it
Sure. You've no idea what it means to analyze a crash dump. -No commercial unix knowledge yes?
Linux has many crash dump tools but not one of them is really usable.
I'm sure you've never had to support a customer with a server hung. -For example running Oracle...
*LOL* Kid, I've seen SAP run
*LOL*
Kid, I've seen SAP running on hardware you can only dream of...
Yep! Kid, I don't think so
Yep!
Kid, I don't think so. I can see this type of hardware every day...
A little Starfire, a little Superdome... ;)
Better compressor
This is the best compressor that I know, better than RAR (on file size).
http://www.7-zip.org/
Unix version
http://sourceforge.net/projects/p7zip/
thanks´, that i´m looking f
thanks´, that i´m looking for
7-zip is also far slower then
7-zip is also far slower then deflate which already increased the time it takes by a factor of 8.
That's not the real obstacle...
You could speed up Zip with more conservative compression settings. The mediocre compression rate has the most to do w/ the fact they only compress pages, which are 4kB. To get reasonable compression rates, you need a much larger block size.
Even w/ gzip -1 or -3, I think you'd see pretty good compression if you compressed 32kB to 128kB at a time instead of 4kB.
or you could use LZO which is
or you could use LZO which is fast
http://www.oberhumer.com/opensource/lzo/
LZO1X decompression in optimized assembler: ~20 MB/sec
Yeah, but how fast does it compress?
Decompression isn't the issue. Most compression algorithms have a very asymmetric compression vs. decompression time cost. From the webpage, compression seems to be about 1/4th the speed of decompression, for the default compression levels. To get to compression levels that are competitive with gzip etc. I wager it gets a lot slower.
Also, that 20MB/sec number isn't terribly impressive by itself. It only becomes impressive if you point out that it was measured on a Pentium 133.
These crash dumps are on the order of 32GB. I'm not sure why such slow numbers were turned in by gzip. Perhaps dialing back the compression a bit and compressing on larger groups of pages would take the time penalty away.
And to those of you mentioning bzip2: It has too large of a memory footprint to be useful in the kernel (think megabytes), and it would definitely require aggregating large numbers of pages prior to compression.
--
Edit: I looked up the compression speed numbers for LZO and it is indeed impressive and has a reasonable footprint. It might be worth a shot.
The second kernel dump system
The most promising Linux kernel crash dump today is kdump, which boots the second kernel upon a crash. The memory image of the first OS can be accessed via /proc/vmcore or /dev/oldmem, from a _userspace_ application. Thus, using bzip2 algorithm is not impossible.
The point is that LKCD v7 compress format only supports the pagewise compression, which means a compressed chunk is limited to 4096 bytes in ia32. Worse, every compressed chunk must have their own huffman tree.
So, as you mention, some new dump compression format which allows multiple pages in one chunk, while allowing the fast random access, would be a help.
Hmm?
How is it compared with bzip2?