This patch makes panic() and die() registers available to, for example, panic notifier functions. Panic notifier functions are quite useful for recording crash information, but they don't get passed the register values. This makes it hard to print register contents, do stack backtraces, etc. The changes in this patch save the register state when panic() is called and introduce a function for die() to call that allows it to pass in the registers it was passed. Following this patch are more patches, one per architecture. These include two types of changes: o A save_ptregs() function for the processor. I've taken a whack at doing this for all of the processors. I have tested x86 and MIPS versions. I was able to find cross compilers for ARM, ... and the code compiles cleanly. Everything else, well, what you see is sheer fantasy. You are welcome to chortle with merriment. o When I could figure it out, I replaced the calls to panic() in exception handling functions with calls to panic_with_regs() so that everyone can leverage these changes without much effort. Again, not all the code was transparent, so there are likely some places that should have additional work done. Note that the pointer to the struct pt_regs may be NULL. This is to accomodate those processors which don't have a working save_ptregs(). I'd love to eliminate this case by providing a save_ptregs() for all architectures, but I'll need help to so. Signed-off-by: David VomLehn <dvomlehn@cisco.com> --- include/linux/kernel.h | 5 ++ include/linux/ptrace.h | 9 ++++ include/linux/ptreg.h | 62 +++++++++++++++++++++++++++++++ kernel/panic.c | 96 +++++++++++++++++++++++++++++++++++++++++------ 4 files changed, 159 insertions(+), 13 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 328bca6..d73ae5c 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -162,6 +162,11 @@ extern struct atomic_notifier_head ...
could you post a sample module that you're using to test with here ? presumably you have some simple code that registers a notify handler and then calls panic_with_regs() ... -mike --
Can the use of va_start() clobber lots of registers, thereby rendering the exercise pointless on some arches? Also, can the save_ptregs() function be out of line asm? The FRV constructed inline statement is huge (and wrong). David --
The implementations I'm familiar with only need one or two registers. What it *does* do is to force the contents of registers being used to pass argument values onto the stack. This is roughly what gcc does for asm() statements when you With this implementation it has to be inline. One use of the saved registers is to backtrace the stack. If you call a function to save the registers, the stack pointer and program counter would be those of the called function, which will not be valid after it returns. I expect that you could come up with an alternative out-of-line function--on every processor I know, you could backtrace one frame to get reasonable values for those registers,. Unfortunately, you would run the risk of clobbering other registers by doing the function call. The more you change register values from those in the function that calls panic(), the less useful this becomes. In this case, I think an inline function is worth the effort to get working. (I'd be interested in know more details about how --
The implementations I'm familiar with only need one or two registers. What it *does* do is to force the contents of registers being used to pass argument values onto the stack. This is roughly what gcc does for asm() statements when you With this implementation it has to be inline. One use of the saved registers is to backtrace the stack. If you call a function to save the registers, the stack pointer and program counter would be those of the called function, which will not be valid after it returns. I expect that you could come up with an alternative out-of-line function--on every processor I know, you could backtrace one frame to get reasonable values for those registers,. Unfortunately, you would run the risk of clobbering other registers by doing the function call. The more you change register values from those in the function that calls panic(), the less useful this becomes. In this case, I think an inline function is worth the effort to get working. (I'd be interested in know more details about how --
How about something like Sparc, where you can pass up to 8 arguments (if I remember correctly) in registers. I'm not sure how Sparc handles varargs Yes. The easiest way might be to write the saver in assembly and call it from Indeed, but the more things panic() does, the more likely it is to clobber registers anyway. Note, also: panic() is __attribute__((noreturn)), which means that the compiler calling it is not required to save the return address or registers As I mentioned above, you can use asm for this. For instance, you can write an inline asm statement that saves onto the stack the registers that need to be clobbered to make a jump, then make the jump, and then have the saver routine retrieve the register values from the stack and place them in the storage area. In fact, you could insert a prologue wrapper on panic() with a bit of asm to save the registers, for example on FRV: panic: subi sp,#-8,sp stdi.p gr4,@(sp,#0) # save GR4/GR5 on stack addi sp,#8,gr5 sethi.p %hi(__panic_reg_save),gr4 # get the save space addr setlo %lo(__panic_reg_save),gr4 sti gr5,@(gr4,#REG_SP)) # save orig stack pointer stdi gr2,@(gr4,#REG_GR(2)) # save GR2/GR3 ldi @(sp,#0),gr5 sti gr5,@(gr4,#REG_GR(4)) # save orig GR4 ldi @(sp,#4),gr5 sti gr5,@(gr4,#REG_GR(5)) # save orig GR5 stdi gr6,@(gr4,#REG_GR(6)) # save GR6/GR7 stdi gr8,@(gr4,#REG_GR(8)) # save GR8/GR9 ... lddi.p @(sp,#0),gr4 # restore GR4/GR5 from stack addi sp,#8,sp bra real_panic # chain Most load/store instructions come in two types, and you need to modify the opcode according to the addressing mode and indicate that you're interpolating a memory dereference argument rather than an address: asm("ldd%I1 %M1,%0" : "=e"(counter) : "m"(v->counter)); You're also trying to load data into GR0 which won't achieve anything. GR0 is hardwired to 0. It's used as the target of instructions where you don't care about the calculated result (eg: compare is ...
From: David Howells <dhowells@redhat.com> 6 arguments, and all arguments get popped onto the stack into the argument save area when doing varargs so you can access them as an array. Stack looks like: struct register_window window; unsigned long args[...]; --
Wouldn't it be much easier to implement panic with an illegal op and let the exception handler set up the pt regs structure instead? Just like some architectures do that already for warnings. Have a look at lib/bug.c and at various arch/<...>/include/asm/bug.h. BUG_FLAG_PANIC would do the trick. But I'm still wondering what the use case would be. You haven't posted any code that would actually use this. --
You also have the problem that there are panic() statements before exception vectors are setup. Eg, using lmb, you might want to allocate a page for a L2 page table so you can setup the exception vectors, but the lmb allocator has a panic statement which will be used on failure to allocate a page. The result is that you don't know why you didn't boot since there's no diagnostics from the kernel. At least with the current setup, merely (re)directing the kernel printk output results in something you can read. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: --
You could do this so long as your exception handler wasn't compromised. When you get here, your system is already known to have failed, so the idea is to be I'm working my way towards this, but the general idea is to be able to log all system state from within a panic handler. So, you'd want to print the registers, the stack, the window of memory around instructions, the stack trace, etc., all of which need some set of the register values. --
No, not necessarily. panic() is varargs. David --
Can you explain why you want this? I'm wondering about the value of saving the registers; normally when a panic occurs, it's because of a well defined reason, and not because something went wrong in some CPU register; to put it another way, a panic() is a more controlled exception than a BUG() or a bad pointer dereference. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: --
On Mon, 12 Apr 2010 13:27:45 +0100 I'm curious about the potential use case as well. So far I only wanted to know the registers if the panic has been triggered due to an unexpected fault with panic_on_oops=1 or in_interrupt()==1. If that happens the die() handler prints the registers. An open coded panic is easy to analyze, imho no need for the registers. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. --
Good example, because helps focus the issue. In recording a subset of kernel state information from an embedded system for collection at a central point. The register values printed by die() are printed to the console, where they disappear. One of the things in this patch involves passed a pointer to those die() registers to a register panic notifier handler. So, there is a path to where panic handlers are called from die() and another one from panic() and this patch makes register values available in both cases. --
More context probably helps, starting with noting that the platform is an embedded one. (I'm at the Eembedded Linux Conference, which is why my reply is so tardy). In embedded systems we frequently (probably?) don't have the resources to create or store a crash dump. A very common approach is to record a subset of the system state that is likely to help diagnose the failure. The state information is then stored\ on the system or sent upstream to some network-connected node. When I gave a talk about this at the ELC, I polled the audience and got at least half a dozen other companies using the same approach. This is the first set of patches to allow this common embedded community requirement to be met. My expectation that register values will be wanted by everyone doing this kind of targeted state reporting. I should also note that I regard this is more the beginning of a conversation on how to diagnose kernel (and, possibly, application) failures on crash dump-less systems. I would expect to see some number of patches, some of which will ultimately be dropped in favor of other approaches. --
+1.
I found in FS-Cache and CacheFiles that often the things I most wanted to know
when I had something of the form:
if (A == B)
BUG();
was a and b, so I made the following macro:
#define ASSERTCMP(X, OP, Y) \
do { \
if (unlikely(!((X) OP (Y)))) { \
printk(KERN_ERR "\n"); \
printk(KERN_ERR "AFS: Assertion failed\n"); \
printk(KERN_ERR "%lu " #OP " %lu is false\n", \
(unsigned long)(X), (unsigned long)(Y)); \
printk(KERN_ERR "0x%lx " #OP " 0x%lx is false\n", \
(unsigned long)(X), (unsigned long)(Y)); \
BUG(); \
} \
} while(0)
which I could then call like this:
ASSERTCMP(A, ==, B);
and if the assertion failed, it prints A and B explicitly. This is much
easier than trying to pick the values out of a register dump, especially as
the compiler may be free to clobber A or B immediately after testing them.
David
--
This is great if you'r in a development environment, and can focus on a single, well characterized case. Unfortunately, I'm staring at hundreds of thousands of systems in the field, all which which have a large number of panic() statements for which this approach has not been taken. So, I have no alternative but to pick the value out of --
Perhaps I'm missing something obvious, but is there some reason why you can't just reuse the crash_setup_regs() code? MIPS doesn't implement it presumably because it's lacking kexec crash kernel support, but it would make sense to make current ptregs saving more generic if there are going to be multiple users for it. --
