I sent Jan & Jason earlier a quick review of the kgdb code currently in git-x86#mm.
The kgdb code in git-x86#mm is right now is totally broken of course because the CFI
annotations in assembler code are gone now, but they are needed for gdb use.
And the bogus fault handling code is still in there, but Jason apprently fixed that one.Here are the other commments I wrote just in case someone is interested.
Overall I would say there is quite a lot of stuff to clean up still.<snip>
ok, doing a review on the code in git-x86#mm. I have not read
everything in detail, just a quick skip over.My personal test for clean kernel debugger integration is if the
debugger could be made a loadable (but not unloadable) module with
minor/no effort. How far away is your code from that?The 32bit in_exception_stack code looks weird. Are you sure you cannot
just use the pt_regs passed to the die notifier?It might be safer to explicitely disable interrupts early in the notifier.
You should probably use simple_strtoul() instead of inventing an
own hex parser in kgdb.c. And sprintf instead of an own hex writer.
In general more use sprintf would probably shorten a lot of the parser
code.The num_online_cpu()s check in getthread() looks weird. What is that
supposed to do?kgdb_softlock_notify: I think the right way to handle this is just
calling touch_softlockup_watchdog() (or perhaps better touch_nmi_watchdog)
Actually it should be only needed once when you exit the debugger
for soft lockup, but regularly for nmi watchdog.On x86 it is ok, but on other architectures i'm suspect the SMP
synchronization with atomics might benefit from more memory barriers.gdb_cmd_reboot: always calling emergency_sync looks dangerous.
In fact i suspect if you hold the other CPUs the changes are good
it will lock up. You should at least offer a option to not call that
(or better don't do it by default). Even machine_start is likely
lock up actually if the other CPUs are truly halted (as th...
..
Speaking of which.. the kernel implementation of snprintf() seems
to have a bug somewhere, in that it returns an incorrect count in
some situations -- mostly around where the buffer is too small to
hold the data being written. There's an off-by-one bug there somewhere,
but I have not had time yet to track it down more precisely.Cheers
--
This loss of CFI data is unfortunate for other potential consumers
too, such as systemtap, which intends to supplement its current stack
backtracing heuristics with CFI data.- FChE
--
yes, but we removed those patches meanwhile because Roland McGrath has
agreed to clean up the CFI mess. (Thanks Roland! We were very close to
nuking them altogether because the unreadability they introduced was
causing real bugs.)[ btw., Andi's characterisation of it being "totally broken" is
incorrect and alarmist. It only means that the "backtrace" command
will not work in kgdb in certain types of contexts (only the current
function is displayed, not the parent function(s)), but there's no
"breakage", just less information available to the debugger. ]in any case, while there are still holes in the frame annotations and it
all needs a good bit of cleanup, that is a largely separate issue from
whether kgdb is merge-worthy or not.I'd like to thank Andi for his kgdb review efforts and comments.
Unfortunately they were not done against kgdb-light which is being
offered for upstream merge, but against an older version of kgdb. (Sorry
about the 12 hour delay in my reply - Andi's mail did not have any of
the kgdb or x86 persons Cc:-ed so it was not immediately noticed on a
busy Monday, only in the lower-prio lkml scans.)after a quick glance i'm happy to conclude that all but one of Andi's
review comments were already addressed in kgdb-light -v7 AFAICS, and the
only one not addressed (emergency_resync()) is now addressed in -v8.the -v8 kgdb-light tree (against v2.6.25-rc1) can be pulled from:
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb.git
shortlog, diffstat and patch attached below. Tip of the -v8 tree is
commit f20d70a6dd5785c9dd4c8215daf21f34365ce1c7.In any case, if there are any open issues we are very much ready and
willing to address them now or after any potential upstream merge of
this codebase. This has meanwhile become one of the best-reviewed pieces
of kernel code in living memory ;-)Ingo
------------------>
Ingo Molnar (3):
pids: add pid_max prototype
uaccess: add probe_kernel...
...dst and src are twisted. Fix below (should make sw-breakpoints work
again).Jan
diff --git a/kernel/kgdb.c b/kernel/kgdb.c
index dce89d1..10fe113 100644
--- a/kernel/kgdb.c
+++ b/kernel/kgdb.c
@@ -153,14 +153,14 @@ int __weak kgdb_validate_break_address(unsigned long addr)
{
char tmp_variable[BREAK_INSTR_SIZE];- return probe_kernel_read((char *)addr, tmp_variable, BREAK_INSTR_SIZE);
+ return probe_kernel_read(tmp_variable, (char *)addr, BREAK_INSTR_SIZE);
}int __weak kgdb_arch_set_breakpoint(unsigned long addr, char *saved_instr)
{
int err;- err = probe_kernel_read((char *)addr, saved_instr, BREAK_INSTR_SIZE);
+ err = probe_kernel_read(saved_instr, (char *)addr, BREAK_INSTR_SIZE);
if (err)
return err;--
thanks, applied :-) Merged, uploaded the updated tree to the usual
place. New tip of head is 59b64c80d706ae3504ccc76a9c00cdfea4a10701.Ingo
--
> after a quick glance i'm happy to conclude that all but one of Andi's
At least the weird in_interrupt_stack() code seems to be still
in there.I also still see a lot of open coded hex manipulation.
The synchronization code looks as bad as it was before.
The getthread magic pids are also still in there. What is that
good for?The notifiers are also still run with interrupts enabled,
so could in theory be preempted (only in some very exotic cases though)The cpu hotplug races are still there I think
I have not rechecked everything else.
-Andi
--
i've uploaded kgdb-light -v9, it can be pulled from:
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb.git
find the shortlog, diffstat, full patch below.
There are a number of changes in -v9 - a patch from Jason to clean up
kernel-messages-over-the-kgdb-console behavior and a bugfix from Jan.fixed. (That code is gone now - the pt_regs passed in to the notifier
points to the right kind of register context already, so the whole
shadow task business on 64-bit was an unnecessary complication - as you
suspected it. With it went another bunch of kernel/kgdb.c complicationi reworked and cleaned up all the kgdb locking code completely. No more
NR_CPUS spinlocks, there's now proper use of barriers, etc. No more
silly "timeouts" for locking either ... There's now a very well-defined
and clean locking state-machine for the primary CPU and the secondarypreviously Mark indicated some sort of sprintf return value breakage he
observed, and kgdb would rely on sprintf return values so i'm inclinedi fixed them too.
thanks for your suggestions, anything else you can think of?
Ingo
------------------>
Ingo Molnar (3):
pids: add pid_max prototype
uaccess: add probe_kernel_write()
x86: kgdb supportJason Wessel (3):
kgdb: core
consoles: polling support, kgdboc
kgdb: document parametersDocumentation/kernel-parameters.txt | 5 +
arch/x86/Kconfig | 1 +
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/kgdb.c | 555 ++++++++++++
drivers/char/tty_io.c | 47 +
drivers/serial/8250.c | 58 ++
drivers/serial/Kconfig | 3 +
drivers/serial/Makefile | 1 +
drivers/serial/kgdboc.c | 163 ++++
drivers/serial/serial_core.c | 70 ++-
include/asm-x86/kgdb.h | 81 ++
include/linux/kgdb.h | 271 ++++++
include/linux/pid.h | ...
First nobody answered the "kgdb clean enough for a module"
A timeout for waiting for other CPUs is actually not a bad idea for a
debugger. After all you still want to debug even if some other CPUsPretty much all the proc output and sysfs show functions rely on these return
The 64bit gdb actually supports segment registers:
static int amd64_linux_gregset64_reg_offset[] =
{
RAX * 8, RBX * 8, /* %rax, %rbx */
RCX * 8, RDX * 8, /* %rcx, %rdx */
RSI * 8, RDI * 8, /* %rsi, %rdi */
RBP * 8, RSP * 8, /* %rbp, %rsp */
R8 * 8, R9 * 8, /* %r8 ... */
R10 * 8, R11 * 8,
R12 * 8, R13 * 8,
R14 * 8, R15 * 8, /* ... %r15 */
RIP * 8, EFLAGS * 8, /* %rip, %eflags */
CS * 8, SS * 8, /* %cs, %ss */
DS * 8, ES * 8, /* %ds, %es */
FS * 8, GS * 8, /* %fs, %gs */
-1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1,
ORIG_RAX * 8You silently overwrite any user ptrace hw breakpoints right? To do it cleanly
All the kerneldoc comments are useless if you don't add the file
I don't think that case should happen during roundup for once. If it happens
something is wrong.That seems weird. I think other parts try to support user mode debugging
too. In theory there is no reason it shouldn't be able to do thisOk I have not checked, but I hope you have strong protections against
reentry of the debugger just in case an exception here falls down toThat wrapper should not be needed; everybody can use instruction_pointer()
Hmm, so idle thread numbers depend on a sysctl? That seems weird.
It's still likely to deadlock on MP I think
[...] Haven't read further.
-Andi
--
this is kgdb-light, version -v10 (against Linus-latest), and can be
pulled from:git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb.git
shortlog, diffstat and full patch can be found further below.
in -v10 i have depleted Andi Kleen's (known ;) pool of last-minute
correct (just like SysRq-B can hang) - in -v10 i've fixed this by
it's a sensible check, it validates a remote-provided "tid" against the
range we accept. (It's a bit superfluous as find_task_by_pid() is safe
enough.)yes, i thought about that too yesterday, but was unsure how the rest of
the code would code with that. It's no big issue and i have no strong
feelings either way.GDB wants to track threads, but on the Linux side each idle task has PID
GDB simply needs an ID it can work with - there's nothing weird about
as Roland explained it, the simplicity and of this code intentional.
That's wrong: in-source documentation is never useless. If it happens to
meet DocBook style that's an added bonus. If the DocBook template isno, not all architectures have it. This is a weak alias that is
The currently submitted code does not try to support user mode
debugging. It might be a nice add-on later, if done cleanly andyes, this means if the user has configured kcrash as well then that will
i disagree that a kernel debugger should necessarily be a kernel module.
It might be nice at a future stage, but right now it would justi went for correctness and simplicity first. If a system is hung, the
debugging CPU might hang too at any time. A timeout on the other hand
introduces the possibility of a 'dead' CPU just coming back to life
after the 'timeout', corrupting debugger data. So for now the rule is
very simple.you are wrong, as procfs does not rely on it for a protocol stack. It
relies on it mostly for user-space "this many bytes were processed"
return values and those values are often wrong and user-space just
handles it gratiously.Anyway,...
Hi,
I am just trying to play with this, I am not able to make it work.
Assume I am able with git, I use it every day on my software, then I
am correctly tracking and building your branch on git.kernel.org.My config has both KGDB and KGDB_SERIAL_CONSOLE. kernel command line
has kgdbwait kgdboc=ttyS0,115200. linux console is not on serial but if
do not enable KGDB_SERIAL_CONSOLE I am then unable to hit SysRq+g. If
instead I use linux serial console, gdb messes with getty and things
quickly get worse.1. at boot, kernel waits for a gdb connection
2. gdb connection is established from another box
3. gdb breaks in
4. hit continue at gdb prompt
5. target box returns to life. everything looks ok (it is 2.6.25-rc1!)
6. SysRq+h on the target correctly shows the help on the console
7. SysRq+g on the target, it prints "SysRq : GDB" and hangs forever
8. gdb on the host does not break inis really that simple, am i overlooking anything? thank you.
regards,
Domenico#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc1
# Mon Feb 11 11:34:22 2008
#
CONFIG_PARISC=y
CONFIG_MMU=y
CONFIG_STACK_GROWSUP=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME=y
CONFIG_TIME_LOW_RES=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_IRQ_PER_CPU=y
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
# CONFIG_IKCONFIG_PROC is not set
CONFIG_LOG_BUF_SHIFT=15
# CONFIG_CGROUPS is ...
I doubt the .config you posted was the one you used, because there were
no KGDB options specified at all.If you want to be using your vga device as the console that is ok. You
can still activate the sysrq with running the command: echo g >
/proc/sysrq-triggerYou might consider doing "set debug remote 1" in the gdb session if you
want to see what is going on. After the continue, the next time you
break in with the debugger it should send $T.........etc which is a stop
packet.Jason.
--
indeed it was of a parisc.. :)
here is the right one:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc1
# Tue Feb 12 13:11:26 2008
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
# CONFIG_IKCONFIG_PROC is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_CGROUPS is not set
# CONFIG_FAIR_GROUP_SCHED is not se...
Can't be very many because oprofile needs it and it works on most archs now.
Anyways, the right thing is to just add it to the architecturesHmm, I'm pretty sure I saw some code that was only useful for the
user case. Perhaps you should take all that out then if it doesn't
work anyways?Seems wrong intended behaviour to me. If kgdb is active it should have priority
The question is less about actually having it as a module, but
just if the interfaces are clean enough to allow it as a module.If all code is correct, it likely won't need a debugger.
Ok it just makes kgdb useless for a wide range of kernel problems.
If the return value of read() is wrong not even cat(1) will work
correctly, but lose bytes.You're saying cat is not supposed to work on /proc? That's certainly
Sure a lot of ugly code works in practice. If it's clean and maintaintable
code is another question of course.-Andi
--
i'm glad that there are no other valid-looking (to me) objections left
against kgdb-light -v10, other than "it would be nice to have more kgdb(It seems we do have a fundamental disagreement about how kgdb should
act: in my opinion it should never break a correct system, while in your
opinion apparently it's OK to compromise on that to make debugging
easier. (find below more detailed arguments about this.) If you do not
change your position about that issue we'll have to agree to disagree
about that point.)So unless i forgot about something (please yell if so), it seems to me
kgdb is now pretty ready for an upstream merge.here are my replies to the most recent points you brought up [in order
of how i perceive the topic's importance]:i gave you very specific technological reasons for why we dont want to
do spinning for now: we dont _ever_ want to break a correctly working
system with kgdb.A valid counter-argument is _not_ to argue "but it would be nice to have
if the system is broken in X, Y and Z ways" (like you did), but to point
it out why the behavior we chose is wrong on a correctly working system.Yes, a buggy system might misbehave in various ways but my primary
interest is in keeping correctly working systems correct.And note that kgdb is not just a "debugger", it's a system inspection
tool. An intelligent, human-controlled printk. A kernel internals
learning tool. An extension to the kernel console concept. Yes, people
frequently use it for debugging too, but the other uses are actuallybut your contention is simply wrong. Most of our debugging
infrastructure is non-modular for a good reason. Modularization
increases complexity and that's exactly the wrong direction forit's not reimplemented - kgdb_arch_pc() does not directly map to
that's already all taken out in -v10. (if you find any leftovers then
that's the approach we are taking: be as unintrusive as possible. This
means that the notifier here is registered at th...
This is not a technical argument, but I am not a big fan of hard hanging
the system if you cannot sync all the CPUs. The original intent was to
at least provide a sync error message to the end user after some
reasonable time. Then allow someone to collect any data you can get and
you basically have to reboot. The reboot was never forced, but assumed
the end users of this knew what they were doing in the first place.Certainly in a completely working system where you use kgdb only for
inspection this is not an issue, unless you use a breakpoint or single
step one of the smp_call functions. As we all know there are lots ofWe might be best served to add a comment to explain the purpose of
kgdb_arch_pc() and put it in the optional implementation function
headers in include/linux/kgdb.hOn some archs certain exceptions do not report the address that the
exception occurred at when you call instruction_pointer(). This optional
function allows for an arch to perform a "fixup" to get the address the
exception actually occurred at.Kgdb requires the actual exception address so a sanity check can be
performed to make sure kgdb did not hit an exception while in a chunk of
code kgdb requires for its functionality. If you hit one of these
conditions kgdb makes its best attempt to try to "patch the wound"
inflicted by shooting yourself but at least you get notified vs a silent
hang :-)Jason.
--
That was for the old longjmp checks, but that code is long gone isn't it
In what cases is that still needed?
-Andi
--
No this is a kgdb integrity check.
Basically when you reach this chunk of code it is before the hand off
to the source debugger. We cannot continue because there was a
breakpoint in a part of the system kgdb was using while doing its
normal work. The reality is that KGDB is not self contained. It
relies on some external I/O methods, atomic page fault handling and
some other pieces. If you take an exception there, the kgdb integrity
check absolutely needs to fail.It was the source of some great pain for folks who tried to use kgdb
for more than an inspection tool. There are some parts of the systemThis is a totally stupid test case but it should illustrate the problem.
- connect to kgdb
- place a break point at kgdb_disable_hw_debug()
- continue
- Now try to break in againYou will see the kgdb integrity check fail because you put a
breakpoint in a place that kgdb actually needs to perform its work.
In this case kgdb will clean up the exception and restore operation
and you will be ok. The kgdboc has a much smaller swath of code that
you cannot debug vs something like kgdboe where there are lot more
places you cannot debug due to the amount of code to services the
ethernet device, irq sync etc...This check is absolutely required to help prevent silent death via
dumb breakpoints or stepping around in random places (which is ill
advised anyway).Jason.
--
Don't you just need a simple recursion counter for this?
I cannot think of a reliable simple way to check for this using the instruction
pointer. The only way would be to use explicit annotations for all
possible code similar to what kprobes does (__kprobes), but that would be
hugely intrusive all over the tree.With the recursion counter the only problem would be someone
setting a break point on the early notifier code itself that contains
a recursion check, but that would be only a few lines of code
so the risk of someone setting a break point exactly there wouldI believe you, but I don't believe that your implementation
of this handles all cases.-Andi
--
It is more than a simple recursion check (which is already in the code)
because there are some conditions we can recover from. I'd rather not
crash the system out if it can be recovered.kgdb_reenter_check() has the "simple" recursion check with the
exception_level variable.It also has several special checks against the breakpoint situation I
described before as well as trying to remove all the breakpoints.
Ultimately if all those checks fails, it results in panic, because at
the end of the day that is all you can do if you re-enter the debugger
from the debugger (IE: you are toast).I don't see any reason to change the code there unless it can be further
improved some way. We want to fail loudly and verbosely when this kind
of thing happens so we can determine if it was the end user that caused
the problem or if there is a real defect.Jason.
--
Ok I'm trying to understand the code as you describe it. As far
as I can see (in kgdb-light-v10) it is:+ addr = kgdb_arch_pc(ks->ex_vector, ks->linux_regs);
+ kgdb_deactivate_sw_breakpoints();
+
+ /*
+ * If the break point removed ok at the place exception
+ * occurred, try to recover and print a warning to the end
+ * user because the user planted a breakpoint in a place that
+ * KGDB needs in order to function.
+ */
+ if (kgdb_remove_sw_break(addr) == 0) {and
+static int kgdb_remove_sw_break(unsigned long addr)
+{
+ int i;
+
+ for (i = 0; i < KGDB_MAX_BREAKPOINTS; i++) {
+ if ((kgdb_break[i].state == BP_SET) &&
+ (kgdb_break[i].bpt_addr == addr)) {
+ kgdb_break[i].state = BP_REMOVED;
+ return 0;
+ }
+ }
+ return -ENOENT;correct?
I don't think that code does what you describe at all. Are you
sure we're talking about the same thing?There is certainly no real protection against break points in
debugger code in there as far as I can see (except for the reentry
counter)-Andi
--
Perhaps we are talking about different things. I will agree that there
is no protection against putting a breakpoint in some code that is part
of the debugger. That was certainly not what I was talking about.You might be able to perform some protections at the breakpoint set time
or request to single step, but that is much more complex than is worth
it for the time being. The recursion check guards against basic
stupidity or accidental stepping out of a frame you didn't mean to.
Also there are a lot of non-obvious code paths that can get executed via
kgdb such as the fault handlers. The simple recursion check covers
enough cases that you don't want to live without it.If you want to improve it please provide some patches or further
elaborate on what needs to be fixed.Jason
--
I don't know -- I have not reread everything. Please don't consider
my comments as approval of the code base. I still think it does quite
a lot of dubious and ugly things overall and should get far more clean upStopping all CPUs for indefinite time very much seems like
"breaking a correctly working system" to me. In a correctly working systemThe only way I know of to do that is gdb vmlinux /proc/kcore
For that gdb vmlinux /proc/kcore already works fine. Or fireproxy.
The main complexity in module handling is handling (or rather preventing)
module unload. I explicitely excluded that in my earlier mail.
Module loading on the other hand tends to be relatively easy.I did a modular kernel debugger on my own some time ago and once
the interfaces were clean it was very simple. I think the reverse
is true too -- if having it as a module is easy then the interfaces
are clean too. That is why I asked for it. It's a good basicIf that is true then it is definitely misnamed and likely
Yeah, it is consistently wrong agreed.
-Andi
--
well, this is a small detail, but still you are wrong, and on a
correctly working system this will not occur. (if yes, tell me how)KGDB does a very straightforward "all CPUs enter controlled state"
transition when the session begins, and at the end an "all CPUs
continue" transition.I'm not sure what you mean exactly under "stopping all CPUs for
indefinite amount of time" (your statement is sufficiently vague to be
hard to counter via specifics) - that does not happen, unless the system
is so buggy that a CPU is not able to process an NMI anymore [which is
rather rare] - in that case the whole system is likely locked up anyway.
In that case the simplest and safest behavior is what kgdb-light does
currently: it will only proceed if all CPUs have responded. Note that
you are wrong to suggest that "KGDB locks up", the system _has already
locked up_.yes, we could "time out" and force a KGDB session even if some CPUs do
not respond. But it's obviously not a completely safe system state,
because other CPUs might be changing things under the feet of the
debugger. So the safest first-level approach is to not enter the
debugger in this case.Ingo
--
Quite frankly, I don't see why the kernel kgdb layer should have *any*
code like this at all.The one who is actually debugging is the one who should decide which CPU's
get stopped, and which don't.I realize that the gdb remote protocol is probably a piece of crap and
cannot handle that, but hey, that's not my problem, and more importantly,
I don't think it's even a *remotely* valid reason for making bad decisions
in the kernel. gdb was still open source last time I saw, and I think it's
reasonable to just say:- the kgdb commands should always act on the *current* CPU only
- add one command that says "switch over to CPU #n" which just releases
the current CPU and sends an IPI to that CPU #n (no timeouts, no
synchronous waiting, no nothing - it's like a "continue", but with a
"try to get the other CPU to stop"Yes, other CPU's will obviously often end up stopping due to waiting for
some spinlock or other if we stop one, but that's a separate issue, and
quite often it might be sufficient - and what we want.And yes, you'd likely have to add some support to gdb to make this
_usable_, but now all that usability crap, all those timeouts for "stop
all CPU's" are now in user space on the _debugger_ side. That can be as
fancy as it wants to be.And maybe this isn't realistic. I'm not saying "we _must_ do it this way",
I just want to say that the kernel kgdb layer should be as thin ass
humanly possible, and maybe the right thing to do is to simply totally
punt on the whole "stop other cpu's" issue and make it a debugger-side
question.In other words, is it perhaps possible to just *get*rid*of* that
"kgdb_active" and "nmicallback" and the whole multi-CPU roundup? Just use
a kgdb spinlock around the stuff that actually sends and receives
individual packets, and expect the debugger side to sort them out (yeah, I
suspect this involves having to add the CPU ID to each packet).Linus
--
The problem I see here is that the kernel tends to get badly confused
if one CPU just stops responding. At some point someone does an global
IPI and that then hangs. You would need to hotunplug the CPU which
is theoretically possible, but quite intrusive. Or maybe the "isolate CPUs
in cpusets" frame work someone posted recently on l-k could be used. Still
would probably have all kinds of tricky issues and races.-Andi
--
Yes. A stopped CPU is very visible and hence can change the behaviour of
I don't think you'd want to be poking around in kernel internals while some
of the CPUs are continuing to run. It sounds rather creepy. You want
everything to stop. Including time-related things.Bear in mind that one of the things you do with kgdb is to modify kernel
memory - I'd do things likeint foo;
...
if (foo == 1)
special_stuff();
...to trigger a particular behaviour at a particular time. If you're making
multiple changes, you want them "atomic" wrt all CPUs. (Of course, if you
happeed to breakpoint one CPU while it was partway through reading multiple
locations, you lose. But that's a teeny window).OT: another thing you can do with kgdb is error-path testing:
foo = kmalloc(...)
BP-> if (!foo)
recover();put a breakpoint on the !foo test and set foo to zero by hand.
--
Hi -
Just for completeness, keep in mind that one can already
If "foo" is a global within a particular compliation unit, any old
function in that CU can be probed to set/get the global. (Setting
incoming function parameters works too.)# cat delayed-set.stp
# set a systemtap script variable based on a /proc control file
probe procfs("activate").write { setit = $value }
global setit
# check the systemtap global in order to set the kernel global
probe kernel.function("foo_checking_fn") {
if (setit=="1") { setit = ""; $foo = 1 }
}
# stap -g -m ds delayed-set.stp &# cat setit2.stp
probe kernel.statement("*@dir/function.c:222") { /*if (desired)*/ $foo = 0 }
# stap -g setit2.stp &- FChE
--
You don't even need systemtap: it has been always possible with
gdb -write vmlinux /proc/kcoreThe only value add of the kgdb stub is really that you can single
step too and do post mortem (although the later works fine using
kdump/crash these days too)-Andi
--
Note I wrote the one above only as a reply to Linus' proposal, not
because I was advocating "live debugging" (or rather I think if you want
that just use gdb vmlinux /proc/kcore)I agree with you in principle, but what do you do if one of the CPUs doesn't
answer?Ingo seems to think it's ok for the debugger then to just hang too,
I think it should eventually time out and debug anyways.Also there are some limits on how much the system should be frozen
down. For example if you're strict about this requirement you
would need to require full DMA quiescence (like kexec does). But that's
obvious not a good idea for the debugger. So it's always shades
of gray, not black/white.What I find strange with the current patch is that it goes to extreme
measures to stop the CPUs (will hang forever), but not do the whole
thing (DMA quiescence too)-Andi
--
Andi, please refrain from misrepresenting my position, i can speak for
myself.As i stated it before, i'm not against adding some configuration on
roundup behavior as long as it's clearly add-on optional system policy.
In fact i already implemented that in the kgdb-light tree.But if your goal is to create smoke where there is none, then feel free
to continue to beat down on this poor strawman, ok? ;-) Meanwhile we'll
continue to clean up kgdb to make it ready for an eventual (and IMO well
deserved) .26 merge. All the useful bits of your comments have made kgdb
cleaner - thanks for that. But please lets keep this constructive.
Thanks,Ingo
--
You're thinking about this totally *wrong*.
You definitely do not want to hot-unplug or isolate anything at all.
That's explicitly against the whole point of kgdb not changing what it is
trying to measure.Just let the other CPU's hang naturally if they need to wait for IPI's
etc. What's the downside? That's what you were trying to do in the first
place by havign the kgdb callback!So you can't have it both ways. Either serializing other cpu's with kgdb
is good (the whole "kgdb_nmicallback" thing or whatever it was called), in
which case it's also perfectly ok to just let them stop when waiting for
IPI's.My point was *not* that kgdb should take control of one CPU, and the other
CPU's should continue to work as if nothing happened. That is insane and
impossible (since you may be stopping a CPU while it holds central
spinlocks etc). No, my point was that I think kgdb should be as light and
non-intrusive as possible, and that any "higher level behaviour" (like the
decision of whether to try to synchronize other CPU's or not) should be
left to the debugger.But only if that makes kgdb patches less intrusive!
In other words, I'm not at all trying to push any particular solution
here, except for the "keep it simple, and anything even remotely debatable
or intrusive to the system should be excised". And I wanted to point out
that maybe all these timeout etc decisions can be pushed to the debugger.So I think we can either:
- have no timeouts or other fancy crap _at_all_, with very simple locking
(ie looks what v10 mostly seems to do)- or you do the fancy dance entirely in the remote debugger.
I don't care. The only thing I care about is that kgdb support never
_ever_ shows up in any interesting code, and that it remains totally
invisible to essentially all of the kernel except the place that would
otherwise print out an oops.And I absolutely don't want it to be fancy, I want it to be so simple that
even _I_ can look at it and say "I...
I agree that it wouldn't be a good idea -- i was just pointing out
There tend to be timeouts (e.g. softlock/nmi watchdog at least). I think
some of the IPIs eventually time out too. In general losing a lot
of time can lead to weird side effects.-Andi
--
I do agree that kgdb and watchdogs aren't like to work well together. And
I don't think you can sanely expect to not have a "disable lockup
detection" when you start poking around with kgdb.We also just have to expect that time will also stop while sometbody is
messing around with kgdb.So I don't dispute that any kernel debugger will *always* be intrusive in
those ways. That's pretty inevitable. I just think the code itself can try
to avoid hooking into various places all over the map.Linus
--
i actually think that the notion of "stopping all system state" is
rather intuitive from a debugging POV: when you have a bug trigger
somewhere then getting an NMI to all CPUs and stopping them dead in
their tracks preserves us the system in its most useful state.also, when using kgdb to "look at system state" it's best to have as
little "unrelated" activity as possible.the timeout argument brought up by Andi was IMO really just pulled here
by its hair, it never happened to me in practice and i was surprised it
even came up. (i never before saw "KGDB roundup" fail - and i tried it
on an 8-way system too) I think the memory-access routine cleanups were
far more interesting from a failure scenario POV. Maybe Andrew could
shed some light on his experience and preference here - he's been using
KGDB for many years.Ingo
--
on a second thought - i actually think it's rather possible and
straightforward to do what you suggest. Stopping of all CPUs is still
useful, but should be an optional property. I'll play with this a bit
and see how GDB reacts.Ingo
--
Stopping with interrupts off. Nothing scheduled anymore.
An easy definition for the condition is anything that requires
touch_{nmi,softlockup}_watchdog [which kgdb definitely does,While that is a slight risk that problem is already there anyways.
Lots of agents in the system could do that. Do you plan to stop
all DMA too for example if you're so worried about this?
Or how about SMM code changing something?Anyways the slight risk of the other CPUs eventually recovering
would seem a acceptable trade off versus not being able to use
the debugger to debug the system with hanging CPUs.A possible compromise between my and your position on this would
be also having an option for this, with default to off
(although I would expect that would be a inconvenient default for
many people)-Andi (who retires from this thread now, I already spent too much on this)
--
Quite frankly, if kgdb starts doing somethign "fancy", there is no way
I'll merge it.This includes things like having "breakpoint reservations" (discussed
earlier) and just generally trying to add lots of infrastructure to make
kgdb "fit in" to the kernel.The fact is, not having kgdb enabled should have _zero_ impact on the
kernel, but the implication is also that when you *do* enable kgdb, that
shouldn't change anything visible either: the rest of the kernel should
simply not _possibly_ be able to even tell or care (which is why
breakpoint reservations are out - they are against the whole point of
debugging a unmodified kernel).So kgdb in my opinion will *have* to step on some other kernel
infrastructure. I also think that things like timeouts are not something
the client should do - if a kgdb lock never gets through, then it should
be the debugging end that should just react to ^C and tough titties: it's
not like there's a whole lot to be done.In other words: I think the kgdb patches have become a lot more palatable
over the last week, but I think so exactly because they have gotten
smaller and less invasive. Anything that evokes any discussion AT ALL
should just be removed. It really should be that simple.[ The exception being that I think hw breakpoint support should be added
back in - never mind that it won't work if the "native kernel" also uses
them. Tough.If the debugger screws up the hw breakpoint state, that's a debugger
error. I'd hope that the remote debugger *defaults* to just re-writing
the instruction stream for that reason, but if you want to do a data
breakpoint, you simply *have* to use the hw breakpoints for kgdb.And you just need to live with the fact that it simply won't be possible
to do if the client wants to use hw breakpoints too. If the client
starts using hw breakpoints - tough titties, it takes them. ]So keep the damn thing really simple, and don't try to handle every
possible thing. W...
HW breakpoints (particularly data breakpoints) are extremely useful
and I am proposing to put them back into kgdb-light. The HW
breakpoint patch worked exactly as described. If the user space
writes into the HW breakpoint registers via ptrace, those take
priority over the kernel HW breakpoints.We have to assume that the person debugging the kernel ultimately has
full control of the system, and will do what ever is needed to
debug/inspect the kernel how he or she chooses.Of course at some future time if there is a generic API to use for the
kernel level HW breakpoints, kgdb can adapt like anything else.Jason.
--
yes, that's what i've been doing. When anything became questionable even
just a little bit, it was the axe or i changed it to something really
obvious and correct.by 2.6.26 kgdb-light will become so neutral to the rest of the kernel
yeah. I believe we need to achieve a "known zero impact, 100% trustable"
yes - if something locks up, it will be the NMI watchdog starting the
debugger eventually - not the other way around. The NMI watchdog is
already system policy which can be turned on/off and is well known and
expresses the user's wishes with what should happen to the system.The debugger should really be a fundamentally very passive "console"
type of thing, with as little direct policy as possible. That minimizes
the chance that it accidentally messes up something and also makes it
more likely that it will actually work reliably and dependably.Ingo
--
I think that part is actually mostly ok now (old kgdb stubs were
much worse in this regard)I still think the ultimative proof for this would be working
While agreed in theory there are some exceptions: any time an oops
happens or a crash dump would happen automatically kgdb should definitely
do something very visible. Unfortunately that's not always the caseThe problem here is that the timeout we were talking about
is not integrated into the kgdb event loop, it is rather the loop
that protects the event loop. If you hang there all inputIf you want areally simple kgdb the best option would be re-porting
the really simple (tm) 2.4 era x86-64 kgdb stub I did long ago.
It had a small fraction of the code size of the current code, but
was also missing a lot of features of course.-Andi
--
To pick up this idea again I did the experimental patch below. It
applies against Jason's latest kgdb-light patch queue:http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=s...
The patch nicely demonstrates what deeper dependencies on kernel
services currently exist in kgdb-light. The following symbols were
unresolvable:o genapic - for send_IPI_allbutself, ie. CPU roundup
o machine_emergency_restart - for implementing "R0" gdb packet
o idle_task - kgdb tells the per-cpu idle tasks apart
o clocksource_touch_watchdog - obviousFor simplicity reasons I just gpl-exported all of them. The result is
a fully modular kgdb that may help to reduce concerns regarding its
intrusiveness.Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
arch/x86/kernel/Makefile | 4 +++-
arch/x86/kernel/genapic_64.c | 1 +
arch/x86/kernel/{kgdb.c => kgdb-x86.c} | 0
arch/x86/kernel/reboot.c | 1 +
arch/x86/mach-generic/probe.c | 1 +
kernel/Makefile | 1 -
kernel/sched.c | 1 +
kernel/time/clocksource.c | 1 +
lib/Kconfig.kgdb | 2 +-
{kernel => lib}/kgdb.c | 8 ++++++--
10 files changed, 15 insertions(+), 5 deletions(-)
rename arch/x86/kernel/{kgdb.c => kgdb-x86.c} (100%)
rename {kernel => lib}/kgdb.c (100%)diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 4cd39cd..2e733b1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -58,7 +58,6 @@ obj-$(CONFIG_MODULES) += module_$(BITS).o
obj-$(CONFIG_ACPI_SRAT) += srat_32.o
obj-$(CONFIG_EFI) += efi.o efi_$(BITS).o efi_stub_$(BITS).o
obj-$(CONFIG_DOUBLEFAULT) += doublefault_32.o
-obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_VM86) += vm86_32.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o@@ -79,6 +78,9 @@ endif
obj-$(CONFIG_SCx200) += scx200.o
scx200-y...
Very nice! If it's that simple then the kgdb integration is really
I would rather export some generic wrapper for that than the full genapic
Hmm, might be a bit dangerous to call this directly -- there are various
quirks with e.g. not rebooting on CPU #0 and not resetting APIC
state. But ok [this is not directly related to the fact that it's
exported now, just mentioning this in general]-Andi
--
kgdb was never expected to be any better than using a sysrq-b. Unless
you put some really ugly code in to deal with lots of edge cases there
are certainly times it is just going to force you to press the power
button. If there is a better function for "reset now and cross fingers"
kgdb can be changed easily enough to do that.Also given that you can have more than one type of "R" packet it would
be easy enough to add other reboot types, such as an R1 which might call
to the bios or another means to reset. For now the intent is pretty
clear and there is not likely a need to expand the scope.Jason.
--
see? There's the difference between us. The initial merge of KGDB does
not want to include policies for nuances that pose an "acceptable slight
risk".Had we done it, you could have (rightfully in that case, i have to say)
complained about that "slight but real danger" of a debugger activatingyes, i said this the first time you mentioned this issue that while it's
all no big deal, it could perhaps be a sensible (but default-off)
layered-on option for later. But the initial merge is for obvious stuff
and it does not want to do such things.Ingo
--
Ok fine. Please implement DMA quiescence on kgdb entry then; otherwise
the code won't conform to your standards at all.-Andi
--
well, it does not take a mindreader to conclude that nothing we do could
possibly result in you "concluding" that kgdb would be ready for a merge
;-)for me it suffices that you've run out of technical arguments and are
Ingo
--
Let's put it like this: the amount of problems I found by reading
only about 20% of the kgdb patch justifies the statement above. Also
contrary to your repeated claims many of those are not fixed yet
as far as I can see.Also every time I dig into something new problems raise their
head -- e.g. see earlier exchange with Jason about the likely
bogus recursion check.Overall that's not a sign of a clean merge ready code base.
-Andi
--
There was work underway on that before (hw_breakpoint). I'm not entirely
sure you want to use fancy stuff like that in kgdb. It's nice for kgdb to
be as self-contained as possible, so you can debug everything else with it.
If you're going to rely on lots of higher-level infrastructure, it could be
using kprobes for its software breakpoint handling, too.At any rate, I think it would be good if the hw breakpoint support in kgdb
were chopped out into a separate patch. First make kgdb work with no code
touching debug registers at all. Then a second patch can add the hw
breakpoint support. The latter one can be mulled over or replaced with a
hw_breakpoint merge or just put off for a while because it's a big can of
worms to multiplex the use of the debug registers.Thanks,
Roland
--
agreed, and i've done that now. Thanks,
Ingo
--
1) The content is valid no matter the formatting
2) It is a well known format
3) And it is widely usedUsing a wellknown format does not require it to be postprocessed.
It is by far better than "one-format-for-each-developer".Sam
--
They seem to have reverted these patches now, so that problem
is gone.Yes there are a couple of other consumers of CFI too, like crash
or the out of tree (but not dead) in kernel unwinder.-Andi
--
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Benjamin Herrenschmidt | Re: [PATCH] Remove process freezer from suspend to RAM pathway |
| Bart Van Assche | Re: Integration of SCST in the mainstream Linux kernel |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Arjan van de Ven | Re: [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Natalie Protasevich | [BUG] New Kernel Bugs |
