Re: [ANNOUNCE] Merkey's Kernel Debugger

Previous thread: SW-IOMMU / ata_piix errors with 2.6.27-rc1 by Juergen Kreileder on Sunday, August 3, 2008 - 10:10 am. (1 message)

Next thread: [PATCH 3/3] posix-timers: simplify de_thread()->exit_itimers() path by Oleg Nesterov on Sunday, August 3, 2008 - 10:49 am. (1 message)
From: jmerkey
Date: Sunday, August 3, 2008 - 10:22 am

This is a linux port of the kernel debugger I wrote in 2000 for the
MANOS/Gadugi Operating System.  I created this particular
port in June of this year from the MANOS/Gadugi source code I released
under the GNU public license in 2000.

I wrote the SMP debugger use in SMP Netware in 1994 and 1995, and that was
later rolled into the main Netware kernel, though a lot of folks
contributed helped merge it into Netware.  This debugger closely resembles
the legacy Netware kernel debugger, and I find it easier to use than kdb
with less
crashes and problems.

This version is ia32 only at present, but I am completing x86_64 support
and will post it as it is completed.  I basically wrote this tool for
my own internal use and for my projects since I could not find a debugger
in linux I was used to.  I add support to it as I need it for my own
internal use.

This linux port of my kernel debugger does not require kdb or the kdb hooks
and is more minimal than kdb and has some features kdb does not, such as
Intel style disassembly with dereferencing of data during disassembly
and a very robust mathematical numeric support with conditional breakpoints.

I created a far more robust version of this debugger in 2001 which
included source level support, integrated screen and keyboard support,
remote networking capability, and loader support and licensed it to another
company.  I was placed under a 5 year non-compete not to port
this tool to Linux until end of year 2007.  The folks who licensed it
did absolutely nothing with it of consequence, and 2007 has come and gone,
so I am released from the non-compete and decided to port the debugger
from my old Open Source operating system and I figured it might be as useful
to others as it has been for my projects.

I will be posting user space modules which can be loaded with this version
at some point which will enable source level debugging and a bunch of
other features.  This add ons may get farmed out to another company for
support.

KNOWN ...
From: jmerkey
Date: Sunday, August 3, 2008 - 12:36 pm

This patch is formally submitted for consideration for inclusion in the
base linux kernel.

ftp://ftp.wolfmountaingroup.org/pub/mdb/mdb-2.6.26-ia32-08-02-08.patch

Jeff

--

From: Rene Herman
Date: Sunday, August 3, 2008 - 1:00 pm

Haven't actually looked, but you should've probably waited just a bit 
for people to start using and then getting fed up with kgdb...

Rene.
--

From: Josh Boyer
Date: Sunday, August 3, 2008 - 5:14 pm

Formally submitted patches should be sent to the list inline.  Reviewing
something on an FTP server just becomes that much harder.

josh

--

From: jmerkey
Date: Sunday, August 3, 2008 - 7:19 pm

Jeff


--

From: Stefan Richter
Date: Monday, August 4, 2008 - 6:41 am

Some non-technical comments to the patch series:
  - Each patch posting in a patch series should have an own Subject and
    changelog which specifically describes the included patch.
  - The Developer's Certificate of Origin is written simply as a single
    line:
    Signed-off-by: Jeffrey Vernon Merkey <email@address>
    This line needs to be included in the changelog of each patch, i.e.
    precedes the diff.  (Tools which harvest patches from mboxes are
    trained to pick the changelog up from before the diff.)
  - The MUA rewrapped some lines.
  - File name and date of last change are redundant information and are
    better left out of the source files.
  - Understandably for a port from other kernels, there are clashes with
    Linux kernel's coding style like CamelCase names, comment style,
    indentations.
  - Why define LONGLONG, WORD, BYTE and so on?  They could be plain
    unsigned char etc., or u8 etc. if you like it brief.
  - Boolean values should be the standard true and false, not locally
    defined TRUE and FALSE.
  - Usually the #include's are not collected in an intermediary header
    (as in patch 7/25) but put directly into the files which require
    a particular #include.

I haven't looked in detail at the patches; it's far out of my area of
experience...
-- 
Stefan Richter
-=====-==--- =--- --=--
http://arcgraph.de/sr/
--

From: jmerkey
Date: Monday, August 4, 2008 - 7:33 am

OK,  Sounds like I get a D- on patch format submission.   I will rework
the patches, switch back to GPL2 (since I guess GPL 3 is still not there
yet) and clean up this list of issues.   ULONG, etc. is Microsoft syntax
for cross platform compatibility.  Since this is a LINUX SPECIFIC PATCH,
I'll rip out and rework the Gates-isms in the code.

All that aside, the damn works so at least folks can start using it while
I perform code beautification.



--

From: Geert Uytterhoeven
Date: Tuesday, August 5, 2008 - 2:41 am

You're aware that the Microsoft assumption

    typedef unsigned long  ULONG

is not compatible with 64-bit platforms in the rest of the world?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 8:02 am

No I was not, but I am now.  At any rate, I removed the Microsoft-isms
from the code.  I can cut yet another patch for git6, but git5 was there
-- GPL2 and all.  How about putting in into the kernel guys -- :-)

Jeff



--

From: Nick Piggin
Date: Tuesday, August 5, 2008 - 8:33 am

Seriously? Because it doesn't seem to have had enough peer review,
it hasn't had widespread testing in somewhere like linux-next or
-mm, and we already have kgdb so you have to also explain why you
can't improve kgdb in the areas it trails mdb.

But the ideal outcome would be if you could contribute patches to
kgdb to the point where it is as good as mdb. It is already in the
tree and supported by a handful of architectures... any chance of
that? (I don't know kernel debugger code, so I ask as an interested
user)
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 8:19 am

If you go back to LKML from 2000, this debugger has been around for 10
years.  I agree not in the hands of the public, but its very mature

I plan to work on kdb and yes, there is a version of this that runs
as an alternate debugger of kdb - you can even switch back and forth
between them - but that misses the point as well.

I can wait untl its more widespread -- or not.

Jeff



--

From: Nick Piggin
Date: Tuesday, August 5, 2008 - 8:45 am

OK I don't doubt that at all, but I just mean in terms of being reviewed
by Linux people and how it merges with the current kernel (eg. we now

That would be great if you do work on kgdb... But I guess I do miss
the point, then. Is there a technical difference with kgdb that cannot
be worked around, a difference of opinion with maintainers, a wish to
have mdb features at short notice?
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 8:32 am

Nick, its OK.  There have been 27,453 downloads of the patches from my ftp
server since yesterday when I osted it -- from what I am seeing people are
voting with their feet.   People can get it and I even posted it t
SourceForge as well.  After ten years of working on Linux I thougt it
would be nice for something I wrote to end up there.  It will happen when
its time.  As it stands, people are using it and it is going to help a lot



--

From: Nick Piggin
Date: Tuesday, August 5, 2008 - 9:38 am

That's all well and good :). But it didn't exactly answer my question.
My question was not what is the point of you writing these patches, but
what is the point of merging it into the kernel (over the alternatives).
It may seem like a trivial question, but it is one that must be answered
in order to be considered to get merged.
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 9:45 am

Integrated kernel debugger in linux (minimal one) and given that there are
already patches to add tickets and text to locks and other tools, one more
can only help.    This is by no means the full MDB debugger you have seen,
just a pared down core I submitted.  The entire MDB debugger is much
larger.

I have been working on it for ten years, and you may or may not have
noticed, I typically do not ask many questions these days from the
community for my appliance and router development, nor ask for help for
any of the companies I have created and sold based on Linux over the past
ten years since I have tools to fix my stuff without needing a hardware
based inverse assembler like most folks need to debug hardware and file

Jeff

Jeff

--

From: Rene Herman
Date: Wednesday, August 6, 2008 - 12:47 pm

Nick, please note there is/was some mis-communication between the two of 
you with respect to kgdb, the currently merged GNU debugger interface, 
and kdb, the SGI kernel debugger.

Merkey responded to you as if you asked about differences with KDB while 
what you did was ask about differences with KGDB. Both KDB and MDB are 
significantly different from KGDB at least in sofar that the latter is a 
remote debugger; it requires two machines. KDB and MDB are local.

This makes KDB and MDB more accesible for small time use at least. The 
other most profound advantage is ofcourse that it's not GDB.

Rene.
--

From: Chris Friesen
Date: Tuesday, August 5, 2008 - 9:04 am

Without public use, it's difficult to determine that there aren't any 
nasty interactions.

If you want to maximize your chances of getting this code into the 
kernel, you might want to read Jonathan Corbet's post, "[PATCH] A 
development process document, V2".  It discusses the normal process, how 
to prepare patches for submission, etc.

Chris
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 9:39 am

Read it already.  Quite a few large companies are using it at present and
have been since 2000, BTW.

Jeff




--

From: Daniel Barkalow
Date: Thursday, August 7, 2008 - 1:43 pm

The criterion for kernel inclusion isn't really whether it works, however. 
It's whether other people would be able to understand it well enough to 
support it if you disappear (or if somebody else has changes that require 
changes to it). If it works well but isn't nice code, nobody really 
benefits from having it in the kernel distribution rather than external 
(like it's been for the past 8 years). If it is nice code (somewhat 
regardless of whether it happens to work right now), people can work on it 
and keep it in sync with the kernel as they change things.

	-Daniel
*This .sig left intentionally blank*
--

From: jmerkey
Date: Thursday, August 7, 2008 - 2:02 pm

It both works and is nice code.   But I may not be impartial.

Jeff

--

From: jmerkey
Date: Thursday, August 7, 2008 - 2:04 pm

You activate it from the keyboard the same way as kdb -- pause/break or
int X or exceptions.

Jeff


--

From: Paul Mundt
Date: Tuesday, August 5, 2008 - 10:21 am

That's great, except kgdb has existed in the kernel for various
architectures well before that as well. ppc32's stub dates back to 1998,
sh had it since 2001, mips around the same time, etc, etc. While the
current rework and tidying of the stubs is something new, kgdb itself is
kgdb and kdb are totally different things, kgdb is what is generally
available and worth improving in-kernel.

While it's certainly good to have options, having multiple in-kernel
debuggers is not going to help matters for the vast majority of users. I
agree with Nick, it would be nice to see what we have in-kernel being
extended and worked on by more people, especially those with a background
in these things.

On the other hand, it seems like there's sufficient interest in your
project out-of-tree, so there's not really much point in merging it if
you're content with the interface as it exists today and it continues to
work for your users.

One of the things we can do however is try to provide cleaner
abstractions for the various debuggers to tie in to, so we don't end up
with each debugger piling on its own set of ifdefs in all of the same
places (int3 handling comes to mind, which you could already do more
cleanly through the die chain today). Perhaps it would be more useful to
see what sort of hooks mdb wants in the architecture and core code, how
those overlap with kgdb, and how we might extend kgdb in areas where mdb
is more feature complete.
--

From: jmerkey
Date: Tuesday, August 5, 2008 - 10:10 am

Not your call to make.  Kernel Debuggers are very personal choices and
its pure arrogance to assume any of us can make a choice for someone else
with tools.  My tastes in debuggers is like my tastes in food, or women,


This is a great suggestion.  mdb already uses an alternate debugger
interface with the hooks into traps_XX.c and reboot_XX.c.   I still would
like to see it in kernel.  but an alternate debugger interface as you
point
out is almost a necessity at this point.  there's a good example in
mdb.c and mdb-list.c.


Jeff

--

From: Andi Kleen
Date: Tuesday, August 5, 2008 - 8:08 pm

I don't think kgdb and a simple assembler debugger 
are directly comparable. kgdb always requires a remote machine,
which has many advantages, but is also often very inconvenient
or impossible to arrange. An low overhead assembler debugger
can be always compiled in just in case.

Also at least for the x86 port the debugger interfaces should
be general enough now (see die hooks as a "debug vfs") that it would
be quite possible to have a multitude of debuggers just using 
them. In fact that's already the cases, kprobes and kgdb and 
kdump are all kinds of debuggers using such hooks.

As long as it doesn't impact the core code and the mdb 
code itself is considered merge worthy and has clean interfaces 
that would seem fine to me.It essentially would just live somewhere in 
its own directory using the existing interfaces. My standard
test for seeing if a debugger has clean interfaces is to see
if it can be loaded as a module.

There are enough different debugging styles around that offering
developers different tools of which they can pick whatever suits
them is not a bad idea. Also as everyone knows debugging
is often a major time eater and if more tools are available that 
can only help the kernel.

That said I haven't read the mdb code, not judging on its general
merge-worthiness or am really completely sure what are all the details
of a "netware style debugger", just a general high level comment on
debuggers. At least judging based on the patch sizes it at least
doesn't seem particularly bloated.  But of course it would need full
proper review first.

-Andi
--

From: Nick Piggin
Date: Tuesday, August 5, 2008 - 10:50 pm

OK thanks for the info. I don't actually know debugger code as I
said, so I wasn't against merging mdb if it offers things that
kgdb fundamentally cannot.

If so, then ensuring clean interfaces indeed would seem like a
good first step to getting it merged.
--

From: Christoph Lameter
Date: Thursday, August 7, 2008 - 10:45 am

The competing implementation is kdb not kgdb. kgdb is just a stub for remote
debugging using gdb. kdb is an in-kernel debugger like the one proposed here.
--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 11:08 am

Is there work underway to get kdb merged?  (I'm just asking because I 
don't know; I personally don't need kdb nor mdb.)
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: Christoph Lameter
Date: Thursday, August 7, 2008 - 12:10 pm

KDB still exists in patches but the merge effort was given up when Linus
stated that he did not want a kernel debugger. No problem to start merge
attempts again AFAICT. Jay?

--

From: Jay Lan
Date: Thursday, August 7, 2008 - 12:47 pm

To merge KDB or any other RAS tools, you need to deal with kdump. Kdump
hijack panic() before the die calling chain. For KDB or a RAS tool to
work, an infrastructure such as the "add new notifier function" by
Takenori Nagano should be in place.

His last attempt fell short, in my opinion, was partly due to his
"[PATCH 3/3] Move crash_kexec() into  panic_notifier" did not do what it
meant to do: to fit kexec/kdump into the new infrastructure. That is
not fatal; it can be fixed to make it right. If community is interested
in getting a kernel debugger to the kernel, we can continue Takenori's
work. Once the infrastructure is accepted, then merging KDB or any other
kernel debugger will make sense.

Regards,

--

From: jmerkey
Date: Thursday, August 7, 2008 - 12:34 pm

As I look through entry_32.S and traps_32.c I do not see where kdump is
hooking the notify_die handler which would intercept calls to a debugger.

Where does kdump hook this path?



--

From: Vivek Goyal
Date: Thursday, August 7, 2008 - 6:26 pm

kdump uses crash_kexec() call for hooking. It hooks in panic(), die_nmi()
and die().

Thanks
Vivek
--

From: Andi Kleen
Date: Thursday, August 7, 2008 - 1:06 pm

Imho kdump should just be fixed to use die chains.

-Andi
--

From: Bernhard Walle
Date: Thursday, August 7, 2008 - 1:07 pm

Well, we had that discussion several times. I'm not against it
(instead, I would like it), but I don't think that repeating the
discussion over and over does help ...




Bernhard
--=20
Bernhard Walle, SUSE LINUX Products GmbH, Architecture Development
From: Andi Kleen
Date: Thursday, August 7, 2008 - 1:09 pm

Just needs some code? 

-Andi
--

From: Bernhard Walle
Date: Thursday, August 7, 2008 - 1:11 pm

No, it was rejected with the argument that in panic case, as less code
as possible should be executed before kexec'ing the panic kernel.

See also: http://kerneltrap.org/node/14050 (for example)



Bernhard
-- 
Bernhard Walle, SUSE LINUX Products GmbH, Architecture Development
--

From: Keith Owens
Date: Thursday, August 7, 2008 - 3:28 pm

Violently agree, especially since the IA64 handling of NMI type
events is significantly different from x86 and requires at least two
callbacks via the die chain.

Alas the kdump authors are adamant that they will not use die chains,
which makes it almost impossible for any other RAS code to coexist with
kdump.  This intransigence on the part of kdump is one of the reasons
that I gave up on getting _any_ RAS code (not just KDB) into the Linux
kernel.

See http://kerneltrap.org/node/14050 and
http://marc.info/?l=linux-arch&m=116304508731232&w=2, the latter
explains why you need die chains to handle IA64 correctly.  x86
debugging is relatively easy, ia64 is hard due to interactions between
the firmware and the OS, either can stop the other cpus.  If your
debugging framework does not handle ia64 INIT and MCA events, then you
cannot debug most of the interesting ia64 events.

In any case, we have gone round this loop too many times for me to care
about it any more.  I have given up on Linux RAS code.

--

From: Vivek Goyal
Date: Thursday, August 7, 2008 - 6:15 pm

I am doing a quick source code grep and in all the cases except panic,
kdump gets a chance to run in the end. We are running die notifications
first. For example, in the case of nmi, in the case of traps,
in the case of mce, notifier list is being executed first. So a debugger
or any other RAS tool on the notifier chain will get a chance to
run first.
 
panic() is the only place where kdump gets a chance to run first and
panic notifiers are not executed.

To me so far only in kernel debugger seems to be a reasonable candiate
which needs to run before kdump after a panic event. If a debugger
is really getting merged into the kernel, then I think debugger can
put a hook in the panic() before kdump. Wouldn't this solve the problem?

Thanks
Vivek
--

From: Andi Kleen
Date: Thursday, August 7, 2008 - 7:29 pm

To be fully clear panic() that is called outside oops/exception context


Yes a kernel debugger should be able to hook into panic()

In fact it can do that already by just setting a break point,
but clearly having a real notifier is preferable.

The use case would be then that the kernel debugger would

kgdb is already merged. Also the x86 notifiers are general
enough that there are a couple of debuggers floating around
that are just using existing interfaces (as in need very little in terms

Yes it would, but right now there is no such hook. Also if there 
was such a hook kdump could use it like everyone else.  

There's a priority scheme in notifiers so you can still run usually last.

-Andi
--

From: Cliff Wickman
Date: Friday, August 8, 2008 - 5:08 am

Agree.
And here is another example of the need for such a hook:

In a partitioned system [I work for SGI, so I'm talking about an Altix],
there is memory sharing among multiple single-system images. And if
one of those partitions were to panic the other partitions need to
be informed that they cannot address the panic'd partition's memory.
(Once that partition is rebooted any such access will cause an MCA
in the accessor.)

So the cross-partition driver (xpc) needs to run a callback there, too.
It seems to me, as Keith has voiced, that it should be the user's choice

-- 
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824
--

From: Andi Kleen
Date: Friday, August 8, 2008 - 5:20 am

There are already existing shutdown hooks. Aren't they good enough
for that?

I would feel uneasy about having arbitary drivers hook into panic().
While I'm sure your code is great there is unfortunately a lot 
of crappy driver code around.

-Andi
--

From: jmerkey
Date: Friday, August 8, 2008 - 6:19 am

I hooked panic last night and inserted a notify_die hook -- there is even
a state defined for it already -- DIE_PANIC.  The rest of the code should
be ok.  My only question was where to harvest the regs variable since
panic is not a real exception.

Here's a first stab.  You also must add #include <linux/kdebug.h> to the
top of panic as well.

diff -Naur linux-2.6.27/kernel/panic.c linux-2.6.27-mdb/kernel/panic.c
--- linux-2.6.27/kernel/panic.c	2008-08-07 15:32:29.000000000 -0600
+++ linux-2.6.27-mdb/kernel/panic.c	2008-08-07 15:29:09.000000000 -0600
@@ -82,6 +82,12 @@
 	printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf);
 	bust_spinlocks(0);

+        // call the notify_die handler for any resident debuggers which
+        // may be active and pass the message string.   On a software
+        // fault return at least some sort of regs  for a remote debugger
+        // to look at.
+	notify_die(DIE_PANIC, buf, get_irq_regs(), 0, 0, 0);
+
 	/*
 	 * If we have crashed and we have a crash kernel loaded let it handle
 	 * everything else.


Jeff


--

From: Cliff Wickman
Date: Friday, August 8, 2008 - 8:06 am

For shutdown, yes.  But on a panic crash_kexec() gets called

That is Eric Biederman's concern as well.  But it seems we should
have a way for a user/customer to customize those events and their order,
as I noted in a previous post.

-- 
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824
--

From: Vivek Goyal
Date: Friday, August 8, 2008 - 6:29 am

Hi Andi,

IIUC, there are two lists for exception and panic notifications. All the
exceptios, NMI related notifications go through "die_chain" and
all the panic notifications are done through "panic_notifier_list".

Are you suggesting that kdump should be put onto panic_notifier_list, in
such a way so that it runs last?

Just few points to ponder.

- panic_notifier_list is exported and any module can register and make use
  of it. As you mentioned in your other mail, there are lot of drivers out
  there with crappy code and if we do it, all the drivers get a chance
  to do stuff after panic() and there is no gurantee that kdump code will
  ever get a chance to run.

- Kdump is built on the philosophy that after a panic(), one should do as
  as little as possible in the kernel and all the actions should be
  deferred to new kernel. That's why we recommend that all the panic
  notifier actions (except debugger), should be done in second kernel. It
  does introduce a little delay in notification but it also makes it more
  reliable.

- Neil Horman, has already provided infrastructure so that one can put
  it user space code in second kernel's initrd and it will be executed. 
  This can be easily done for modules also. 

But somehow nobody seems to be interested in doing things in second kernel
and everybody wants to run its post panic code in the first kernel. So
far, except debugger, we have not run into any strong case which needs to
run post panic code in first kernel and things will not work out if post
panic actions are taken in second kernel.

That's why there is always resistance from our side to move kdump to panic
notifier list so that we can make modules do the right thing and that
is, run in second kernel. The moment kdump is put onto panic_notifier_list,
nobody will think of doing anything in second kernel (because it takes extra
effort). Everybody will register a panic notifier handler in first kernel
and be happy..

If everybody thinks that they can do ...
From: Cliff Wickman
Date: Friday, August 8, 2008 - 7:50 am

In the case of the cross-partition driver, running panic notification in the
second kernel is an interesting idea.

I discussed it with Robin Holt, who is more knowledgable than I on the
details of that driver, and he told me that there is a great deal of
state information needed for the notification.  It's easy to do in the
first kernel, but extremely difficult in a second kernel.

Couldn't we have some tunable flexability in that area, to determine
should run on a panic, and in what order?

 
--

From: Jay Lan
Date: Friday, August 8, 2008 - 9:57 am

KDB registers to the panic_notifier_list, but since crash_kexec()
takes control early in panic(), the panic_notifier_list is essentially
dead if kdump is chkconfig'ed on.

I think a kernel debugger is not complete if it does not have an option
to create a kernel dump. Unfortunately we have to tell KDB users to
not chkconfig on kdump.

I am working on KDB to allow KDB to co-exist with kdump. But it is
done through a hack to place KDB ahead of crash-kexec(). It would be
preferred to have a formal notifier_list.

Regards,

--

From: Vivek Goyal
Date: Monday, August 11, 2008 - 5:56 am

May be that's the way forward. Export the list of registered handlers on
panic_notifier_list through sysfs or debugfs and also provide flexibility
that user can change the priorities from userspace. That should work
for all. 

Thanks
Vivek
--

From: Andi Kleen
Date: Friday, August 8, 2008 - 11:03 am

The point was that kernel debuggers have an at least as legitimate
need as kdump to run early on panic as kdump. In particularly they
should run before kdump because kdump can be triggered from
the debugger.

But for modular kernel debuggers the hook would need to be exported,
so in theory everyone could use it. In theory code review should
catch that. Another alternative would be to readd the old namespaces
patches I posted some time ago, this allowed to export symbols only
to specific modules (but that would be also unfortunate for out of tree
debuggers) 

Since we have nearly all other needed hooks for kernel debuggers
anyways it doesn't really make sense to stop at panic. So this
earlier requirements should be relaxed.

Perhaps code review can solve the problem?

-Andi
--

From: Vivek Goyal
Date: Monday, August 11, 2008 - 6:02 am

I think given that so many people want kdump on panic_notifier_list,
it would be worthwhile to experiment with the different approach.

- Move kdump to panic_notifier_list.
- Export panic_notifier_list to user space and provide flexibility
  so that a user can change the priorities of registered handlers
  dynamically.

This will allow an admin to explicitly see who all are goint to run
in what order in case of panic and also give him capability that he
can choose to change the order.

This kind of list should keep all the kind of users happy. Those who
want to run all the other modules before kdump, they will be able to
do so and those who don't want, they can boost the priority of kdump
to put it ahead in the list.

I think Takenori had some working patches in the past for this. Probably
time to revisit the patches. (Somebody willing to look into it?).

Thanks
Vivek
--

From: jmerkey
Date: Monday, August 11, 2008 - 6:11 am

I found a problem with APIC NMI support which seems to affect all the
debuggers, but appears machine specific -- at least I can reproduce it
with all of the modules MDB, KDB, and KGDB modules on my ACER 2410 dual
core laptop.  It explains the mysterious hangs I would see in KDB all the
time on SMP systems.

The call:

send_IPI_allbutself(vector)

will hard hang an on ACER laptop with dual core processors if issued while
any one of the processors are actively inside an INT 1 handler, then take
a SECOND NMI inside of this path, and nest.   It hangs the requesting
(focus) processor during nested interrupts if a target processor is A)
inside an INT 1 exception B) takes an NMI interrupt C) returns from the
NMI back into the INT1 D) receives a second NMI.

I am aware that a second NMI will not propagate to a processor currently
servicing an NMI until the processor sees an IRET instruction (at least
this is how intel worked years back).

I have not been able to reproduce it on the Xeon based motherboards.  I
have seen the APIC bus hang this way on my other OS project -- when the
APIC was programmed incorrectly, and assume it must be a bug in the APIC,
how the APIC is programmed by Linux, etc.

I am coding around the problem to prevent such convoluted nesting levels
in MDB (this was from testing) but this was the final test for enabling
SSB and all the fixes before I post and rc3 patch series which really
cleanup up the code, and there's a mystery with send_IPI_allbutself().

Jeff

--

From: Andi Kleen
Date: Monday, August 11, 2008 - 6:50 am

A couple of laptop BIOS (e.g. some thinkpads) are unfortunately
not NMI safe. There is no known workaround other than not using NMIs
on these systems.

There's unfortunately no global blacklist for these systems, although
having would be useful for a couple of subsystems.

-Andi

--

From: jmerkey
Date: Monday, August 11, 2008 - 9:16 am

I seem to have nailed down the "voodoo" sequence for reproducing it and
the sequence of failure on the Acer 9410.

Processors 0,1

first set a global breakpoint (schedule) and load registers DR6/DR7

0 -> trigger int1 breakpoint
1 -> trigger int1 breakpoint
0 -> get debugger lock
1 -> spin at debugger lock
0-> NMI all processors but self
1-> gets NMI while spinning at debugger lock
1-> enters NMI code loop and spins
0-> enter debugger console
0-> leave debugger console
0-> release spinning processors
1-> leave NMI code issues IRETD (returns to debugger spinlock and spins)
0-> release debugger lock
1-> get debugger lock
1-> NMI all processors but self
...hard hang in send_IPI_allbutself(APIC_DM_NMI)....

If a delay is placed in the code that calls send_IPI_allbutself() that
waits until processor 0 has left the int1 exception handler and issued an
IRETD, then the hang does not occur.  Seems to be the workaround for this
problem.

This problem seems specific to my Acer 9410 laptop, and as you described
seems hardware related, though I am going to attempt to instrument a
workaround for it anyway.

Jeff




--

From: jmerkey
Date: Thursday, August 7, 2008 - 10:53 am

I don't consider them competing, just different tools for people from
different development backgrounds.  GNU and DOS/Windows.

Jeff

--

From: Nick Piggin
Date: Friday, August 8, 2008 - 1:40 am

Yes, so Andi said a couple of days ago ;)
--

From: Bill Davidsen
Date: Wednesday, August 6, 2008 - 6:11 am

That idea sounds familiar, the "suspend2" response, when something new 
and significantly different is offered, instead of putting it in and 
letting people choose in configuration, take the position that what is 
there is good enough, and if the author of the new solution will just 
drop all their ideas and slap some band-aids on the existing code it 
will be "gooder enough" without actually offering people a choice of 
something different.

I totally agree with this, the whole idea of a remote machine implies 
In addition to "Bravo!" I will add that tools which work somewhat 
differently will increase the chances of having a tool which will work 
I would suggest that if it meets coding standards and doesn't break 
anything else it could be included in -mm (assume there's no objection 
there) and let people beat on it there, with the assumption that unless 
problems are found it will be promoted.

The need for a special setup make spur-of-the-moment investigation of 
unusual behavior difficult for anyone but a hard-core developer who does 
daily work on a setup with the remote machine available at hand. I think 
this new approach would encourage people to do quick checks when the 
behavior is observed.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--

From: Stefan Richter
Date: Wednesday, August 6, 2008 - 6:37 am

To be fair, choice in "leaf" features like a debugger is not entirely 
comparable to choice in central features.  If the infrastructure does 
not support all use cases reasonably well, its better to fix the 
infrastructure or replace it by a working one, rather than adding a 
second infrastructure which is also not general enough.

In this case:  Make a side-by-side comparison of features and 
shortcomings of the available debuggers (as in Andi's response), then 
decide how the best of both worlds can be achieved + used + maintained 
most easily --- by having both side-by-side, or by taking over some or 
all of one's features into the other.  Either way requires contributors 
to be interested.
-- 
Stefan Richter
-=====-==--- =--- --==-
http://arcgraph.de/sr/
--

From: Olivier Galibert
Date: Wednesday, August 6, 2008 - 6:54 am

It's a little too early for that.  Right now it's at the phase "how to
make it better integrate with the kernel", with the use of existing
hooks, adding the needed hooks to be more complete, working as a
module, etc.  When that is done then the philosophical aspects can
come into play, but it's not there yet.

  OG.
--

From: jmerkey
Date: Wednesday, August 6, 2008 - 6:45 am

I have removed the hooks into the /arch/x86 sections and converted the
debugger to use kprobes and notify_die as Andi suggested.  It also builds
and loads as a module.

One serious point has to do with NMI handling on SMP since the notify_die
handlers use this priorty calling mechanism.  I am still testing on SMP
but it seems to work -- I just am a little uncomfortable with trusting an
interface (notify_die) that can let someone come in and hook the NMI
handlers when I MUST BE ABLE TO NMI AND HALT non-focus processors first.

I am adding a special NMI state to the chain notifier to handle this case
where IT MUST BE CALLED FIRST and IT MUST BE THE ONLY EVENT CALLED.  I
used the DIE_KERNELDEBUG to hook the keyboard handler in
drivers/char/keyboard.c so we have the general hook into kprobes to handle
enter debugger events.

Jeff

--

From: Nick Piggin
Date: Wednesday, August 6, 2008 - 7:16 am

No. First try to integrate them together so you have the best of both
from one code base is what I was saying. I specifically said if they
are significantly different and can't be reconciled then it could be
merged.
--

From: Jason Wessel
Date: Wednesday, August 6, 2008 - 10:21 am

It depends how you look at the problem.  I would agree that the use of
gdb + kgdb vs an assembly debugger are completely different cases.
The kgdb core in the mainline kernel, can actually allow to write such
a front end however.  The kgdb core has an API for I/O and it is
possible to write an I/O module that implements an in kernel assembly
debugger.  The kgdb test suite is not a great example, but it is a
complete example of using the kgdb core directly without a second
machine.

If there is truly missing functionality from kgdb in terms of the way
the kgdb core is used vs mdb, it would be good to at least consider
what is missing.  It is entirely possible to add functionality such
that mdb could be implemented a kgdb I/O module.  In this case you
would be able to make use of zero runtime impact when a kgdb I/O
module is not configure or make use of it as an early/late/ondemand

I would agree that the possibility exists to use the hooks directly,
and clearly the mdb code base as it stands in this patch set does not
accomplish this.

If one were to consider integrating mdb as a kgdb I/O module, it would
have a greater degree of platform independence.  The primary arch
dependencies should be narrowed down to the back tracing / disassembly
interface. The SMP / threading / breakpoints / exception handling,
would all be shared between the debugger front ends that way.  The mdb
code base currently relies on re-implementing HW/SW breakpoints for
each architecture you desire to support.

Unifying some of the debugging technology is a noble goal where it
makes sense to do so.  Using some of the existing kernel hook points
is a first pass requirement before a merge of mdb could be considered
for the mainline kernel.

Jason.




--

From: Andi Kleen
Date: Wednesday, August 6, 2008 - 11:57 am

> It depends how you look at the problem.  I would agree that the use of

Yes I left the possibility of a "someone writing a in kernel kgdb UI" 
out. Indeed that would be a possibility. 

On the other hand I'm not sure it would save all that much code 
versus just directly working on top of die notifiers.


It's not just a possibility, they are already used by multiple
debugger like subsystems. e.g. kprobes is certainly a kind of debugger.

-Andi
--

From: jmerkey
Date: Thursday, August 7, 2008 - 5:45 am

UPDATE:

As per everyone's recommendations, the debugger has been fully
module-ized, and I have run checkpatch.pl and am cleaning up the slew of
messages checkpatch spits out of its tailpipe.

It would be nice if checkpatch also could FIX those areas it complains about.

I tested kprobes with NMI cross processor calls on SMP and I am unable
to break it, and the module loads and unloads very well.  There is a need
for early initialization of the debugger if someone wants to debug kernel
startup and I am including support for this with another.config option,
but I am concerned about the reliance of kprobes on rcu and if this will
break early init of the debugger.   The code looks ok, but another set of
eyes
would be helpful when I post the next patch series.

I will generate another patch series after I finished cleaning up the
checkpatch.pl report.  I am still going through it.

Also, whoever wrote "/Documentation/volatiles_are_evil" must not have
worked with the busted-ass GNU compiler that optimizes away global
variables and busts SMP dependent code.  I am not going to remove the
volatile declarations needed for SMP coordination in the debugger since
the code breaks when removed.  GCC will cause massive breakage of SMP code
if you do not declare certain variables as volatile.

Whoever wrote that section doesn't understand low level SMP coding for
operating systems design and aparently has not sent over a week running
down an SMP bug only to discover it was caused by the busted-ass GCC
compiler arbitrarily deciding to optimize away a low level flag used to
signal between processors -- I have spent the time running down Stallman's
bugs.

That text should be removed from the kernel or qualified that its
advertising for GCC's malfunctioning optimization code.

Jeff



--

From: Peter Zijlstra
Date: Thursday, August 7, 2008 - 8:17 am

Even with proper barrier() usage?


--

From: Andi Kleen
Date: Thursday, August 7, 2008 - 9:07 am

The Linux way to handle this is to use gcc memory barriers.
mb()/barrier()/wmb()/rmb()/smp_rmb()/smp_wmb() etc.
Normally everything that volatile can do can be expressed by them.

On x86 such a memory barrier tells gcc that memory might
have been clobbered and needs to be flushed and also prevents the compiler 
from reordering memory accesses. On other architectures it also forces ordering
on the CPU level, although that's not needed on x86 (except
in some special situations like using write-combining)

See Documentation/memory-barriers.txt

-Andi

--

From: jmerkey
Date: Thursday, August 7, 2008 - 8:52 am

Andi,

I'll instrument this as described in the documentation you referenced and
remove the volatile declarations.  If this passes testing, I will repost
with these corections.

Jeff




--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 10:04 am

Take care though that neither memory barriers nor volatile are what you 
want if accesses need to be atomic on whatever given data structure. 
(E.g. bitfield manipulations, counter increments, accesses to virtually 
anything that is bigger than an integer or a pointer...)
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 5:28 pm

scripts/Lindent can at least help with some of the whitespace changes. 
It's long ago though that I used it myself, so I have no idea how well 
that works.
-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
--

From: jidong xiao
Date: Monday, August 11, 2008 - 3:36 am

Well I think given the fact that kdb is not accepted by Linus, there
is little possibility that mdb will be included in mainline kernel.
Though I don't know why kgdb is acceptable.

Regards
Jason Xiao
--

Previous thread: SW-IOMMU / ata_piix errors with 2.6.27-rc1 by Juergen Kreileder on Sunday, August 3, 2008 - 10:10 am. (1 message)

Next thread: [PATCH 3/3] posix-timers: simplify de_thread()->exit_itimers() path by Oleg Nesterov on Sunday, August 3, 2008 - 10:49 am. (1 message)