Re: [git pull] x86 arch updates for v2.6.25

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: H. Peter Anvin <hpa@...>
Cc: John Stoffel <john@...>, Linus Torvalds <torvalds@...>, Maxim Levitsky <maximlevitsky@...>, Ingo Molnar <mingo@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>
Date: Friday, February 8, 2008 - 2:24 pm

On Tue, 5 Feb 2008, H. Peter Anvin wrote:

If the machine can be hooked up over FireWire to another Linux box, you can
use the OHCI1394 controller's "physical DMA" feature to get lots if info:

What you need:

- An OHCI1394-compliant (nearly all are) FireWire port in the oopsing system
- A second Linux system (with other tools also other OS) with any FireWire port
- A FireWire cable which connects the two ports (there are two kinds of plugs)
- ftp://ftp.suse.de/private/bk/firewire/tools/firescope-0.2.2.tar.bz2

What you do:

1 - You boot both systems and connect them with the FireWire cable
2 - You initialize FireWire on both systems and ensure that 'physical DMA'
    is enabled in the OHCI1394-compliant FireWire controller of the oopsing
    machine.
3 - You transfer the System.map of the debugged kernel to the second system.

    - Everything after this point does not need the cooperation of the CPU
      of the debugged system, the PCI bus needs to be working and unlocked,
      but the CPU(s) of the debugged system can otherwise do anything or
      nothing at all anymore.

4 - You trigger the oops/crash/hang, but leave the debugged machine turned on.
5 - You install firescope (URL in the list above) on the second machine and
    run it for example with:

       firescope -A System.map-of-debugged-kernel

6) you press Ctrl-D (this is just one way to do it)

What you get are the contents of the printk buffer containing the messages
which the kernel logged into the in-memory printk buffer (such as oopses)
- directly from the other machine's RAM over remote DMA (which is called
  "physical DMA" in FireWire language)

Of course you can put firescope into "auto update mode" and just watch or
log the printk buffer as it gets messages on the remote system - in real time.

--------------------------------------------------------------------------

There is new code in mainline now to debug early boot issues:

If the oops/hang/crash happens early in boot, before the normal ohci1394 kernel
driver can initialize the OHCI1394-compatible controllers on the debugged
system, such as in the ACPI initialisation or so, you can employ my early
initialisation patch for OHCI1394 controllers to get remote access early:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f212ec...

It has been committed into mainline (for 2.6.25) with the x86 merge on
January 30 and it contains more documentation on debugging over FireWire.

To to ensure everybody: The code is not compiled by default, and even if
compiled, it runs __only__ when a specific boot parameter is given on boot.

However, the initialization of the ohci1394 driver allows physical DMA to
all FireWire nodes unless it is loaded with phys_dma=0. See the PS of
this mail for more regarding security.

You can also use the new code on debugging suspend/resume from disk/ram:

To debug suspend:

1 - enable the early firewire patch as documentented in the patch
2 - have the second linux system set up and connected on boot already
3 - ensure that no OHIC1394 driver is ever loaded, as suspending as
    well as the driver unload would disable the OHCI1394 controller.
4 - suspend the machine to get the oops/hang/crash.

To debug resume from disk:

Nothing add to the above. The new code gets called when the boot kernel
initializes the machine hardware on resume, so you can get access.

To debug resume from ram:

Disable the __init{,data} flags in the patch and insert a call to

	init_ohci1394_dma_on_all_controllers()

early on resume, after the registers of PCI cards are accessible.

Testing and submitting that would be the next thing on my todo list for
debugging with firewire, but someone tries it, I'd be happy to hear how
it went (or to look at the resulting patch).

My plan would be to make that a config option which can be enabled when
needed to debug resume from ram with a new flag to the boot option which
the new init_ohci1394 commit added.

The documentation in the commit also includes a link to firedump, a tool
which can to pull the the contents of all PCI-accessible memory over
FireWire at high speed. But it does not parse the e820 map yet, so it
crashes the remote system hard if it ties to read from a memory hole.
below the 4GB limit over to the second machine - with the speed of your

Best Regards,
Bernhard Kaindl

PS:

Using gdb (and even kgdb) over firewire is possible:

For very advanced users, the documentation in the commit contains a link
to a gdb stub for firewire (named fireproxy currenty) which allows gdb to
access all data which can be referenced from symbols found by gdb using
a vmlinux file which is compiled with debug information.

With it, you can get symbolic kernel stack backtraces of the remote system
by using the gdb macros which were written for kdump.

The latest version of the gdb proxy (fireproxy-0.34) can communicate
with kgdb over a generic memory-communication module (kgdbom). This
implementation is just a proof-of-concept which can only excange a very
small amount of messages over physical DMA memory back and forth, which
means that it is not yet useable for real debbugging work.

Regarding security:

On the software side: The new fw-ohci driver seems to allow physical DMA only
to devices which pretend to be FireWire disks (it is the specified way to
transfer the data) unless a disk on a remote FireWire bus is being
plugged-in. That makes it a bit harder to use physical DMA, but not
impossible. Loading no FireWire driver is also possible, but you'd have
to check that your BIOS does not enable FireWire when your machine reboots
(for boot from FireWire).

With FireWire enabled by the BIOS, a hacker could (in theory) load his
own operating system into the system RAM and boot it by changing the
code which the boot loader executes while it is running to trick the
CPU to jump directy into the bootstrapping code of the loaded OS.

Loading the ochi1394 driver with phys_dma=0 would disable physical DMA.
It could mean that some FireWire drivers like sbp2 might not work, so
if the BIOS does not enable FireWire on boot, you should be safe from
the FireWire side with that.

To be really safe from such kind of system tapping, you have to prevent
physical access to your FireWire ports (and your mainboard, to exclude
debug cards): Filling hot glue into the FireFire ports makes it harder
to use them and a padlock on the system housing gives some resistance
to physical system intrusions, but there is always a big enough bolt
cutter to open it. How much you need depends on the threat you have.
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Tue Jan 29, 9:15 pm)
Re: [git pull] x86 arch updates for v2.6.25, Amit Shah, (Sat Feb 9, 10:11 am)
Re: [git pull] x86 arch updates for v2.6.25, Jiri Kosina, (Sun Feb 10, 8:30 am)
Re: [git pull] x86 arch updates for v2.6.25, Amit Shah, (Tue Feb 12, 3:16 am)
Re: [git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Wed Feb 13, 4:56 am)
Re: [git pull] x86 arch updates for v2.6.25, Amit Shah, (Wed Feb 13, 6:19 am)
Re: [git pull] x86 arch updates for v2.6.25, Maxim Levitsky, (Mon Feb 4, 10:36 pm)
Re: [git pull] x86 arch updates for v2.6.25, Linus Torvalds, (Mon Feb 4, 11:27 pm)
Re: [git pull] x86 arch updates for v2.6.25, Andi Kleen, (Fri Feb 8, 1:00 pm)
Re: [git pull] x86 arch updates for v2.6.25, Jan Kiszka, (Fri Feb 8, 1:48 pm)
Re: [git pull] x86 arch updates for v2.6.25, Andi Kleen, (Fri Feb 8, 2:57 pm)
Re: [RFC][PATCH] KGDB: remove kgdb-own fault handling, Jason Wessel, (Fri Feb 8, 6:16 pm)
Re: [git pull] x86 arch updates for v2.6.25, Daniel Phillips, (Thu Feb 7, 3:20 pm)
Re: [git pull] x86 arch updates for v2.6.25, John Stoffel, (Tue Feb 5, 1:45 pm)
Re: [git pull] x86 arch updates for v2.6.25, H. Peter Anvin, (Tue Feb 5, 1:52 pm)
Re: [git pull] x86 arch updates for v2.6.25, Bernhard Kaindl, (Fri Feb 8, 2:24 pm)
Re: [git pull] x86 arch updates for v2.6.25, Phil Oester, (Tue Feb 5, 12:11 am)
Re: [git pull] x86 arch updates for v2.6.25, Christoph Hellwig, (Fri Feb 8, 12:48 am)
Re: [git pull] x86 arch updates for v2.6.25, Jan Kiszka, (Fri Feb 8, 5:51 am)
Re: [git pull] x86 arch updates for v2.6.25, Andrew Morton, (Tue Feb 5, 12:54 am)
Re: [git pull] x86 arch updates for v2.6.25, Jan Kiszka, (Wed Feb 6, 8:08 am)
Re: [git pull] x86 arch updates for v2.6.25, Daniel Phillips, (Thu Feb 7, 4:00 pm)
Re: [git pull] x86 arch updates for v2.6.25, Adrian Bunk, (Thu Jan 31, 11:57 am)
Re: [git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Thu Jan 31, 12:00 pm)
Re: [git pull] x86 arch updates for v2.6.25, Adrian Bunk, (Thu Jan 31, 12:12 pm)
Re: [git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Thu Jan 31, 12:15 pm)
Re: [git pull] x86 arch updates for v2.6.25, WANG Cong, (Thu Jan 31, 12:21 pm)
sparc compile error caused by x86 arch updates, Adrian Bunk, (Thu Jan 31, 12:29 pm)
Re: sparc compile error caused by x86 arch updates, Jeremy Fitzhardinge, (Thu Jan 31, 12:50 pm)
Re: sparc compile error caused by x86 arch updates, Ingo Molnar, (Thu Jan 31, 1:43 pm)
Re: sparc compile error caused by x86 arch updates, Adrian Bunk, (Thu Jan 31, 2:21 pm)
Re: sparc compile error caused by x86 arch updates, Ingo Molnar, (Thu Jan 31, 2:38 pm)
Re: sparc compile error caused by x86 arch updates, Jeremy Fitzhardinge, (Thu Jan 31, 1:55 pm)
Re: [git pull] x86 arch updates for v2.6.25, Adrian Bunk, (Thu Jan 31, 12:24 pm)
Re: [git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Thu Jan 31, 12:46 pm)
Re: [git pull] x86 arch updates for v2.6.25, Jeremy Fitzhardinge, (Thu Jan 31, 12:52 pm)
Re: [git pull] x86 arch updates for v2.6.25, Ingo Molnar, (Thu Jan 31, 12:04 pm)
x86 arch updates also broke s390, Adrian Bunk, (Wed Jan 30, 8:33 pm)
Re: x86 arch updates also broke s390, Martin Schwidefsky, (Thu Jan 31, 5:34 am)
Re: x86 arch updates also broke s390, Ingo Molnar, (Fri Feb 1, 5:48 am)
Re: x86 arch updates also broke s390, Martin Schwidefsky, (Fri Feb 1, 5:54 am)
Re: x86 arch updates also broke s390, Ingo Molnar, (Fri Feb 1, 6:02 am)
Re: x86 arch updates also broke s390, Ingo Molnar, (Fri Feb 1, 5:52 am)
Re: x86 arch updates also broke s390, Ingo Molnar, (Thu Jan 31, 6:24 am)
Re: x86 arch updates also broke s390, Nick Piggin, (Thu Jan 31, 8:37 am)