Re: perf hw in kexeced kernel broken in tip

Previous thread: [PATCH v16 01/17] Add a new structure for skb buffer from external. by xiaohui.xin on Wednesday, December 1, 2010 - 1:08 am. (18 messages)

Next thread: autogroup patches for mainline 2.6.37-rc4 and stable 2.6.36.1 by Mike Galbraith on Wednesday, December 1, 2010 - 1:52 am. (2 messages)
From: Yinghai Lu
Date: Wednesday, December 1, 2010 - 1:00 am

First kernel:
[    1.139418] calling  init_hw_perf_events+0x0/0xb77 @ 1
[    1.159111] Performance Events: PEBS fmt1+, Nehalem events, Intel PMU
driver.
[    1.159567] ... version:                3
[    1.179121] ... bit width:              48
[    1.179353] ... generic registers:      4
[    1.179593] ... value mask:             0000ffffffffffff
[    1.199211] ... max period:             000000007fffffff
[    1.199554] ... fixed-purpose events:   3
[    1.219108] ... event mask:             000000070000000f
[    1.219454] initcall init_hw_perf_events+0x0/0xb77 returned 0 after
11719 usecs

.....
[   20.220997] checking TSC synchronization [CPU#0 -> CPU#11]: passed.
[   20.260818] NMI watchdog enabled, takes one hw-pmu counter.

kexeced kernel.


[    1.169470] calling  init_hw_perf_events+0x0/0xb77 @ 1
[    1.189265] Performance Events: PEBS fmt1+, Nehalem events, Broken
PMU hardware detected, software events only.
...
[   21.010407] NMI watchdog failed to create perf event on cpu14:
fffffffffffffffe

caused by:

commit 33c6d6a7ad0ffab9b1b15f8e4107a2af072a05a0
Author: Don Zickus <dzickus@redhat.com>
Date:   Mon Nov 22 16:55:23 2010 -0500

    x86, perf, nmi: Disable perf if counters are not accessible
   
    In a kvm virt guests, the perf counters are not emulated.  Instead they
    return zero on a rdmsrl. The perf nmi handler uses the fact that
crossing
    a zero means the counter overflowed (for those counters that do not have
    specific interrupt bits). Therefore on kvm guests, perf will swallow all
    NMIs thinking the counters overflowed.
   
    This causes problems for subsystems like kgdb which needs NMIs to do its
    magic. This problem was discovered by running kgdb tests.
   
    The solution is to write garbage into a perf counter during the
    initialization and hopefully reading back the same number.  On kvm
    guests, the value will be read back as zero and we disable perf as
    a result.
   
    Reported-by: Jason Wessel ...
From: Peter Zijlstra
Date: Wednesday, December 1, 2010 - 4:27 am

*sigh*, and people ask me why kexec/kdump are such bad ideas..

apparently kexec doesn't properly shut down the first kernel and leaves
a counter running, then when we write and read the counter value they
don't match because its still running and voila, crap happens.

I've CC'ed the kexec people, maybe they got clue as to how to sort this.


--

From: Vivek Goyal
Date: Wednesday, December 1, 2010 - 9:06 am

So we can shutdown counters while first kernel is going down. Is there a
simple function already which I can call?

Thanks
Vivek
--

From: Peter Zijlstra
Date: Wednesday, December 1, 2010 - 9:11 am

Dunno, the cpu hotplug stuff should suffice I think, but then I don't
think you actually unplug the boot cpu.

What does kexec normally do to ensure hardware is left in a sane state?
--

From: Vivek Goyal
Date: Wednesday, December 1, 2010 - 9:23 am

Typically calls device_shutdown() and sysdev_shutdown() from
kernel_restart_prepare() to shutdown the devices.

Also calls machine_shutdown() which depending on architecture can take
care of various things like stopping other cpus, shutting down LAPIC, 
disabling IOAPIC, disabling hpet, shutting down IOMMU etc
(native_machine_shutdown()).

Thanks
Vivek
--

From: Peter Zijlstra
Date: Wednesday, December 1, 2010 - 12:38 pm

So basically there's no sane generic reset callout?
--

From: Vivek Goyal
Date: Wednesday, December 1, 2010 - 12:46 pm

I think ->shutdown() calls are sane generic callouts. Isn't it?

There seem to be few exceptions for LAPIC, IOMMU and HPET and I am not
sure why they are not covered by shutdown calls. CCing Eric, he might
have more insight into it.

Thanks
Vivek
--

From: Peter Zijlstra
Date: Wednesday, December 1, 2010 - 12:49 pm

->shutdown looks like it's about to reset/halt the hardware, no point in
slowing down the regular shutdown/reboot path for something like this,

That's all arch specific, but even there I don't think the reset code
should live outside of kexec.
--

From: Vivek Goyal
Date: Wednesday, December 1, 2010 - 12:58 pm

I think we already call ->shutdown() in regular reboot path.

kernel_restart()
  kernel_restart_prepare()
    device_shutdown();
    sysdev_shutdown();

So it should not make lot of difference if perf subsystem/counters are

I would not know the history but I have heard stories that if you don't
shutdown the hardware over restart, BIOS might not be expecting it and
might get trumped.

Thanks
Vivek
--

From: Peter Zijlstra
Date: Wednesday, December 1, 2010 - 1:07 pm

Oh, but I'm not a device or sysdev thing, I'll never get something like

Never yet had a problem with that.
--

From: Eric W. Biederman
Date: Wednesday, December 1, 2010 - 2:48 pm

There is also the reboot notifier, if the NMI needs to be controlled

I haven't personally but I have certainly heard stories and seen
debugging sessions where some devices work or don't depending on the
order of running linux and windows on a machine, with soft reboots in
between.

Eric

--

From: Don Zickus
Date: Wednesday, December 1, 2010 - 10:23 pm

I tried reboot notifiers with the nmi_watchdog and acheived some success
(on a Westmere box, a P4 still failed).  Kdump is still screwed, but maybe
we don't care for now.

Here is the quick and dirty patch I used.

Cheers,
Don


diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 792a4ed..3455cf9 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -23,6 +23,7 @@
 #include <linux/notifier.h>
 #include <linux/module.h>
 #include <linux/sysctl.h>
+#include <linux/reboot.h>
 
 #include <asm/irq_regs.h>
 #include <linux/perf_event.h>
@@ -550,6 +551,18 @@ static struct notifier_block __cpuinitdata cpu_nfb = {
 	.notifier_call = cpu_callback
 };
 
+static int __cpuinit
+reboot_callback(struct notifier_block *nfb, unsigned long action, void *unused)
+{
+	watchdog_disable_all_cpus();
+
+	return notifier_from_errno(0);
+}
+
+static struct notifier_block __cpuinitdata reboot_nfb = {
+	.notifier_call = reboot_callback
+};
+
 void __init lockup_detector_init(void)
 {
 	void *cpu = (void *)(long)smp_processor_id();
@@ -563,6 +576,7 @@ void __init lockup_detector_init(void)
 
 	cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
 	register_cpu_notifier(&cpu_nfb);
+	register_reboot_notifier(&reboot_nfb);
 
 	return;
 }
--

From: Peter Zijlstra
Date: Thursday, December 2, 2010 - 12:34 am

We'd really want a perf_event.c callback there to do as the hot-unplug
code does and detach all running counters from the cpu.


--

From: Don Zickus
Date: Thursday, December 2, 2010 - 9:15 am

Ok, I moved the reboot notifier stuff from kernel/watchdog.c to
kernel/perf_event.c.  Things still worked fine from a kexec perspective.

Vivek suggested to me this morning that I should just blantantly disable the
perf counter during init when running my test.  Looking through the code I
don't think I can do this using disable_all because some routines look for
the active bit to be set and some arches have different disable registers
than others.  Thoughts?

Cheers,
Don
--

From: Peter Zijlstra
Date: Tuesday, December 7, 2010 - 4:30 pm

Nah, we should actively scan for that during the bring-up and kill
hw-perf when we find an enable bit set, some BIOSes actively use the
PMU, this is something that should be discouraged.

---
 arch/x86/kernel/cpu/perf_event.c |   30 +++++++++++++++++++++++++++---
 1 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 817d2b1..7f92833 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -375,15 +375,40 @@ static void release_pmc_hardware(void) {}
 static bool check_hw_exists(void)
 {
 	u64 val, val_new = 0;
-	int ret = 0;
+	int i, reg, ret = 0;
 
 	val = 0xabcdUL;
 	ret |= checking_wrmsrl(x86_pmu.perfctr, val);
 	ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
-	if (ret || val != val_new)
+	if (ret || val != val_new) {
+		printk(KERN_CONT "Broken PMU hardware detected, software events only.\n");
 		return false;
+	}
+
+	/*
+	 * Check to see if the BIOS enabled any of the counters, if so
+	 * complain and bail.
+	 */
+	for (i = 0; i < x86_pmu.num_counters; i++) {
+		reg = x86_pmu.eventsel + i;
+		rdmsrl(reg, val);
+		if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
+			goto bios_fail;
+	}
+
+	for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
+		reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
+		rdmsrl(reg, val);
+		if (val & (0x03 << i*4))
+			goto bios_fail;
+	}
 
 	return true;
+
+bios_fail:
+	printk(KERN_CONT "Broken BIOS detected, software events only.\n");
+	printk(KERN_ERR FW_BUG "invalid MSR %x=%Lx\n", reg, val);
+	return false;
 }
 
 static void reserve_ds_buffers(void);
@@ -1379,7 +1404,6 @@ int __init init_hw_perf_events(void)
 
 	/* sanity check that the hardware exists or is emulated */
 	if (!check_hw_exists()) {
-		pr_cont("Broken PMU hardware detected, software events only.\n");
 		return 0;

Something like the below, preferably I'd key that off of SYS_KEXEC, but
looking through the existing notifiers adding a state requires ...
From: Don Zickus
Date: Wednesday, December 8, 2010 - 7:01 am

Ok, the reboot notifier addresses the kexec problem but doesn't fix it
though (I have to test to confirm that, comments below).  The bios check
should catch those situations (ironically I stumbled upon a machine with
this problem, so I will test your patch with it, though it only uses perf
counter 0).  The kdump problem will still exist, not sure if we care and
perhaps we should document in the changelog that we know kdump is still

I wonder if you should reverse these checks.  If the bios has the perf
counter enabled, there might be a high chance that it fails the first


Ok, so this shuts down the perf counters on cpu0, but the other cpus are
still running and will fail your new bios check, no?

Privately, I used the above wrapped with for_each_online_cpu(cpu) and it
worked fine for me.

Cheers,
Don

--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 7:20 am

Right, they usually only steal one or two counters, but the fact that

You mean even if we cure the kexec reboot notifier patch thing kdump is



Oh, so reboot doesn't actually stop the non-boot cpus? I was unsure of
that (see my XXX there), so yeah, if it doesn't then I guess the
for_each_possible_cpu() thing is the way out.



--

From: Vivek Goyal
Date: Wednesday, December 8, 2010 - 7:42 am

Yes. reboot notifier notifications are not sent in kdump path. In this
path we know kernel has crashed and we just try to do bare minimal things
to boot into second kernel. If some hardware is left in inconsistent
state we try to recover from that situation by resetting the device
when second kernel is booting.

Either driver itself can detect that device is in inconsistent state and
reset it otherwise we also pass a command line parameter "reset_devices" to
second kernel to explicitly tell kernel that devices might be in bad state,
reset these during initialization. If we want to use these perf counters in
kdump kernel, we shall have to do something similar.

Thanks
Vivek
--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 7:48 am

Right, so I'm perfectly fine with leaving the kdump kernel broken for
now and if people really do need hardware events we can try and reset
the hardware when we find that reset_devices command line parameter.

Not sure how that interacts with these broken BIOSes, but its kdump so
its mostly broken by design anyway ;-)


--

From: Vivek Goyal
Date: Wednesday, December 8, 2010 - 8:02 am

reset_devices was meant to be dual purpose so that it can handle broken
BIOSes also. So if BIOS is broken then one can pass "reset_devices" to 

Kdump has its share of problems especially with the fact that
kernel/drivers find devices in bad state and are not hardened enough
to deal with that. But on bare metal what's the better way of capturing
kernel crash dump? Trying to do anything post crash in the kernel is
also not very reliable either.

I think the way we fix kernel for boot problems on newer hardware, for broken
BIOses, we need to keep on fixing it in kdump path also to make sure new
devices/drivers can cope up with this scenario. 

Thanks
Vivek 
--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 8:15 am

/me <3 RS-232

I haven't found anything better than that... 

And poking at the RS-232 requires less of the kernel to be functional
than booting into a new kernel (whose image might have been corrupted by
the dying kernel, etc..)
--

From: Vivek Goyal
Date: Wednesday, December 8, 2010 - 8:22 am

Serial is good for getting the oops out. But for the big vmcore? Secondly,
people want the flexibility of sending the vmcore over various targets
like over network to some remote server. Booting into second kernel opens
up all those options and now one can do intelligent filtering and send

New kernel image being corrupted problem can be solved up to great extent
by write protecting that memory location.

So those who are happy with RS-232, they don't have to configure kdump.
Just connect serial console and get the oops message out.

Thanks
Vivek 
--

From: Eric W. Biederman
Date: Wednesday, December 8, 2010 - 2:16 pm

True.  But it can be a pain to operate RS-232 at production scale, or to
convince customers to hook up RS-232 just in case your released software

For debugging a reproducible failure RS-232 wins.  For everything else
there is kdump.  It sucks but it is at least fixable.

And really the kdump kernel should be running a minimalistic hardware
config so you only have to get the chunks of hardware you really care
about working.

As for corruption the kdump kernel lives in an area of memory that we
never DMA to in the primary kernel, and we check a sha256 hash before we
start booting the kdump kernel.  In general kdump fails safe. That is if
it can't makes things work it fails to boot and does nothing to your
system.  Definitely not perfect but if you don't have RS-232 it is the
best I have seen.

Eric


--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 7:59 am

Something like so..

---
Subject: perf, x86: Detect broken BIOSes
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Dec 08 15:56:23 CET 2010

Some BIOSes use PMU resources, this is a bug.

Try to detect this, warn about it, and further refuse to touch the
PMU ourselves.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
@@ -375,15 +375,51 @@ static void release_pmc_hardware(void) {
 static bool check_hw_exists(void)
 {
 	u64 val, val_new = 0;
-	int ret = 0;
+	int i, reg, ret = 0;
 
+	/*
+	 * Check to see if the BIOS enabled any of the counters, if so
+	 * complain and bail.
+	 */
+	for (i = 0; i < x86_pmu.num_counters; i++) {
+		reg = x86_pmu.eventsel + i;
+		ret = rdmsrl_safe(reg, &val);
+		if (ret)
+			goto msr_fail;
+		if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
+			goto bios_fail;
+	}
+
+	for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
+		reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
+		ret = rdmsrl_safe(reg, &val);
+		if (ret)
+			goto msr_fail;
+		if (val & (0x03 << i*4))
+			goto bios_fail;
+	}
+
+	/*
+	 * Now write a value and read it back to see if it matches,
+	 * this is needed to detect certain hardware emulators (qemu/kvm)
+	 * that don't trap on the MSR access and always return 0s.
+	 */
 	val = 0xabcdUL;
-	ret |= checking_wrmsrl(x86_pmu.perfctr, val);
+	ret = checking_wrmsrl(x86_pmu.perfctr, val);
 	ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
 	if (ret || val != val_new)
-		return false;
+		goto msr_fail;
 
 	return true;
+
+bios_fail:
+	printk(KERN_CONT "Broken BIOS detected, software events only.\n");
+	printk(KERN_ERR FW_BUG "invalid MSR: %x=%Lx\n", reg, val);
+	return false;
+
+msr_fail:
+	printk(KERN_CONT "Broken PMU hardware detected, software events only.\n");
+	return false;
 }
 
 static ...
From: Yinghai Lu
Date: Wednesday, December 8, 2010 - 11:43 am

can you add sth force_... in command line to take over ownership of perf from BIOS or previous kernel ?

then still can use perf etc after we kexec from RHEL or SLES kernel to later kernel ( from 2.6.37)

Thanks


--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 12:06 pm

The problem is, you cannot steal the thing from the BIOS, you'll trample
on its settings and the next time it runs it will simply re-instate it.

And aside from probing the EN bit on boot there is no way of determining
this.


I'm not sure why people would do that, but yeah I guess we can do
something like that.
--

From: Yinghai Lu
Date: Wednesday, December 8, 2010 - 12:20 pm

more problem:
system with linuxbios and have kernel in flash as bootloader. they may kexec to final production kernel. 

and they may need to update that embedded kernel to shutdown perf....

Yinghai
--

From: Yinghai Lu
Date: Wednesday, December 8, 2010 - 12:05 pm

how about second case: kexec from RHEL 6 stock kernel to upstream kernel ?

Thanks

	Yinghai
--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 12:17 pm

Its impossible to distinguish between a BIOS having claimed a counter
and a previous kernel not having shut things down properly.

The best we can do is allow a force parameter and let the user keep all
pieces when he uses it.
--

From: Yinghai Lu
Date: Wednesday, December 8, 2010 - 12:20 pm

ok, thanks.
--

From: Don Zickus
Date: Wednesday, December 8, 2010 - 12:01 pm

My understand is that you can't because the BIOS is actively using it
behind the scenes of the kernel (well during an SMI).  I have a machine
where I tried to force take it but it still stopped triggering interrupts.

Cheers,
Don
--

From: Don Zickus
Date: Wednesday, December 8, 2010 - 3:37 pm

This seems to work correctly on my Nehalem and broken bios machines during
boot and kexec.  As expected it fails during kdump.  My p4 box failed
during kexec for some reason.  But p4 has other issues.

Cheers,
--

From: Eric W. Biederman
Date: Wednesday, December 8, 2010 - 4:20 pm

Does the kdump kernel still boot?

It looks like it should I just want to double check.

Eric
--

From: Don Zickus
Date: Wednesday, December 8, 2010 - 9:34 pm

Yeah, sorry for not being clear.  It definitely boots and does it thing.
perf init (and thus nmi watchdog) fail with 'BIOS broken' because the perf
counters were not shutdown prior to executing kdump.

Cheers,
Don
--

From: Don Zickus
Date: Thursday, December 9, 2010 - 1:20 pm

Getting closer...


	if (x86_pmu.perfctr_second_write)
		ret |= checking_wrmsrl(x86_pmu.perfctr, val);



Cheers,
Don
--

From: Cyrill Gorcunov
Date: Thursday, December 9, 2010 - 1:44 pm

On Thu, Dec 09, 2010 at 03:20:08PM -0500, Don Zickus wrote:
...

yeah, thanks! would you push a patch upstream?

  Cyrill
--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 7:33 am

Something like so then?

---
Subject: perf: Stop all counters on reboot
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Dec 08 15:29:02 CET 2010

Use the reboot notifier to detach all running counters on reboot, this
solves a problem with kexec where the new kernel doesn't expect
running counters (rightly so).

It will however decrease the coverage of the NMI watchdog. Making a
kexec specific reboot notifier callback would be best, however that
would require touching all notifier callback handlers as they are not
properly structured to deal with new state.

As a compromise, place the perf reboot notifier at the very last
position in the list.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -21,6 +21,7 @@
 #include <linux/dcache.h>
 #include <linux/percpu.h>
 #include <linux/ptrace.h>
+#include <linux/reboot.h>
 #include <linux/vmstat.h>
 #include <linux/vmalloc.h>
 #include <linux/hardirq.h>
@@ -6329,7 +6330,7 @@ static void __cpuinit perf_event_init_cp
 	mutex_unlock(&swhash->hlist_mutex);
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
+#if defined CONFIG_HOTPLUG_CPU || defined CONFIG_KEXEC
 static void perf_pmu_rotate_stop(struct pmu *pmu)
 {
 	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
@@ -6383,6 +6384,26 @@ static void perf_event_exit_cpu(int cpu)
 static inline void perf_event_exit_cpu(int cpu) { }
 #endif
 
+static int
+perf_reboot(struct notifier_block *notifier, unsigned long val, void *v)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		perf_event_exit_cpu(cpu);
+
+	return NOTIFY_OK;
+}
+
+/*
+ * Run the perf reboot notifier at the very last possible moment so that
+ * the generic watchdog code runs as long as possible.
+ */
+static struct notifier_block perf_reboot_notifier = {
+	.notifier_call = perf_reboot,
+	.priority = ...
From: Vivek Goyal
Date: Wednesday, December 8, 2010 - 7:39 am

Can't think why would somebody like to use performance counters in kdump
kernel. So that probably should not be a concern.

Vivek
--

From: Don Zickus
Date: Tuesday, December 7, 2010 - 2:16 pm

Ok, here is a simpler patch for now.

--------------------------------8<--------
From: Don Zickus <dzickus@redhat.com>
Date: Tue, 7 Dec 2010 16:06:59 -0500
Subject: [PATCH] perf:  Use event select bits for hardware check

The counter registers can continue to increment if left enabled
across a kexec or a kdump.  The makes the perf hardware check
accidentally return false when the hardware really does exist.

Change the check to use the first bits of event selection.  Those
bits should be safe as they are used to program the type of events
to use.  And more importantly, they won't increment across kexec/kdump.

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/cpu/perf_event.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 7b91396..7d869c0 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -377,10 +377,10 @@ static bool check_hw_exists(void)
 	u64 val, val_new = 0;
 	int ret = 0;
 
-	val = 0xabcdUL;
-	ret |= checking_wrmsrl(x86_pmu.perfctr, val);
-	ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
-	if (ret || val != val_new)
+	val = 0xabUL;
+	ret |= checking_wrmsrl(x86_pmu.eventsel, val);
+	ret |= rdmsrl_safe(x86_pmu.eventsel, &val_new);
+	if (ret || val != (val_new & 0xFF))
 		return false;
 
 	return true;
-- 
1.7.3.2

--

From: Yinghai Lu
Date: Tuesday, December 7, 2010 - 5:26 pm

Thanks. it fixes the problem.

Yinghai
--

From: Peter Zijlstra
Date: Wednesday, December 8, 2010 - 3:39 am

Won't merge it though, I think it stinks..
--

From: Eric W. Biederman
Date: Wednesday, December 1, 2010 - 1:41 pm

No you don't!

Most BIOSen implement a board level reset there, but it isn't required.
Just doing a software only reinitialization is allowed, and on some
arches is the only thing you can do.

Speed during reboot is not a reason to avoid anything.  reboot
is not a fast path, and we are talking about things in human tersm.

The only argument I have heard that holds the least amount of
sense is to keep what we do to a minimum, to increase the chances
that we can do a reboot even after a kernel oops.

All of that said.  What insane start are we leaving the hardware
in that we think it is going to be slow in human terms to remove?

Eric
--

Previous thread: [PATCH v16 01/17] Add a new structure for skb buffer from external. by xiaohui.xin on Wednesday, December 1, 2010 - 1:08 am. (18 messages)

Next thread: autogroup patches for mainline 2.6.37-rc4 and stable 2.6.36.1 by Mike Galbraith on Wednesday, December 1, 2010 - 1:52 am. (2 messages)