Re: [PATCH -v3] perf, x86: try to handle unknown nmis with running perfctrs

Previous thread: [PATCH v3] netfilter: save the hash of the tuple in the original direction for latter use by Changli Gao on Friday, August 20, 2010 - 7:53 am. (5 messages)

Next thread: 2.6.36-rc1 hangs during XFS barrier test for / by Torsten Kaiser on Friday, August 20, 2010 - 8:08 am. (3 messages)
From: Don Zickus
Date: Friday, August 20, 2010 - 8:05 am

I'll test tip later today to see if I can reproduce it.

Cheers,
Don

From: Ingo Molnar
Date: Friday, August 20, 2010 - 8:25 am

FYI, it does not trigger on an AMD box.

Thanks,

	Ingo
--

From: Ingo Molnar
Date: Monday, August 23, 2010 - 1:53 am

Ok, to not hold up the perf/urgent flow i zapped these two commits for 
the time being:

 4a31beb: perf, x86: Fix handle_irq return values
 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU

We can apply them if they take a form that dont introduce a different 
kind of (and more visible) regression.

Thanks,

	Ingo
--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 9:22 am

Btw, guys, I fail to see how new nmi_watchdog work, we have
default_do_nmi
  if (!(reason & 0xc0)) {
    if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == == NOTIFY_STOP)
      return
    if (nmi_watchdog_tick(regs, reason))
      return

but perf_event_nmi_handler returns NOTIFY_STOP when watchdog is perf event
and nmi_watchdog_tick _never_ called, or (most probably) I miss something?

	-- Cyrill
--

From: Robert Richter
Date: Tuesday, August 24, 2010 - 10:09 am

The watchdog is disabled during profiling (perf and oprofile) by
calling disable_lapic_nmi_watchdog().

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 10:20 am

Huh? iirc Don have switched nmi watchdog to native perf subsystem, ie watchdog
uses PERF_COUNT_HW_CPU_CYCLES event, letme check...
 
	-- Cyrill
--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 10:21 am

False alarm, perf watchdog uses own handler. Sorry for noise ;)

	-- Cyrill
--

From: Robert Richter
Date: Tuesday, August 24, 2010 - 10:15 am

Ingo, are those commits in some branch to fetch them for testing? I
did not find them in tip/master.

Thanks,

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 10:28 am

Robert, they should be in -tip tree, I have

| commit 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
| Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
| Date:   Thu Aug 19 16:28:00 2010 +0200
|
|    perf, x86: Fix handle_irq return values
    
	-- Cyrill
--

From: Don Zickus
Date: Tuesday, August 24, 2010 - 11:46 am

Has anyone else seen kernel panics when enabling the LOCKUP_DETECTOR or
apply Robert's patch?  I keep getting a panic with unable to handle paging
request everytime either Robert's patch or the lockup detector uses
__get_cpu_var().  And only on 2.6.36 code.

Cheers,
Don
--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 11:54 am

I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).

	-- Cyrill
--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 12:52 pm

On Tue, Aug 24, 2010 at 10:54:58PM +0400, Cyrill Gorcunov wrote:

Don, for me it fails with somehow unrelated page handling fault - in
reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
a problem either in per-cpu allocator itself or we screw pointer somehow.
Weird.

	-- Cyrill
--

From: Don Zickus
Date: Tuesday, August 24, 2010 - 1:27 pm

I just found out (with the help of the crash utility and Dave A.) that
Robert's percpu struct nmi clashes with the exception entry point .nmi.
I only see this problem in 2.6.36, so I am not sure what changed with
regards to compiler flags to confuse variables with text segments.

But renaming the percpu struct nmi to nmidon fixed the problem for me (I
am open to other suggestions :-) ).

Regarding your reiserfs, what was the variable's name?

Cheers,
Don
--

From: Cyrill Gorcunov
Date: Tuesday, August 24, 2010 - 1:40 pm

yeah, I suspect name clashes here but then I did grep over per-cpu variables
in whole kernel and didn't find match so I thought the assumption was wrong,

nmi_don_zickus ;) well, I think nmi_pmu or something like that

It seems to be different, it's pity that I had only 80x25 vga
mode and was unable to snap the whole log. But actually I didn't
even check precisely all .config options I had set since I was
more interested in early stage where per-cpu access should already
happen rather then real init'ed environmen. But I think I'll be
	-- Cyrill
--

From: Frederic Weisbecker
Date: Wednesday, August 25, 2010 - 4:52 pm

There was a missing "return" in the end of reiserfs_evict_inode.

It has been fixed there:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f4ae2f...

--

From: Cyrill Gorcunov
Date: Thursday, August 26, 2010 - 2:11 am

On Thu, Aug 26, 2010 at 3:52 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:

Thanks for info, Frederic!
--

From: Robert Richter
Date: Wednesday, August 25, 2010 - 3:20 am

I was testing the patches also with CONFIG_LOCKUP_DETECTOR=y without

Also I was looking at the differences between .35 and tip/perf/urgent
and could not find a namespace collision.

Hmm...

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Don Zickus
Date: Thursday, August 26, 2010 - 2:14 pm

So this patch fixes it, though I haven't convince myself why (perhaps
babysitting my 4 month old isn't helping :-))

The code now enters the loop and reprocesses the new status which properly
increments handled to 2 and thus the new logic takes care of it.

Cheers,
Don


diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..d16ebd8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -738,6 +738,7 @@ again:
 
 	inc_irq_stat(apic_perf_irqs);
 	ack = status;
+	intel_pmu_ack_status(ack);
 
 	intel_pmu_lbr_read();
 
@@ -766,8 +767,6 @@ again:
 			x86_pmu_stop(event);
 	}
 
-	intel_pmu_ack_status(ack);
-
 	/*
 	 * Repeat if there is more work to be done:
 	 */
--

From: Robert Richter
Date: Friday, August 27, 2010 - 12:51 am

Yes, not immediately ack'ing the status was suspect to me too. Though
it must then be the same counter that retriggers. Or, it is a cpu
bug. You could add a debug print of the status register for the case
the loop is reentered, would be interesting...

Thank for fixing this.


-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Don Zickus
Date: Friday, August 27, 2010 - 6:39 am

It seems to be the same counter.  I wonder if the act of processing it
'intel_save_and_restart' cleared the status bit on the perfselX bit but
not in the global status register.  Thus it triggered again and we
accidentally cleared it when we ack'd later in the code.

Cheers,
Don
--

From: Robert Richter
Date: Friday, August 27, 2010 - 1:10 am

I would slightly change the patch:

There is no need for the ack variable anymore, you could directly work
with the status.

I would call intel_pmu_ack_status() as close as possible after the
intel_pmu_get_status(), which is after 'again:'.


-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Don Zickus
Date: Friday, August 27, 2010 - 6:44 am

Yeah, I can do that.  The other patch was just a proof of concept to see
what others thought.

What is funny is that this problem was masked by the
perf_event_nmi_handler swallowing all the nmis.  I wonder if we were
losing events as a result of this bug too because if you think about it,
we processed the first event, a second event came in and we accidentally
ack'd it, thus dropping it on the floor.  Now I wonder how the event was
ever reloaded, unless it was by accident because of how the scheduler
deals with perf counters (perf_start/stop all the time).

Cheers,
Don
--

From: Robert Richter
Date: Friday, August 27, 2010 - 7:05 am

Yes, this could be the case, but only for handled counters. So it
would be interesting to see for this case the status mask of the

The nmi might be queued be the cpu regardless of of the overflow
state.

I am wondering why this happens at all, because events are disabled by
wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0). Hmm, maybe this is exactly the
reason because the nmi could fire again after reenabling the counters.

Is there a reason for disabling all counters?


-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Don Zickus
Date: Friday, August 27, 2010 - 8:05 am

The status masks seem to be identical, 0x1 (and when I forced pmc0

Heh.  Not sure why it isn't working then.  Then again you shouldn't need

It would be a nice to have that way we wouldn't have to 'eat' all these
extra nmis.  But I guess it isn't working correctly.

Cheers,
Don
--

From: Robert Richter
Date: Friday, August 27, 2010 - 8:48 am

So this should also happen if only one counter is running?
Back-to-back nmis actually only occur then 2 different counters

What about the erratum mentioned in this thread before? We might
identify affected cpus and return handled=2 for them. This solution
will be still better than before. And, all other cpu models have the
nmi detection fixed. In a next step we could try to use a timer for
detection.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Don Zickus
Date: Friday, August 27, 2010 - 11:57 am

Here is another version of the patch, which uses Robert's suggestion of
removing the ack variable

Cheers,
Don 

--

From: Don Zickus <dzickus@redhat.com>
Date: Fri, 27 Aug 2010 14:43:03 -0400
Subject: [PATCH] [x86] perf: fix accidentally ack'ing a second event on intel perf counter

During testing of a patch to stop having the perf subsytem swallow nmis,
it was uncovered that Nehalem boxes were randomly getting unknown nmis
when using the perf tool.

Moving the ack'ing of the PMI closer to when we get the status allows
the hardware to properly re-set the PMU bit signaling another PMI was
triggered during the processing of the first PMI.  This allows the new
logic for dealing with the shortcomings of multiple PMIs to handle the
extra NMI by 'eat'ing it later.

Now one can wonder why are we getting a second PMI when we disable all
the PMUs in the begining of the NMI handler to prevent such a case, for
that I do not know.  But I know the fix below helps deal with this quirk.

Tested on multiple Nehalems where the problem was occuring.  With the
patch, the code now loops a second time to handle the second PMI (whereas
before it was not).

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..ee05c90 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	struct perf_sample_data data;
 	struct cpu_hw_events *cpuc;
 	int bit, loops;
-	u64 ack, status;
+	u64 status;
 	int handled = 0;
 
 	perf_sample_data_init(&data, 0);
@@ -729,6 +729,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 
 	loops = 0;
 again:
+	intel_pmu_ack_status(status);
 	if (++loops > 100) {
 		WARN_ONCE(1, "perfevents: irq loop stuck!\n");
 ...
From: Yinghai Lu
Date: Friday, August 27, 2010 - 12:00 pm

Can you resend all three updated patches?

Thanks

Yinghai
--

From: Robert Richter
Date: Friday, August 27, 2010 - 12:33 pm

Looks good to me. Thanks Don.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Robert Richter
Date: Wednesday, August 25, 2010 - 2:48 am

Ingo,

do you mean it does not trigger false positives on AMD? Both patches
applied on top of current tip/perf/urgent (c6db67c) are working on the
systems I have.

You might use the debug patch below for diagnostics.

-Robert

--

From 1bbb5aa64e96360529c34a593a072e1a84114f04 Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Wed, 11 Aug 2010 18:14:00 +0200
Subject: [PATCH] debug

Signed-off-by: Robert Richter <robert.richter@amd.com>
---
 arch/x86/kernel/cpu/perf_event.c |   54 ++++++++++++++++++++++++++++++++++++-
 1 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index dd2fceb..059ef09 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1203,10 +1203,43 @@ void perf_events_lapic_init(void)
 struct pmu_nmi_state {
 	unsigned int	marked;
 	int		handled;
+	u64		timestamp;
 };
 
 static DEFINE_PER_CPU(struct pmu_nmi_state, nmi);
 
+struct nmi_debug {
+	int		cpu;
+	unsigned int	this_nmi;
+	unsigned int	marked;
+	int		handled;
+	u64		timestamp;
+	u64		delta;
+};
+
+static DEFINE_PER_CPU(struct nmi_debug[16], nmi_debug);
+
+static void nmi_handler_debug(void)
+{
+	struct nmi_debug *debug;
+	int i;
+
+	if (!printk_ratelimit())
+		return;
+
+	for (i = 0; i < 16; i++) {
+		debug = &__get_cpu_var(nmi_debug)[i];
+		printk(KERN_EMERG
+		       "cpu #%d, nmi #%d, marked #%d, handled = %d, time = %llu, delta = %llu\n",
+		       debug->cpu,
+		       debug->this_nmi,
+		       debug->marked,
+		       debug->handled,
+		       debug->timestamp,
+		       debug->delta);
+	}
+}
+
 static int __kprobes
 perf_event_nmi_handler(struct notifier_block *self,
 			 unsigned long cmd, void *__args)
@@ -1214,6 +1247,8 @@ perf_event_nmi_handler(struct notifier_block *self,
 	struct die_args *args = __args;
 	unsigned int this_nmi;
 	int handled;
+	struct nmi_debug *debug;
+	u64 timestamp;
 
 	if ...
From: Ingo Molnar
Date: Wednesday, August 25, 2010 - 3:41 am

Correct, on AMD boxes i do not get this message during the first 'perf 

Before the two patches i did not get these messages at all, on any of 

Thanks, will try that and report back.

	Ingo
--

From: Ingo Molnar
Date: Wednesday, August 25, 2010 - 4:00 am

Here's a more detailed description of the regression introduced by:

  4a31beb: perf, x86: Fix handle_irq return values
  8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU

Booting into the debug kernel the system boots up fine - no NMI 
messages, as expected.

Then when i start 'perf top' for the first time i get the NMI message 
with this debug output:

 cpu #15, nmi #160, marked #0, handled = 1, time = 333392635730, delta = 11238255
 cpu #15, nmi #161, marked #0, handled = 1, time = 333403779380, delta = 11143650
 cpu #15, nmi #162, marked #0, handled = 1, time = 333415418497, delta = 11639117
 cpu #15, nmi #163, marked #0, handled = 1, time = 333415467084, delta = 48587
 cpu #15, nmi #164, marked #0, handled = 1, time = 333415501531, delta = 34447
 cpu #15, nmi #165, marked #0, handled = 1, time = 333459918106, delta = 44416575
 cpu #15, nmi #166, marked #0, handled = 0, time = 333459923167, delta = 1666
 cpu #15, nmi #151, marked #0, handled = 1, time = 332978597882, delta = 11447002
 cpu #15, nmi #152, marked #0, handled = 1, time = 332978657151, delta = 59269
 cpu #15, nmi #153, marked #0, handled = 1, time = 332978667847, delta = 10696
 cpu #15, nmi #154, marked #0, handled = 1, time = 333023125757, delta = 44457910
 cpu #15, nmi #155, marked #0, handled = 1, time = 333291980833, delta = 268855076
 cpu #15, nmi #156, marked #0, handled = 1, time = 333325663125, delta = 33682292
 cpu #15, nmi #157, marked #0, handled = 1, time = 333348216481, delta = 22553356
 cpu #15, nmi #158, marked #0, handled = 1, time = 333370168887, delta = 21952406
 cpu #15, nmi #159, marked #0, handled = 1, time = 333381397475, delta = 11228588
 Uhhuh. NMI received for unknown reason 00 on CPU 15.
 Do you have a strange power saving mode enabled?
 Dazed and confused, but trying to continue

When i start perf top for a second time, no messages are printed at all. 
The reason is that on one of the CPUs NMIs are 'stuck':

 NMI: 78164 67099 6342 [*] 65677 66119 63796 65395 ...
From: Don Zickus
Date: Wednesday, August 25, 2010 - 1:11 pm

So I found a Nehalem box that can reliably reproduce Ingo's problem using
something as simple 'perf top'.  But like above, I am noticing the
samething, an extra NMI(PMI??) that comes out of nowhere.

Looking at the data above the delta between nmis is very small compared to
the other nmis.  It almost suggests that this is an extra PMI.
Considering there is already two cpu errata discussing extra PMIs under
certain configurations, I wouldn't be surprised if this was a third.

Cheers,

--

From: Cyrill Gorcunov
Date: Wednesday, August 25, 2010 - 1:24 pm

On Wed, Aug 25, 2010 at 04:11:06PM -0400, Don Zickus wrote:

Oh. I'm not sure if it would be a good idea at all but maybe we could
use kind of Robert's idea about "pmu nmi relaxing time" ie some time
slice in which we treat nmi's as being from pmu, but not arbitrary number
but equal to the number of PMI turned off. Say we handle NMI and found
that 4 events are overflowed, we clear them, arm timer and wait for
3 unknow nmis to happen, if they are not happening during some time
period we clear this waitqueue, if they happen or partially happen
- we destroy the timer. Ie almost the same as Robert's idea but
without tsc? Just a thought.

	-- Cyrill
--

From: Don Zickus
Date: Wednesday, August 25, 2010 - 2:20 pm

The only problem is only one counter is overflowing in these cases, so we
would have to do it all the time, which may not be hard.  But I was
thinking of something similar.

For now, I am trying to force counter0 off, seeing that most of the perf
errata on nehalem have been on counter0.  Or maybe I can get 'perf top' to
use something other than counter0 by running 'perf record' first?

Cheers,
--

From: Cyrill Gorcunov
Date: Wednesday, August 25, 2010 - 2:36 pm

On Wed, Aug 25, 2010 at 05:20:37PM -0400, Don Zickus wrote:

Well, I never looked deep inside perf tool code itself. Perhaps the easiest
(if you want to disable counter0) is just disable it in kernel and see how
it goes, no?

	-- Cyrill
--

From: Robert Richter
Date: Thursday, August 26, 2010 - 2:00 am

This could also be a race in the counter handling code, or we do not
proper count the number of handled counters. Maybe 2 counters actually
fired but we only noticed one counter and then accidentially cleared
the 2nd without processing it.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Cyrill Gorcunov
Date: Thursday, August 26, 2010 - 2:18 am

On Thu, Aug 26, 2010 at 1:00 PM, Robert Richter <robert.richter@amd.com> wrote:

Any chance to get it tested on P4 machine since it has a bit
different design?

Cyrill
--

From: Don Zickus
Date: Thursday, August 26, 2010 - 7:31 am

Well P4 uses a different pmu irq handler, so I am not sure it will give us
much insight.  I haven't noticed this on an AMD or an intel i5 either.

Cheers,
Don
--

From: Don Zickus
Date: Thursday, August 26, 2010 - 8:22 am

Hmm, I take that back.  I guess I can reproduce this on my i5 that I had
using Ingo's config.

Working on Robert's assumption, I added code to perf_event_intel.c that
said if handled !=0 just add one to it (IOW always process handled as 0 or
 something >1).  That seems to working good and catches the nmis that Ingo
was seeing.

I'll keep looking for the race condition to better fix it.

Cheers,
Don
--

From: Cyrill Gorcunov
Date: Thursday, August 26, 2010 - 8:34 am

Sounds promising, mind to post new inter-diff? Ie what you have
changed from Robert's patch.

	-- Cyrill
--

From: Don Zickus
Date: Thursday, August 26, 2010 - 9:40 am

It was really in Peter's patch, just a stupid hack for now.

Cheers,
Don


diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..9e65a7b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -777,7 +777,9 @@ again:
 
 done:
 	intel_pmu_enable_all(0);
-	return handled;
+	if (!handled)
+		return handled;
+	return ++handled;
 }
 
 static struct event_constraint *
--

From: Cyrill Gorcunov
Date: Thursday, August 26, 2010 - 11:02 am

ok, it seems it just treat any unknown nmi as being came from PMU, no?

	-- Cyrill
--

From: Robert Richter
Date: Friday, August 27, 2010 - 12:57 am

Yes, this just throws away all unknown nmis after a perf nmi. It
disables unknown nmi detection on this cpu type.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Peter Zijlstra
Date: Friday, August 27, 2010 - 1:11 am

Wouldn't returning 2 be more sensible, then it would only eat a few
unknowns after each pmi? (Still assuming you return 0 when there really
was nothing to do)
--

From: Robert Richter
Date: Friday, August 27, 2010 - 1:31 am

Yes, this would be the best workaround for cpus where the detection
logic does not work properly. But I think Don found a solution
already.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Robert Richter
Date: Wednesday, August 25, 2010 - 4:02 am

Yes, because all were eaten by NOTIFY_STOP, and that's why we also
lose NMIs from other sources, like the nmi button. The patches try to
fix this.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--

From: Ingo Molnar
Date: Wednesday, August 25, 2010 - 4:19 am

I understand that, but obviously the patches need to fix the lack of 
messages without introducing new regressions: such as messages not seen 
before and CPUs with stuck NMIs.

Thanks,

	Ingo
--

From: Don Zickus
Date: Friday, August 20, 2010 - 4:31 pm

Sad to say, that won't happen.  Both my amd box and nehalem box have to
many issues with your master branch.

The amd box BUGs in perf_event_nmi_handler on the new code trying to run
'perf top'

arch/x86/kernel/cpu/perf_event.c::perf_event_nmi_handler:1250

((__get_cpu_var(nmi).marked == this_nmi) &&

The BUG is attached below.  I can't figure out why

And my Nehalem box won't even boot with the that kernel, not even to
console for some reason.  Then bisecting revealed that in 2.6.35 something
with LVM changed such that the kernel can't mount my RHEL-6 lvm
partitions.  So even if I did get that kernel booting it won't mount
disks.

I'll take this as a sign to quit for now.. and try again on Monday. :-)

Cheers,
Don

-----



amd-ma78gm-01.rhts.eng.bos.redhat.com login: BUG: unable to handle kernel
paging request at ffff87ff838a5200
IP: [<ffffffff814a6370>] perf_event_nmi_handler+0xd0/0xe0
PGD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/online
CPU 0 
Modules linked in: autofs4 sunrpc cpufreq_ondemand powernow_k8 freq_table
mperf ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport wmi
snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc pcspkr serio_raw edac_core edac_mce_amd sg i2c_piix4 r8169
mii ahci libahci shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif
firewire_ohci firewire_core crc_itu_t ata_generic pata_acpi pata_atiixp
floppy radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod
[last unloaded: scsi_wait_scan]

Pid: 1865, comm: perf Not tainted 2.6.36-rc1tipperf-tip+ #28
GA-MA78GM-S2H/GA-MA78GM-S2H
RIP: 0010:[<ffffffff814a6370>]  [<ffffffff814a6370>]
perf_event_nmi_handler+0xd0/0xe0
RSP: 0018:ffff880002407e88  EFLAGS: 00010046
RAX: 0000000000000001 RBX: 000000000000000c RCX: ffffffff814a5200
RDX: ffffffff814a5200 RSI: 0000000000000001 RDI: ffff880002400000
RBP: ffff880002407e98 R08: 0000000000000001 ...
Previous thread: [PATCH v3] netfilter: save the hash of the tuple in the original direction for latter use by Changli Gao on Friday, August 20, 2010 - 7:53 am. (5 messages)

Next thread: 2.6.36-rc1 hangs during XFS barrier test for / by Torsten Kaiser on Friday, August 20, 2010 - 8:08 am. (3 messages)