Re: softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Ingo Molnar
Date: Wednesday, February 6, 2008 - 5:51 pm

* Andrew Morton <akpm@linux-foundation.org> wrote:


the way i do it in bisection is to do:

  mkdir patches
  git-log -1 -p ed50d6cbc394cd0966469d3 > patches/fix.patch
  echo fix.patch > patches/series

and then before testing a bisection point, i do a 'quilt push'. Before 
telling git-bisect about the quality of that bisection point (good/bad) 
i pop it off via 'quilt pop'.

this way the 'required fix' can be kept during the bisection, to find 
the secondary bug.


what should be the proper message?

my suspects, besides there being something wrong in the hung-tasks code 
of the softlockup watchdog, would be the cpu-hotplug commits, or some 
arch/x86 commit. (although we didnt really have anything specifically 
touching the the reboot path)

does a stupid patch like the one below tell you more about what the 
other CPUs are doing during this hang? [32-bit only patch]

	Ingo

---
 arch/i386/kernel/nmi.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux/arch/i386/kernel/nmi.c
===================================================================
--- linux.orig/arch/x86/kernel/nmi_64.c
+++ linux/arch/x86/kernel/nmi_64.c
@@ -331,6 +331,14 @@ __kprobes int nmi_watchdog_tick(struct p
 	int touched = 0;
 	int cpu = smp_processor_id();
 	int rc=0;
+	static int count[NR_CPUS];
+
+	if (!count[cpu]) {
+		count[cpu] = nmi_hz;
+		printk("CPU#%d, tick\n", cpu);
+		show_regs(regs);
+	}
+	count[cpu]--;
 
 	/* check for other users first */
 	if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: softlockup: automatically detect hung TASK_UNINTERRUPT ..., Ingo Molnar, (Wed Feb 6, 5:51 pm)