Patchset that uses HPET timers in MSI mode (when supported) and sets up per CPU HPET timers. This removes the dependency on IRQ0 timer broadcast with LAPIC stopping in deep C-state, on platforms that support HPET MSI mode. On my test system with dual core CPU, the number of timer related interrupts (HPET_MSI + IRQ0 + LAPIC) comes down from 180 to 95 over a period of 10s, with these patches. This is on an idle system with tickless enabled and when system is idle. Patches against tip. -- --
cool stuff! this is _really_ how a modern dynticks system should look like on x86 - proper per CPU hardware timers that are southbridge based. There's a few routine checks this new has to pass: we've got to see how widely this works and whether there are any bugs/quirks to take care of, so i created a separate feature topic for it: tip/timers/hpet-percpu. This tip/timers/hpet-percpu feature topic tree is based on irq/sparseirq + timers/hpet + timers/urgent - which had some changes in the hpet area. I merged up the conflicts - please double check the result. I also did cleanups for a few style problems that were present in hpet.c. I've merged it into tip/master as well and will run a few tests before pushing it out. Ingo --
it crashes two testsystems, the fault on a NULL pointer in hpet init, with: initcall print_all_ICs+0x0/0x520 returned 0 after 26 msecs calling hpet_late_init+0x0/0x1c0 BUG: unable to handle kernel NULL pointer dereference at 000000000000008c IP: [<ffffffff80d228be>] hpet_late_init+0xfe/0x1c0 PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.27-rc5 #29725 RIP: 0010:[<ffffffff80d228be>] [<ffffffff80d228be>] hpet_late_init+0xfe/0x1c0 RSP: 0018:ffff88003fa07dd0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffffc20000000160 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff88003fa07e90 R08: 0000000000000000 R09: ffff88003fa07dd0 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88003fa07dd0 R13: 0000000000000002 R14: ffffc20000000000 R15: 000000006f57e511 FS: 0000000000000000(0000) GS:ffffffff80cf6a80(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000008c CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff88003fa06000, task ffff88003fa08000) Stack: 00000000fed00000 ffffc20000000000 0000000100000003 0000000800000002 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff80d227c0>] ? hpet_late_init+0x0/0x1c0 [<ffffffff80209045>] do_one_initcall+0x45/0x190 [<ffffffff80296f39>] ? register_irq_proc+0x19/0xe0 [<ffffffff80d0d140>] ? early_idt_handler+0x0/0x73 [<ffffffff80d0dabc>] kernel_init+0x14c/0x1b0 [<ffffffff80942ac1>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8020dbd9>] child_rip+0xa/0x11 [<ffffffff8020ceee>] ? restore_args+0x0/0x30 [<ffffffff80d0d970>] ? kernel_init+0x0/0x1b0 [<ffffffff8020dbcf>] ? child_rip+0x0/0x11 Code: 20 48 83 c1 01 48 39 f1 75 e3 ...
There was one code path, with CONFIG_PCI_MSI disabled, where we were accessing
hpet_devs without initialization. That resulted in the above crash. The change
below adds a check for hpet_devs.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
arch/x86/kernel/hpet.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
Index: tip/arch/x86/kernel/hpet.c
===================================================================
--- tip.orig/arch/x86/kernel/hpet.c 2008-09-08 09:20:00.000000000 -0700
+++ tip/arch/x86/kernel/hpet.c 2008-09-08 09:44:23.000000000 -0700
@@ -124,6 +124,24 @@ EXPORT_SYMBOL_GPL(is_hpet_enabled);
* timer 0 and timer 1 in case of RTC emulation.
*/
#ifdef CONFIG_HPET
+static void hpet_reserve_msi_timers(struct hpet_data *hd)
+{
+ int i;
+
+ if (!hpet_devs)
+ return;
+
+ for (i = 0; i < hpet_num_timers; i++) {
+ struct hpet_dev *hdev = &hpet_devs[i];
+
+ if (!(hdev->flags & HPET_DEV_VALID))
+ continue;
+
+ hd->hd_irq[hdev->num] = hdev->irq;
+ hpet_reserve_timer(hd, hdev->num);
+ }
+}
+
static void hpet_reserve_platform_timers(unsigned long id)
{
struct hpet __iomem *hpet = hpet_virt_address;
@@ -156,15 +174,7 @@ static void hpet_reserve_platform_timers
Tn_INT_ROUTE_CNF_SHIFT;
}
- for (i = 0; i < nrtimers; i++) {
- struct hpet_dev *hdev = &hpet_devs[i];
-
- if (!(hdev->flags & HPET_DEV_VALID))
- continue;
-
- hd.hd_irq[hdev->num] = hdev->irq;
- hpet_reserve_timer(&hd, hdev->num);
- }
+ hpet_reserve_msi_timers(&hd);
hpet_alloc(&hd);
--
applied to tip/timers/hpet-percpu, thanks Venki! Ingo --
