Question: do we disable all CPUs except 0 when doing ACPI power off?
Background:
I have a machine here dedicated to running MythTV.
It powers up to record, and then sets the RTC alarm for next time
and powers down again in between recordings.It has an Intel Core2duo E6300 CPU, currently on an ICH8 motherboard.
Previously it was on a completely different (vendor,bios,...) ICH7 motherboard.In both cases, "halt -p" sometimes fails to actually turn off the power,
which means that it later then fails to "turn on" to record again.Annoying.
This is a 32-bit kernel/runtime, with full ACPI (not APM) kernel support enabled.
So I'm wondering if it may be due to the old SMP-poweroff bogeyman ?
For now, I've hardcoded a cpu_down(1) into the poweroff code,
and we'll see if that helps or is merely redundant.But I do wonder where else to look for a cause?
Two different boards, vendors, BIOSs, same CPU chip. Same problem.
????
-
May be.
Same chipset, perchance?
Greetings,
Rafael
-
We used to.
It is absolutely mandatory -- else it confuses the BIOS on some boards
b/c it isn't expecting SMM to get entered from other than cpu0.-Len
-
Can we use the CPU hotplug for that, like in the suspend/hibernation case?
Rafael
-
Well, so far it's working: about ten poweroffs since I patched it,
and no issues with any of them. Prior to that, it seemed like about
one in five poweroffs wouldn't (power off).It'll take a lot more testing to confirm, though.
What can I call to determine if more than one CPU is enabled, anyway?
Here's the hack I'm using here, very situation (2 cores) specific,
and it still has some printk's leftover with a sleep so I have time
to read them before the lights go out. :)--- old/arch/i386/kernel/reboot.c 2007-09-27 17:17:00.000000000 -0400
+++ linux/arch/i386/kernel/reboot.c 2007-09-27 17:15:35.000000000 -0400
@@ -393,8 +393,22 @@
.halt = native_machine_halt,
};+static void kill_cpu1(void)
+{
+ extern int cpu_down(unsigned int cpu);
+
+ printk(KERN_EMERG "kill_cpu1: was running on CPU%d\n", smp_processor_id());
+ /* Some bioses don't like being called from CPU != 0 */
+ set_cpus_allowed(current, cpumask_of_cpu(0));
+ printk(KERN_EMERG "kill_cpu1: now running on CPU%d\n", smp_processor_id());
+ cpu_down(1);
+ printk(KERN_EMERG "kill_cpu1: done\n");
+ msleep(1000);
+}
+
void machine_power_off(void)
{
+ (void)kill_cpu1();
machine_ops.power_off();
}-
Well, we have disable_nonboot_cpus() that we use for suspend and that is
supposed to be general.The question is when to call it.
Greetings,
Rafael
-
How about from the obvious candidate: kernel/sys.c::kernel_power_off() ?
I'll rework my hack into a proper patch there,
repost it here, and test with it for another day or two.-ml
-
We need to disable all CPUs other than the boot CPU (usually 0)
before attempting to power-off modern SMP machines.
This seems to fix the hang-on-poweroff issue
that one of my SMP boxes exhibits. More testing required.Signed-off-by: Mark Lord <mlord@pobox.com>
------ linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
+++ linux/kernel/sys.c 2007-09-28 09:48:54.000000000 -0400
@@ -32,6 +32,7 @@
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
+#include <linux/cpu.h>#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -879,6 +880,7 @@
if (pm_power_off_prepare)
pm_power_off_prepare();
sysdev_shutdown();
+ disable_nonboot_cpus();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
}
-
Fixes my new toybox as well. Thanks for tracking it down before I had to
dig in.-
Thanks. Here is the revised patch.
* * *
We need to disable all CPUs other than the boot CPU (usually 0)
before attempting to power-off modern SMP machines.
This fixes the hang-on-poweroff issue on my MythTV SMP box,
and also on Thomas Gleixner's new toybox.Signed-off-by: Mark Lord <mlord@pobox.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
------ linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
+++ linux/kernel/sys.c 2007-09-28 15:48:54.000000000 -0400
@@ -32,6 +32,7 @@
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
+#include <linux/cpu.h>#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -878,6 +879,7 @@
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
+ disable_nonboot_cpus();
sysdev_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
-
Doesn't fix for me!
I have an Athlon x2 running on a Asus A8N-E mobo which has an NForce 4
chipset, I thought this patch would fix poweroff for me too, but it doesn't.I'm seing this on 2.6.23-rc8 with and without your patch, here is what I get
on the console:Shutdown: hdd
Shutdown: hda
System halted.Nothing else pops up.
When I hadn't put your patch if I tried Ctrl+Alt+Del after the System halted
message I could then see these messages:md: stopping all md devices.
md: md1 sill in use.But no reboot would take place. I have not tested this Ctrl+Alt+Del thing
with your patch, but I think it still behaves like that (not rebooting).I'm attaching my .config.
Regards...
--
Manty/BestiaTester -> http://manty.net
I'd say your problem is more of a distro issue,
in that the method you are using to shutdown
is not actually requesting "poweroff".That last mess above ("System halted.") comes from kernel_halt(),
rather than the expected message ("Power down.") from kernel_power_off().So, try using the "poweroff" command instead of "halt",
or try using "halt -p". If neither of those work,
then edit /etc/init.d/halt and hardcode the "-p" parameter
inside there onto the "halt" command line(s).I had to do that frequently back in the Redhat/Fedora days.
I'm sure they have a nice GUI for it somewhere,
but at the time it was simpler to just edit the script.Cheers
-
Well it works ok with 2.6.22 powering off and saying so right before
powering off, with some references to ACPI. On 2.6.23-rc8 however it doesn't
seem to get that far.I have followed the poweroff of my distro (Debian unstable) and on getting
to the end of rc 0 it calls halt with options -d -f -i -p. So it does call
it with the -p you asked for. BTW, this halt comes fom Debian's sysvinit
version 2.86.ds1-38.1 in case that matters, but as I said it is working okThat init.d/halt is the one that is calling halt with -d -f -i -p already.
Regards...
--
Manty/BestiaTester -> http://manty.net
-
There was a bug in 2.6.23-rc8 that caused this to happen.
It's been fixed in the later -git kernels
(commits 2f3f22269bdf702311342c5d106dfdd7347d1c3e,
853298bc03ef65e3eb392f5d61265605214ee8fb).Greetings,
Rafael
-
You are right, I have downloaded current head of git and seems to work ok.
Sorry for the lost time :-(
Regards...
--
Manty/BestiaTester -> http://manty.net
-
Mmm.. then I wonder why it is not actually getting into the kernel's poweroff function?
Can you boot into single user mode ( kernel command line parameter of S ),
and then remount your filesystems all r/o (ALT-SysRQ-u + ALT-SysRQ-s,
or do it all manually if you prefer).Then manually try this command from the primary console (Ctrl-ALT-F1):
/sbin/halt -f -p
The machine *should* poweroff.
If not, then do the whole thing again with this command:strace /bin/halt -f -p
And see what the final syscall + parameters is. Post them here if you can.
Cheers
-
I booted into single mode, then umounted all unneeded stuff and put / to ro,
I tried this and I did the strace with a -o to get the output on a remote
cifs fs, here is the full trace:execve("/sbin/halt", ["/sbin/halt", "-f", "-p"], [/* 16 vars */]) = 0
brk(0) = 0x804b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f29000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=52448, ...}) = 0
mmap2(NULL, 52448, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f1c000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260a\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1335536, ...}) = 0
mmap2(NULL, 1340944, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7dd4000
mmap2(0xb7f16000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x142) = 0xb7f16000
mmap2(0xb7f19000, 9744, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f19000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7dd3000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7dd36b0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7f16000, 4096, PROT_READ) = 0
munmap(0xb7f1c000, 52448) = 0
geteuid32() = 0
chdir("/") = 0
open("/var/log/wtmp", O_WRONLY|O_APPEND) = -1 EROFS (Read-only file system)
sync() = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8)...
Mmm.. okay, user space is doing the right things.
So next is inside the kernel itself, at linux/kernel/sys.c :: sys_reboot(),
where we see this code:/* Instead of trying to make the power_off code look like
* halt when pm_power_off is not set do it the easy way.
*/
if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
cmd = LINUX_REBOOT_CMD_HALT;This converts a "poweroff" into a "reboot" if no machine dependent
power off function has been bound in (pm_power_off() is a function pointer).So for this to work, I believe that either ACPI or APM has to have been
configured into the kernel (and the modules loaded). Your kernel .config
from earlier shows ACPI built-in to the kernel core, so it should be present.Unless you booted with noacpi or some such parameter..
So let's have a look at the kernel boot logs,
and you could also try CONGIG_ACPI_DEBUG=yBizarre (and nothing to do with my patch).
-
Yes, and it is indeed, the acpid is running and it detects my power button
I believe this is normal, I have done a grep -i acpi on the dmesg, here is
the result, if you want the full dmesg tell me:BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
ACPI: RSDP 000F7560, 0014 (r0 Nvidia)
ACPI: RSDT 3FFF3040, 0030 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: FACP 3FFF30C0, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: DSDT 3FFF3180, 65F2 (r1 NVIDIA AWRDACPI 1000 MSFT 100000E)
ACPI: FACS 3FFF0000, 0040
ACPI: MCFG 3FFF9880, 003C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: APIC 3FFF97C0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
Nvidia board detected. Ignoring ACPI timer override.
If you got timer trouble try acpi_use_timer_override
ACPI: PM-Timer IO Port: 0x4008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: BIOS IRQ0 pin2 override ignored.
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: Core revision 20070126
tbxface-0598 [00] tb_load_namespace : ACPI Tables successfully acquired
evxfevnt-0091 [00] enable : Transition to ACPI mode successful
ACPI: bus type pci registered
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interru...
..
The output is missing a line like this, which should have been between the two above:
ACPI: (supports S0 S3 S4 S5)
The ACPI power-off function only gets bound into pm_power_off()
when that line shows S5 on it.The only way that line can be missing, is if something disabled ACPI
after boot.This patch (below) should find the culprit for you:
---
--- old/include/asm-i386/acpi.h 2007-09-28 18:09:14.000000000 -0400
+++ linux/include/asm-i386/acpi.h 2007-10-01 12:35:23.000000000 -0400
@@ -97,6 +97,7 @@
extern int acpi_pci_disabled;
static inline void disable_acpi(void)
{
+ WARN_ON(1);
acpi_disabled = 1;
acpi_ht = 0;
acpi_pci_disabled = 1;
-
Duh.. fingers failed to follow brain: that converts a "poweroff" into a "halt",
-
Before sysdev_shutdown(), please.
Greetings,
Rafael
-
Damn, you're right. Missed that.
tglx
-
Okay, verified now. Prior to this patch, *both* CPUs were still up and running
when machine_power_off() got called, and there was no guarantee that CPU0 was
the one calling machine_power_off(). BUG.The above patch guarantees that only the single boot CPU is running
and calling machine_power_off().Hopefully this buries the SMP-power-off bogeyman for good!
Cheers
-
Latest 2.6.23-rc-git. Same problem from time to time on 2.6.17, as well.
Dunno about in between those Revs., but it's much more common on the latestMmmm I originally didn't think so.
But actually one board is ICH8, the other ICH8R,
so yes, they use the same chipset.Cheers
-
Oh, and two different power-supplies, too.
-ml
-
| James Bottomley | Re: Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| majkls | sys_chroot+sys_fchdir Fix |
| Paul Mackerras | Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| KOSAKI Motohiro | [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
