Re: [PATCH] ACPI PM: Restore the 2.6.24 suspend ordering

Previous thread: Re: [PATCH 3/3] [2.6.25-rc5] Warn user about a BIOS bug in recent asus boards by Yinghai Lu on Saturday, March 29, 2008 - 5:43 pm. (2 messages)

Next thread: Send-Q on UDP socket growing steadily - why? by Deomid Ryabkov on Saturday, March 29, 2008 - 10:43 pm. (3 messages)
From: Rafael J. Wysocki
Date: Saturday, March 29, 2008 - 6:19 pm

Hi Len,

Please consider pushing the appended patch for 2.6.25.

It fixed the regression described at:
https://bugzilla.novell.com/show_bug.cgi?id=374217
http://bugzilla.kernel.org/show_bug.cgi?id=10340

details in the changelog.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

Some time ago it turned out that our suspend code ordering broke
some NVidia-based systems that hung if _PTS was executed with one of
the PCI devices, specifically a USB controller, in a low power state.
Then, it was noticed that the suspend code ordering was not compliant
with ACPI 1.0, although it was compliant with ACPI 2.0 (and later),
and it was argued that the code had to be changed for that reason
(ref. http://bugzilla.kernel.org/show_bug.cgi?id=9528).  So we did,
but evidently we did wrong, because it's now turning out that some
systems have been broken by this change (refs.
http://bugzilla.kernel.org/show_bug.cgi?id=10340 ,
https://bugzilla.novell.com/show_bug.cgi?id=374217#c16).  [I said
at that time that something like this might happend, but the majority
of people involved thought that it was improbable due to the
necessity to preserve the compliance of hardware with ACPI 1.0.]
This actually is a quite serious regression from 2.6.24.

Moreover, the ACPI 1.0 ordering of suspend code introduced another
issue that I have only noticed recently.  Namely, if the suspend of
one of devices fails, the already suspended devices will be resumed
without executing _WAK before, which leads to problems on some
systems (for example, in such situations thermal management is
broken on my HP nx6325).  Consequently, it also breaks suspend
debugging on the affected systems.

Note also, that the requirement to execute _PTS before suspending
devices does not really make sense, because the device in question
may be put into a low power state at run time for a reason unrelated
to a system-wide suspend.

For the reasons outlined above, the change of the suspend ordering
should be reverted, ...
From: Pavel Machek
Date: Sunday, March 30, 2008 - 4:18 am

But this will break those few nvidia-based systems, no?

this may have been a good idea in -rc1 days, but we are in -rc7
now... and the patch is slightly big.

What about something like: (hand-edited patch, sorry)



 Index: linux-2.6/drivers/acpi/sleep/main.c
 ===================================================================
 --- linux-2.6.orig/drivers/acpi/sleep/main.c
 +++ linux-2.6/drivers/acpi/sleep/main.c
 @@ -26,21 +26,6 @@ u8 sleep_states[ACPI_S_STATE_COUNT];
  
  #ifdef CONFIG_PM_SLEEP
  static u32 acpi_target_sleep_state = ACPI_STATE_S0;
 static bool acpi_sleep_finish_wake_up;
 
- /*
-  * ACPI 2.0 and later want us to execute _PTS after suspending devices, so we
-  * allow the user to request that behavior by using the 'acpi_new_pts_ordering'
-  * kernel command line option that causes the following variable to be set.
-  */
 static bool new_pts_ordering = true;
 
 -static int __init acpi_new_pts_ordering(char *str)
 +static int __init acpi_old_pts_ordering(char *str)
 {
 	new_pts_ordering = false;
 	return 1;
 }
 -__setup("acpi_old_pts_ordering", acpi_old_pts_ordering);
 +__setup("acpi_new_pts_ordering", acpi_new_pts_ordering);
  #endif
 
  static int acpi_sleep_prepare(u32 acpi_state)
 Index: linux-2.6/Documentation/kernel-parameters.txt
 ===================================================================
 --- linux-2.6.orig/Documentation/kernel-parameters.txt
 +++ linux-2.6/Documentation/kernel-parameters.txt
 @@ -170,11 +170,6 @@ and is between 256 and 4096 characters. 
  	acpi_irq_isa=	[HW,ACPI] If irq_balance, mark listed IRQs used by ISA
  			Format: <irq>,<irq>...
  
 -	acpi_new_pts_ordering [HW,ACPI]
 +	acpi_old_pts_ordering [HW,ACPI]
 -			Enforce the ACPI 2.0 ordering of the _PTS control
 +			Enforce the ACPI 1.0 ordering of the _PTS control
 			method wrt putting devices into low power states
 -			default: pre ACPI 2.0 ordering of _PTS
 +			default: ACPI 2.0 ordering of _PTS
 
  	acpi_no_auto_ssdt	[HW,ACPI] Disable automatic loading ...
From: Rafael J. Wysocki
Date: Sunday, March 30, 2008 - 4:58 am

Well, I think that would be confusing.

The NVidia systems are broken anyway on 2.6.24.x, so we just don't fix them
rather than break them and there are more reasons to do what the patch does
(as pointed out in the changelog).  For example, your suggested patch doesn't
 fix the error paths/debugging breakage described in the changelog.

I think we _can_ do something about the failing NVidia systems in the 2.6.26

Thanks,
Rafael
--

From: Pavel Machek
Date: Sunday, March 30, 2008 - 5:28 am

Yes, but if we are putting them into lowpower state ourselves, we
should probably be doing that "by hand", without calling acpi
methods. _PTS may prepare something for acpi methods (which tell us



We could simply blacklist them, no?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Rafael J. Wysocki
Date: Sunday, March 30, 2008 - 6:15 am

I meant "the requirement to execute _PTS before suspending devices, because

Breaking systems between 2.6.24 and 2.6.25 is even worse, which is why
I've posted this patch.

IOW, we tried to fix systems that were broken with 2.6.24, but it didn't work,
because our "fix" broke systems that were OK with 2.6.24.  Solution: revert
the "fix" and go back to the design board.  That's all we can do so late in


Yes, but for this purpose we'll have to redesign the core so that everything
(including debugging and the error paths) works if _PTS is executed before
suspending devices.  _That_, however, is not a 2.6.25 thing.

Thanks,
Rafael
--

From: Pavel Machek
Date: Tuesday, April 1, 2008 - 1:45 am

Well, I agree that regression from 2.6.24 is worse, but it is
_slightly_ worse... -rcs are really expected to improve...

...plus it no longer looks like macbook regression is caused by _PTS

So we have solution that fixes 2.6.24 systems, makes system that
worked in 2.6.25-rc5 work with command line option, but gets error
handling wrong.

I guess we could use that?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Felix Möller
Date: Tuesday, April 1, 2008 - 7:38 am

I am the reporter from the original Novell Bug: 
https://bugzilla.novell.com/show_bug.cgi?id=374217

I just tried current git head (two hours ago) with the patch (the one 
from the beginning of this thread) from Rafael and without it. With the 
patch my MacBook does suspend without it does not.

HTH
Felix Möller
--

From: Rafael J. Wysocki
Date: Tuesday, April 1, 2008 - 12:55 pm

IMO we should not use that, because it's broken.  That's why I posted the
patch.

Thanks,
Rafael
--

From: Pavel Machek
Date: Wednesday, April 2, 2008 - 6:00 am

I understand that. World will not fall down one way or another, but
I'd slightly prefer the 'only change default' version. That's why I
objected to your patch :-).
							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

Previous thread: Re: [PATCH 3/3] [2.6.25-rc5] Warn user about a BIOS bug in recent asus boards by Yinghai Lu on Saturday, March 29, 2008 - 5:43 pm. (2 messages)

Next thread: