Re: PCIE ASPM support hangs my laptop pretty often

Previous thread: Re: [patch 2/4] forcedeth: fix MAC address detection on network card (regression in 2.6.23) by Jeff Garzik on Tuesday, February 5, 2008 - 2:20 pm. (10 messages)

Next thread: [PATCH] fb: fix warning: no return statement in function returning non-void by Anton Vorontsov on Tuesday, February 5, 2008 - 2:40 pm. (1 message)
To: Shaohua Li <shaohua.li@...>
Cc: Greg KH <greg@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 1:40 pm

I've patched my kernel with the PCIe ASPM and after setting
echo powersave > /sys/module/pcie_aspm/parameters/policy

I started to experience random hangs of my laptop.
Hardware info:
Thinkpad x60s 1704-5UG
also tested on a firends X60s 1702-F6U

Kernel is 2.6.24 + these patches:
tuxonice 3.0-rc5
thinkpad_acpi v0.19-20080107
tp_smapi 0.36

--
Damjan Georgievski
--

To: Дамјан <penguinista@...>
Cc: Greg KH <greg@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>, Kok, Auke <auke-jan.h.kok@...>
Date: Tuesday, February 19, 2008 - 1:55 am

Hi,
Sorry for the long delay, I'm just back from vocation. Some devices or
chipsets don't work well with ASPM. This is one of the reason why the
default policy of the patch is per BIOS setting. Ideally drivers should
disable ASPM for specific devices, the patch provides an API
(pci_disable_link_state) for this too. As Auke suggested, you can use
the per-device interface to control separate links to see which device
is broken. If you found one, please report to driver maintainer and me,
we can disable ASPM in the driver.

Thanks,
Shaohua

--

To: Дамјан Георгиевски <penguinista@...>
Cc: Shaohua Li <shaohua.li@...>, Greg KH <greg@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 2:46 pm

On Tue, 5 Feb 2008 18:40:04 +0100

--
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--

To: Arjan van de Ven <arjan@...>
Cc: ???????????? ?????????????????????? <penguinista@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 2:51 pm

Well, the code shouldn't then cause a crash of the machine :)

thanks,

greg k-h
--

To: Greg KH <greg@...>
Cc: Arjan van de Ven <arjan@...>, ???????????? ?????????????????????? <penguinista@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 4:58 pm

The user enabled it specifically (where it is disabled by default)

ASPM has been crashing e1000(e), which is why I've recently merged a patch to
disable L1 ASPM for the onboard 82573 nic on those platforms.

this new infrastructure should work in the default configuration - enabling ASPM
where this system leaves it disabled is expected to give problems unless you know
what you are doing.

Auke
--

To: Kok, Auke <auke-jan.h.kok@...>
Cc: Greg KH <greg@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 8:05 pm

In my defense, the patch documentation didn't say it doesn't work with my
hardware, nor that it hangs the chipset :) and the promised 1.3w surelly
looked nice.

So, are there any benefits of ASPM if I have it in the kernel but it's set to
default? I got the impression that "default" means not much power savings?

--
Damjan Georgievski
Free Software Macedonia
--

To: ?????? ??????????? <penguinista@...>
Cc: Kok, Auke <auke-jan.h.kok@...>, Greg KH <greg@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Tuesday, February 5, 2008 - 8:22 pm

did the Kconfig not come with a big fat (EXPERIMENTAL) ?

it actually depends for each device on the PCI-Express bus. Most PCI-E ports
support it but the device has the option of advertising enablement of that
capability or not.

both platform and each device on the pci-e bus are involved. some sata chipsets
work great with it, some that might not even advertise the capability... but it's
really hit and miss.

Your report is great of course, no doubt about it. I hope that people understand
that this feature can seriously break things at the bus level. It makes me feel a
lot better about the issues we had with some of our network cards and ASPM :)

once we get some feeling about how good ASPM works in the field for people we
might have to blacklist certain platforms or devices.

you could (for instance) try to see which device on your busses support ASPM and
work on per-device ASPM parameters (which is one of the things I suggested before)
so that we get an idea of which device is badly behaving with ASPM on your system.

Cheers,

Auke

--

To: Kok, Auke <auke-jan.h.kok@...>
Cc: ?????? ??????????? <penguinista@...>, Greg KH <greg@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Wednesday, February 6, 2008 - 9:00 am

(EXPERIMENTAL) is something different from (KNOWN BROKEN).

If we know about broken setups, we should probably be blacklisting
them.

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Kok, Auke <auke-jan.h.kok@...>, ?????? ??????????? <penguinista@...>, Greg KH <greg@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Wednesday, February 6, 2008 - 1:25 pm

Well, the ASPM thing seems to break every single setup I've tested. So,
perhaps we should whitelist the working ones?

Rafael
--

To: Rafael J. Wysocki <rjw@...>
Cc: Pavel Machek <pavel@...>, Kok, Auke <auke-jan.h.kok@...>, ?????? ??????????? <penguinista@...>, Greg KH <greg@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Wednesday, February 6, 2008 - 5:46 pm

greg KH is reverting this patch alltogether in mainline, maybe the original writer
can accomodate some of the comments in the rewrite.

Auke

--

To: Kok, Auke <auke-jan.h.kok@...>
Cc: Rafael J. Wysocki <rjw@...>, Pavel Machek <pavel@...>, ?????? ??????????? <penguinista@...>, Arjan van de Ven <arjan@...>, Shaohua Li <shaohua.li@...>, lkml <linux-kernel@...>, linux-pci <linux-pci@...>
Date: Wednesday, February 6, 2008 - 5:58 pm

It's already reverted.

thanks,

greg k-h
--

Previous thread: Re: [patch 2/4] forcedeth: fix MAC address detection on network card (regression in 2.6.23) by Jeff Garzik on Tuesday, February 5, 2008 - 2:20 pm. (10 messages)

Next thread: [PATCH] fb: fix warning: no return statement in function returning non-void by Anton Vorontsov on Tuesday, February 5, 2008 - 2:40 pm. (1 message)