So yet another week, another -rc. This one should be the last one: we're
certainly not running out of regressions, but at the same time, at some
point I just have to pick some point, and on the whole the regressions
don't look _too_ scary. And -rc8 obviously does fix more of them.Most of the changes since -rc7 are pretty small, and there aren't even a
whole lot of them. The shortlog (appended) is just a couple of pages, and
the diffstat is even smaller, but since the dirstat is a dense overview,
I'll just put that here instead:4.6% arch/m32r/kernel/
5.7% arch/m32r/
9.5% arch/mips/pci/
10.4% arch/mips/
4.2% arch/x86/kernel/
4.4% arch/x86/
26.0% arch/
3.5% drivers/usb/storage/
10.4% drivers/usb/
3.6% drivers/watchdog/
23.8% drivers/
11.5% fs/xfs/
13.5% fs/
3.7% kernel/
9.8% net/9p/
10.6% net/
5.4% scripts/kconfig/
5.9% scripts/
7.4% sound/soc/codecs/
8.4% sound/soc/
10.1% sound/and it's actually more spread out than usual. Arch and drivers are just
half of the patch even when combined.Give it a try,
Linus
---
Adrian Bunk (5):
m32r: remove the unused NOHIGHMEM option
m32r: don't offer CONFIG_ISA
m32r: export empty_zero_page
m32r: export __ndelay
m32r/kernel/: cleanupsAdrian Hunter (2):
UBIFS: TNC / GC race fixes
UBIFS: remove incorrect assertAkinobu Mita (2):
[WATCHDOG] ibmasr: remove unnecessary spin_unlock()
ibmasr: remove unnecessary spin_unlock()Alan Cox (1):
pcmcia: Fix broken abuse of dev->driver_dataAlan Stern (2):
USB: unusual_devs addition for RockChip MP3 player
USB: revert recovery from transient errorsAlex Chiang (1):
[IA64] Ski simulator doesn't need check_sal_cache_flushAlexander Beregalov (1):
UBIFS: fix printk format warningsAlexander Duyck (1):
netdev: simple_tx_hash shouldn't hash inside fragmentsAndrea Righi (1):
x86, oprofile: BUG schedu...
Hi....
Dealing with my Aspire One setup, I found this (so obvious I don't send a patch:)
arch/x86/kernel/cpu/mtrr/main.c:
static int __init disable_mtrr_cleanup_setup(char *str)
{
if (enable_mtrr_cleanup != -1)
enable_mtrr_cleanup = 0;
return 0;
}
early_param("disable_mtrr_cleanup", disable_mtrr_cleanup_setup);static int __init enable_mtrr_cleanup_setup(char *str)
{
if (enable_mtrr_cleanup != -1)
enable_mtrr_cleanup = 1;
return 0;
}
early_param("enble_mtrr_cleanup", enable_mtrr_cleanup_setup);
^^^^^^Nice ;)
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--
heh. Could you send a patch with a changelog please?
Ingo
--
These options are also named inconsistently with all other options.
The standard way to name an boolean option is "foo" versus "nofoo", in
this case, "mtrrcleanup" vs "nomtrrcleanup".-hpa
--
ok, we could change it...
YH
--
If we're fixing a typo anyway I'd suggest so. We know we're not
breaking anyone's working setup...-hpa
--
mtrr_cleanup and no_mtrr_cleanup?
YH
--
Dashes seem to be used more than underscores, so it probably should be
"mtrr-cleanup" and "nomtrr-cleanup" if you want a separator.-hpa
--
i need to document the mtrr_cleanup_debug too...change it to
mtrrcleanup_debug ? just like initcall_debug?YH
--
I would prefer "mtrr-cleanup-debug" if the main one is "mtrr-cleanup";
mixing dashes and underscores is a bit sick. Unfortunately we have had
very few attempts at consistency with command line options... some in
the early days were even StudlyCaps (yuck...)-hpa
--
Here it goes...I hope its right.
==================
Correct typo for 'enable_mtrr_cleanup' early boot param name.
Signed-off-by: J.A. Magallon <jamagallon@ono.com>
diff -p -up linux/arch/x86/kernel/cpu/mtrr/main.c.orig linux/arch/x86/kernel/cpu/mtrr/main.c
--- linux/arch/x86/kernel/cpu/mtrr/main.c.orig 2008-09-30 09:57:46.000000000 +0200
+++ linux/arch/x86/kernel/cpu/mtrr/main.c 2008-09-30 09:57:55.000000000 +0200
@@ -834,7 +834,7 @@ static int __init enable_mtrr_cleanup_se
enable_mtrr_cleanup = 1;
return 0;
}
-early_param("enble_mtrr_cleanup", enable_mtrr_cleanup_setup);
+early_param("enable_mtrr_cleanup", enable_mtrr_cleanup_setup);struct var_mtrr_state {
unsigned long range_startk;--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--
applied to tip/x86/urgent, thanks!
Ingo
--
Ingo, why did you require a patch? Was not it really more simple and
easy for everyone to write it yourself? Since I am sure it was not only
a laziness matter (really?), I am very curious to know the reason.Thank you,
Domenico-----[ Domenico Andreoli, aka cavok
--[ http://www.dandreoli.com/gpgkey.asc
---[ 3A0F 2F80 F79C 678A 8936 4FEE 0677 9033 A20E BC50
--
I see two things :
- preserve authorship of the code
- "laziness" as you call it, is the only way to scale for a maintainer.Willy
--
yeah, correct. Also, i asked (not required) J.A. Magallón whether he
could send a patch - if he didnt (no time, etc.) i'd have fixed it
myself (crediting him in the changelog).But it's also a general principle: maintainers dont 'own' the code in
any way and there should be no assymetry in the ability to modify the
code. So if people are willing to fix bugs they notice, i prefer that
far more than me doing it.Ingo
--
I think I got the lesson although the assymetry matter is still not that
clear to me. Anyway I also know that when you talk about code you prefer
patches to plain english so I expect you'd like others do the same ;)Thank you,
Domenico-----[ Domenico Andreoli, aka cavok
--[ http://www.dandreoli.com/gpgkey.asc
---[ 3A0F 2F80 F79C 678A 8936 4FEE 0677 9033 A20E BC50
--
If 2.6.27 is released with e1000e driver corrupting EEPROM contents on
many systems out there, rendering the cards unusable for most of the
i-am-not-a-hacker users (and remember, even Dave Airlie bricked his laptop
completely to death, when trying to restore eeprom contents), well, I
personally find that very scary.Intel is working with us on tracking down and resolving the issue, but
this is not going as well as one would like to see (one attempt, one card
with completely hosed EEPROM contents ... and restoring the contents is
not *that* trivial).Intel has some patches to mitigate the symptoms (even though we still
don't know who is causing the breakage, but Xorg is the biggest suspect in
my eyes), but they are neither in your tree nor in any other maintainer's
queue yet, as far as I know.--
Jiri Kosina
SUSE Labs--
What's the magic to trigger it? I've got a laptop with that e1000e chip in
it, and am obviously running a recent kernel on it. Do people have a
handle on it? Is it actually verified to be kernel-related, and not
related to the X server etc?Linus
--
So far it seems to be that you need 1) something close to xorg 7.4 and
2) 2.6.27-rcX kernel to trigger it. Not every system having e1000e is
affected.Apparently it is some kind of race, as it usually takes multiple cycles to
trigger (on one of our testing machines this took three attempts to
trigger for the first time, and then after unbricking the machine and
restarting testing, the reproduction tests have been running for several
hours).It always seems to happen when X is probing/initializing the graphics
card. So it really seems to be some badness in Xorg intel driver
initialization code, and kernel/hardware allows bad things to happen.Last time I heard, our X developers are suspecting vbeinit initialization
code in Intel driver and are looking into it.Also, we are going to release next opensuse/SLES beta with patches that
should mitigate the problem (Jesse has posted a new version of them), so
hopefully we will then receive some stacktraces from the users who are
able to trigger the problem more easily.--
Jiri Kosina
SUSE Labs
--
And this e1000e must be ICH*, right? I.e. not a separate e1000e
chip/card?
--
Krzysztof Halasa
--
So far all the affected systems I am aware of were ICH.
--
Jiri Kosina
SUSE Labs
--
Ditto here, i.e. we have no similar reports on other parts.
-----Original Message-----
From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Jiri Kosina
Sent: Tuesday, September 30, 2008 7:11 AM
To: Krzysztof Halasa
Cc: Linus Torvalds; Linux Kernel Mailing List; Brandeburg, Jesse
Subject: Re: Linux 2.6.27-rc8So far all the affected systems I am aware of were ICH.
--
Jiri Kosina
SUSE Labs
----
my current status mail was posted earlier today to lkml from this
address, since then we've had a local reproduction and are going for
number two. The reproduction seems racy, i.e. it doesn't happen every
time, so we put it in a loop doing detect, check eeprom, detect, etc,
and we'll see if it fails.Reproduction seems to consistently be around X probing time, no firm
leads yet. As for Intel we have keithp and jbarnes as well as arjan,
auke, myself and a few others involved.We have some patches to lock the nvm down, we'll be posting those
tonight and tomorrow, I also have some debug logic (and fixes) to help
prove that we don't think it's a race in e1000e.
--
Jesse
--
Can we get the simple debug patches including the fixes which resulted
from them pushed upstream ASAP ?Thanks,
tglx
--
On Tue, Sep 30, 2008 at 11:56 AM, Linus Torvalds
If we had the magic we'd have fixed it by now, the current working
theory is its X server related. This
hasn't been proven, though my ATI GPU e1000e seems fine so it may have
some legs.If it is X related then its both a kernel + X server issue, the e1000e
driver opens the barn door, the X server drives the horses through it.Of course until someone produces a way to fix the hw after it breaks,
reproducing this isn't something for the feint hearted. I'm hoping my
laptop
comes back today with a brand new motherboard in it.Dave.
--
Are you sure? There was a mandriva report abou NVM corruption on an e100
too (that one apparently just caused PXE failure, the networking worked
fine).So I wonder if it's _purely_ X-server-related, adn the reason people blame
2.6.27-rc1 is just timing of some X update and then people just look at
the kernel beceuse the 'network card failed' looks so kernel-related.The reason I mention that is right now it looks like the distros are just
running around disabling the e1000e module, or perhaps downgrading it.
Which may not even work!The discussions in some of the bug-trackers seem to be full of people who
have no actual information, but are perfectly willing to flail around
wildly saying obviously crazy things.The Ubuntu people are some of the crazier ones (should I be surprised?),
but that one also has Ben Collins claiming they use the same e1000e driver
for the 2.6.26/27 kernels (from intels sf.net project). That may be bogus,
but if true it would indicate that it's possibly not so kernel-related, or
at least not so e1000e-driver-related.Linus
--
That is very probably completely separate issue, and shoudl have been
I think that not many peeople are suspecting bug in e1000e directly.
Rather a combination of X bug, kernel allowing X to do bad things (for
example the missing check in drivers/pci/pci-sysfs.c:pci_mmap_resource()
looks particularly suspicious) and a "bug-friendly" hardware behavior.--
Jiri Kosina
SUSE Labs
--
Likely not, you are mentioning a patch for e1000, while the Mandriva bug
report is about e100:
https://qa.mandriva.com/show_bug.cgi?id=44192See you,
Eric--
Em Tue, 30 Sep 2008 09:58:56 +0200
Eric Piel <eric.piel@tremplin-utc.net> escreveu:| Jiri Kosina schreef:
| > On Mon, 29 Sep 2008, Linus Torvalds wrote:
| >
| >>> If it is X related then its both a kernel + X server issue, the e1000e
| >>> driver opens the barn door, the X server drives the horses through it.
| >> Are you sure? There was a mandriva report abou NVM corruption on an e100
| >> too (that one apparently just caused PXE failure, the networking worked
| >> fine).
| >
| > That is very probably completely separate issue, and shoudl have been
| > fixed already by 78566fecb.
| Likely not, you are mentioning a patch for e1000, while the Mandriva bug
| report is about e100:
| https://qa.mandriva.com/show_bug.cgi?id=44192Yes, also the reporter has said that he has got the problem with -rc7 and
this fix is available since -rc6.Jiri, doesn't e100 need that fix as well?
Anyway, it is not clear for us whether this is a kernel problem. We
could not reproduce it here and the reporter is now checking his network.--
Luiz Fernando N. Capitulino
--
He finished checks and discovered the e100 issue was in reality a hardware
problem in the switch being used that started to have problems now,
coincidently with this e1000e issue getting more attention, after swapping
the switch the problem stopped, so just a false alarm. I closed
https://qa.mandriva.com/show_bug.cgi?id=44192 that was the original report.--
[]'s
Herton
--
On Mon, 29 Sep 2008 19:21:02 -0700 (PDT)
btw, we're also working on making some parts of the kernel more robust
against certain types of bugs; for example the ioremap checks and sysfs
resource checks. There's a set of checks and API changes we can do to
make it less likely that drivers end up doing bad stuff; but that's
obviously more for 2.6.28 than for .27--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
On Tue, Sep 30, 2008 at 12:21 PM, Linus Torvalds
Well from a purely empirical standpoint, I've been running new X
against that laptop for a long time,
and others have the same laptop, so I think its a problem with the
e1000e driver putting the card into a state which allows
X to do bad things. I think X maybe causing issues on other hw, like
e100 and some realtek.. Also when we say X I think it looks like Intel
driver interaction issues,
as I said I'm running the same stuff on my ATI gpu laptop with e1000e
and haven't had any problems.But I'm leaving this up to Intel, I don't think HP will take it too
kindly if I keep returning my laptop.Dave.
--
On Tue, 30 Sep 2008 11:59:58 +1000
we have a patch to save/restore now, in final testing stages
(obviously we want to be really careful with this)Note that so far it seems to mostly hit with "new" distros, so both
new kernel and new X... ;(--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
Btw, the _real_ bug is clearly in the hardware design that allows you to
brick those things without apparently even having a lock bit.I'm hoping Intel doesn't treat this as just a software bug. Some hw
designer should be thinking hard about which orifice they put their head
up in.It used to be that you could fry some monitors by feeding them
out-of-range signals. The _monitors_ got fixed.Linus
--
I am confident they will, because right now some more malicious virus
writers will be thinking 'whoopeee party time'.
--
The hardware has a lock bit, and we're trying to figure out why the BIOS
writers guide doesn't say to set it. Probably because of the MAC address,We will post a patch to e1000e tomorrow that sets a lock bit that prevents
the registers memory mapped by 0:19.0 BAR1 from causing flash write
cycles.The patches I've just posted don't quite do that yet.
--
Mostly. I think you can still do bad things to internal LCD's on at least
some laptops. Although I hope I'm wrong.Linus
--
You still can in some cases. You can also erase many video card
firmwares, trash disks, brick DVD drives and the like fairly easily too
but you do tend to have to try to be evil in these cases, not just get an
address wrong.Alan
--
unless there is news that I missed, the E1000 bricking bug is still out
there. that is a particularly nasty one.--
