Re: 2.6.25-mm1

Previous thread: [PATCH 1/1] Update email address in MODULE_AUTHOR by Hans-Christian Egtvedt on Friday, April 18, 2008 - 1:02 am. (1 message)

Next thread: Problem with delayed data from pl2303 usb serial gps by Helge Hafting on Friday, April 18, 2008 - 3:16 am. (2 messages)
From: Andrew Morton
Subject: 2.6.25-mm1
Date: Friday, April 18, 2008 - 1:47 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25/2.6.25-mm1/ 

- git-xfs is undropped because I finally got around to fixing its clashes
  with git-vfs.

- git-arm-master, git-sparc64 and perhaps others are dropped because they
  don't generate a clean pull.  They might be empty - I didn't check.

- git-kvm remains dropped due to clashes with git-s390 and perhaps git-x86.

- git-selinux is newly dropped due to memory corruption regressions.

- git-nfs is (perhaps permanently) dropped because its content is also in
  git-nfsd.

- git-drm remains reverted due to build failures

- Tomorrow I'll do the -mm merge plans email and I'll dump a couple hundred
  patches on tree maintainers (these have about a 15% yay-he-merged-it rate).

  Then I'm travelling for a poorly-timed week.  I return late in the merge
  window to find out if any of these patches still apply.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs ...
From: Kamalesh Babulal
Date: Friday, April 18, 2008 - 4:26 am

Hi Andrew,

The 2.6.25-mm1 kernel allyesconfig build fails on the powerpc

drivers/edac/pasemi_edac.c: In function ‘pasemi_edac_init’:
drivers/edac/pasemi_edac.c:288: error: implicit declaration of function ‘opstate_init’
drivers/edac/pasemi_edac.c: In function ‘__check_edac_op_state’:
drivers/edac/pasemi_edac.c:304: error: ‘edac_op_state’ undeclared (first use in this function)
drivers/edac/pasemi_edac.c:304: error: (Each undeclared identifier is reported only once
drivers/edac/pasemi_edac.c:304: error: for each function it appears in.)
drivers/edac/pasemi_edac.c: At top level:
drivers/edac/pasemi_edac.c:304: error: ‘edac_op_state’ undeclared here (not in a function)
make[2]: *** [drivers/edac/pasemi_edac.o] Error 1
make[1]: *** [drivers/edac] Error 2
make: *** [drivers] Error 2

I have only build tested the patch.

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
---
--- linux-2.6.25/drivers/edac/pasemi_edac.c	2008-04-18 16:18:27.000000000 +0530
+++ linux-2.6.25/drivers/edac/~pasemi_edac.c	2008-04-18 16:18:36.000000000 +0530
@@ -26,6 +26,7 @@
 #include <linux/pci.h>
 #include <linux/pci_ids.h>
 #include <linux/slab.h>
+#include <linux/edac.h>
 #include "edac_core.h"
 
 #define MODULE_NAME "pasemi_edac"
-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Reuben Farrelly
Date: Friday, April 18, 2008 - 6:02 am

The GCC stackprotector option is a no-go for me, and causes 100% repeatable 
fatal oopses on boot with my x86_64 box.

This is not new to 2.6.25-mm1 - but was also present in 2.6.24-rc8-mm2 
(2.6.24-rc8-mm1 was good, but this option didn't exist then).

It seems that enabling the stackprotector option:

tornado boot # diff -u config-2.6.25-mm1 config-2.6.25-mm1.old
--- config-2.6.25-mm1   2008-04-18 22:40:15.000000000 +1000
+++ config-2.6.25-mm1.old       2008-04-18 20:09:38.000000000 +1000
@@ -1,7 +1,7 @@
  #
  # Automatically generated make config: don't edit
  # Linux kernel version: 2.6.25-mm1
-# Fri Apr 18 22:25:04 2008
+# Fri Apr 18 19:57:17 2008
  #
  CONFIG_64BIT=y
  # CONFIG_X86_32 is not set
@@ -256,7 +256,8 @@
  CONFIG_X86_PAT=y
  # CONFIG_EFI is not set
  CONFIG_SECCOMP=y
-# CONFIG_CC_STACKPROTECTOR is not set
+CONFIG_CC_STACKPROTECTOR_ALL=y
+CONFIG_CC_STACKPROTECTOR=y
  # CONFIG_HZ_100 is not set
  # CONFIG_HZ_250 is not set
  CONFIG_HZ_300=y

is enough to prevent my system booting, viz:

input: Belkin Components Belkin OmniView KVM Switch as 
/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.1/3-1.1:1.0/input/input2
input: USB HID v1.00 Keyboard [Belkin Components Belkin OmniView KVM Switch] on 
usb-0000:00:1d.1-1.1
input: Belkin Components Belkin OmniView KVM Switch as 
/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.1/3-1.1:1.1/input/input3
input: USB HID v1.00 Mouse [Belkin Components Belkin OmniView KVM Switch] on 
usb-0000:00:1d.1-1.1
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
NET: Registered protocol family 17
Testing -fstack-protector-all feature
------------[ cut here ]------------
WARNING: at ™š:-2145164734 0x0()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.25-mm1 #1

Call Trace:
  [<ffffffff802362a9>] warn_on_slowpath+0x67/0x98
  [<ffffffff802f31da>] ? ...
From: Ingo Molnar
Date: Friday, April 18, 2008 - 6:36 am

hm, does it boot up fine with the attached patch and stackprotector 
enabled? It appears that your system got to the self-test so 
stackprotector is working mostly - it's just that the self-test went 
wrong.

	Ingo

----------------->
Subject: x86: disable stackprotector selftest
From: Ingo Molnar <mingo@elte.hu>
Date: Fri Apr 18 15:21:49 CEST 2008

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/panic.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-x86.q/kernel/panic.c
===================================================================
--- linux-x86.q.orig/kernel/panic.c
+++ linux-x86.q/kernel/panic.c
@@ -394,5 +394,5 @@ void __stack_chk_fail(void)
 }
 EXPORT_SYMBOL(__stack_chk_fail);
 
-late_initcall(__stack_chk_test);
+/* late_initcall(__stack_chk_test); */
 #endif
--

From: Arjan van de Ven
Date: Friday, April 18, 2008 - 6:51 am

On Fri, 18 Apr 2008 15:36:45 +0200

yes it'll be interesting to see if this is due to the test triggering or due to
something else.

Reuben:
I assume your gcc is pretty vanilla and doesn't have weird patches in this area?

Can you do a (with the test enabled in the config/code) do a
make kernel/panic.s
and send me that file? (this allows me to see what code your gcc generated for the test)

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Reuben Farrelly
Date: Friday, April 18, 2008 - 7:41 am

I think so.  Well, put it this way... I haven't made any changes to it, this is 
the standard/current gcc that has been in Gentoo Portage for the last while.

 > Can you do a (with the test enabled in the config/code) do a
 > make kernel/panic.s
 > and send me that file? (this allows me to see what code your gcc generated 
for the test)

Done - posted up in the web directory along with the other files (saves the 
possible grief of MUA mangling).

I'm about to reboot to try Ingo's test also.

Reuben

--

From: Reuben Farrelly
Date: Friday, April 18, 2008 - 7:49 am

It boots up fine with that patch below and:

tornado boot # grep STACKPROTECT /boot/config-2.6.25-mm1-wip
CONFIG_CC_STACKPROTECTOR_ALL=y
CONFIG_CC_STACKPROTECTOR=y

In fact I'm running with it applied right now and it all seems good so far, so I 
guess that's confirmation that it is just the test itself which is problematic?

Reuben
--

From: Ingo Molnar
Date: Monday, April 21, 2008 - 8:06 am

yeah. Arjan - any new patches to try that might fix the bootup test?

	Ingo
--

From: Arjan van de Ven
Date: Monday, April 21, 2008 - 6:48 pm

On Mon, 21 Apr 2008 17:06:04 +0200

I've looked at the disassembly and compared it to mine, and the gcc is doing
something... rather unexpected.
The only thing I can think of is the patch below, it should make it a ton 
more robust...


From: Arjan van de Ven <arjan@linux.intel.com>
Subject: x86: be more conversative about the stack-protector test

This patch makes the stack-protector self-test more robust against
weird stack layouts; rather than assuming that a local variable is
layed out in a certain way, we first check this against the known
canary value (before we poison it).

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

diff --git a/kernel/panic.c b/kernel/panic.c
index c92c1e2..b4a6a05 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -351,7 +351,10 @@ static noinline void __stack_chk_test_func(void)
 	}
 #endif
 	barrier();
-	memset(&foo, 0, 2*sizeof(foo)); /* deliberate buffer overflow */
+	if (current->stack_canary == *(((unsigned long *)&foo)+1))
+		*(((unsigned long *)&foo)+1) = 0;
+	else
+		printk(KERN_ERR "No -ftack-protector canary found\n");
 	barrier();
 }
 
--

From: Valdis.Kletnieks
Date: Monday, April 21, 2008 - 7:04 pm

On Mon, 21 Apr 2008 18:48:59 PDT, Arjan van de Ven said:


From: Ingo Molnar
Date: Tuesday, April 22, 2008 - 1:34 am

ok, i queued this up. (with the typo that Valdis noticed fixed)

but ... this bug needs to be figured out, not worked around.

	Ingo
--

From: Arjan van de Ven
Date: Tuesday, April 22, 2008 - 7:29 am

On Tue, 22 Apr 2008 10:34:08 +0200

well what I figured out was that the stack layout was "different".
Why/how I don't know, but being more robust against that is a good idea
in general.


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Randy Dunlap
Date: Friday, April 18, 2008 - 9:40 am

with CONFIG_BLOCK=n:

linux-2.6.25-mm1/drivers/base/core.c: In function 'device_to_dev_kobj':
linux-2.6.25-mm1/drivers/base/core.c:768: error: 'block_class' undeclared (first use in this function)
make[3]: *** [drivers/base/core.o] Error 1

---
~Randy
--

From: Greg KH
Date: Friday, April 18, 2008 - 9:56 am

Ah, more fun caused by Dan's /sys/dev/... patch.

Dan, this is causing a lot of problems, I'm going to drop it for now
until the build and run-time errors get resolved.

thanks,

greg k-h
--

From: Dan Williams
Date: Friday, April 18, 2008 - 11:38 am

Ok, I'll have an updated version with Kay's duplicate-entry fix and a

Regards,
Dan
--

From: Randy Dunlap
Date: Friday, April 18, 2008 - 9:45 am

with CONFIG_RT_MUTEXES=N:

In file included from linux-2.6.25-mm1/kernel/trace/trace.c:2432:
linux-2.6.25-mm1/kernel/trace/trace_selftest.c: In function 'trace_wakeup_test_thread':
linux-2.6.25-mm1/kernel/trace/trace_selftest.c:413: error: implicit declaration of function 'rt_mutex_setprio'
make[3]: *** [kernel/trace/trace.o] Error 1

---
~Randy
--

From: Valdis.Kletnieks
Date: Friday, April 18, 2008 - 1:14 pm

I may have reported this same one previously against an earlier -mm, or somebody
did, or something... ;)

x86_64 kernel, Core2 Duo T7200, Dell Latitude D820 laptop...

[    0.060388] ACPI: Core revision 20080321
[    0.070079] ------------[ cut here ]------------
[    0.070082] WARNING: at arch/x86/kernel/genapic_64.c:86 read_apic_id+0x41/0x7c()
[    0.070085] Modules linked in:
[    0.070089] Pid: 1, comm: swapper Not tainted 2.6.25-mm1 #1
[    0.070091]
[    0.070092] Call Trace:
[    0.070099]  [<ffffffff8023c702>] warn_on_slowpath+0x67/0xb7
[    0.070105]  [<ffffffff805d096e>] ? _spin_lock+0x25/0x54
[    0.070111]  [<ffffffff8021e399>] ? __cpus_weight+0x4b/0x68
[    0.070114]  [<ffffffff802245c1>] read_apic_id+0x41/0x7c
[    0.070119]  [<ffffffff807a0ded>] verify_local_APIC+0xb4/0x177
[    0.070123]  [<ffffffff805d4153>] ? sub_preempt_count+0x44/0x6e
[    0.070126]  [<ffffffff8079faf4>] native_smp_prepare_cpus+0x238/0x36a
[    0.070129]  [<ffffffff805d4153>] ? sub_preempt_count+0x44/0x6e
[    0.070134]  [<ffffffff80794712>] kernel_init+0x69/0x2a1
[    0.070137]  [<ffffffff805d0d42>] ? _spin_unlock_irq+0x43/0x62
[    0.070143]  [<ffffffff802366b7>] ? finish_task_switch+0x3e/0xb4
[    0.070147]  [<ffffffff8020d6e8>] child_rip+0xa/0x12
[    0.070150]  [<ffffffff8020cdd0>] ? restore_args+0x0/0x30
[    0.070154]  [<ffffffff807946a9>] ? kernel_init+0x0/0x2a1
[    0.070157]  [<ffffffff8020d6de>] ? child_rip+0x0/0x12
[    0.070159]
[    0.070166] ---[ end trace a7919e7f17c0a725 ]---
[    0.070169] ------------[ cut here ]------------
[    0.070171] WARNING: at arch/x86/kernel/genapic_64.c:86 read_apic_id+0x41/0x7c()
[    0.070174] Modules linked in:
[    0.070177] Pid: 1, comm: swapper Tainted: G        W 2.6.25-mm1 #1
[    0.070179]
[    0.070179] Call Trace:
[    0.070182]  [<ffffffff8023c702>] warn_on_slowpath+0x67/0xb7
[    0.070186]  [<ffffffff805d096e>] ? _spin_lock+0x25/0x54
[    0.070189]  [<ffffffff8021e399>] ? __cpus_weight+0x4b/0x68
[    0.070192]  ...
From: Alexey Dobriyan
Date: Friday, April 18, 2008 - 4:09 pm

At least the following files aren't removed by "make mrproper":

	Module.markers
	arch/x86/kernel/acpi/realmode/wakeup.lds
	crypto/.tmp_aes_generic.ver
	fs/.tmp_buffer.ver
	kernel/time/.tmp_timekeeping.ver

Noticed by "git-ls-files -o".

*.ver , I think, remain after abruptdly terminated build.

--

From: Joseph Fannin
Date: Friday, April 18, 2008 - 7:13 pm

I've been seeing the following backtraces since 2.6.25-rc8-mm1 -- at
least, since that's the earliest -mm I've built in a while.  I don't
get the same in mainline.

No idea who to CC:  I've sat on this report long enough.

I'm going to send a few different reports in separate mails, so I'll
put my dmesg and .config up on a server:

http://home.columbus.rr.com/jfannin3/dmesg.txt
http://home.columbus.rr.com/jfannin3/config-2.6.25-mm1.txt

[  451.915553] sysfs: duplicate filename 'pcspkr' can not be created
[  451.915731] ------------[ cut here ]------------
[  451.915851] WARNING: at fs/sysfs/dir.c:427 sysfs_add_one+0x85/0xe0()
[  451.915981] Modules linked in: snd_pcsp(+) ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_mpu401_uart snd_seq_dummy snd_seq_oss snd_seq_midi psmouse snd_rawmidi serio_raw snd_seq_midi_event snd_seq button i2c_viapro snd_timer snd_seq_device pcspkr i2c_core snd snd_page_alloc via686a shpchp pci_hotplug parport_pc parport via_agp agpgart soundcore evdev sg sr_mod cdrom sd_mod 8139cp aic7xxx  scsi_transport_spi scsi_mod 8139too mii uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod thermal processor fan fuse ext4dev mbcache jbd2 crc16
[  451.918960] Pid: 2740, comm: modprobe Tainted: G        W 2.6.25-mm1 #7
[  451.929271]  [<c0130fa9>] warn_on_slowpath+0x59/0x80
[  451.929500]  [<c0132400>] ? vprintk+0x2f0/0x4a0
[  451.929723]  [<c0356adc>] ? _spin_unlock+0x2c/0x50
[  451.929918]  [<c01c6a7a>] ? ifind+0x4a/0xa0
[  451.930126]  [<c0155216>] ? trace_hardirqs_on_caller+0x16/0x150
[  451.930334]  [<c015535b>] ? trace_hardirqs_on+0xb/0x10
[  451.930534]  [<c01325d0>] ? printk+0x20/0x30
[  451.930727]  [<c01fcc45>] sysfs_add_one+0x85/0xe0
[  451.930900]  [<c01fd89e>] create_dir+0x4e/0xb0
[  451.931064]  [<c01fd930>] sysfs_create_dir+0x30/0x50
[  451.931291]  [<c0356adc>] ? _spin_unlock+0x2c/0x50
[  451.931485]  [<c023dac6>] kobject_add_internal+0xb6/0x190
[  451.931656]  [<c023dc22>] ? ...
From: Andrew Morton
Date: Friday, April 18, 2008 - 8:02 pm

Yes, there have been lots of these lately.  I expect some of them _will_ go
into mainline and they'll then slowly get weeded out.

--

From: Dmitry Torokhov
Date: Friday, April 18, 2008 - 9:14 pm

It looks like it is coming from snd_pcsp module from alsa tree.


Cool things there:

+#ifdef CONFIG_DEBUG_PAGEALLOC
+	/* Well, CONFIG_DEBUG_PAGEALLOC makes the sound horrible. Lets
alert */
+	printk(KERN_WARNING
+	       "PCSP: Warning, CONFIG_DEBUG_PAGEALLOC is enabled!\n"
+	       "You have to disable it if you want to use the PC-Speaker
"
+	       "driver.\n"
+	       "Unless it is disabled, enjoy the horrible, distorted "
+	       "and crackling noise.\n");
+#endif

-- 
Dmitry
--

From: Andrew Morton
Date: Friday, April 18, 2008 - 9:29 pm

heh.

CONFIG_DEBUG_PAGEALLOC is a very heavy consumer of CPU cycles.  I'm not
surprised that it would whack what I presume to be a very latency-sensitive
driver.

--

From: Joseph Fannin
Date: Friday, April 18, 2008 - 11:33 pm

Um...

[jhf@Susa ~]$ uname -a
Linux Susa 2.6.25-mm1 #7 SMP PREEMPT Fri Apr 18 17:05:14 EDT 2008 i686
GNU/Linux
[jhf@Susa ~]$ zgrep PAGEALLOC /proc/config.gz
# CONFIG_DEBUG_PAGEALLOC is not set
[jhf@Susa ~]$

I thought that might have snuck in as =y, but it didn't.

--
Joseph Fannin
jfannin@gmail.com

--

From: Takashi Iwai
Date: Monday, April 21, 2008 - 4:07 am

[Added snd-pcsp author to Cc]

At Fri, 18 Apr 2008 21:29:34 -0700,

Seems that snd-pcsp registers as "pcspkr", which is identical with
input pc-speaker driver.  Does the patch below fix the problem?


Takashi

---
diff -r e8f61dd0b153 sound/drivers/pcsp/pcsp.c
--- a/sound/drivers/pcsp/pcsp.c	Thu Apr 17 17:58:34 2008 +0200
+++ b/sound/drivers/pcsp/pcsp.c	Mon Apr 21 13:06:35 2008 +0200
@@ -21,7 +21,7 @@
 MODULE_DESCRIPTION("PC-Speaker driver");
 MODULE_LICENSE("GPL");
 MODULE_SUPPORTED_DEVICE("{{PC-Speaker, pcsp}}");
-MODULE_ALIAS("platform:pcspkr");
+MODULE_ALIAS("platform:snd_pcsp");
 
 static int index = SNDRV_DEFAULT_IDX1;	/* Index 0-MAX */
 static char *id = SNDRV_DEFAULT_STR1;	/* ID for this card */
@@ -214,7 +214,7 @@
 
 static struct platform_driver pcsp_platform_driver = {
 	.driver		= {
-		.name	= "pcspkr",
+		.name	= "snd_pcsp",
 		.owner	= THIS_MODULE,
 	},
 	.probe		= pcsp_probe,
--

From: Stas Sergeev
Date: Monday, April 21, 2008 - 10:44 am

Hello.

Actually it does not. The reason is that
then it fails to match the platform device,
which is created in arch/x86/kernel/pcspeaker.c,
with the name of "pcspkr".

But we already had the patch for that in an
alsa tree, it probably got forgotten. Here
it is:
http://hg-mirror.alsa-project.org/alsa-driver/raw-file/90eeee75052f/utils/patches/pcsp...

Also attaching it here.
It simply disables the pcspkr driver in
Kconfig. snd-pcsp has the copy of that
driver, so that only one driver would
drive the device.

Does that fix look good? (presumably acked
by Takashi, otherwise the patch wouldn't
be in an alsa tree)

---------------

- Prevent pcspkr driver from being built
together with snd-pcsp. snd-pcsp fully
superceeds pcspkr.
- Update CREDITS file. :)

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Acked-by: Takashi Iwai <tiwai@suse.de>
From: Takashi Iwai
Date: Tuesday, April 22, 2008 - 3:09 am

At Mon, 21 Apr 2008 21:44:33 +0400,

Hm, the hardcoded string is no good thing.
It should be defined in the common header if it's used in multiple
places.

I'm not 100% certain whether restrictng this in Kconfig is the correct
fix.  Basically this doesn't stop building both drivers.  In theory,
we can switch them dynamically.

But, it's the easiest way to avoid unnecessary bugs right now, so

Err, no, this wasn't merged to sound git tree because apparently the

No, I gave no ACK yet.  The alsa-driver tree is our playground, and
the patch merged to that tree doesn't mean that I approved it for

I'd rather add a new line with a single "depends on SND_PCSP=n".
You see a clear difference in the art of dependencies, one for
architectures and one for driver-specific.


thanks,

Takashi
--

From: Stas Sergeev
Date: Tuesday, April 22, 2008 - 10:54 am

Hello.

That's simply because the old one
Done.

---
- Update CREDITS with the pc-speaker
driver authors.
- Prevent pcspkr from being built together
with snd-pcsp.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
From: Takashi Iwai
Date: Wednesday, April 23, 2008 - 1:55 am

At Tue, 22 Apr 2008 21:54:46 +0400,

Thanks, applied to my git tree.
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6.git

Meanwhile, we need to add a similar depenency to snd-pcsp as well, no?


Takashi
--

From: Takashi Iwai
Date: Wednesday, April 23, 2008 - 7:14 am

At Wed, 23 Apr 2008 10:55:08 +0200,

Fixed on ALSA tree now.


Takashi
--

From: Stas Sergeev
Date: Monday, April 21, 2008 - 12:45 pm

Hi.


As a quick workaround, say N to INPUT_PCSPKR:
  │ Prompt: PC Speaker support
  │   Location:
  │     -> Device Drivers
  │       -> Input device support
  │         -> Generic input layer (needed for keyboard, mouse, ...)
(INPUT │
  │           -> Miscellaneous devices (INPUT_MISC [=y])
--

From: Takashi Iwai
Date: Monday, April 21, 2008 - 7:06 am

At Fri, 18 Apr 2008 21:29:34 -0700,

We can add simply a dependncy to Kconfig if this really matters.


Takashi

---

diff -r e8f61dd0b153 sound/drivers/Kconfig
--- a/sound/drivers/Kconfig	Thu Apr 17 17:58:34 2008 +0200
+++ b/sound/drivers/Kconfig	Mon Apr 21 16:06:09 2008 +0200
@@ -7,6 +7,7 @@
 config SND_PCSP
 	tristate "Internal PC speaker support"
 	depends on X86_PC && HIGH_RES_TIMERS
+	depends on !DEBUG_PAGEALLOC
 	help
 	  If you don't have a sound card in your computer, you can include a
 	  driver for the PC speaker which allows it to act like a primitive
--

From: Stas Sergeev
Date: Monday, April 21, 2008 - 10:55 am

Hello.

I think this is a bit too heavy-handed.
That thing adds a lot of noise to the sound,
but it doesn't really prevent the driver from
working properly. And perhaps there are
other options with the same effect?
Also, that would motivate people to
optimize DEBUG_PAGEALLOC, while otherwise
they wouldn't know. :)

What is the problem with the warning
exactly? If it makes problems, we can
just remove it and update the help text
instead, for example.
--

From: Takashi Iwai
Date: Tuesday, April 22, 2008 - 3:13 am

At Mon, 21 Apr 2008 21:55:27 +0400,

It's no big problem but I just find the phrase "you have to
disable..." too strict.  It's just a sound-quality problem as you
mentioned in the above.

BTW, I noticed that the message includes a few line breaks.  You need
a proper KERN_ prefix for each line in such a case.  A fix patch would
be appreciated.


thanks,

Takashi
--

From: Dmitry Torokhov
Date: Tuesday, April 22, 2008 - 7:01 am

Hi, Takashi, Stas,


Just out of curiosity, what is the sound quality with the PC speaker?
Not to belittle Stas's effort, but how relevant is this driver for
mainline given that it os pretty much impossible nowadays to find a
motherboard without on-board sound?

-- 
Dmitry
--

From: Stas Sergeev
Date: Tuesday, April 22, 2008 - 9:42 am

Hello.

That depends on the speaker itself.
If it is large enough (not a piezo
tablet), then the sound probably
I know it is used because people are
mailing me about it. Mainly on servers
I think, as on a desktop its value it
really questionable. :)
The real problem is that the board
manufacturers put the piezo beepers
these days, and also nvidia already
excluded the necessary capabilities
from their NForce chipset. So in the
future it may indeed became out of
the use.
By the way, you may be surprised, but
I am still being asked by various people
to write an LPT DAC (Covox) sound driver
(the ancient OSS-based pc-speaker driver
supported also that).
I guess people just like to use all
the hardware they have, even if it is
not really very usefull.
--

From: Stas Sergeev
Date: Tuesday, April 22, 2008 - 11:31 am

Hello.

Like the attached one?
From: Takashi Iwai
Date: Wednesday, April 23, 2008 - 1:49 am

At Tue, 22 Apr 2008 22:31:01 +0400,

Missing \n in the first line?


thanks,

Takashi
--

From: Takashi Iwai
Date: Wednesday, April 23, 2008 - 7:18 am

At Wed, 23 Apr 2008 10:49:00 +0200,

Fixed on ALSA tree now, too.


Takashi
--

From: Stas Sergeev
Date: Wednesday, April 23, 2008 - 1:02 pm

Hello.

This was intentional - wanted it to
print in a single line. You see a space
there for that reason.
Should the second KERN_WARNING be removed
I personally don't think so.
snd-pcsp has the excact copy of the pcspkr
code built-in, so I thought pcspkr can be
obsoleted in the future. From that point of
view, having snd-pcsp enabled and not even
seeing pcspkr in a menuconfig is fine.
While otherwise (you ocasionally enable
pcspkr and don't even see snd-pcsp then)
is not fine.
You mentioned earlier that you would like
to be able to swap those drivers dynamically,
but... what's the use? With such a dependancy
added, many people will not even know about
Oh...
--

From: Takashi Iwai
Date: Thursday, April 24, 2008 - 2:40 am

At Thu, 24 Apr 2008 00:02:06 +0400,

Then you don't need KERN_WARNING there.
The problem is that you need KERN_* prefix again after the line break

No, I don't think input-pcspkr would be ever easily obsoleted by
snd-pcsp.  People definitely want a system without the sound subsystem

Think about distro.  They could distribute both modules if both modules
can be replacible.  Otherwise, snd-pcsp won't be enabled on most
distros, I guess.


Takashi
--

From: Stas Sergeev
Date: Thursday, April 24, 2008 - 8:51 pm

Hello.

If you don't enable sound, then you
can't enable snd-pcsp, and so the
pcspkr can be enabled.
But if you have enabled the sound,
what's the use in not seeing the
snd-pcsp in the config?
Because most people already had
pcspkr enabled I guess, this just
reduces the availability of snd-pcsp
by an order of magnitude. And the
reason for that, is... ?
--

From: Takashi Iwai
Date: Thursday, April 24, 2008 - 11:28 pm

At Fri, 25 Apr 2008 07:51:58 +0400,

Yeah, of course.  Don't mix up with the argument about Kconfig
dependency issues.  I just reacted against your statement that input

I actually removed the dependency from snd-pcsp Kconfig again since
this causes a dependency loop (at least, kbuild can't handle
properly). 

But anyway, as mentioned in my previous post, snd-pcsp wouldn't be
enabled on most of distros' kernels because it cannot be built
together with input-pcspkr driver, unfortunately...


Takashi
--

From: Stas Sergeev
Date: Friday, April 25, 2008 - 9:45 am

Hello.

Those are the different ones of course.
For me the dependancy issue is a primary
problem of course, so the other one I
Yes, making them to play well together is
something to think about too. But I don't
see how it can affect the distributors.
If the distro doesn't have the sound enabled,
then it won't enable snd-pcsp no matter
what. If it does have the sound enabled,
then why would it ever want to use pcspkr
instead of snd-pcsp?
I see this issue mostly as a theoretical
one. If you see the real problems with
the current situation, please let me know.
--

From: Takashi Iwai
Date: Friday, April 25, 2008 - 9:51 am

At Fri, 25 Apr 2008 20:45:51 +0400,

The sound subsystem is enabled (loaded) only when necessary.
As long as there are many systems without the sound subsystem
(i.e. servers), snd-pcsp wouldn't be built (thus not provided even as
a module) for their kernels because it prevents input-pcspkr.


Takashi
--

From: Stas Sergeev
Date: Friday, April 25, 2008 - 10:25 am

Hello.

Is it really a problem for the server
admin to just tolerate a few extra modules
being loaded? I really don't know, I just
thought it is not.
Of course the one may not like the fact
that after the kernel update, he gets a
few modules more in lsmod output; someone
may even call it a bloat...
$ lsmod |wc -l
79
Doesn't look too small already. :)

Anyway, I guess we'll soon find out the
numbers, unless someone will come up with
the good way to make those drivers to
play better together and to not depend
on each other, which looks a bit difficult
from the first glance.
--

From: Takashi Iwai
Date: Friday, May 2, 2008 - 9:44 am

t Fri, 25 Apr 2008 18:51:04 +0200,

... and I completely missed the viewpoint of device allocation.
Yeah, that'll be a bit hackish.


Takashi
--

From: Stas Sergeev
Date: Friday, May 2, 2008 - 9:57 am

I'll appreciate if your replies
became a bit less cryptic... ;)
What will be a bit hackish?
--

From: Takashi Iwai
Date: Tuesday, May 6, 2008 - 3:20 am

At Fri, 02 May 2008 20:57:00 +0400,

[Oops, overseen this follow up]

One problem is that we cannot load two drivers to a single device
right now.  Even if you have input pcspkr and snd-pcsp modules, you
have to blacklist one of two modules so that udev loads the one
properly.  Because of this, snd-pcsp will be unlikely activated for
most systems as default.

For avoiding this, you'll have a few choices:
a) implement pcspkr-core driver, and make input-pcspkr and snd-pcsp on
   that core module
b) make snd-pcsp copmletely rely on input pcspkr, implement as an
   add-on by adding hook to each driver callback and event handler of
   pcspkr
c) implement snd-pcsp as another individual platform driver and adds a
   hook to pcskr event handler of pcspkr

The case (a) would make things more complicated and give less
solution.

In the case (b), the modification of pcspkr.c would be big, and
would be ugly.

The case (c) was my proposal.  But in this case, the driver will
become likely self consistent; it allocates its own device at init.

In anyway, there is no sexy way to auto-load snd-pcsp (partly because
it's the purpose -- avoid loading the sound subsystem unless really
necessary).  That's why I called it hackish.


Takashi
--

From: Stas Sergeev
Date: Tuesday, May 6, 2008 - 9:51 am

Hello.

I thought all (or most) alsa drivers
are allocating device on init, even
though this is explicitly discouraged
in the docs. So I was considering this
as a possible solution with the minimal
drawback.
But... as long as the autoloading by
default is not needed, and both drivers
can be at least built together, and not
too much distros have pcspkr built-in,
I thought the current solution - having
pcspkr as the default but to let the user
to choose snd-pcsp, is not all that bad
too. I guess it costs only adding a single
alias into modprobe.conf to choose snd-pcsp.
And I also think _most_ distros do not
mind having the sound subsystem loaded
by default, but some certainly do. For
those that do, the user will have to add
an alias. For others - he may get snd-pcsp
right away. IMHO this is rather acceptable.
--

From: Dmitry Torokhov
Date: Friday, April 25, 2008 - 11:09 am

Hi Stas,


Given agerage quality of the speakers in the current boxes and
abudance of on-board sound I doubt any distriution would want to have
snd-pcsp enabled since there is a chance it may come up as a primary
(default) sound device. At least my experience with F[C]7 that
juggling 3 sound cards is not very easy.

-- 
Dmitry
--

From: Stas Sergeev
Date: Friday, April 25, 2008 - 11:31 am

Hello.

I avoid that by adding
options snd-pcsp index=1
into /etc/modprobe.conf.
Maybe index=5 or alike would be
safer for those who have many cards.
--

From: Dmitry Torokhov
Date: Friday, April 25, 2008 - 11:37 am

Yes, it could be. But face it, the number of users wanting to play
musuc through PC speaker is quite small and unlikely to increase in
the future.

-- 
Dmitry
--

From: Joseph Fannin
Date: Friday, April 18, 2008 - 7:25 pm

I've been seeing the following backtrace since (I think)
2.6.25-rc8-mm2.

I'm sending multiple reports vs. 2.6.25-mm1, so I'm putting the dmesg
and .config on a server:

http://home.columbus.rr.com/jfannin3/dmesg.txt
http://home.columbus.rr.com/jfannin3/config-2.6.25-mm1.txt

[  842.795144] hm, dftrace overflow: 265 changes (0 total) in 428 usecs
[  842.795182] ------------[ cut here ]------------
[  842.795192] WARNING: at kernel/trace/ftrace.c:658 ftraced+0x1a4/0x1b0()
[  842.795200] Modules linked in: af_packet rfcomm l2cap bluetooth ppdev ipv6 cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave video output wmi pci_slot container dock sbs sbshcbattery iptable_filter ip_tables x_tables ext2 ac lp loop snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_mpu401_uart snd_seq_dummy snd_seq_oss snd_seq_midi psmouse snd_rawmidi serio_raw snd_seq_midi_event snd_seq button i2c_viapro snd_timer snd_seq_device pcspkr i2c_core snd snd_page_alloc via686a shpchp pci_hotplug parport_pc parport via_agp agpgart soundcore evdev sg sr_mod cdrom sd_mod 8139cp aic7xxx scsi_transport_spi scsi_mod 8139too mii uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod thermal processor fan fuse ext4dev mbcache jbd2 crc16
[  842.795470] Pid: 13, comm: ftraced Tainted: G        W 2.6.25-mm1 #7
[  842.795497]  [<c0130fa9>] warn_on_slowpath+0x59/0x80
[  842.795541]  [<c013244f>] ? vprintk+0x33f/0x4a0
[  842.795589]  [<c0155216>] ? trace_hardirqs_on_caller+0x16/0x150
[  842.795622]  [<c0354eb0>] ? __mutex_lock_common+0x2b0/0x3c0
[  842.795667]  [<c0155216>] ? trace_hardirqs_on_caller+0x16/0x150
[  842.795688]  [<c015535b>] ? trace_hardirqs_on+0xb/0x10
[  842.795709]  [<c017e4d0>] ? __ftrace_update_code+0x0/0x110
[  842.795730]  [<c017e9f0>] ? ftraced+0x0/0x1b0
[  842.795746]  [<c01325d0>] ? printk+0x20/0x30
[  842.795764]  [<c017e9f0>] ? ftraced+0x0/0x1b0
[  842.795780]  [<c017eb94>] ftraced+0x1a4/0x1b0
[  ...
From: Andrew Morton
Date: Friday, April 18, 2008 - 8:08 pm

Seen plenty of them - I think Greg today dropped the offending patch(es).

[  451.915553] sysfs: duplicate filename 'pcspkr' can not be created


I haven't seen that one before.
--

From: Joseph Fannin
Date: Friday, April 18, 2008 - 8:10 pm

New, in 2.6.25-mm1 is a hang I'm seeing, just after the kernel prints:

"[    0.160375] NET: Registered protocol family 16"

The hang lasts about five minutes, and then boot continues.  Just
after that, a backtrace is printed; I don't know if it's related.  The
backtrace will follow.

This does not occur in mainline.  It seems it might be related to OLPC
support -- I enabled all those options -- but that's not good
behavior, and I see no warning of thus in the help.

I'm sending a number or reports against 2.6.25-mm1, so I've put my
dmesg and .config on a server:

http://home.columbus.rr.com/jfannin3/dmesg.txt
http://home.columbus.rr.com/jfannin3/config-2.6.25-mm1.txt

[    0.160375] NET: Registered protocol family 16
[  400.782683] ------------[ cut here ]------------
[  400.782832] WARNING: at arch/x86/mm/ioremap.c:158 __ioremap_caller+0x27d/0x2e0()
[  400.783022] Modules linked in:
[  400.783169] Pid: 1, comm: swapper Not tainted 2.6.25-mm1 #7
[  400.783300]  [<c0130fa9>] warn_on_slowpath+0x59/0x80
[  400.783480]  [<c0106c2e>] ? profile_pc+0x3e/0x50
[  400.783682]  [<c01374ee>] ? irq_exit+0x4e/0xa0
[  400.783879]  [<c0115aec>] ? smp_apic_timer_interrupt+0x5c/0x90
[  400.784087]  [<c024314c>] ? trace_hardirqs_on_thunk+0xc/0x10
[  400.784298]  [<c01552cd>] ? trace_hardirqs_on_caller+0xcd/0x150
[  400.784506]  [<c024314c>] ? trace_hardirqs_on_thunk+0xc/0x10
[  400.784706]  [<c010416c>] ? restore_nocheck_notrace+0x0/0xe
[  400.784906]  [<c011d0e6>] ? page_is_ram+0xa6/0xd0
[  400.785059]  [<c011d4ed>] __ioremap_caller+0x27d/0x2e0
[  400.785221]  [<c03569d8>] ? _spin_unlock_irqrestore+0x48/0x80
[  400.785421]  [<c017f4cd>] ? ftrace_record_ip+0x7d/0x250
[  400.785621]  [<c0474801>] ? olpc_init+0x31/0x140
[  400.785817]  [<c011d59f>] ioremap_nocache+0x1f/0x30
[  400.785976]  [<c0474801>] ? olpc_init+0x31/0x140
[  400.786165]  [<c0474801>] olpc_init+0x31/0x140
[  400.786318]  [<c0464992>] kernel_init+0x142/0x2d0
[  400.786479]  [<c01552cd>] ? ...
From: Andrew Morton
Date: Friday, April 18, 2008 - 8:29 pm

Please add initcall_debug to the kernel boot command line - that should

<looks at this again>

That's

                WARN_ON_ONCE(is_ram);

the changelog for the patch which added that warning is information-free
and there's no code comment explaining what went wrong, which makes things
rather harder than they ought to be.

Yes it's due to the new OLPC code.  olpc_init() has

	romsig = ioremap(0xffffffc0, 16);

which we probably just shouldn't do this at all unless we're running on the
OLPC hardware.  But we need to do this to find out if we're running on the OLPC
hardware!  Perhaps the warning should just be removed.
--

From: Andres Salomon
Date: Saturday, April 19, 2008 - 6:25 am

On Fri, 18 Apr 2008 20:29:25 -0700

Hm.  We could either protect that code with an:

if (!is_geode())
  return;

Or I could add the OpenFirmware patches which would allow us to get
rid of this code, and instead check for the existence of OFW using
that.

The former is quick and easy; the latter is (imo) nicer, so long as
people don't have problems w/ the OFW code.  :)


-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: Andres Salomon
Date: Saturday, April 19, 2008 - 10:39 am

This adds 32-bit support for calling into OFW from the kernel.  It's useful
for querying the firmware for misc hardware information, fetching the device
tree, etc.

There's potentially no reason why other platforms couldn't use this, but
currently OLPC is the main user of it.

This work was originally done by Mitch Bradley.

Signed-off-by: Andres Salomon <dilinger@debian.org>
---
 arch/x86/Kconfig          |    8 +++++
 arch/x86/kernel/Makefile  |    1 +
 arch/x86/kernel/head_32.S |   27 ++++++++++++++++
 arch/x86/kernel/ofw.c     |   75 +++++++++++++++++++++++++++++++++++++++++++++
 include/asm-x86/ofw.h     |   50 ++++++++++++++++++++++++++++++
 include/asm-x86/setup.h   |    1 +
 6 files changed, 162 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kernel/ofw.c
 create mode 100644 include/asm-x86/ofw.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3b9089b..ce56105 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -661,6 +661,14 @@ config I8K
 	  Say Y if you intend to run this kernel on a Dell Inspiron 8000.
 	  Say N otherwise.
 
+config OPEN_FIRMWARE
+	bool "Support for Open Firmware"
+	default y if OLPC
+	---help---
+	  This option adds support for the implementation of Open Firmware
+	  that is used on the OLPC XO laptop.
+	  If unsure, say N here.
+
 config X86_REBOOTFIXUPS
 	def_bool n
 	prompt "Enable X86 board specific fixups for reboot"
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9575754..d33600e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -54,6 +54,7 @@ obj-$(CONFIG_X86_TRAMPOLINE)	+= trampoline_$(BITS).o
 obj-$(CONFIG_X86_MPPARSE)	+= mpparse_$(BITS).o
 obj-$(CONFIG_X86_LOCAL_APIC)	+= apic_$(BITS).o nmi_$(BITS).o
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic_$(BITS).o
+obj-$(CONFIG_OPEN_FIRMWARE)	+= ofw.o
 obj-$(CONFIG_X86_REBOOTFIXUPS)	+= reboot_fixups_32.o
 obj-$(CONFIG_KEXEC)		+= machine_kexec_$(BITS).o
 obj-$(CONFIG_KEXEC)		+= relocate_kernel_$(BITS).o crash.o
diff ...
From: Yinghai Lu
Date: Sunday, April 20, 2008 - 3:34 am

how about changing to ofw_32.c?

YH
--

From: H. Peter Anvin
Date: Sunday, April 20, 2008 - 5:07 am

Hm.  This interface seems more than a bit ad hoc.  In particular, I 
*really* don't like the swapper_pg_dir hack.

"There must be a better way."

	-hpa
--

From: Andres Salomon
Date: Sunday, April 20, 2008 - 10:59 am

On Sun, 20 Apr 2008 08:07:55 -0400

I'm certainly open to suggestions..  Otherwise, I'll poke around and
see if I can come up w/ something.



-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: Mitch Bradley
Date: Sunday, April 20, 2008 - 11:42 am

The x86 architecture doesn't make this problem easy.

The conventional solution is to have the BIOS operate in real mode.  
When the kernel calls into the BIOS, it has to do a grotesque dance that 
involves jumping through a chain of several segments of different 
flavors, thus gradually shutting down the multi-tiered address 
translation mechanism.  Then, if the BIOS is actually operating in 
protected mode (which is necessary if it is larger than 64K, as all 
modern BIOSes are), it has to perform the inverse process, do the 
requested work, then go back into real mode to return to the kernel.  
The net result is that a "call" into the BIOS involves:

a) Copying the arguments to a real-mode register shadow array
b) Saving all the registers - general ones and a few special ones too
c) Far call to a linear-mapped code segment with an execution address in 
the first 1M of memory
d) Switching to a different stack
e) Turning off page translation
f) Switching from protected mode to real mode (or in some cases, V86 
mode instead, which requires an additional Task State Segment dance to 
set the IO permission mask)
g) Switching to a real-mode interrupt descriptor table

h) Executing an INT instruction

I) Performing the inverse of a - g inside the BIOS

j) Doing the requested work

K) Performing a - g again to get back into real mode

l) Executing an "iret" instruction

M) Performing the inverse of a-g to return to normal operation

The machinery that you need to do all that is predictably complex - 
extra segment descriptors that are set up just-so, several little code 
fragments that must be at special addresses in the first meg, additional 
stacks, a real-mode interrupt table at a fixed address, and several data 
save arrays.  That machinery has to be in assembly language, spanning 
several different instruction set modes.

Compared to that, I think that sharing one or two page directory entries 
at the very top of the virtual address space is pretty clean and ...
From: H. Peter Anvin
Date: Sunday, April 20, 2008 - 12:12 pm

[long rant about the x86 architecture]

It would be more useful if you described the actual defined entry 
conditions from OpenFirmware look like, including if they are 
well-defined for all OF implementations or only for OLPC.

	-hpa
--

From: Mitch Bradley
Date: Sunday, April 20, 2008 - 8:39 pm

Fair enough...

To get the second subquestion out of the way:  At the present time, on 
the x86 architecture, "all OF implementations" and "OLPC" are 
effectively the same.  I am unaware of any other x86 OFW deployments in 
current use.  There have been some in the past, on bespoke systems such 
as Network Appliance servers and at least one settop box, but those have 
fallen by the wayside as those companies have shifted over to commodity 
PC hardware.  The current market status quo is that x86 boards are 
primarily designed for Windows, and thus must run legacy BIOS, with some 
recent migration to EFI, neither of which are open source in the strong 
sense.  While I would like to see more OFW penetration into the larger 
x86 market, I don't really expect it.  x86 motherboard manufacturing is 
becoming more and more difficult as signal speeds increase, leading to a 
decline in the number of manufacturers.  The existing manufacturers 
depend on Windows for sales volume and their internal procedures and 
working knowledge are based on legacy BIOS.

Once upon a time, we had an OFW "binding" document that stipulated the 
interface conditions, with the intention of making that "standard" 
across all OFW-on-x86 systems.  However, by the time OLPC came around, 
there were no other systems to consider, so I felt free to make some 
changes in the interface.  I ended up choosing an ABI that resulted in a 
simple (in the sense of not much code, and no complex state transitions) 
interface with 2.6 Linux kernels.

The interface defined below is not inherently OLPC-specific - it would 
be suitable for any ia32 system that used OFW.  (At a higher level, the 
set of OFW callback functions is architecture-neutral; in this message I 
am focusing on the very low-level details of the ia32 ABI)

The system conditions for the OFW to Linux kernel transition are as follows:

a) OFW can load the Linux kernel from either bzimage format or ELF 
format (either uncompressed or zlib-compressed.)  If the ...
From: Yinghai Lu
Date: Sunday, April 20, 2008 - 9:54 pm

so you are assuming that your uncompressed vmlinux only use less 8M space?

you are supposed to check the bzImage to get uncompressed vmlinux size.

YH
--

From: Mitch Bradley
Date: Monday, April 21, 2008 - 1:22 am

The 0x800000 ramdisk load address is an OLPC-specific firmware 
implementation detail that could easily be changed without affecting 
anything else. I probably shouldn't have mentioned it because it isn't 
really an integral part of the interface "contract".

I certainly hope that the OLPC kernel never gets anywhere near that 
size.  The OLPC hardware has limited configurability, so it's not 
plausible that the kernel would grow that large to include a huge kit of 
drivers.  If the kernel file becomes large as a result of including the 
initramfs in the same file, the 0x800000 ramdisk load address won't 
apply (because there won't be a separate load of the initramfs file), so 
the kernel could be extend way past that boundary with no problems.

If we get to the point where we do need huge kernels on OLPC, we can 
release a firmware upgrade along with the new OS.  We have mechanisms 
for coordinating firmware and OS upgrades.

If a new customer for OFW on x86 appears, I'll remember to float the 
boundary above the bzImage uncompressed size (assuming that the bzimage 
--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 4:36 am

So let me see here... you want the virtual address range [0xffc00000, 
0xfff00000) to be reserved for OFW, and you are prohibiting the kernel 


I do not like it, simply because it amounts to "initialize this 
otherwise zero-initialized piece of data without making any kind of 
reservations and blindly hope nothing else overwrites it."

I'm also troubled with the assumption that the kernel doesn't use PAE. 
I realize that this is not an issue for OLPC, but it certainly makes 
this a less-than-generic solution.

Having mapped page table entries which are not under kernel control is a 
very serious problem for PAT - PAT requires, by hardware specification, 
the kernel to eliminate all potential aliases with different mappings.

One way to deal with this, of course, is to save the firmware-provided 
PGD and only use it for OFW calls.  On the other hand, perhaps a better 
questions is to what extent it is needed at all.

Furthermore, since you're using a nonstandard OFW interface (not 
compliant with the x86 OFW binding document), all of this should be 
called something like OLPC_OFW to make it clear that it's the OLPC variant.

If I had designed this, I would probably have used an SMI; since you 
have control over the firmware you can do that.  SMI saves the entire 
machine state including all the modes, cleans them all up for you, and 
puts it all back together at RSM time.  It is slow, of course, but it 
completely decouples the firmware and the OS, which is why it's used.

	-hpa

--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 6:09 am

Okay, stepping back a few steps, it's pretty clear that most of my 
objections aren't really an issue for Geode/OLPC; however, I *really* 
don't want others to pick it up as being "the" Open Firmware interface.

Within those constraints it makes sense to set up the PDEs in 
swapper_pg_dir and let them propagate using the normal mechanisms.

** This is assuming that your OF interface does not rely on a 1:1 
mapping of low memory being present at the time it makes a call.  If it 
*does*, then a separate page directory needs to be maintained for the OF 
class. **


Thus, I'm willing to accept this with these changes:

- Please name things specific to the interface (as opposed to Open 
Firmware in general, like the device tree) olpc_ofw or olpcfw, to denote 
that this is an OLPC-specific interface.  Thus, 
CONFIG_OLPC_OPEN_FIRMWARE or something along those lines.

- Make it explicit in Kconfig that OLPC_OPEN_FIRMWARE conflicts with 
X86_PAE, 64BIT, or X86_PAT.

- Change VMALLOC_END in include/asm-x86/pgtable_32.h so the kernel will 
know to avoid this virtual memory range.

- Add a memory region to arch/x86/mm/dump_tabletables.c.

	-hpa
--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 6:13 am

Okay, stepping back a few steps, it's pretty clear that most of my 
objections aren't really an issue for Geode/OLPC; however, I *really* 
don't want others to pick it up as being "the" Open Firmware interface.

Within those constraints it makes sense to set up the PDEs in 
swapper_pg_dir and let them propagate using the normal mechanisms.

** This is assuming that your OF interface does not rely on a 1:1 
mapping of low memory being present at the time it makes a call.  If it 
*does*, then a separate page directory needs to be maintained for the OF 
class. **


Thus, I'm willing to accept this with these changes:

- Please name things specific to the interface (as opposed to Open 
Firmware in general, like the device tree) olpc_ofw or olpcfw, to denote 
that this is an OLPC-specific interface.  Thus, 
CONFIG_OLPC_OPEN_FIRMWARE or something along those lines.

- Make it explicit in Kconfig that OLPC_OPEN_FIRMWARE conflicts with 
X86_PAE, 64BIT, or X86_PAT.

- Change VMALLOC_END in include/asm-x86/pgtable_32.h so the kernel will 
know to avoid this virtual memory range.

- Add a memory region to arch/x86/mm/dump_tabletables.c.

	-hpa
--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 6:19 am

Okay, stepping back a few steps, it's pretty clear that most of my 
objections aren't really an issue for Geode/OLPC; however, I *really* 
don't want others to pick it up as being "the" Open Firmware interface.

Within those constraints it makes sense to set up the PDEs in 
swapper_pg_dir and let them propagate using the normal mechanisms.

** This is assuming that your OF interface does not rely on a 1:1 
mapping of low memory being present at the time it makes a call.  If it 
*does*, then a separate page directory needs to be maintained for the OF 
class. **


Thus, I'm willing to accept this with these changes:

- Please name things specific to the interface (as opposed to Open 
Firmware in general, like the device tree) olpc_ofw or olpcfw, to denote 
that this is an OLPC-specific interface.  Thus, 
CONFIG_OLPC_OPEN_FIRMWARE or something along those lines.

- Make it explicit in Kconfig that OLPC_OPEN_FIRMWARE conflicts with 
X86_PAE, 64BIT, or X86_PAT.

- Change VMALLOC_END in include/asm-x86/pgtable_32.h so the kernel will 
know to avoid this virtual memory range.

- Add a memory region to arch/x86/mm/dump_tabletables.c.

	-hpa
--

From: Jordan Crouse
Date: Monday, April 21, 2008 - 8:05 am

/me puts on his coreboot hat

This is off topic slightly, but let it be known that the coreboot project
considers OFW a very valid option for x86 platforms.  A kernel that
worked happily with OFW would greatly encourage people to adopt it in
lieu of other BIOS / firmware solutions.

I return you to your previously scheduled debate.

Jordan

--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 7:58 am

The interface they are proposing is definitely not suitable for upward 
extension, for the reasons already mentioned.  However, they have units 
in the field, and the amount of changes required to support another 
interface should be relatively minor.

Hence my insistence that we don't promote it as *the* OFW interface, but 
*a* OFW interface.

	-hpa
--

From: H. Peter Anvin
Date: Sunday, April 20, 2008 - 12:13 pm

It pretty much depends on what the invariants look like.  The 
normal/clean way of doing this kind of thing is via a fixmap entry 
and/or ioremap.

	-hpa
--

From: Mitch Bradley
Date: Sunday, April 20, 2008 - 8:09 pm

Is your suggestion to change the filename from "ofw.c" to "ofw_32.c"?  
That seems like a good idea to me.

--

From: Yinghai Lu
Date: Sunday, April 20, 2008 - 8:15 pm

Yes.

BTW,  why olpc need OFW runtime service?
why not just put the info in in ram with some signiture, so
kernel/util just need to loot at the table if needed?

YH
--

From: Mitch Bradley
Date: Sunday, April 20, 2008 - 9:05 pm

In SPARC land, at least on SunOS and Solaris, it was very convenient for 
debugging to interrupt the OS with Stop-A and use OFW to inspect the 
system state.  That was especially handy for live crash analysis.  Dumps 
are useful as far as they go, but they often fail to capture detailed 
I/O device state.

I was hoping to do that on x86 too.  So far we (OLPC) haven't 
implemented a sysrq hook to enter OFW, but I haven't given up hope yet.  
It doesn't cost much to leave OFW around, but once you decide to eject 
it, you can't easily get it back.

Apple made the early decision to eject OFW and just keep a device tree 
table.  That decision was probably due to several factors, including the 
rather lame state of Apple's first OFW implementation and the complexity 
of their OS startup process at the time (which included "trampolining" 
to a 68000 emulator to run their legacy code).  Once they went down that 
path, the die was cast, and the PowerPC community got used to the "OFW 
--

From: David Miller
Date: Sunday, April 20, 2008 - 9:26 pm

From: Mitch Bradley <wmb@firmworks.com>

In most current SPARC systems, OFW is not usable and is completely
forgotten right after bootup in order to accomodate LDOMs and CPU
hotplug.

It's a better idea, anyways, to develop more pervasive and usable
in-kernel debugger facilities.  Then it doesn't matter if you have
"cool" firmware or not. :-)

--

From: Yinghai Lu
Date: Sunday, April 20, 2008 - 9:50 pm

geode is using SMI to simulate the pci conf space, wonder that could be problem.

later you have 64 runtime service for 64 platform like UEFI?

YH
--

From: Mitch Bradley
Date: Monday, April 21, 2008 - 1:03 am

On the current OLPC system, we don't use the SMI-based PCI config space 
simulator.  The code for that "VSA" module is only partially open 
sourced (some of it is open, and some of it is just not available).  The 
parts of it for which we do have source can only be compiled with an old 
proprietary toolchain that is no longer available.

Instead of using the SMI-based simulation, we have added a PCI 
configuration access method in the kernel that supplies the necessary 
information from a table.  The code for that hardware-specific access 
method is roughly 40 lines of code plus a few data tables.

In the past few weeks, I have developed a rather complete Open 
Firmware-based reimplementation of the SMI PCI config hardware 
emulator.   All-told, it requires over 1000 lines.  It remains to be 
seen whether the complicated version will ultimately be deployed.  
Personally, I find it distasteful to use a lot of code to make the 
hardware pretend that it is something other than what it really is, when 
a much smaller driver works just as well.  The SMI-based emulator is 
quite difficult to understand and maintain, because the Geode SMI 
handling mechanism is complex, incompletely documented, and suffers from 
many of the multiple-mode-switches problems as real-mode to 

Possibly.   64-bit systems are not a problem per se - there have been 
64-bit OFW implementations for 64-bit architectures like SPARC and Alpha 
dating back to a long time ago.  The main issue from my point of view is 
--

From: Andres Salomon
Date: Monday, April 21, 2008 - 7:24 am

On Sun, 20 Apr 2008 18:05:26 -1000

I'm not actually convinced that we *do* want to keep OFW resident in memory,
especially given the memory tricks we need to play.  I also don't actually
like the OFW interface that we.  The debugging aspect of it was a
compelling argument up until a week ago (when kernel debuggers started
finally finding their way into the kernel).

However, until we clean up the promfs stuff, there's no chance of getting


-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: David Woodhouse
Date: Monday, April 21, 2008 - 8:54 am

I don't actually think that the debugging aspect was _ever_ a compelling
argument. It might have made it theoretically possible for _Mitch_ to
debug kernel problems, should he be inclined to do so -- but for the
rest of us mere mortals it's just a PITA trying to keep OpenFirmware

I see no reason why we shouldn't be able to create a 'flattened'
device-tree during early boot, like the PowerPC kernel does. And use it
thereafter, having quiesced OpenFirmware. Haven't we already been
working on unifying this between SPARC and PowerPC kernels?

I definitely don't think we need to play these tricks to keep
OpenFirmware resident while the kernel is running. Take a look at your
second patch -- it's _all_ just lookups in the device-tree, and you're
inventing a new way to do it instead of using the existing one.

-- 
dwmw2

--

From: Andres Salomon
Date: Monday, April 21, 2008 - 10:03 am

On Mon, 21 Apr 2008 16:54:13 +0100

Quite simply, it's a lot more work (*and* we have to play nice w/
sparc and ppc).  I had intended to eventually do it, but first I wanted
to get this stuff in for 2.6.26 so that we could at least boot upstream
kernels on XOs.

I was also hoping to not get into this conversation, but alas.. too


-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: David Woodhouse
Date: Monday, April 21, 2008 - 12:18 pm

It's only more work because we did it the wrong way in the first place.
If only someone had pointed it out at the time... :)

For interaction with device-tree properties in generic code, you should
be using the functions defined in <linux/of.h>.

Creating the static device-tree before we quiesce OpenFirmware surely

Is it only the things in your second patch which need to be made to
work? One of them was already working, by grubbing around in the BIOS
directly -- so all we need is the board revision, isn't it? Can we get
that from the EC for now?

-- 
dwmw2

--

From: Andres Salomon
Date: Monday, April 21, 2008 - 12:46 pm

On Mon, 21 Apr 2008 20:18:11 +0100

Yes, and if only we had an infinite number of kernel hackers who had time


We're not adding a device tree right now, we're adding a method for
querying OFW for information.  Eventually that information should be
obtained from a device tree.  However, that's going to take additional time,
and I'd like to get rid of some of these patches that we've been carrying


Well, no, it wasn't already working; that's the reason this whole
thread started.  It was crashing someone's machine.  That's why the OFW
interface, as imperfect as it is, is an _improvement_.



-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: David Woodhouse
Date: Monday, April 21, 2008 - 1:25 pm

You're proposing a new interface between bootloader and kernel as a
temporary hack just to work around that until we fix it properly?

That seems like overkill to me. I'd just go for is_geode() as you
suggested, and maybe PCI configuration tricks to detect the lack of VSA
so we can be _fairly_ sure it's OLPC before we poke at it?

Or why not try '!page_is_ram(0xffffffc0 >> PAGE_SHIFT)' if it's just to
avoid that particular warning? :)

-- 
dwmw2

--

From: Andres Salomon
Date: Monday, April 21, 2008 - 2:02 pm

On Mon, 21 Apr 2008 21:25:17 +0100


Okay, does anyone have a problem with this?

    




The OFW sig check requires an ioremap that is dangerous on non-OLPC
systems.  Long term, we should be getting the signature from the
device tree (/openprom/model), but for right now just limit the
check to only run on a subset of Geode (GX2/LX) systems.

Signed-off-by: Andres Salomon <dilinger@debian.org>
---
 arch/x86/kernel/olpc.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/olpc.c b/arch/x86/kernel/olpc.c
index 11670be..3e66722 100644
--- a/arch/x86/kernel/olpc.c
+++ b/arch/x86/kernel/olpc.c
@@ -211,6 +211,10 @@ static int __init olpc_init(void)
 {
 	unsigned char *romsig;
 
+	/* The ioremap check is dangerous; limit what we run it on */
+	if (!is_geode() || geode_has_vsa2())
+		return 0;
+
 	spin_lock_init(&ec_lock);
 
 	romsig = ioremap(0xffffffc0, 16);
-- 
1.5.4.4


-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: Jordan Crouse
Date: Monday, April 21, 2008 - 2:17 pm

-- 
Jordan Crouse
Systems Software Development Engineer 
Advanced Micro Devices, Inc.

--

From: David Woodhouse
Date: Monday, April 21, 2008 - 2:17 pm

That looks saner to me for now.

Acked-By: David Woodhouse <dwmw2@infradead.org>

-- 
dwmw2

--

From: Andrew Morton
Date: Monday, April 28, 2008 - 8:06 pm

geode_has_vsa2() is a fairly expensive-looking function and afacit only
needs to be evaluated once per boot.  Perhaps we should cache it somewhere?

--

From: Andres Salomon
Date: Monday, April 28, 2008 - 10:32 pm

On Mon, 28 Apr 2008 20:06:51 -0700

How about this?






This moves geode_has_vsa2 into a .c file, caches the result we get from
the VSA virtual registers, and causes the function to no longer be inline.

Signed-off-by: Andres Salomon <dilinger@debian.org>
---
 arch/x86/kernel/geode_32.c |   19 +++++++++++++++++++
 include/asm-x86/geode.h    |   11 +----------
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/geode_32.c b/arch/x86/kernel/geode_32.c
index 9dad6ca..1cb8225 100644
--- a/arch/x86/kernel/geode_32.c
+++ b/arch/x86/kernel/geode_32.c
@@ -161,6 +161,25 @@ void geode_gpio_setup_event(unsigned int gpio, int pair, int pme)
 }
 EXPORT_SYMBOL_GPL(geode_gpio_setup_event);
 
+static int has_vsa2 = -1;
+
+int geode_has_vsa2(void)
+{
+	if (has_vsa2 == -1) {
+		/*
+		 * The VSA has virtual registers that we can query for a
+		 * signature.
+		 */
+		outw(VSA_VR_UNLOCK, VSA_VRC_INDEX);
+		outw(VSA_VR_SIGNATURE, VSA_VRC_INDEX);
+
+		has_vsa2 = (inw(VSA_VRC_DATA) == VSA_SIG);
+	}
+
+	return has_vsa2;
+}
+EXPORT_SYMBOL_GPL(geode_has_vsa2);
+
 static int __init geode_southbridge_init(void)
 {
 	if (!is_geode())
diff --git a/include/asm-x86/geode.h b/include/asm-x86/geode.h
index 7154dc4..8a53bc8 100644
--- a/include/asm-x86/geode.h
+++ b/include/asm-x86/geode.h
@@ -185,16 +185,7 @@ static inline int is_geode(void)
 	return (is_geode_gx() || is_geode_lx());
 }
 
-/*
- * The VSA has virtual registers that we can query for a signature.
- */
-static inline int geode_has_vsa2(void)
-{
-	outw(VSA_VR_UNLOCK, VSA_VRC_INDEX);
-	outw(VSA_VR_SIGNATURE, VSA_VRC_INDEX);
-
-	return (inw(VSA_VRC_DATA) == VSA_SIG);
-}
+extern int geode_has_vsa2(void);
 
 /* MFGPTs */
 
-- 
1.5.5

--

From: Andrew Morton
Date: Tuesday, April 29, 2008 - 1:35 pm

On Tue, 29 Apr 2008 01:32:13 -0400

Looks sane.  Although one wonders if it should be cached as one of the
standard x86 feature bit thingies which show up in /proc/cpuinfo's 'flags'

nit:

--- a/arch/x86/kernel/geode_32.c
+++ a/arch/x86/kernel/geode_32.c
@@ -161,10 +161,10 @@ void geode_gpio_setup_event(unsigned int
 }
 EXPORT_SYMBOL_GPL(geode_gpio_setup_event);
 
-static int has_vsa2 = -1;
-
 int geode_has_vsa2(void)
 {
+	static int has_vsa2 = -1;
+
 	if (has_vsa2 == -1) {
 		/*
 		 * The VSA has virtual registers that we can query for a

--

From: Andres Salomon
Date: Tuesday, April 29, 2008 - 1:57 pm

On Tue, 29 Apr 2008 13:35:12 -0700

The VSA lives in a weird place between hardware and BIOS.  I'm not
really sure whether it's appropriate for it to be an x86_cap_flags (it
hadn't occurred to me), but I think of it more as BIOS.  Jordan, what do

Looks good.



--

From: H. Peter Anvin
Date: Monday, April 21, 2008 - 9:57 am

If so, would this apply to OLPC as well?

	-hpa
--

From: David Woodhouse
Date: Monday, April 21, 2008 - 11:54 am

Yes. The 'second patch' to which I refer is the one which makes OLPC
platform code use the calls in OpenFirmware... all of them gratuitous.

-- 
dwmw2

--

From: Andres Salomon
Date: Saturday, April 19, 2008 - 10:39 am

Prior to including OFW kernel support, we had to work around the lack of
OFW.  Once OFW support is added, we can switch to using it.  This cleans
up some pre-OFW model detection and OFW signature detection.

Note: this should be a bit nicer to non-OLPC hardware.

Signed-off-by: Andres Salomon <dilinger@debian.org>
---
 arch/x86/kernel/olpc.c |   43 +++++++++++++++++++++++++++++--------------
 1 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/olpc.c b/arch/x86/kernel/olpc.c
index 11670be..3a05683 100644
--- a/arch/x86/kernel/olpc.c
+++ b/arch/x86/kernel/olpc.c
@@ -190,11 +190,11 @@ EXPORT_SYMBOL_GPL(olpc_ec_cmd);
 static void __init platform_detect(void)
 {
 	size_t propsize;
-	u32 rev;
+	uint32_t rev;
 
 	if (ofw("getprop", 4, 1, NULL, "board-revision-int", &rev, 4,
 			&propsize) || propsize != 4) {
-		printk(KERN_ERR "ofw: getprop call failed!\n");
+		printk(KERN_ERR "olpc:  ofw getprop call failed!\n");
 		rev = 0;
 	}
 	olpc_platform_info.boardrev = be32_to_cpu(rev);
@@ -207,26 +207,43 @@ static void __init platform_detect(void)
 }
 #endif
 
-static int __init olpc_init(void)
+static int __init ofw_detect(void)
 {
-	unsigned char *romsig;
+	size_t propsize;
+	char romsig[20];
+	ofw_phandle phandle;
 
-	spin_lock_init(&ec_lock);
+	/* Fetch /openprom/model */
+	if (ofw("finddevice", 1, 1, "/openprom", &phandle) || phandle == ~0)
+		return -ENODEV;
 
-	romsig = ioremap(0xffffffc0, 16);
-	if (!romsig)
-		return 0;
+	if (ofw("getprop", 4, 1, phandle, "model", &romsig, sizeof(romsig),
+			&propsize) || propsize < 7)
+		return -ENODEV;
 
+	/* String should look something like "CL1   Q2D08  Q2D" */
 	if (strncmp(romsig, "CL1   Q", 7))
-		goto unmap;
+		return -ENODEV;
 	if (strncmp(romsig+6, romsig+13, 3)) {
-		printk(KERN_INFO "OLPC BIOS signature looks invalid.  "
+		printk(KERN_INFO "olpc:  BIOS signature looks invalid.  "
 				"Assuming not OLPC\n");
-		goto unmap;
+		return -ENODEV;
 	}
 
-	printk(KERN_INFO ...
From: Andrew Morton
Date: Saturday, April 19, 2008 - 10:38 am

Do both ;)

The quick-n-easy version sounds suitable for now.
--

From: Andres Salomon
Date: Saturday, April 19, 2008 - 10:50 am

On Sat, 19 Apr 2008 10:38:33 -0700

Heh, I already had sent the nicer version.  If people have some fundamental
problem w/ it, I can send the quick-n-easy version.


-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: Jordan Crouse
Date: Monday, April 21, 2008 - 7:56 am

I prefer the nicer version.  It is not a good policy IMHO to wrap OLPC
specfic code with is_geode() and friends.  Even by Geode standards, we've
abused the code greatly for the benefit of the Geode, and few of those
abuses would translate very well even to the general Geode community.  I 
would prefer that we use the is_olpc() and #ifdef wrappers to ensure
that the code that is exclusively OLPC stays exclusively OLPC.

Thanks,
Jordan

-- 
Jordan Crouse
Systems Software Development Engineer 
Advanced Micro Devices, Inc.

--

From: Andres Salomon
Date: Monday, April 21, 2008 - 8:05 am

On Mon, 21 Apr 2008 08:56:19 -0600

Yeah, like I said; the nicer version is the _correct_ way to do things.  I
just fear that the OFW code isn't ready for merging (see hpa's concerns).

The code is already #ifdef'd (the original reporter had enabled
CONFIG_OLPC), and the code in question is what determines what is_olpc()
should return.  is_geode() is just to narrow the scope of what hardware
the check runs on.




-- 
Need a kernel or Debian developer?  Contact me, I'm looking for contracts.
--

From: Jordan Crouse
Date: Monday, April 21, 2008 - 8:12 am

My bad, I missed the key points.  This still is dangerous on a generic
Geode, but at least if they encounter the problem, we can loudly proclaim
"Don't do that".

Jordan

-- 
Jordan Crouse
Systems Software Development Engineer 
Advanced Micro Devices, Inc.

--

From: Arjan van de Ven
Date: Saturday, April 19, 2008 - 11:21 am

On Fri, 18 Apr 2008 20:29:25 -0700


calling ioremap() on something which COULD be ram is... REALLY nasty.
The kernel has to mark that page uncached, for all users and mappings of that memory.
A second hard case then is to find out when the last ioremap() user has
released that memory (since there's several cases where different parts of the same
4K page can be ioremapped) before it can map it cached again. The good news is that
until this olpc patch got in, there were no users of this capability....
Instead of outright forbidding it though we added a warn_on to find out if the
assumption of no users was correct... 
seems it caught some new code which is trying to do this here.

this code should probably be a lot more careful and check that
1) there is no actual kernel memory or something else at this region
   (what if there's some other device there? this code could blow up)
2) the machine won't tripple fault or otherwise throw tantrums if
   this hardcoded value is accessed (not automatic on x86!!)
3) it only runs if there's a really high degree of confidence that this really is
   an OLPC device.
or maybe
4) get this address from some other table or system provided resource




-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Jiri Slaby
Date: Sunday, April 20, 2008 - 4:29 am

Hi, I'm not sure by what was this caused.

LANG=en strace -fo strace_gcc.txt  gcc -Wp,-MD,drivers/usb/class/.usblp.o.d 
-nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -D__KERNEL__ 
-Iinclude -Iinclude2 -I/home/l/latest/xxx/include -include 
include/linux/autoconf.h -I/home/l/latest/xxx/drivers/usb/class 
-Idrivers/usb/class -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -O2 
-fno-stack-protector -m64 -march=core2 -mno-red-zone -mcmodel=kernel 
-funit-at-a-time -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare 
-fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow 
-I/home/l/latest/xxx/include/asm-x86/mach-default -Iinclude/asm-x86/mach-default 
-fno-omit-frame-pointer -fno-optimize-sibling-calls -g 
-Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(usblp)"  -D"KBUILD_MODNAME=KBUILD_STR(usblp)" 
/home/l/latest/xxx/drivers/usb/class/usblp.c -S -o usblp.s
/home/l/latest/xxx/drivers/usb/class/usblp.c: In function 'usblp_submit_read':
/home/l/latest/xxx/drivers/usb/class/usblp.c:977: internal compiler error: 
Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugs.opensuse.org/> for instructions.




strace_gcc.txt:
http://www.fi.muni.cz/~xslaby/sklad/strace_gcc.txt

preprocessor output available here:
http://www.fi.muni.cz/~xslaby/sklad/usblp.E

Reboot fixed it. It happened after few suspend/resume cycles. The preproc output 
differs in no way from after the reboot. Now, the strace looks like:
5341  mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7f362e004000
5341  mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f362df04000
5341  brk(0x1964000)                    = 0x1964000
5341  brk(0x194c000)                    = 0x194c000
5341  ...
From: Jiri Slaby
Date: Monday, April 21, 2008 - 1:31 am

Hi,

$ ls /usr/share/man/cat3readlin
Segmentation fault

[the file doesn't exist.]
This is probably the same bug as in -rc8-mm2 I reported here:
http://www.opensubscriber.com/message/linux-kernel@vger.kernel.org/9008289.html

general protection fault: 0000 [1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/statistics/collisions
CPU 0
Modules linked in: test ipv6 tun bitrev arc4 ecb crypto_blkcipher cryptomgr 
crypto_algapi ath5k mac80211 crc32 sr_mod usbhid ohci1394 rtc_cmos hid rtc_core 
cfg80211 ieee1394 cdrom ehci_hcd rtc_lib ff_memless floppy evdev
Pid: 24838, comm: man Not tainted 2.6.25-mm1_64 #403
RIP: 0010:[<ffffffff802aca27>]  [<ffffffff802aca27>] __d_lookup+0x97/0x160
RSP: 0018:ffff8100337d1b98  EFLAGS: 00010206
RAX: 00f0000000000000 RBX: 00f0000000000000 RCX: 0000000000000012
RDX: ffff8100200830e0 RSI: ffff8100337d1ca8 RDI: ffff810079195708
RBP: ffff8100337d1bf8 R08: ffff8100337d1ca8 R09: 0000000000000000
R10: 000000000000013d R11: 0000000000000246 R12: ffff8100200830c8
R13: 00000000198eaed5 R14: ffff810079195708 R15: ffff8100337d1bc8
FS:  00007f447b5c06f0(0000) GS:ffffffff80664000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000001484f88 CR3: 000000005fac4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process man (pid: 24838, threadinfo ffff8100337d0000, task ffff810034418000)
Stack:  ffff8100337d1ca8 000000000000000b ffff810079195710 0000000b792561a0
  ffff81003136600f ffffffff802f9073 00f0000000000000 0000000000000001
  ffff8100337d1e48 ffff8100337d1e48 ffff8100337d1ca8 ffff8100337d1cb8
Call Trace:
  [<ffffffff802f9073>] ? ext3_lookup+0xc3/0x100
  [<ffffffff802a1e85>] do_lookup+0x35/0x220
  [<ffffffff802a22c2>] __link_path_walk+0x252/0x1010
  [<ffffffff802b20ba>] ? mntput_no_expire+0x2a/0x140
  [<ffffffff802a30ee>] path_walk+0x6e/0xe0
  [<ffffffff802a33b2>] do_path_lookup+0xa2/0x240
  ...
From: Al Viro
Date: Monday, April 21, 2008 - 2:06 am

On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:

        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
                struct qstr *qstr;

                if (dentry->d_name.hash != hash)
                        continue;

walking into node == (struct hlist_node *)0x00f0000000000000...

--

From: Jiri Slaby
Date: Monday, April 21, 2008 - 2:37 am

Yup, true, In the last oops I stuck on memcmp few lines below.

BTW. it's 100% reproducible after it happens once, but fixable by reboot. Any 
tests I should run (memtest, some printks sticked anywhere)?
--

From: Al Viro
Date: Monday, April 21, 2008 - 2:45 am

Well, if list has such turd in it, you'll certainly hit it every time
you walk that list, so 100% reproducible is not surprising.

How well is it reproducible from fresh boot?
--

From: Jiri Slaby
Date: Monday, April 21, 2008 - 2:59 am

Few days with suspend/resume cycles. This one was booted 12 hours ago, one 
suspend/resume. Will keep an eye on it and keep you informed.
--

From: Rafael J. Wysocki
Date: Monday, April 21, 2008 - 6:42 am

I think that's exactly the same problem I reported here:
http://lkml.org/lkml/2008/4/20/182
for 2.6.25-git2, so it hit the mainline and seems to be related to RCU.

Thanks,
Rafael
--

From: Matthew Wilcox
Date: Monday, April 21, 2008 - 10:23 am

Shall we see if we can catch it earlier?  I have no idea if this will
help ... I haven't even booted it on a testmachine yet ;-)  If I got
something wrong, it'll BUG() pretty early.

diff --git a/include/linux/list.h b/include/linux/list.h
index 75ce2cb..238ca1e 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -724,10 +724,17 @@ static inline int hlist_empty(const struct hlist_head *h)
 	return !h->first;
 }
 
+#ifdef CONFIG_DEBUG_LIST
+extern void hlist_check(struct hlist_node *n);
+#else
+#define hlist_check(n)		do { } while (0)
+#endif
+
 static inline void __hlist_del(struct hlist_node *n)
 {
 	struct hlist_node *next = n->next;
 	struct hlist_node **pprev = n->pprev;
+	hlist_check(n);
 	*pprev = next;
 	if (next)
 		next->pprev = pprev;
@@ -785,6 +792,7 @@ static inline void hlist_replace_rcu(struct hlist_node *old,
 {
 	struct hlist_node *next = old->next;
 
+	hlist_check(old);
 	new->next = next;
 	new->pprev = old->pprev;
 	smp_wmb();
@@ -840,6 +848,7 @@ static inline void hlist_add_head_rcu(struct hlist_node *n,
 static inline void hlist_add_before(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	n->pprev = next->pprev;
 	n->next = next;
 	next->pprev = &n->next;
@@ -849,6 +858,7 @@ static inline void hlist_add_before(struct hlist_node *n,
 static inline void hlist_add_after(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	next->next = n->next;
 	n->next = next;
 	next->pprev = &n->next;
@@ -878,6 +888,7 @@ static inline void hlist_add_after(struct hlist_node *n,
 static inline void hlist_add_before_rcu(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	n->pprev = next->pprev;
 	n->next = next;
 	smp_wmb();
@@ -906,6 +917,7 @@ static inline void hlist_add_before_rcu(struct hlist_node *n,
 static inline void hlist_add_after_rcu(struct hlist_node *prev,
 				       struct hlist_node *n)
 {
+	hlist_check(prev);
 	n->next = prev->next;
 ...
Previous thread: [PATCH 1/1] Update email address in MODULE_AUTHOR by Hans-Christian Egtvedt on Friday, April 18, 2008 - 1:02 am. (1 message)

Next thread: Problem with delayed data from pl2303 usb serial gps by Helge Hafting on Friday, April 18, 2008 - 3:16 am. (2 messages)