2.6.23-rc3-mm1: m32r defconfig compile error

Previous thread: plea for project idea by shaneed cm on Wednesday, August 22, 2007 - 1:41 am. (4 messages)

Next thread: none
From: Andrew Morton
Date: Wednesday, August 22, 2007 - 2:06 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc3/2.6.23-rc3-mm1/

- git-ixgbe.patch got dropped - git-net.patch destroyed it

- then git-net got dropped as it doesn't work

- the -mm import-to-git engine still isn't working



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.



Changes since 2.6.23-rc2-mm2:

 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-agpgart.patch
 git-audit-master.patch
 git-cpufreq.patch
 git-powerpc.patch
 git-dma.patch
 git-drm.patch
 git-dvb.patch
 git-hwmon.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ieee1394.patch
 ...
From: Randy Dunlap
Date: Wednesday, August 22, 2007 - 11:03 am

allyesconfig on x86_64 says:

kernel/unwind.c:1016:31: error: undefined identifier '__builtin_labs'
kernel/unwind.c:1232:25: error: undefined identifier '__builtin_labs'

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Andrew Morton
Date: Wednesday, August 22, 2007 - 11:32 am

On Wed, 22 Aug 2007 11:03:48 -0700

One wonders why x86_64-mm-unwinder.patch has an open-coded call to
__builtin_labs(), when include/linux/kernel.h:abs() should do a fine job.

And what's this stuff, anyway?

+typedef unsigned long uleb128_t;
+typedef   signed long sleb128_t;
+#define sleb128abs __builtin_labs

unsigned and signed little-endian 128-bit types?  Nope, they're 32-bit or
64-bit.   All very mysterious.
-

From: Andi Kleen
Date: Wednesday, August 22, 2007 - 12:38 pm

dwarf2 uses a magic compressing encoding for numbers that uses less bytes
for small numbers and more bytes for larger numbers. These are the base
types for this.

It's similar to fs/reiser4/dscale.h in your tree.

-Andi

-

From: Randy Dunlap
Date: Wednesday, August 22, 2007 - 12:17 pm

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Andi Kleen
Date: Wednesday, August 22, 2007 - 1:53 pm

Hmm I use the same compiler from SUSE10.2 and it works for me (with both
mm and only my tree applied) 

Ok mm fails with some errors in the wireless drivers but with 
wireless disabled it compiles.

When you compile a simple test program like

main() { printf("%lu\n", __builtin_labs(-1)); }


Andrew, I actually checked that and the abs() there is just abs()
not a labs(). So it wouldn't work on 64bit platform.

We could opencode it of course, but __builtin_labs should be really 
there.

-Andi
-

From: Rafael J. Wysocki
Date: Wednesday, August 22, 2007 - 9:33 am

Apparently, the b43 driver is expecting another version of mac80211.

This patch fixes the compilation, but I'm not sure what about the
functionality. ;-)

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/net/wireless/b43/main.c |    6 ++----
 drivers/net/wireless/b43/xmit.c |   10 ++++------
 2 files changed, 6 insertions(+), 10 deletions(-)

Index: linux-2.6.23-rc3-mm1/drivers/net/wireless/b43/main.c
===================================================================
--- linux-2.6.23-rc3-mm1.orig/drivers/net/wireless/b43/main.c
+++ linux-2.6.23-rc3-mm1/drivers/net/wireless/b43/main.c
@@ -1189,8 +1189,7 @@ static void b43_write_probe_resp_plcp(st
 
 	plcp.data = 0;
 	b43_generate_plcp_hdr(&plcp, size + FCS_LEN, rate);
-	dur = ieee80211_generic_frame_duration(dev->wl->hw,
-					       dev->wl->if_id, size,
+	dur = ieee80211_generic_frame_duration(dev->wl->hw, size,
 					       B43_RATE_TO_BASE100KBPS(rate));
 	/* Write PLCP in two parts and timing for packet transfer */
 	tmp = le32_to_cpu(plcp.data);
@@ -1246,8 +1245,7 @@ static u8 *b43_generate_probe_resp(struc
 	/* Set the frame control. */
 	hdr->frame_control = cpu_to_le16(IEEE80211_FTYPE_MGMT |
 					 IEEE80211_STYPE_PROBE_RESP);
-	dur = ieee80211_generic_frame_duration(dev->wl->hw,
-					       dev->wl->if_id, *dest_size,
+	dur = ieee80211_generic_frame_duration(dev->wl->hw, *dest_size,
 					       B43_RATE_TO_BASE100KBPS(rate));
 	hdr->duration_id = dur;
 
Index: linux-2.6.23-rc3-mm1/drivers/net/wireless/b43/xmit.c
===================================================================
--- linux-2.6.23-rc3-mm1.orig/drivers/net/wireless/b43/xmit.c
+++ linux-2.6.23-rc3-mm1/drivers/net/wireless/b43/xmit.c
@@ -220,7 +220,6 @@ static void generate_txhdr_fw4(struct b4
 	} else {
 		int fbrate_base100kbps = B43_RATE_TO_BASE100KBPS(rate_fb);
 		txhdr->dur_fb = ieee80211_generic_frame_duration(dev->wl->hw,
-								 dev->wl->if_id,
 								 fragment_len,
 								 fbrate_base100kbps);
 	}
@@ -311,16 ...
From: Michael Buesch
Date: Wednesday, August 22, 2007 - 2:56 pm

There seems to be a screwup somehow.
These mac80211 API functions were recently changed to include
the additional parameter. So it seems you carry an old version of mac80211.

-- 
Greetings Michael.
-

From: John W. Linville
Date: Wednesday, August 22, 2007 - 7:56 pm

I think what happened is because Andrew dropped Dave M.'s net tree.
Since mac80211 has been getting merged through Dave M., crucial bits
are missing which then break the bits from wireless-dev.

Andrew, if you find that you need to drop git-net again then I'll be
happy to provide you with a wireless-dev patch that does not depend on
Dave's tree.  The mm-master branch in wireless-dev has dropped those
patches which have gone to Dave M. in the hopes of avoiding conflicts.
Dependencies are another matter... :-)

John
-- 
John W. Linville
linville@tuxdriver.com
-

From: Andrew Morton
Date: Thursday, August 23, 2007 - 12:07 am

Hopefully git-net is less wrecked than it was yesterday.  If things still
play up I'll have a go at bodging it up a bit, perhaps by disabling
netconsole.  (Although now I think about it, the netconsole bug was mainly
an ill-advised BUG_ON, fixable by using WARN_ON instead).



-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 10:26 am

Hello,

	Got that on my laptop:

------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                    double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                  initialize held:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 bad unlock order:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |  ok  |
           recursive read-lock #2:             |  ok  |             |  ok  |
            mixed read-write-lock:             |  ok  |             |  ok  |
            mixed write-read-lock:             |  ok  |             |  ok  |
  --------------------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
       sirq-safe-A => hirqs-on/12:  ok  |  ok  |irq event stamp: 452
hardirqs last  enabled at (452): [<c026ff85>] irqsafe2A_rlock_12+0x8d/0xcc
hardirqs last disabled at (451): [<c0115ce4>] cpu_clock+0xe/0x49
softirqs last  enabled at (448): ...
From: Frederik Deweerdt
Date: Wednesday, August 22, 2007 - 2:27 pm

Hi Mariusz,

FWIW, reverting softlockup-use-cpu_clock-instead-of-sched_clock.patch
fixes the problem here.

Regards,
Frederik
-

From: Rafael J. Wysocki
Date: Wednesday, August 22, 2007 - 10:30 am

I get this during resume from suspend to RAM and during hibernation:

WARNING: at /home/rafael/src/mm/linux-2.6.23-rc3-mm1/arch/x86_64/kernel/smp.c:380 smp_call_function_single()

Call Trace:
 [<ffffffff8021a97d>] smp_call_function_single+0x52/0xff
 [<ffffffff8022e703>] task_rq_lock+0x3d/0x6f
 [<ffffffff80230fc0>] set_cpus_allowed+0xbf/0xcc
 [<ffffffff802141ab>] sc_freq_event+0x5f/0x63
 [<ffffffff80431c38>] notifier_call_chain+0x33/0x65
 [<ffffffff8024c11f>] __srcu_notifier_call_chain+0x4b/0x69
 [<ffffffff8024c14c>] srcu_notifier_call_chain+0xf/0x11
 [<ffffffff803bcfa4>] cpufreq_resume+0x131/0x157
 [<ffffffff8038151c>] __sysdev_resume+0x34/0x73
 [<ffffffff80381b76>] sysdev_resume+0x1f/0x61
 [<ffffffff803865e8>] device_power_up+0x9/0x10
 [<ffffffff80256620>] suspend_devices_and_enter+0xbf/0xf7
 [<ffffffff802567bb>] enter_state+0x163/0x1e5
 [<ffffffff802568e1>] state_store+0xa4/0xc2
 [<ffffffff802d7bc5>] subsys_attr_store+0x31/0x33
 [<ffffffff802d7e8d>] sysfs_write_file+0xe0/0x11c
 [<ffffffff80293b77>] vfs_write+0xc7/0x150
 [<ffffffff802940f8>] sys_write+0x47/0x70
 [<ffffffff8020bdce>] system_call+0x7e/0x83

Apparently, smp_call_function_single() is unhappy, because it's called with
interrupts disabled by sc_freq_event() executed (as a notifier) by
cpufreq_resume().  However, cpufreq_resume() is always run with one CPU on
line, so all this stuff should be handled differently.  Oh, dear.
-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 12:04 pm

Hello,

	Got that on imac g3.

  CC      kernel/kgdb.o
kernel/kgdb.c: In function 'kgdb_handle_exception':
kernel/kgdb.c:940: error: invalid lvalue in unary '&'
kernel/kgdb.c:940: warning: type defaults to 'int' in declaration of '_o_'
kernel/kgdb.c:940: error: invalid lvalue in unary '&'
kernel/kgdb.c:940: warning: type defaults to 'int' in declaration of '_n_'
kernel/kgdb.c:940: error: invalid lvalue in unary '&'
kernel/kgdb.c:940: error: invalid lvalue in unary '&'
kernel/kgdb.c:940: error: invalid lvalue in unary '&'
kernel/kgdb.c:940: warning: type defaults to 'int' in declaration of 'type name'
make[1]: *** [kernel/kgdb.o] Blad 1
make: *** [kernel] Blad 2

Regards,

	Mariusz


From: Andrew Morton
Date: Wednesday, August 22, 2007 - 12:47 pm

On Wed, 22 Aug 2007 21:04:28 +0200

I'm not surprised.

	while (cmpxchg(&atomic_read(&debugger_active), 0, (procid + 1)) != 0) {

a) cmpxchg isn't available on all architectures

b) we can't just go and take the address of atomic_read()'s return value!

c) that's pretty ugly-looking stuff anyway.
-

From: Jason Wessel
Date: Wednesday, August 22, 2007 - 3:44 pm

Against the tip of the kernel + kgdb patches this config builds.  I 
wonder if is the compiler or the macros for atomic_read or cmpxchg have 
changed for in the -mm tree.  Perhaps it is not relevant though if you 
It was available for all the archs that the kgdb had been implemented on 
Perhaps yes, perhaps no I guess it depends on what actually gets 
generated...  In the past the intent of this was to guard for the race 
to be the master processor and looked like some attempt to do it 

Perhaps there is a cleaner way to do the same thing and avoid the 
cmpxchg all together.  I used the attached patch to eliminate the 
cmpxchg operation.


Jason.
From: Andrew Morton
Date: Wednesday, August 22, 2007 - 4:53 pm

On Wed, 22 Aug 2007 17:44:12 -0500

eek.  We're in the process of hunting down and eliminating exactly this
construct.  There have been cases where the compiler cached the
atomic_read() result in a register, turning the above into an infinite
loop.

Plus we should never add power-burners like that into the kernel anyway. 
That loop should have a cpu_relax() in it.  Which will also fix the
compiler problem described above.

Thirdly, please always add a newline when coding statements like that:

	while (expr())
		;


-

From: Jason Wessel
Date: Wednesday, August 22, 2007 - 8:25 pm

The other instances I found of the same problem in the kgdb core are 
fixed too.

I merged all the changes into the for_mm branch in the kgdb git tree.

Thanks,
Jason.
-

From: Pete/Piet Delaney
Date: Wednesday, August 29, 2007 - 4:43 pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Where is the kgdb git tree?


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1gS/JICwm/rv3hoRAhfRAJ42F3QlzGwG4aQbs9hHVMI4kJ9SWQCfXrku
UGo97ByKsB9yhyIu5c+2Jh0=
=welB
-----END PGP SIGNATURE-----
-

From: Pete/Piet Delaney
Date: Wednesday, August 29, 2007 - 5:05 pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Trying:

git clone
http://master.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb.git

- -


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1gnFJICwm/rv3hoRApOoAJ9BHXLsIuxDiOCaAFRfAZGwrDXATQCeLL3O
bxtr3qz0soPRghPmtSZgOqc=
=kQd1
-----END PGP SIGNATURE-----
-

From: Pete/Piet Delaney
Date: Wednesday, August 29, 2007 - 6:19 pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Why am I getting this when I do:

git clone
http://master.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb.git

-
----------------------------------------------------------------------------
error: Couldn't get
http://master.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb.git/refs/tags...
for tags/v2.6.11
The requested URL returned error: 404
error: Could not interpret tags/v2.6.11 as something to pull
rm: cannot remove directory
`/nethome/piet/Src/linux/git/jwessel/linux-2.6-kgdb/.git/clone-tmp':
Directory not empty
/nethome/piet/Src/linux/git/jwessel$
-
----------------------------------------------------------------------------

We are getting a problem with VMware where kernel text is the schedler
is getting wacked with four null bytes into the code. Thought I'd use
the current linux-2.6-kgdb.git tree and possible the CONFIG_DEBUG_RODATA
patch to make kernel text readonly:

 https://www.x86-64.org/pipermail/patches/2007-March/003666.html

I thought the kernel text was RO and gdb had to disable it to
insert a breakpoint.

- -


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1hshJICwm/rv3hoRAhTGAJ46pq69zYHqRmT+yTmRx+RVh8aBtgCfdyFM
gl91xCFTy0NJxHalVXpd9Os=
=c8FZ
-----END PGP SIGNATURE-----
-

From: Randy Dunlap
Date: Wednesday, August 29, 2007 - 6:38 pm

See the URLs at the top of
http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=summary


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Jason Wessel
Date: Wednesday, August 29, 2007 - 7:07 pm

I have only ever used:

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb.git


Jason.
-

From: Jason Wessel
Date: Wednesday, August 29, 2007 - 7:13 pm

If you are going to make all the kernel text RO, then you are going to 
have to add some code to the kgdb write memory so as to unprotect a 
given page or all the breakpoint writes are going to fail.  
Alternatively you can use HW breakpoints.  But, I have no idea if your 
VM Ware simulated HW emulate HW breakpoint registers or not.

Jason.
-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 12:16 pm

Hello,

	Got that on athlon x86_32:

  CC [M]  drivers/net/wireless/rt2x00mac.o
drivers/net/wireless/rt2x00mac.c: In function `rt2x00mac_tx_rts_cts':
drivers/net/wireless/rt2x00mac.c:61: warning: passing arg 2 of `ieee80211_ctstoself_get' makes pointer from integer without a cast
drivers/net/wireless/rt2x00mac.c:61: warning: passing arg 3 of `ieee80211_ctstoself_get' makes integer from pointer without a cast
drivers/net/wireless/rt2x00mac.c:61: warning: passing arg 4 of `ieee80211_ctstoself_get' makes pointer from integer without a cast
drivers/net/wireless/rt2x00mac.c:61: warning: passing arg 5 of `ieee80211_ctstoself_get' from incompatible pointer type
drivers/net/wireless/rt2x00mac.c:61: error: too many arguments to function `ieee80211_ctstoself_get'
drivers/net/wireless/rt2x00mac.c:65: warning: passing arg 2 of `ieee80211_rts_get' makes pointer from integer without a cast
drivers/net/wireless/rt2x00mac.c:65: warning: passing arg 3 of `ieee80211_rts_get' makes integer from pointer without a cast
drivers/net/wireless/rt2x00mac.c:65: warning: passing arg 4 of `ieee80211_rts_get' makes pointer from integer without a cast
drivers/net/wireless/rt2x00mac.c:65: warning: passing arg 5 of `ieee80211_rts_get' from incompatible pointer type
drivers/net/wireless/rt2x00mac.c:65: error: too many arguments to function `ieee80211_rts_get'
make[3]: *** [drivers/net/wireless/rt2x00mac.o] Error 1
make[2]: *** [drivers/net/wireless] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

Regards,

	Mariusz



Linux localhost 2.6.23-rc3-mm1 #2 PREEMPT Wed Aug 22 19:45:30 CEST 2007 i686 AMD Athlon(tm) XP 1700+ AuthenticAMD GNU/Linux
 
Gnu C                  3.4.6
Gnu make               3.81
binutils               2.17
util-linux             2.12r
mount                  2.12r
module-init-tools      3.2.2
e2fsprogs              1.39
nfs-utils              1.0.6
Linux C Library        2.5
Dynamic linker (ldd)   2.5
Procps                 3.2.7
Net-tools              1.60
Kbd ...
From: Ivo van Doorn
Date: Wednesday, August 22, 2007 - 12:31 pm

This has been fixed for quite some time already.
John, I can't check this myself now, but which rt2x00
patches have gone into the -mm tree? Since I believe
the patch that changed ieee80211_ctstoself_get was
followed by a patch to fix rt2x00 within the same series...

Ivo
-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 12:54 pm

Ok. Thanks. What about this one?

  CC [M]  drivers/net/wireless/zd1211rw-mac80211/zd_mac.o
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c: In function `zd_op_erp_ie_changed':
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c:822: error: `IEEE80211_ERP_CHANGE_PREAMBLE' undeclared (first use in this function)
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c:822: error: (Each undeclared identifier is reported only once
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c:822: error: for each function it appears in.)
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c: At top level:
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c:844: error: unknown field `erp_ie_changed' specified in initializer
drivers/net/wireless/zd1211rw-mac80211/zd_mac.c:844: warning: initialization from incompatible pointer type
make[4]: *** [drivers/net/wireless/zd1211rw-mac80211/zd_mac.o] Error 1
make[3]: *** [drivers/net/wireless/zd1211rw-mac80211] Error 2
make[2]: *** [drivers/net/wireless] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

Regards,

	Mariusz
-

From: Ivo van Doorn
Date: Wednesday, August 22, 2007 - 1:12 pm

I'm not a zd1211rw developer, but a quick look into the patch series it seems
that the mac80211 version in -mm1 does not contain the patch
[PATCH 4/4] mac80211: implement ERP info change notifications
But it does contain the zd1211rw patch:
[PATCH] zd1211rw-mac80211: use correct preambles for RTS/CTS frames
Which depended on the above mentioned mac80211 patch.

Just had a second thought about those rt2x00 compilation errors you reported,
the error is not caused by rt2x00 lagging behind mac80211 api changes but
that rt2x00 patches to follow the api changes are going upstream but
the mac80211 api changes it depends on are not going anywhere.

It seems that mac80211 has not been updated in the -mm tree while the
drivers have been updated. This is causing the compilation errors for both
rt2x00 as zd1211rw.
I'll bet that if you try any other mac80211 driver similar issues will arise.

Ivo
-

From: Rafael J. Wysocki
Date: Wednesday, August 22, 2007 - 1:22 pm

Yup.  This also happens to the b43 driver, for example.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-

From: John W. Linville
Date: Wednesday, August 22, 2007 - 12:58 pm

Andrew had a lot of problems working-out conflicts between wireless-dev
and net-2.6.24.  I have since taken steps to help with this, but I
think his pull was from before the wireless-dev rebase.  Hopefully the
next -mm will be better.

John
-- 
John W. Linville
linville@tuxdriver.com
-

From: Michal Piotrowski
Date: Wednesday, August 22, 2007 - 3:11 am

/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c: In function 'xfs_bmbt_set_allf':
/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c:2312: error: 'b' undeclared (first use in this function)
/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c:2312: error: (Each undeclared identifier is reported only once
/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c:2312: error: for each function it appears in.)
/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c: In function 'xfs_bmbt_disk_set_allf':
/home/devel/linux-mm/fs/xfs/xfs_bmap_btree.c:2372: error: 'b' undeclared (first use in this function)
make[3]: *** [fs/xfs/xfs_bmap_btree.o] Error 1
make[2]: *** [fs/xfs] Error 2
make[1]: *** [fs] Error 2
make: *** [_all] Error 2

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/

-

From: Michal Piotrowski
Date: Wednesday, August 22, 2007 - 3:27 am

Build fix.

Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>

--- linux-mm-clean/fs/xfs/xfs_bmap_btree.c	2007-08-22 12:20:35.000000000 +0200
+++ linux-mm/fs/xfs/xfs_bmap_btree.c	2007-08-22 12:15:52.000000000 +0200
@@ -2309,7 +2309,7 @@ xfs_bmbt_set_allf(
 		((xfs_bmbt_rec_base_t)blockcount &
 		(xfs_bmbt_rec_base_t)XFS_MASK64LO(21));
 #else	/* !XFS_BIG_BLKNOS */
-	if (ISNULLSTARTBLOCK(b)) {
+	if (ISNULLSTARTBLOCK(startblock)) {
 		r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) |
 			((xfs_bmbt_rec_base_t)startoff << 9) |
 			 (xfs_bmbt_rec_base_t)XFS_MASK64LO(9);
@@ -2369,7 +2369,7 @@ xfs_bmbt_disk_set_allf(
 		 ((xfs_bmbt_rec_base_t)blockcount &
 		  (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)));
 #else	/* !XFS_BIG_BLKNOS */
-	if (ISNULLSTARTBLOCK(b)) {
+	if (ISNULLSTARTBLOCK(startblock)) {
 		r->l0 = cpu_to_be64(
 			((xfs_bmbt_rec_base_t)extent_flag << 63) |
 			 ((xfs_bmbt_rec_base_t)startoff << 9) |

-


Hi Michal,

Thanks for the patch.
This would be a problem for 32bit machines without large blocksize support
(i.e. in our xfs tests: !XFS_BIG_BLKNOS => (BITS_PER_LONG == 32 && !defined(CONFIG_LBD))
which we obviously didn't do a build test for.

I'll check it into our local tree and push to the master branch for Andrew.

--Tim


-

From: Kamalesh Babulal
Date: Wednesday, August 22, 2007 - 6:02 am

Hi Andrew,

Following Kernel Bug was raised when i tried compiling and booting ppc64 
machine
with 2.6.23-rc3-mm1 kernel.

=================================================================

Freeing initrd memory: 908k freed

sysctl table check failed: /kernel .1 Writable sysctl directory

skb_over_panic: text:c0000000002bf840 len:139 put:29 head:c00000000ffe7400 data:c00000000ffe7400 tail:0x8b end:0x80 dev:<NULL>

------------[ cut here ]------------

kernel BUG at net/core/skbuff.c:95!

Oops: Exception in kernel mode, sig: 5 [#1]

SMP NR_CPUS=128 NUMA pSeries

Modules linked in:

NIP: c0000000003fd7c4 LR: c0000000003fd7c0 CTR: 80000000000f97dc

REGS: c0000000027f3850 TRAP: 0700   Not tainted  (2.6.23-rc3-mm1-autokern1)

MSR: 8000000000029032 <EE,ME,IR,DR>  CR: 24288024  XER: 00000010

TASK = c000000009fc0000[1] 'swapper' THREAD: c0000000027f0000 CPU: 0

GPR00: c0000000003fd7c0 c0000000027f3ad0 c000000000737710 0000000000000082 

GPR04: 0000000000000001 0000000000000001 0000000000000000 c00000000062bb3c 

GPR08: 0000000000000000 c00000000067b2e0 0000000000002100 c00000000077b110 

GPR12: 0000000000004000 c000000000649f00 0000000000000000 0000000000000000 

GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 

GPR20: 0000000000000000 c00000000802a0c0 c00000000052ea30 c0000000005392a8 

GPR24: c000000009f6b908 c0000000006a4000 c0000000026e2000 0000000000000020 

GPR28: c00000000ffe746e 0000000000000004 c0000000006f5340 c000000009f6a900 

NIP [c0000000003fd7c4] .skb_over_panic+0x50/0x58

LR [c0000000003fd7c0] .skb_over_panic+0x4c/0x58

Call Trace:

[c0000000027f3ad0] [c0000000003fd7c0] .skb_over_panic+0x4c/0x58 (unreliable)

[c0000000027f3b60] [c0000000002bf848] .kobject_uevent_env+0x4f8/0x528

[c0000000027f3c80] [c00000000032512c] .device_add+0x2bc/0x730

[c0000000027f3d50] [c000000000022330] .vio_register_device_node+0x1a4/0x274

[c0000000027f3e00] [c0000000005d34a8] .vio_bus_init+0xa0/0xec

[c0000000027f3e80] ...
From: Andrew Morton
Date: Wednesday, August 22, 2007 - 8:50 am

gargh, sorry, that's probably due to my screwed up attempt to fix Kay's
screwed up
gregkh-driver-driver-core-change-add_uevent_var-to-use-a-struct.patch.

Kay sent an update patch but it didn't arrive in time.

Greg, if you haven't yet merged that, please do so asap?

So what _should_ this:

--- a/arch/powerpc/kernel/vio.c~fix-4-gregkh-driver-driver-core-change-add_uevent_var-to-use-a-struct
+++ a/arch/powerpc/kernel/vio.c
@@ -373,7 +373,7 @@ static int vio_hotplug(struct device *de
 	dn = dev->archdata.of_node;
 	if (!dn)
 		return -ENODEV;
-	cp = of_get_property(dn, "compatible", &length);
+	cp = of_get_property(dn, "compatible", &env->buflen);
 	if (!cp)
 		return -ENODEV;
 
_

have done?
-

From: Kay Sievers
Date: Wednesday, August 22, 2007 - 10:58 am

Does replacing "&length" with "NULL" work? That's what's in the updated
patch.

Thanks,
Kay

-

From: Balbir Singh
Date: Wednesday, August 22, 2007 - 12:04 pm

Hi, Kay,

replacing &length with NULL does not work for me. I get a message saying that
init terminated with signal 7.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-

From: Kay Sievers
Date: Wednesday, August 22, 2007 - 1:55 pm

Hi Balbir,
ugh, I can't see what's going wrong here.

Care to just "return 0" for the whole function, and try again? Just to
rule out that this is the cause of the problem.

Thanks,
Kay

-

From: Balbir Singh
Date: Wednesday, August 22, 2007 - 2:10 pm

Same here.. I went through the new add_uevent_var() code. The only change
I found was that instead of using env->envp[env->envp_idx] as an argument
to vsnprintf(), the code looks semantically the same. Even with those
changes, the assignment of env->envp[env->envp_idx++] to &env->buf[
env->buflen] makes the semantics look similar.

I verified that the arguments to add_uevent_var() are sane. So at this
point, I am a little lost. I'll debug further and see if the socket


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-

From: Balbir Singh
Date: Thursday, August 23, 2007 - 11:59 am

Hi, Kay,

I just confirmed, your NULL fix looks correct. The init got signal 7
problem occurs even if uevent_add_var() is commented out. I suspect that


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 6:33 am

...

 CC      arch/i386/boot/cpu.o
  CC      arch/i386/boot/cpucheck.o
WARNING: "div64_64" [net/netfilter/xt_connbytes.ko] has no CRC!
  CC      arch/i386/boot/edd.o
  AS      arch/i386/boot/header.o
  CC      arch/i386/boot/main.o

...

config : http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/config
build-log: http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/build-log

Regards,

Gabriel
-

From: Andrew Morton
Date: Wednesday, August 22, 2007 - 9:09 am

Yeah, I get that too.  I was hoping that someone who had a vague clue
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 10:01 am

Hmm.. I don't know ( added netdev to Cc ) I got one more :

...

WARNING: "div64_64" [net/ipv4/tcp_cubic.ko] has no CRC!

...

Btw when modprobing these the kernel gets tainted

...

[ 5498.536055] nf_conntrack version 0.5.0 (10240 buckets, 40960 max)
[ 5498.554844] xt_connbytes: no version for "div64_64" found: kernel tainted.

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

cu
Adrian


<--  snip  -->


This patch makes the 64bit integers on 32bit architectures usable for
all C parsers that know about "long long".

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 include/asm-arm/types.h      |   10 +++++++---
 include/asm-avr32/types.h    |   10 +++++++---
 include/asm-blackfin/types.h |   11 +++++++----
 include/asm-cris/types.h     |   10 +++++++---
 include/asm-frv/types.h      |   10 +++++++---
 include/asm-h8300/types.h    |   10 +++++++---
 include/asm-i386/types.h     |   10 +++++++---
 include/asm-m32r/types.h     |   11 ++++++++---
 include/asm-m68k/types.h     |   10 +++++++---
 include/asm-mips/types.h     |   10 +++++++---
 include/asm-parisc/types.h   |   10 +++++++---
 include/asm-powerpc/types.h  |    9 ++++++---
 include/asm-s390/types.h     |    9 ++++++---
 include/asm-sh/types.h       |   10 +++++++---
 include/asm-sh64/types.h     |   10 +++++++---
 include/asm-v850/types.h     |   10 +++++++---
 include/asm-xtensa/types.h   |   10 +++++++---
 17 files changed, 118 insertions(+), 52 deletions(-)

4b6826d7a2f5b54a6a3b1cfa8cd40b1b27621be0 
diff --git a/include/asm-arm/types.h b/include/asm-arm/types.h
index 3141451..1dae25b 100644
--- a/include/asm-arm/types.h
+++ b/include/asm-arm/types.h
@@ -19,11 +19,15 @@ typedef unsigned short __u16;
 typedef __signed__ int __s32;
 typedef unsigned int __u32;
 
-#if defined(__GNUC__)
-__extension__ typedef __signed__ long long __s64;
-__extension__ typedef unsigned long long __u64;
+#if defined(__GNUC__) && defined(__STRICT_ANSI__)
+#define __extension_long_long __extension__
+#else
+#define __extension_long_long
 #endif
 
+__extension_long_long typedef __signed__ long long __s64;
+__extension_long_long typedef unsigned long long __u64;
+
 #endif /* __ASSEMBLY__ */
 
 /*
diff --git a/include/asm-avr32/types.h b/include/asm-avr32/types.h
index 8999a38..2c14f49 100644
--- a/include/asm-avr32/types.h
+++ b/include/asm-avr32/types.h
@@ -25,11 +25,15 @@ typedef ...
From: Mike Frysinger
Date: Monday, August 27, 2007 - 2:34 pm

ah, yet another attempt at this stuff

you probably need to update linux/types.h as well
-mike
-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:36 pm

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Mike Frysinger
Date: Monday, August 27, 2007 - 2:42 pm

just grep for __GNUC__ ...

#if defined(__GNUC__) && !defined(__STRICT_ANSI__)
typedef     __u64       uint64_t;
typedef     __u64       u_int64_t;
typedef     __s64       int64_t;
#endif

#if defined(__GNUC__) && !defined(__STRICT_ANSI__)
typedef __u64 __bitwise __le64;
typedef __u64 __bitwise __be64;
#endif

you've made available __u64 and __s64, but not the rest ...
-mike
-

From: Andrew Morton
Date: Tuesday, August 28, 2007 - 12:37 am

Given that this patch (hopefully) fixes a problem in the current net-2.6.24
tree, I'm inclined to slip it into mainline immediately.

But I'd like a better description, please.  Which "non-gcc parser" are we
talking about here?  Something under ./scripts/.  Well, please identify it,
and describe what the problem is, and how the proposed patch will address
it.

Let's cc Sam too, as I guess he's the guy whose code just broke.

Thanks.

-

From: Sam Ravnborg
Date: Tuesday, August 28, 2007 - 1:43 am

If my analysis is correct then genksyms fails to produce a CRC for div64_64 because
genksyms does not know the __extension__ keyword.
And this patch just paper over the real bug wich is in genksyms - right?

So we should fix the root cause here.

Googeling I did not find a good description of where __extension__ can be
used so I fail to see where in the parse.y file I shal add the keyword.
I think __extension__ may be used both as a part of an expression AND
as part of a typedef (as in this case) but I wonder if this is where it is limited
to be used.
I would like to have this sorted out so we do not do a half-backed solution,
and the proposed patch as it just paper over the real bug is no good.

	Sam
-

From: Michael Matz
Date: Tuesday, August 28, 2007 - 7:19 am

Hi,


The grammatic rules involving __extension__ are these (the lhs stems from 
the standard directly):

   external-declaration:
     __extension__ external-declaration

   struct-declaration:
     __extension__ struct-declaration

   nested-declaration:
     __extension__ nested-declaration

   unary-operator: one of
     __extension__ __real__ __imag__

The first three allow to put __extension__ in front of any external or 
local declaration (including decls inside blocks, in C99), ala:

  {
    x = 1+3;
    __extension__ int y = 3;
    x += y;
  }

the last one defines __extension__ as an unary operator, which can be 
applied to all cast-expressions (which in turn are just unary 
expressions).  E.g.:

   x = 1 + __extension__ (2+3);

Note that the decls include the C99 nested-decls in for statements:

   for (__extension__ long long i = 0; ...)

Note further that there's a small ambiguity in parsing when just looking 
forward one token, namely between decl and expression, like in this 
example:

   { __extension__ int i;

vs.

   { __extension__ i + 2;

Here you can't decide if __extension__ introduces an expression or a decl.  
Probably doesn't matter for your parser.  Hope this helps.


Ciao,
Michael.
-

From: Randy Dunlap
Date: Tuesday, August 28, 2007 - 7:40 am

I found only one gcc manual page on __extension__:

http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Alternate-Keywords.html#Alternate-Keywords

(also found for other gcc versions)

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Adrian Bunk
Date: Tuesday, August 28, 2007 - 7:42 am

It fixes a bug exposed by a -mm only patch, not by the net tree

It's about parsers like the Sun C compiler and the C parser shipped 
with genksyms.

We can fix the C parser shipped with genksyms, but we have nearly the 
same problem with userspace C parsers:

These are userspace headers, and we had a bug report that the Sun C 
compiler was not able to compile some userspace code.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Sam Ravnborg
Date: Tuesday, August 28, 2007 - 10:06 am

So it is about two bugs.
1) kbuild (genksyms) fails to generate CRC for some symbols
2) allow userspace to parse the header

As for 2 we already use sed to remove a lot of stuff in our headers
so why do we use another approach here?

As for 1 I will try to teach genksyms to accept __extension__ but
it seems leess trivial than I expected (most be fooling myself somehow).

	Sam
-

From: Mike Frysinger
Date: Tuesday, August 28, 2007 - 10:42 am

the sed removes things permanently and is designed for scrubbing
things that are kernel-only ... in this case, these typedefs are not
kernel only, but exposed conditionally when the compiler/standard
allows for it
-mike
-

From: Adrian Bunk
Date: Tuesday, August 28, 2007 - 10:59 am

This time it's the other way round:


We anyway need a way to hide __extension__ from non-gcc userspace C 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Sam Ravnborg
Date: Tuesday, August 28, 2007 - 11:37 am

OK.
I have anyway added support for __extension__ in genksyms.
See below patch.

Note: To try this patch out do the following in a fresh tree (no generated files):
$ rm scripts/genksyms/*_shipped
$ apply patch
$ make GENERATE_PARSER=1 ...

In kbuild.git the _shipped files are updated but that would just be noise here.

From: Sam Ravnborg <sam@ravnborg.org>
Date: Tue, 28 Aug 2007 20:28:55 +0200
Subject: [PATCH] kbuild: __extension__ support in genksyms (fix unknown CRC warning)

Recently the __extension__ keyword has been introduced in the kernel.
Teach genksyms about this keyword so it can generate correct CRC for
exported symbols that uses a symbol marked __extension__.
For now only the typedef variant:

	__extension__ typedef ...

is supported.
Later we may add more variants as needed.

This patch contains the actual source file changes. The
following patch will hold modifications to the generated
files (*_shipped) and only after the second patch the fix
has effect.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
---
 scripts/genksyms/keywords.gperf |    1 +
 scripts/genksyms/parse.y        |    5 ++++-
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/scripts/genksyms/keywords.gperf b/scripts/genksyms/keywords.gperf
index c75e0c8..5ef3733 100644
--- a/scripts/genksyms/keywords.gperf
+++ b/scripts/genksyms/keywords.gperf
@@ -11,6 +11,7 @@ __attribute, ATTRIBUTE_KEYW
 __attribute__, ATTRIBUTE_KEYW
 __const, CONST_KEYW
 __const__, CONST_KEYW
+__extension__, EXTENSION_KEYW
 __inline, INLINE_KEYW
 __inline__, INLINE_KEYW
 __signed, SIGNED_KEYW
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index ca04c94..408cdf8 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -61,6 +61,7 @@ remove_list(struct string_list **pb, struct string_list **pe)
 %token DOUBLE_KEYW
 %token ENUM_KEYW
 %token EXTERN_KEYW
+%token EXTENSION_KEYW
 %token FLOAT_KEYW
 %token INLINE_KEYW
 %token INT_KEYW
@@ -110,7 +111,9 @@ ...
From: Gabriel C
Date: Wednesday, August 22, 2007 - 8:30 am

Got it with a randconfig ( http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-8 )

...

net/ipv4/fib_trie.c: In function 'trie_rebalance':
net/ipv4/fib_trie.c:969: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:971: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:977: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:980: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'fib_insert_node':
net/ipv4/fib_trie.c:1034: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1034: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1034: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'fn_trie_lookup':
net/ipv4/fib_trie.c:1498: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1502: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1502: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1503: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'trie_leaf_remove':
net/ipv4/fib_trie.c:1539: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1539: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1539: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1554: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'nextleaf':
net/ipv4/fib_trie.c:1706: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c:1743: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'fib_trie_get_next':
net/ipv4/fib_trie.c:2046: error: lvalue required as unary '&' operand
net/ipv4/fib_trie.c: In function 'fib_trie_seq_show':
net/ipv4/fib_trie.c:2320: error: lvalue required as unary '&' operand
make[2]: *** [net/ipv4/fib_trie.o] Error 1
make[1]: *** [net/ipv4] Error 2
make: *** [net] Error 2
make: *** Waiting for unfinished jobs....

...
-

From: Adrian Bunk
Date: Wednesday, August 22, 2007 - 8:41 am

Side effect of the git-net removal, temporarily removing 
immunize-rcu_dereference-against-crazy-compiler-writers.patch should 
work around it.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Paul E. McKenney
Date: Wednesday, August 22, 2007 - 10:03 am

Alternatively, the following one-line patch to net/ipv4/fib_trie.c could
be used.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 fib_trie.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.23-rc3-mm1/net/ipv4/fib_trie.c linux-2.6.23-rc3-mm1.compile/net/ipv4/fib_trie.c
--- linux-2.6.23-rc3-mm1/net/ipv4/fib_trie.c	2007-08-22 09:20:33.000000000 -0700
+++ linux-2.6.23-rc3-mm1.compile/net/ipv4/fib_trie.c	2007-08-22 09:47:33.000000000 -0700
@@ -94,7 +94,7 @@ typedef unsigned int t_key;
 #define T_LEAF  1
 #define NODE_TYPE_MASK	0x1UL
 #define NODE_PARENT(node) \
-	((struct tnode *)rcu_dereference(((node)->parent & ~NODE_TYPE_MASK)))
+	((struct tnode *)(rcu_dereference((node)->parent) & ~NODE_TYPE_MASK))
 
 #define NODE_TYPE(node) ((node)->parent & NODE_TYPE_MASK)
 
-

From: Jarek Poplawski
Date: Sunday, August 26, 2007 - 11:36 pm

...

After first reading of this thread I've had an impression it's about
compiler's behavior, but now it seems to me this patch is not an
alternative, but a 'must be' and only proper way of calling
rcu_dereference (with a variable instead of an expression)? Am I
right?

Regards,
Jarek P.
-

From: Paul E. McKenney
Date: Monday, August 27, 2007 - 9:23 am

Yes, rcu_dereference() does indeed need to be invoked on a lvalue.

							Thanx, Paul
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 9:32 am

Gabriel
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 9:15 am

CONFIG_SCSI_ADVANSYS=y && CONFIG_ISA=n results in :

...

drivers/built-in.o: In function `advansys_init':
advansys.c:(.init.text+0x38ea): undefined reference to `isa_register_driver'
advansys.c:(.init.text+0x38ff): undefined reference to `isa_register_driver'
advansys.c:(.init.text+0x3926): undefined reference to `isa_unregister_driver'
advansys.c:(.init.text+0x3930): undefined reference to `isa_unregister_driver'
drivers/built-in.o: In function `advansys_exit':
advansys.c:(.exit.text+0x340): undefined reference to `isa_unregister_driver'
advansys.c:(.exit.text+0x34a): undefined reference to `isa_unregister_driver'
make: *** [.tmp_vmlinux1] Error 1

...


I guess advansys_{init,exit} is missing some #ifdef's ..


config : http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-9


Gabriel

-

From: Matthew Wilcox
Date: Wednesday, August 22, 2007 - 9:28 am

That's one conclusion.  I prefer to think that the ISA support should
behave the same as the PCI and EISA support:

----

When CONFIG_ISA is disabled, the isa_driver support will not be compiled
in.  Define stubs so that we don't get link-time errors.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>

diff --git a/include/linux/isa.h b/include/linux/isa.h
index 1b85533..b0270e3 100644
--- a/include/linux/isa.h
+++ b/include/linux/isa.h
@@ -22,7 +22,18 @@ struct isa_driver {
 
 #define to_isa_driver(x) container_of((x), struct isa_driver, driver)
 
+#ifdef CONFIG_ISA
 int isa_register_driver(struct isa_driver *, unsigned int);
 void isa_unregister_driver(struct isa_driver *);
+#else
+static inline int isa_register_driver(struct isa_driver *d, unsigned int i)
+{
+	return 0;
+}
+
+static inline void isa_unregister_driver(struct isa_driver *d)
+{
+}
+#endif
 
 #endif /* __LINUX_ISA_H */

-- 
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 9:57 am

From: Gabriel C
Date: Wednesday, August 22, 2007 - 10:10 am

Got it with a randconfig ( http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-9 )
( patch from http://lkml.org/lkml/2007/8/22/273 is needed too or CONFIG_SCSI_ADVANSYS need be N)


...

ERROR: "slhc_init" [drivers/net/ppp_generic.ko] undefined!
ERROR: "slhc_remember" [drivers/net/ppp_generic.ko] undefined!
ERROR: "slhc_uncompress" [drivers/net/ppp_generic.ko] undefined!
ERROR: "slhc_free" [drivers/net/ppp_generic.ko] undefined!
ERROR: "slhc_compress" [drivers/net/ppp_generic.ko] undefined!
ERROR: "slhc_toss" [drivers/net/ppp_generic.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

...


Regards,

Gabriel
-

From: Mel Gorman
Date: Wednesday, August 22, 2007 - 10:17 am

08/22/07-07:01:07 building kernel - make bzImage
  CHK     include/linux/version.h
  UPD     include/linux/version.h
  CHK     include/linux/utsrelease.h
  UPD     include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-x86_64
  CC      arch/x86_64/kernel/asm-offsets.s
arch/x86_64/kernel/asm-offsets.c:1: error: -mpreferred-stack-boundary=3
is not between 4 and 12
make[1]: *** [arch/x86_64/kernel/asm-offsets.s] Error 1
make: *** [prepare0] Error 2
08/22/07-07:01:08 Build the kernel. Failed rc = 2
08/22/07-07:01:08 build: Building kernel... Failed rc = 1
Failed and terminated the run
08/22/07-07:01:08 command complete: (1) rc=126 (TEST ABORT)
 Fatal error, aborting autorun

config file at: http://test.kernel.org/abat/107411/build/dotconfig
gcc version is 3.4.4

This does not occur when using a cross-compiler gcc 3.4.0

-- 
Mel Gorman
-

From: Andrew Morton
Date: Wednesday, August 22, 2007 - 11:10 am

On Wed, 22 Aug 2007 18:17:38 +0100

x86_64-mm-less-stack-alignment.patch has

cflags-y += $(call cc-option,-mpreferred-stack-boundary=3)

So we _should_ have detected that gcc didn't like =3, so it
should not have been used.

I am suspecting a kbuild glitch: asm-offsets.c tends to be handled
in special ways (ie: it's usually the thing which blows up first)
so perhaps it is somehow avoiding the above does-gcc-support-this test.

Suitable cc's added ;)
-

From: Mel Gorman
Date: Thursday, August 23, 2007 - 4:39 am

Reverting the patch does allow the kernel to build and boot on that
machine.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-

From: Andy Whitcroft
Date: Thursday, August 23, 2007 - 5:03 am

On Wed, Aug 22, 2007 at 11:10:29AM -0700, Andrew Morton wrote:

It seems that this is a problem caused by the way we check for
compiler options in x86_64.  Each compiler flag is checked for
individually and if available added to cflags-y, later that is
added to CFLAGS.  However, this means that each flag is checked
in total isolation.  On x86_64 (on this compiler at least) the
-mpreferred-stack-boundary and -m{32,64} flags are actually mutually
dependant, the alignment constraints vary based on the word size.
This leads to the compile failure:

    # gcc -mpreferred-stack-boundary=3 -S -xc /dev/null  -o FOO
    # echo $?
    0
    # gcc -m64 -mpreferred-stack-boundary=3 -S -xc /dev/null  -o FOO
    /dev/null:1: error: -mpreferred-stack-boundary=3 is not between 4 and 12
    # echo $?
    1

In the main Makefile we always add each flag directly to CFLAGS
which means we check them all in combination, perhaps this is
prudent here also?  Either way I suspect that changing the -m64
check to add itself directly to CFLAGS will fix this us.

-apw
-

From: Andi Kleen
Date: Thursday, August 23, 2007 - 5:22 am

Ok that makes sense. Most people don't see it because they don't
need -m64. 

I fixed it up with 
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/cflags-probe
and then
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/less-stack-alignment
(replacement for the mm patch)

Can you test?

-Andi

-

From: Andy Whitcroft
Date: Thursday, August 23, 2007 - 5:34 am

Sure, will do that now and let you know.

-apw
-

From: Sam Ravnborg
Date: Thursday, August 23, 2007 - 5:28 am

Something like this then:
[PATCH] x86_64: fix preferred-stack-boundary check

gcc has different interpretation of the -preferred-stack-boundary flag
dependent on the option -m64 is present or not as seen in the following:
     # gcc -mpreferred-stack-boundary=3 -S -xc /dev/null  -o FOO
     # echo $?
     0
     # gcc -m64 -mpreferred-stack-boundary=3 -S -xc /dev/null  -o FOO
     /dev/null:1: error: -mpreferred-stack-boundary=3 is not between 4 and 12
     # echo $?
     1

Adding the -m64 to CFLAGS let cc-option do the right thing.

Thanks to Andy Whitcroft <apw@shadowen.org> for spotting the root cause.

Cc: Andy Whitcroft <apw@shadowen.org> 
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
---
diff --git a/arch/x86_64/Makefile b/arch/x86_64/Makefile
index 128561d..5402c0a 100644
--- a/arch/x86_64/Makefile
+++ b/arch/x86_64/Makefile
@@ -25,6 +25,8 @@ LDFLAGS		:= -m elf_x86_64
 OBJCOPYFLAGS	:= -O binary -R .note -R .comment -S
 LDFLAGS_vmlinux :=
 CHECKFLAGS      += -D__x86_64__ -m64
+AFLAGS          += -m64
+CFLAGS          += -m64
 
 cflags-y	:=
 cflags-kernel-y	:=
@@ -36,7 +38,6 @@ cflags-$(CONFIG_MCORE2) += \
 	$(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
 cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic)
 
-cflags-y += -m64
 cflags-y += -mno-red-zone
 cflags-y += -mcmodel=kernel
 cflags-y += -pipe
@@ -69,7 +70,6 @@ cflags-$(CONFIG_CC_STACKPROTECTOR_ALL) += $(shell $(CONFIG_SHELL) $(srctree)/scr
 
 CFLAGS += $(cflags-y)
 CFLAGS_KERNEL += $(cflags-kernel-y)
-AFLAGS += -m64
 
 head-y := arch/x86_64/kernel/head.o arch/x86_64/kernel/head64.o arch/x86_64/kernel/init_task.o
 
-

From: Sam Ravnborg
Date: Thursday, August 23, 2007 - 7:24 am

OK - Andi decided to do this in a bit more invasive way but it looks OK.
So disregard my patch.

	Sam
-

From: Andi Kleen
Date: Thursday, August 23, 2007 - 5:07 am

The flag actually needs a recent gcc 4.3 snapshot (it's
a new feature the gcc developers added especially for the
kernel :), so if this didn't work it would fail on the vast 
majority of systems.

Somehow it doesn't? At least here it compiles fine.

I notice the final comma is missing, Mel does it work
when you change the line to 

cflags-y += $(call cc-option,-mpreferred-stack-boundary=3,)  

If not please run
gcc -O2 -mpreferred-stack-boundary=2 -S -xc /dev/null -o x.o 
echo $?

What does the echo output?

-Andi

-

From: Mel Gorman
Date: Thursday, August 23, 2007 - 9:25 am

No but that is hardly a suprise as it's looking like -m64 is the way

Repeating really but;

elm3b6:~# gcc -O2 -mpreferred-stack-boundary=2 -S -xc /dev/null -o x.o ; echo $?
0
elm3b6:~# gcc -m64 -O2 -mpreferred-stack-boundary=2 -S -xc /dev/null -o x.o ; echo $?
/dev/null:1: error: -mpreferred-stack-boundary=2 is not between 4 and 12
1


-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-

From: Torsten Kaiser
Date: Wednesday, August 22, 2007 - 10:24 am

That patch is no longer in 2.6.23-rc3-mm1, my bootlog says:
[    0.000000] Unknown boot option `sata_nv.swncq=1': ignoring

I could not find this patch in any git trees I looked and its removal
mail from mm-commit said:
"This patch was dropped because Changes in Jeff's tree destroyed it."


I only found out about the swncq=1 command line option yesterday and
so tested it only one day.
But I did not have any trouble with it, even as my drive was made by Maxtor.

The chipset:
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)

The drive:
Device Model:     MAXTOR STM3320820AS
Serial Number:    5QF2E698
Firmware Version: 3.AAE

As it worked for me, I hope that patch will be picked up by someone. :)

Torsten
-

From: Andrew Morton
Date: Wednesday, August 22, 2007 - 11:14 am

On Wed, 22 Aug 2007 19:24:39 +0200

This is a fairly regular occurrence in ata land: patches from maintainers
don't get merged, so I merge them for testing, then some fairly pointless
cleanup-style patch goes on a great tree-wide rampage thus destabilising or
simply destroying the more important, mysteriously-not-merged patch.

Nobody knows why this happens.

Peer and Kuan: can you please redo that patch against the current ata
development tree?

Thanks.

-

From: Michal Piotrowski
Date: Wednesday, August 22, 2007 - 7:19 am

allyesconfig

  RELOCS  arch/i386/boot/compressed/vmlinux.relocs
WARNING: Absolute relocations present
Offset     Info     Type     Sym.Value Sym.Name
c06018f3 02ee1f01   R_386_32 c14adad0  _sdata

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-

From: Andrew Morton
Date: Wednesday, August 22, 2007 - 9:17 am

Yeah, that's Greg's pestiferous
gregkh-driver-warn-when-statically-allocated-kobjects-are-used.patch acting
up.

I previously suggested that something like kallsyms_lookup() could be used
for this, but I was cruelly ignored.

-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 1:23 pm

Hello,

	This is from x86_32 with gcc 3.4.6:

  CC [M]  sound/pci/hda/hda_codec.o
sound/pci/hda/hda_codec.c: In function `snd_hda_codec_free':
sound/pci/hda/hda_codec.c:517: sorry, unimplemented: inlining failed in call to 'free_hda_cache': function body not available
sound/pci/hda/hda_codec.c:534: sorry, unimplemented: called from here
sound/pci/hda/hda_codec.c:517: sorry, unimplemented: inlining failed in call to 'free_hda_cache': function body not available
sound/pci/hda/hda_codec.c:535: sorry, unimplemented: called from here
make[3]: *** [sound/pci/hda/hda_codec.o] Error 1
make[2]: *** [sound/pci/hda] Error 2
make[1]: *** [sound/pci] Error 2
make: *** [sound] Error 2

Regards,

	Mariusz
-

From: Takashi Iwai
Date: Wednesday, August 22, 2007 - 2:07 pm

At Wed, 22 Aug 2007 22:23:03 +0200,

Since it doesn't happen with gcc-4.x, this looks like a gcc-3.x
specific problem.   Does the patch below fix?


Taksahi

diff -r db9001b20d29 pci/hda/hda_codec.c
--- a/pci/hda/hda_codec.c	Wed Aug 22 14:19:45 2007 +0200
+++ b/pci/hda/hda_codec.c	Wed Aug 22 23:06:00 2007 +0200
@@ -514,7 +514,7 @@ static int read_widget_caps(struct hda_c
 
 static void init_hda_cache(struct hda_cache_rec *cache,
 			   unsigned int record_size);
-static inline void free_hda_cache(struct hda_cache_rec *cache);
+static void free_hda_cache(struct hda_cache_rec *cache);
 
 /*
  * codec destructor
@@ -707,7 +707,7 @@ static void __devinit init_hda_cache(str
 	cache->record_size = record_size;
 }
 
-static inline void free_hda_cache(struct hda_cache_rec *cache)
+static void free_hda_cache(struct hda_cache_rec *cache)
 {
 	kfree(cache->buffer);
 }
-

From: Mariusz Kozlowski
Date: Wednesday, August 22, 2007 - 2:18 pm

Yes - it does.

Thanks,

-

From: Adrian Bunk
Date: Wednesday, August 22, 2007 - 2:44 pm

It happens because gcc doesn't see the whole file without 
unit-at-a-time and we disable unit-at-a-time with gcc 3.4 on i386 due 
to stack usage problems (and older GNU gcc versions don't support 
unit-at-a-time at all).

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Frederik Deweerdt
Date: Wednesday, August 22, 2007 - 1:25 pm

Hi Jeremy,

arch/i386/kernel/alternative.c:alternative_instructions() doesn't
check for noreplace-smp before setting capability bits and freeing the
__smp_locks section.

Every call to alternatives_smp_unlock() checks for noreplace-smp
beforehand, so remove the check from there.

Boot tested on i386 with UP+noreplace-smp (lguest) and SMP (real hardware)

Regards,
Frederik

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>

diff --git a/arch/i386/kernel/alternative.c b/arch/i386/kernel/alternative.c
index 9f4ac8b..7c5af80 100644
--- a/arch/i386/kernel/alternative.c
+++ b/arch/i386/kernel/alternative.c
@@ -221,9 +221,6 @@ static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end
 	u8 **ptr;
 	char insn[1];
 
-	if (noreplace_smp)
-		return;
-
 	add_nops(insn, 1);
 	for (ptr = start; ptr < end; ptr++) {
 		if (*ptr < text)
@@ -406,7 +403,7 @@ void __init alternative_instructions(void)
 #endif
 
 #ifdef CONFIG_SMP
-	if (smp_alt_once) {
+	if (smp_alt_once && !noreplace_smp) {
 		if (1 == num_possible_cpus()) {
 			printk(KERN_INFO "SMP alternatives: switching to UP code\n");
 			set_bit(X86_FEATURE_UP, boot_cpu_data.x86_capability);
-

From: Andrew Morton
Date: Thursday, August 23, 2007 - 2:50 pm

On Wed, 22 Aug 2007 22:25:51 +0200

umm, so?  What happens then?  What bug is being fixed here, and what are

You refer to rc3-mm1 and this is described as a "-mm patch" but it seems to
also be applicable to mainline?
-

From: Frederik Deweerdt
Date: Thursday, August 23, 2007 - 11:04 pm

That means that even when you specify noreplace_smp, some replacing
takes place anyway. One of the consequences, besides noreplace_smp not
working as expected, is that lguest crashes when you feed it an SMP kernel
Hmm yes, my bad.

Regards,
Frederik
-

From: Jeremy Fitzhardinge
Date: Thursday, August 23, 2007 - 11:46 pm

Hm.  Is alt_smp_once useful?

    J
-

From: Frederik Deweerdt
Date: Friday, August 24, 2007 - 1:22 am

[Added Gerd Hoffman and Rusty Russel to cc]
It dies with:
[    0.131000] SMP alternatives: switching to UP code
lguest: bad stack page 0xc057a000

I added a dump_stack on the Host, which gives:
[124320.090946]  [<c01052f8>] dump_trace+0x65/0x1de
[124320.090956]  [<c010548b>] show_trace_log_lvl+0x1a/0x2f
[124320.090970]  [<c0105ea4>] show_trace+0x12/0x14
[124320.090975]  [<c0105fcd>] dump_stack+0x16/0x18
[124320.090980]  [<f888032c>] pin_page+0x5f/0xa3 [lg]
[124320.090993]  [<f8880654>] pin_stack_pages+0x3a/0x4a [lg]
[124320.091004]  [<f888007e>] guest_pagetable_clear_all+0x12/0x15 [lg]
[124320.091013]  [<f887f81a>] do_hcall+0xb1/0x1cb [lg]
[124320.091021]  [<f887fbbe>] do_hypercalls+0x28a/0x2a0 [lg]
[124320.091029]  [<f887f2a2>] run_guest+0x24/0x492 [lg]
[124320.091037]  [<f8881b48>] read+0x83/0x8f [lg]
[124320.091048]  [<c0175a77>] vfs_read+0x8e/0x117
[124320.091054]  [<c0175e99>] sys_read+0x3d/0x61
[124320.091059]  [<c0104166>] sysenter_past_esp+0x6b/0xb5
[124320.091065]  [<ffffe410>] 0xffffe410
[124320.091069]  =======================

Now, the "SMP alternatives: switching to UP code" message made me wonder
if it had anything to do with the alternatives, so I tried disabling
the switch by passing noreplace_smp...
... But the message was displayes anyway (and the smp_locks section
freed), because the check my patch adds is not made.
With the patch, I can boot lguest with an SMP kernel if I pass

I can't figure what the use case is, debugging set aside,
but there are places (eg xen, __cpu_die) in the kernel calling
alternatives_smp_switch(1) at runtime.  Passing smp-alt-once will prevent
the switch.

Regards,
Frederik
-

From: Rusty Russell
Date: Saturday, August 25, 2007 - 5:07 am

How odd!  This means that the guest set the kernel to a stack which it
hadn't mapped writable (or perhaps not mapped at all).  I always run SMP
kernels, and that seems a very strange side effect of a patching
problem...

Nonetheless, I did have a previous problem with a bug in the patching
code which didn't show up native and did show up under lguest.

Can you send your config?  Do you need noreplace-smp even on 2.6.23-rc3,
or only 2.6.23-rc3-mm1?

Thanks,
Rusty.


-

From: Frederik Deweerdt
Date: Saturday, August 25, 2007 - 5:23 am

I had time to investigate this a little further, it appears that in fact
0xc057a000 is the beginning of the __smp_locks section.

The crash responsible function call is in alternative_instructions():
		free_init_pages("SMP alternatives",
				(unsigned long)__smp_locks,
				(unsigned long)__smp_locks_end);

Ie, if I comment this out, I can boot lguest without passing
noreplace_smp.

BTW, to make things clear: the patch I sent does _not_ fix the
lguest/alternatives problem. It just makes noreplace_smp functional
Here it is:
I'll try ASAP.

Thanks,
Frederik
-

From: Frederik Deweerdt
Date: Saturday, August 25, 2007 - 2:14 pm

Ok, tested: I need noreplace-smp + patch to make it work on mainline too.

Regards,
Frederik
-

From: Rusty Russell
Date: Monday, August 27, 2007 - 9:09 am

If the stack pointer is 0xc057a000, then the first stack page is at
0xc0579000 (the stack pointer is decremented before use).  Not
calculating this correctly caused guests with CONFIG_DEBUG_PAGEALLOC=y
to be killed with a "bad stack page" message: the initial kernel stack
was just preceeding the .smp_locks section which
CONFIG_DEBUG_PAGEALLOC marks read-only when freeing.

Thanks to Frederik Deweerdt for the bug report!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r cb71c5b0bbb5 drivers/lguest/interrupts_and_traps.c
--- a/drivers/lguest/interrupts_and_traps.c	Sun Aug 26 10:31:53 2007 +1000
+++ b/drivers/lguest/interrupts_and_traps.c	Sun Aug 26 10:34:44 2007 +1000
@@ -270,8 +270,11 @@ void pin_stack_pages(struct lguest *lg)
 	/* Depending on the CONFIG_4KSTACKS option, the Guest can have one or
 	 * two pages of stack space. */
 	for (i = 0; i < lg->stack_pages; i++)
-		/* The stack grows *upwards*, hence the subtraction */
-		pin_page(lg, lg->esp1 - i * PAGE_SIZE);
+		/* The stack grows *upwards*, so the address we're given is the
+		 * start of the page after the kernel stack.  Subtract one to
+		 * get back onto the first stack page, and keep subtracting to
+		 * get to the rest of the stack pages. */
+		pin_page(lg, lg->esp1 - 1 - i * PAGE_SIZE);
 }
 
 /* Direct traps also mean that we need to know whenever the Guest wants to use


-

From: Frederik Deweerdt
Date: Thursday, August 30, 2007 - 9:38 am

Hello Rusty,

I just could try the patch, sorry for the delay. Albeit it allows to
progress a little further in the boot process, lguest seems to like that
"section that was just freed" :)

Please note that:
- It could progress to "Freeing SMP alternatives: 13k freed", which is new.
  Indeed, your patch made the Host to pin 0xc04d3000, which is the
  good page.
- 0xc04d4000 is the __smp_locks section:
 $ objdump -h vmlinux
 [...]
 20 .data.init_task 00001000  c04d3000  004d3000  003d4000  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 21 .smp_locks    000036c8  c04d4000  004d4000  003d5000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 [...]

[    0.128503] SMP alternatives: switching to UP code
[    0.132846] Freeing SMP alternatives: 13k freed
[    0.135177] BUG: unable to handle kernel paging request at virtual address c04d4000
[    0.135417]  printing eip:
[    0.135505] c01051df
[    0.135564] *pde = 00005067
[    0.135645] *pte = 004d4000
[    0.135756] Oops: 0000 [#1]
[    0.135825] PREEMPT SMP DEBUG_PAGEALLOC
[    0.136039] Modules linked in:
[    0.136205] CPU:    0
[    0.136206] EIP:    0061:[<c01051df>]    Not tainted VLI
[    0.136207] EFLAGS: 00010097   (2.6.23-rc3 #5)
[    0.136665] EIP is at dump_trace+0x5f/0x97
[    0.136738] eax: c0614954   ebx: c04d3ffc   ecx: c0497b00   edx: c04ef641
[    0.136883] esi: c04d3000   edi: c04d3ffd   ebp: c04d3da0   esp: c04d3d90
[    0.137058] ds: 0069   es: 0069   fs: 00d8  gs: 0000  ss: 0069
[    0.137235] Process swapper (pid: 0, ti=c04d3000 task=c04953e0 task.ti=c04d3000)
[    0.137447] Stack: c0109d95 c0614954 c04953e0 00000000 c04d3db4 c010a1f1 c0497b00 c0614954
[    0.137831]        c0614954 c04d3dc4 c0140921 c0144252 c04959c8 c04d3dec c014272f c02eccf5
[    0.138119]        c04959c8 c0614938 c04d3dec 00000001 c04959c8 c0614938 c04953e0 c04d3e4c
[    0.138497] Call Trace:
[    0.138603]  [<c0105231>] show_trace_log_lvl+0x1a/0x2f
[    0.138798]  [<c01052e1>] show_stack_log_lvl+0x9b/0xa3
[   ...
From: Rusty Russell
Date: Thursday, August 30, 2007 - 3:12 pm

Yes, I got this too, then had to jump on a plane (and away from my test
box).

Turns out this actually isn't my bug (yay!).

See next patch...
Rusty.

-

From: Rusty Russell
Date: Thursday, August 30, 2007 - 3:14 pm

We don't care if ebp is on the stack, we care about ebp + 4.  Without
this, lguest (with CONFIG_DEBUG_LOCKDEP) can touch a page unmapped by
CONFIG_DEBUG_PAGEALLOC.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r b0b1ab8ecf48 arch/i386/kernel/traps.c
--- a/arch/i386/kernel/traps.c	Fri Aug 31 03:25:06 2007 +1000
+++ b/arch/i386/kernel/traps.c	Fri Aug 31 07:57:35 2007 +1000
@@ -100,7 +100,7 @@ print_context_stack(struct thread_info *tinfo,
 	unsigned long addr;
 
 #ifdef	CONFIG_FRAME_POINTER
-	while (valid_stack_ptr(tinfo, (void *)ebp)) {
+	while (valid_stack_ptr(tinfo, (void *)ebp + 4)) {
 		unsigned long new_ebp;
 		addr = *(unsigned long *)(ebp + 4);
 		ops->address(data, addr);


-

From: Linus Torvalds
Date: Thursday, August 30, 2007 - 9:44 pm

Hmm.. This *really* cannot happen with a normal kernel - it implies that 
the stack has crossed into an invalid page. 

Why is that allowed with lguest? What kind of code could validly *ever* 
come in here and cause problems?

I'm getting the nervous feeling that lguest is really doing things that 
shouldn't be done, or is using normal kernel functions in ways that they 
should not be used. 

In other words, yes, we load off "ebp+4", but I really don't see it being 
a valid situation wher ebp itself isn't also a valid stack frame. The 
stack is not sized for "off-by-one" errors - we're supposed to always have 
plenty of stack space free, and if you care about "off-by-one", you're not 
just living on the edge, you're way beyond it!

IOW, please explain why/how lguest ever triggers a case where this would 
possibly matter!

		Linus
-

From: Rusty Russell
Date: Thursday, August 30, 2007 - 11:03 pm

AFAICT, a corrupt stack could lead us to touch a page which isn't
mapped.  If we assume the stack isn't corrupt, we don't have to do the

head.S pushes a "$0" on the stack to stop the unwinder, lguest doesn't.

Here's the lguest fix, but I still think the real fix posted previously
is more important.

Cheers,
Rusty.
===
lguest doesn't terminate stack, upsets unwinder

Copy head.S, which puts a 0 on the stack to terminate ebp-chasing
backtrace code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r 926e5cc964fd drivers/lguest/lguest_asm.S
--- a/drivers/lguest/lguest_asm.S	Fri Aug 31 08:02:08 2007 +1000
+++ b/drivers/lguest/lguest_asm.S	Fri Aug 31 16:01:25 2007 +1000
@@ -19,6 +19,8 @@
  	movl $(init_thread_union+THREAD_SIZE),%esp
 	movl %esi, %eax
 	addl $__PAGE_OFFSET, %eax
+	/* Fake value to stop backtraces with CONFIG_FRAME_POINTER */
+	pushl $0
 	jmp lguest_init
 
 /*G:055 We create a macro which puts the assembler code between lgstart_ and


-

From: Linus Torvalds
Date: Friday, August 31, 2007 - 12:51 am

The unwinder should stop when it sees an invalid frame pointer, and even 
without the push 0 I'd have expected it to be invalid.

But I suspect lguest triggers another thing: you actually make the stack 
start at the *very*top* of the stack area. Afaik, normal x86 does not. A 
normal x86 kernel will start off with a pt_regs[] setup, I think - ie the 
kernel stack is always set up so that it has the "return to user mode" 
information.

And *that* difference may be what triggers this for lguest, even though it 
can never trigger for a "real" kernel.

But your patch does improve the sanity checking of the frame pointer. That 
said, I think the following patch improves it more: does this also work 
for you? (Totally untested, but it looks like the RightThing(tm) to do)

		Linus

---
diff --git a/arch/i386/kernel/traps.c b/arch/i386/kernel/traps.c
index cfffe3d..b9998f3 100644
--- a/arch/i386/kernel/traps.c
+++ b/arch/i386/kernel/traps.c
@@ -100,10 +100,10 @@ asmlinkage void machine_check(void);
 int kstack_depth_to_print = 24;
 static unsigned int code_bytes = 64;
 
-static inline int valid_stack_ptr(struct thread_info *tinfo, void *p)
+static inline int valid_stack_ptr(struct thread_info *tinfo, void *p, unsigned size)
 {
 	return	p > (void *)tinfo &&
-		p < (void *)tinfo + THREAD_SIZE - 3;
+		p <= (void *)tinfo + THREAD_SIZE - size;
 }
 
 static inline unsigned long print_context_stack(struct thread_info *tinfo,
@@ -113,7 +113,7 @@ static inline unsigned long print_context_stack(struct thread_info *tinfo,
 	unsigned long addr;
 
 #ifdef	CONFIG_FRAME_POINTER
-	while (valid_stack_ptr(tinfo, (void *)ebp)) {
+	while (valid_stack_ptr(tinfo, (void *)ebp, 2*sizeof(unsigned long))) {
 		unsigned long new_ebp;
 		addr = *(unsigned long *)(ebp + 4);
 		ops->address(data, addr);
@@ -129,7 +129,7 @@ static inline unsigned long print_context_stack(struct thread_info *tinfo,
 		ebp = new_ebp;
 	}
 #else
-	while (valid_stack_ptr(tinfo, stack)) {
+	while ...
From: Rusty Russell
Date: Friday, August 31, 2007 - 10:37 am

This is only for the initial booting stack (init_thread_union); see
arch/i386/kernel/head.S:
	/* Set up the stack pointer */
	lss stack_start,%esp
	...
	pushl $0		# fake return address for unwinder
	...
.data
ENTRY(stack_start)
	.long init_thread_union+THREAD_SIZE
	.long __BOOT_DS

lguest_asm.S missed the pushl $0 (lguest doesn't boot via head.S.  I'd
like to change that for 2.6.24, but it involved perturbing that code so


*((unsigned long *)ebp + 1)?

Thanks,
Rusty.

-

From: Linus Torvalds
Date: Friday, August 31, 2007 - 11:24 am

Ok, we should fix that. We should just make it look like all other stack 
frames.

There is other code in the kernel that "knows" that all kernel stacks have 
the fields for the user stack return on it, namely the ptrace code etc. 
Now, the initial stack is hopefully never *accessed* by that kind of code, 

Well, we might as well then just make the code readable instead. IOW, how 
about this one, which just declares a structure that describes the stack 
frame thing? That just makes everything clearer, since we can then use 
"sizeof(that structure)" instead of using the magic "2*sizeof(unsigned 
long)".

		Linus

---
diff --git a/arch/i386/kernel/traps.c b/arch/i386/kernel/traps.c
index cfffe3d..47b0bef 100644
--- a/arch/i386/kernel/traps.c
+++ b/arch/i386/kernel/traps.c
@@ -100,36 +100,45 @@ asmlinkage void machine_check(void);
 int kstack_depth_to_print = 24;
 static unsigned int code_bytes = 64;
 
-static inline int valid_stack_ptr(struct thread_info *tinfo, void *p)
+static inline int valid_stack_ptr(struct thread_info *tinfo, void *p, unsigned size)
 {
 	return	p > (void *)tinfo &&
-		p < (void *)tinfo + THREAD_SIZE - 3;
+		p <= (void *)tinfo + THREAD_SIZE - size;
 }
 
+/* The form of the top of the frame on the stack */
+struct stack_frame {
+	struct stack_frame *next_frame;
+	unsigned long return_address;
+};
+
 static inline unsigned long print_context_stack(struct thread_info *tinfo,
 				unsigned long *stack, unsigned long ebp,
 				struct stacktrace_ops *ops, void *data)
 {
-	unsigned long addr;
-
 #ifdef	CONFIG_FRAME_POINTER
-	while (valid_stack_ptr(tinfo, (void *)ebp)) {
-		unsigned long new_ebp;
-		addr = *(unsigned long *)(ebp + 4);
+	struct stack_frame *frame = (struct stack_frame *)ebp;
+	while (valid_stack_ptr(tinfo, frame, sizeof(*frame))) {
+		struct stack_frame *next;
+		unsigned long addr;
+
+		addr = frame->return_address;
 		ops->address(data, addr);
 		/*
 		 * break out of recursive entries (such as
 		 * ...
From: Rusty Russell
Date: Tuesday, September 4, 2007 - 11:18 am

Much nicer, thanks.

Rusty.

-

From: Jeremy Fitzhardinge
Date: Thursday, August 23, 2007 - 4:16 pm

Does this fix a real problem?  Or is there just some redundancy? 
Wouldn't it be better to put the noreplace_smp test in one place?

    J
-

From: Frederik Deweerdt
Date: Thursday, August 23, 2007 - 11:06 pm

I agree, but I don't think it is doable (alt_smp_once comes to mind). I'll
double check however.

Thanks,
Frederik
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 4:30 pm

config : http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-18

...

drivers/char/nozomi.c:2204: error: expected expression before '__attribute__'
make[2]: *** [drivers/char/nozomi.o] Error 1
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2
make: *** Waiting for unfinished jobs....


...
-

From: Randy Dunlap
Date: Wednesday, August 22, 2007 - 8:45 pm

__devexit should be __devexit_p


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Gabriel C
Date: Wednesday, August 22, 2007 - 4:34 pm

config : http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-18

...

fs/xfs/xfs_bmap_btree.c: In function 'xfs_bmbt_set_allf':
fs/xfs/xfs_bmap_btree.c:2312: error: 'b' undeclared (first use in this function)
fs/xfs/xfs_bmap_btree.c:2312: error: (Each undeclared identifier is reported only once
fs/xfs/xfs_bmap_btree.c:2312: error: for each function it appears in.)
fs/xfs/xfs_bmap_btree.c: In function 'xfs_bmbt_disk_set_allf':
fs/xfs/xfs_bmap_btree.c:2372: error: 'b' undeclared (first use in this function)
make[2]: *** [fs/xfs/xfs_bmap_btree.o] Error 1
make[2]: *** Waiting for unfinished jobs....
  CC      fs/reiser4/safe_link.o
  CC      fs/reiser4/plugin/plugin.o
make[1]: *** [fs/xfs] Error 2
make[1]: *** Waiting for unfinished jobs....

...
-

From: Randy Dunlap
Date: Wednesday, August 22, 2007 - 8:47 pm

patch is here:  
http://lkml.org/lkml/2007/8/22/153

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Zan Lynx
Date: Wednesday, August 22, 2007 - 7:08 pm

2.6.23-rc3-mm1/

After installing this new wonder kernel on my AMD-64 laptop, I
discovered that Beagle wouldn't start.  While enjoying how fast my
system felt ( :) ) I also discovered that Evolution wouldn't start
because it was built with mono integration.

Can't live without email, so I poked at it and discovered that if I run
mono applications (including Evolution) with the legacy memory layout,
they work.

Like this: setarch x86_64 -L evolution

This didn't happen on -rc2-mm2, so I think somebody changed something.
Mono claims to mmap with the MAP_32BIT option.

In -rc3-mm1 strace shows mono's mmap like this:
mmap(NULL, 65536, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS=
|MAP_32BIT, -1, 0) =3D 0x7fa21f5cb000

It's got MAP_32BIT, but that's not a 32-bit address...
--=20
Zan Lynx <zlynx@acm.org>
From: Andrew Morton
Date: Wednesday, August 22, 2007 - 11:57 pm

Thanks, it helps.

I'm thinking unkind thoughts about pie-executable-randomization.patch.

Below is a patch which removes

pie-executable-randomization.patch
pie-executable-randomization-fix.patch
pie-executable-randomization-fix-2.patch

from 2.6.23-rc3-mm1.  'twould be great if you could see if that fixes
things, thanks.


 arch/ia64/ia32/binfmt_elf32.c |    2 
 arch/x86_64/mm/mmap.c         |  107 ++++----------------------------
 fs/binfmt_elf.c               |  107 ++++++--------------------------
 3 files changed, 38 insertions(+), 178 deletions(-)

diff -puN fs/binfmt_elf.c~revert-pie-executable-randomization fs/binfmt_elf.c
--- a/fs/binfmt_elf.c~revert-pie-executable-randomization
+++ a/fs/binfmt_elf.c
@@ -45,7 +45,7 @@
 
 static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs);
 static int load_elf_library(struct file *);
-static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);
+static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int);
 
 /*
  * If we don't support core dumping, then supply a NULL so we
@@ -295,70 +295,33 @@ create_elf_tables(struct linux_binprm *b
 #ifndef elf_map
 
 static unsigned long elf_map(struct file *filep, unsigned long addr,
-		struct elf_phdr *eppnt, int prot, int type,
-		unsigned long total_size)
+		struct elf_phdr *eppnt, int prot, int type)
 {
 	unsigned long map_addr;
-	unsigned long size = eppnt->p_filesz + ELF_PAGEOFFSET(eppnt->p_vaddr);
-	unsigned long off = eppnt->p_offset - ELF_PAGEOFFSET(eppnt->p_vaddr);
-	addr = ELF_PAGESTART(addr);
-	size = ELF_PAGEALIGN(size);
+	unsigned long pageoffset = ELF_PAGEOFFSET(eppnt->p_vaddr);
 
+	down_write(&current->mm->mmap_sem);
 	/* mmap() will return -EINVAL if given a zero size, but a
 	 * segment with zero filesize is perfectly valid */
-	if (!size)
-		return addr;
-
-	down_write(&current->mm->mmap_sem);
-	/*
-	* total_size is the size of the ELF (interpreter) ...
From: Jiri Kosina
Date: Thursday, August 23, 2007 - 2:28 am

Hi Zan,

thanks for an excellent bugreport. Rather than throwing the whole 
pie-randomization and flexmmap support away, could you please test the 
patch below and let me know whether it fixes all your issues? Thanks.


From: Jiri Kosina <jkosina@suse.cz>

Handle MAP_32BIT flags properly in x86_64 flexmmap

We need to handle MAP_32BIT flags of mmap() properly for 64bit 
applications with filexible mmap layout.

This patch introduces x86_64-specific version of 
arch_get_unmapped_area_topdown() which differs from the generic one in 
handling of the MAP_32BIT flag -- when this flag is passed to mmap(), we 
stick back to the legacy layout for this particular mmap, which gives 
proper 32bit range.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>

diff --git a/arch/x86_64/kernel/sys_x86_64.c b/arch/x86_64/kernel/sys_x86_64.c
index 4770b7a..0e44d08 100644
--- a/arch/x86_64/kernel/sys_x86_64.c
+++ b/arch/x86_64/kernel/sys_x86_64.c
@@ -16,6 +16,7 @@
 #include <linux/file.h>
 #include <linux/utsname.h>
 #include <linux/personality.h>
+#include <linux/random.h>
 
 #include <asm/uaccess.h>
 #include <asm/ia32.h>
@@ -69,6 +70,7 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
 			   unsigned long *end)
 {
 	if (!test_thread_flag(TIF_IA32) && (flags & MAP_32BIT)) {
+		unsigned long new_begin;
 		/* This is usually used needed to map code in small
 		   model, so it needs to be in the first 31bit. Limit
 		   it to that.  This means we need to move the
@@ -78,6 +80,11 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
 		   of playground for now. -AK */ 
 		*begin = 0x40000000; 
 		*end = 0x80000000;		
+		if (current->flags & PF_RANDOMIZE) {
+			new_begin = randomize_range(*begin, *begin + 0x02000000, 0);
+			if (new_begin)
+				*begin = new_begin;
+		}
 	} else {
 		*begin = TASK_UNMAPPED_BASE;
 		*end = TASK_SIZE; 
@@ -147,6 +154,97 @@ full_search:
 	}
 }
 
+
+unsigned long
+arch_get_unmapped_area_topdown(struct file ...
From: Zan Lynx
Date: Thursday, August 23, 2007 - 10:32 am

[snip patch]

This does fix the mono problem.  Thank you.
--=20
Zan Lynx <zlynx@acm.org>
From: Andrew Morton
Date: Thursday, August 23, 2007 - 4:52 pm

On Thu, 23 Aug 2007 11:28:25 +0200 (CEST)

 arch/x86_64/kernel/sys_x86_64.c |   98 ++++++++++++++++++++++++++++++++++++++++
 include/asm-x86_64/pgtable.h    |    1 

well that's another hunk of code for us to maintain and to slow all our
computers down.

It is quite unobvious to me that the whole pie-randomization thing is
worth merging.  Why shouldn't we just drop the lot?


<looks at the changelog>

  This patch is using mmap()'s randomization functionality in such a way
  that it maps the main executable of (specially compiled/linked
  -pie/-fpie) ET_DYN binaries onto a random address (in cases in which
  mmap() is allowed to perform a randomization).

  The code has been extraced from Ingo's exec-shield patch
  http://people.redhat.com/mingo/exec-shield/

that certainly doesn't tell anyone why we should merge this code into Linux.
-

From: Jiri Kosina
Date: Thursday, August 23, 2007 - 5:09 pm

(some more CCs added)


Hi Andrew,

well, whenever it comes to address space layout randomization, there 
usually follows a huge debate whether it is needed or not, some people 
think it's useful and powerful security protection against 0day attacks, 
other people think that it's just fighting the bugs in userspace software 
in a wrong way.

Opinions differ, that's why there is a way to turn the VA space 
randomization completely off trivially.

We already have randomized stack, randomized mmap base, randomized vdso 
page in mainline kernel, but code and heap still stay on deterministic 
addresses. I think providing the possibility for users to have really full 
address space randomization (if they want to) is much better than 
providing the current slightly crippled state, when some parts of address 
space are randomized and some are not. Or do you think we should rather 
rip all the randomization off?

And it's almost certain to me that users want this functionality - look 
major distros. They seem to have out-of-tree patches to provide this 
functionality to their users, IMHO.

Thanks,

-- 
Jiri Kosina
-

From: Arjan van de Ven
Date: Friday, August 24, 2007 - 9:17 am

On Fri, 24 Aug 2007 02:09:59 +0200 (CEST)

randomizing PIE's is as a whole worth getting right and in mainline.
That means that ONLY the PIE text should be randomized, not that mmap
should break ;)

Randomizing address space is very widely recognized as being part of a
whole set of things (and there's a lot of discussion about what that
whole set should be, each vendor will say their solution should be part
of that and that all others suck) that you need to do to make it a LOT
harder to get a general purpose exploit working. (It's not fool proof;
it's more comparable than a 4 tumble number lock than it is to a iris
scan; yet even a tumble number lock makes it harder to break into your
gym locker)
-

From: Gautham R Shenoy
Date: Thursday, August 23, 2007 - 4:24 am

Hi Andrew, 

2.6.23-rc3-mm1 renders my machine unresponsive during bootup.
I've been observing this problem in 2.6.22-rc2-mm2 as well.
The boot log, the cpuinfo and .config have been appened at the
end

After embedding a few debug printk's in start_kernel, I noticed that
the last function to be executed was calibrate_delay()

Further, from the mm-bisect, the following patch turned out to be the
culprit:

x86_64-dynticks-disable-hpet_id_legsup-hpets.patch 
From: Andrew Morton <akpm@linux-foundation.org>

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/i386/kernel/hpet.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/i386/kernel/hpet.c~x86_64-dynticks-disable-hpet_id_legsup-hpets arch/i386/kernel/hpet.c
--- a/arch/i386/kernel/hpet.c~x86_64-dynticks-disable-hpet_id_legsup-hpets
+++ a/arch/i386/kernel/hpet.c
@@ -336,7 +336,7 @@ int __init hpet_enable(void)
 
 	clocksource_register(&clocksource_hpet);
 
-	if (id & HPET_ID_LEGSUP) {
+	if (0 && (id & HPET_ID_LEGSUP)) {
 		hpet_enable_int();
 		hpet_reserve_platform_timers(id);
 		/*

Any particular reason for this patch [There is no changelog :-)]? 
Without this patch the mm-kernel seems to behave just fine for me.

Thanks and Regards
gautham.

------------------------------------------------------------------------------
Linux version 2.6.23-rc3-mm1 (ego@llm43.in.ibm.com) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)) #18 SMP Thu Aug 23 15:54:03 IST 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 000000000009a400 (usable)
 BIOS-e820: 000000000009a400 - 00000000000a0000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bffcba40 (usable)
 BIOS-e820: 00000000bffcba40 - 00000000bffcee00 (ACPI data)
 BIOS-e820: 00000000bffcee00 - 00000000c0000000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: ...
From: Andrew Morton
Date: Thursday, August 23, 2007 - 1:47 pm

On Thu, 23 Aug 2007 16:54:23 +0530

Oh damn.  That patch is required to prevent a boot-time div-by-zero
on my old nocona machine.

I have a new set of x86_64-dynticks patches from Thomas to look at
so I guess it's reset-and-start-again time on that front.
-

From: Thomas Gleixner
Date: Thursday, August 23, 2007 - 1:56 pm

That patch was necessary due to a bug in the hpet code, which is
resolved for quite a while (hopefully). It was not a divide by zero, it

Yep. I dropped that patch in the series.

	tglx


-

From: Valdis.Kletnieks
Date: Thursday, August 23, 2007 - 6:33 am

OK, so I don't actually *use* the irDA on my laptop for much, but I figure if
I have the hardware, I should at least make sure the driver comes up.

23-rc3-mm1 causes massive spewage, apparently at least partially as a fall-out
of the sysctl rework.  Not sure if those caused the 'unknown symbol' issues or
not.  The 'cannot allocate memory' is somewhat odd too, I can't believe it was
*really* out of memory while still in /etc/rc5.d when I have 2G of ram...

kernel: [  247.804000] NET: Registered protocol family 23
kernel: [  247.804000] sysctl table check failed: /net/irda .3.412 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/discovery .3.412.1 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/devname .3.412.2 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/debug .3.412.3 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/fast_poll_increase .3.412.4 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/discovery_slots .3.412.5 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/discovery_timeout .3.412.6 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/slot_timeout .3.412.7 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/max_baud_rate .3.412.8 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/min_tx_turn_time .3.412.9 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/max_tx_data_size .3.412.10 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/max_tx_window .3.412.11 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/max_noreply_time .3.412.12 Unknown sysctl binary path
kernel: [  247.804000] sysctl table check failed: /net/irda/warn_noreply_time ...
From: Alexey Dobriyan
Date: Thursday, August 23, 2007 - 10:37 am

Eric is a bit sadistic with sysctls. If sysctl_check_table() fails
irda_sysctl_register() will think there is no memory.

As for unknown symbols, hell knows. Same as with multiple syscts errors.
	...
-

From: Valdis.Kletnieks
Date: Thursday, August 23, 2007 - 11:45 am

CONFIG_IRDA=m

#
# IrDA protocols
#
CONFIG_IRLAN=m
CONFIG_IRNET=m
CONFIG_IRCOMM=m
CONFIG_IRDA_ULTRA=y

#
# IrDA options
#
CONFIG_IRDA_CACHE_LAST_LSAP=y
CONFIG_IRDA_FAST_RR=y
CONFIG_IRDA_DEBUG=y

#
# Infrared-port device drivers
#

#
# SIR device drivers
#
CONFIG_IRTTY_SIR=m


On my doesn't-complain -rc2-mm1 kernel:

% lsmod | grep -i ir
irnet                  21409  0 
ppp_generic            22177  1 irnet
irtty_sir               8321  0 
sir_dev                14473  1 irtty_sir
ircomm_tty             35345  0 
ircomm                 20425  1 ircomm_tty
irda                  188973  5 irnet,irtty_sir,sir_dev,ircomm_tty,ircomm
crc_ccitt               2817  1 irda

My guess is that irda fails to insmod because of the sysctl errors, so when
the other ir* modules try to load, they come up empty on symbols that irda
provides.
From: Andrew Morton
Date: Thursday, August 23, 2007 - 2:16 pm

On Thu, 23 Aug 2007 09:33:46 -0400

Cute.  Eric, can you please suggest what we should do here?

Yes, the ENOMEM is bogus.  But irda_sysctl_register() saw a NULL return
from register_sysctl_table() and simply has no clue why it failed, and is
forced to assume ENOMEM.  That's a design shortcoming in
register_sysctl_table(), whcih should have returned an ERR_PTR.  Doesn't
matter much.


-

From: Eric W. Biederman
Date: Thursday, August 23, 2007 - 8:11 pm

Grumble.

Ok.   This is a two sided bug.
The NET_IRDA define as not put in sysctl.h where it belongs so I
missed it, when making the list of all existing binary sysctls.

So really I need to put update the sysctl_check tables to have
the NET_IRDA numbers, because at least at first skim everything
looks ok on the binary side.

Patches to follow shortly.

Eric
-

From: Eric W. Biederman
Date: Thursday, August 23, 2007 - 8:46 pm

I should say something about the return value issue.

Currently the only time this matters is when someone messes up in
development, and if it isn't an out of memory error we get messages in
dmesg so it shouldn't be to hard to sort out.

I agree it is a bit of a short coming that we can only return NULL
and it might be worth changing that at some point.  Perhaps when
I introduce register_sysctl_path would be a good time.   Going
through all of the callers just to give a better return value when
they can't do anything about it anyway seems to be a lot of work
for a very minor improvement.

Eric
-

From: Eric W. Biederman
Date: Thursday, August 23, 2007 - 8:53 pm

Grumble.  These numbers should have been in sysctl.h from the
beginning if we ever expected anyone to use them.  Oh well put
them there now so we can find them and make maintenance easier.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 include/linux/sysctl.h |   20 ++++++++++++++++++++
 net/irda/irsysctl.c    |   34 ++++++++++++++--------------------
 2 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 88f0941..77c9ae2 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -238,6 +238,7 @@ enum
 	NET_LLC=18,
 	NET_NETFILTER=19,
 	NET_DCCP=20,
+	NET_IRDA=412,
 };
 
 /* /proc/sys/kernel/random */
@@ -795,6 +796,25 @@ enum {
 	NET_BRIDGE_NF_FILTER_PPPOE_TAGGED = 5,
 };
 
+/* proc/sys/net/irda */
+enum { 
+	NET_IRDA_DISCOVERY=1,
+	NET_IRDA_DEVNAME=2, 
+	NET_IRDA_DEBUG=3, 
+	NET_IRDA_FAST_POLL=4,
+	NET_IRDA_DISCOVERY_SLOTS=5,
+	NET_IRDA_DISCOVERY_TIMEOUT=6,
+	NET_IRDA_SLOT_TIMEOUT=7,
+	NET_IRDA_MAX_BAUD_RATE=8,
+	NET_IRDA_MIN_TX_TURN_TIME=9,
+	NET_IRDA_MAX_TX_DATA_SIZE=10,
+	NET_IRDA_MAX_TX_WINDOW=11,
+	NET_IRDA_MAX_NOREPLY_TIME=12,
+	NET_IRDA_WARN_NOREPLY_TIME=13,
+	NET_IRDA_LAP_KEEPALIVE_TIME=14,
+};
+
+
 /* CTL_FS names: */
 enum
 {
diff --git a/net/irda/irsysctl.c b/net/irda/irsysctl.c
index 957e04f..525343a 100644
--- a/net/irda/irsysctl.c
+++ b/net/irda/irsysctl.c
@@ -31,12 +31,6 @@
 #include <net/irda/irda.h>		/* irda_debug */
 #include <net/irda/irias_object.h>
 
-#define NET_IRDA 412 /* Random number */
-enum { DISCOVERY=1, DEVNAME, DEBUG, FAST_POLL, DISCOVERY_SLOTS,
-       DISCOVERY_TIMEOUT, SLOT_TIMEOUT, MAX_BAUD_RATE, MIN_TX_TURN_TIME,
-       MAX_TX_DATA_SIZE, MAX_TX_WINDOW, MAX_NOREPLY_TIME, WARN_NOREPLY_TIME,
-       LAP_KEEPALIVE_TIME };
-
 extern int  sysctl_discovery;
 extern int  sysctl_discovery_slots;
 extern int  sysctl_discovery_timeout;
@@ -94,7 +88,7 @@ static int do_devname(ctl_table *table, int write, struct file *filp,
 /* ...
From: Eric W. Biederman
Date: Thursday, August 23, 2007 - 8:55 pm

It turns out that the net/irda code didn't register any of
it's binary paths in the global sysctl.h header file so
I missed them completely when making an authoritative list
of binary sysctl paths in the kernel.  So add them to
the list of valid binary sysctl paths.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 kernel/sysctl_check.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index d5e0337..aa5b6f6 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -702,6 +702,24 @@ static struct trans_ctl_table trans_net_dccp_table[] = {
 	{}
 };
 
+static struct trans_ctl_table trans_net_irda_table[] = {
+	{ NET_IRDA_DISCOVERY,		"discovery" },
+	{ NET_IRDA_DEVNAME,		"devname" },
+	{ NET_IRDA_DEBUG,		"debug" },
+	{ NET_IRDA_FAST_POLL,		"fast_poll_increase" },
+	{ NET_IRDA_DISCOVERY_SLOTS,	"discovery_slots" },
+	{ NET_IRDA_DISCOVERY_TIMEOUT,	"discovery_timeout" },
+	{ NET_IRDA_SLOT_TIMEOUT,	"slot_timeout" },
+	{ NET_IRDA_MAX_BAUD_RATE,	"max_baud_rate" },
+	{ NET_IRDA_MIN_TX_TURN_TIME,	"min_tx_turn_time" },
+	{ NET_IRDA_MAX_TX_DATA_SIZE,	"max_tx_data_size" },
+	{ NET_IRDA_MAX_TX_WINDOW,	"max_tx_window" },
+	{ NET_IRDA_MAX_NOREPLY_TIME,	"max_noreply_time" },
+	{ NET_IRDA_WARN_NOREPLY_TIME,	"warn_noreply_time" },
+	{ NET_IRDA_LAP_KEEPALIVE_TIME,	"lap_keepalive_time" },
+	{}
+};
+
 static struct trans_ctl_table trans_net_table[] = {
 	{ NET_CORE,		"core",		trans_net_core_table },
 	/* NET_ETHER not used */
@@ -723,6 +741,7 @@ static struct trans_ctl_table trans_net_table[] = {
 	{ NET_LLC,		"llc",		trans_net_llc_table },
 	{ NET_NETFILTER,	"netfilter",	trans_net_netfilter_table },
 	{ NET_DCCP,		"dccp",		trans_net_dccp_table },
+	{ NET_IRDA,		"irda",		trans_net_irda_table },
 	{ 2089,			"nf_conntrack_max" },
 	{}
 };
-- 
1.5.1.1.181.g2de0

-

From: Samuel Ortiz
Date: Sunday, August 26, 2007 - 3:03 pm

From: Valdis.Kletnieks
Date: Saturday, August 25, 2007 - 1:29 am

Applied both patches, and now all I get from irda at boot time now is this:

[  292.062000] irda_init()
[  292.063000] NET: Registered protocol family 23
[  292.069000] IrCOMM protocol (Dag Brattli)
[  292.221000] PPP generic driver version 2.4.2

in other words, business as usual. Thanks.

Feel free to stick this on both patches:

Tested-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>
From: Eric W. Biederman
Date: Saturday, August 25, 2007 - 5:57 am

Thanks.

It's good to have confirmation that my sysctl_check routine
didn't find something else wrong.

Eric
-

From: Valdis.Kletnieks
Date: Saturday, August 25, 2007 - 7:07 am

If I understand the code, anything it whinges about is either an outright bug
or it's a round of ammo already chambered. ;)

As far as "something else wrong", I'm still seeing these in -rc3-mm1, but
they've been reported before against -rc2-mm2, I think:

[    0.628000] sysctl table check failed: /kernel/ostype .1.1 Missing strategy
[    0.628000] sysctl table check failed: /kernel/osrelease .1.2 Missing strategy
[    0.628000] sysctl table check failed: /kernel/version .1.4 Missing strategy
[    0.628000] sysctl table check failed: /kernel/hostname .1.7 Missing strategy
[    0.628000] sysctl table check failed: /kernel/domainname .1.8 Missing strategy
[    0.628000] sysctl table check failed: /kernel/shmmax .1.34 Missing strategy
[    0.628000] sysctl table check failed: /kernel/shmall .1.41 Missing strategy
[    0.628000] sysctl table check failed: /kernel/shmmni .1.45 Missing strategy
[    0.628000] sysctl table check failed: /kernel/msgmax .1.35 Missing strategy
[    0.628000] sysctl table check failed: /kernel/msgmni .1.42 Missing strategy
[    0.628000] sysctl table check failed: /kernel/msgmnb .1.36 Missing strategy
[    0.628000] sysctl table check failed: /kernel/sem .1.43 Missing strategy

And this isn't on an allyesconfig or allmodconfig. There may well be sysctl
code I didn't hit - my /lib/modules/2.6.23-rc3-mm1 is only about 10M, and
the Fedora kernels are weighing in at about 75M of /lib/modules a pop.
From: Eric W. Biederman
Date: Saturday, August 25, 2007 - 10:59 am

Interesting.  No I haven't seen this one.  This appears to be one of
those silly little corner cases I failed to account for in my checks.
It looks like you don't have CONFIG_SYSCTL_SYSCALL defined, and it
appears utsname_syscall and ipcdata_syscall both become NULL pointers

Yes.  Thank you.  I figure as long as we are reasonably close people
we should catch most if not all things before this is merged into
Linus's tree.

Patch in a moment.

Eric
-

From: Valdis.Kletnieks
Date: Tuesday, August 28, 2007 - 11:40 am

Yep. Nothing I actually use needs SYSCTL_SYSCALL, so I turned it off to
see what breaks...
From: Eric W. Biederman
Date: Tuesday, August 28, 2007 - 2:06 pm

Other then glibc (which uses it to see if we are on a SMP system, and
has a fallback to /proc/sys) I only found 5 other applications
binaries when I was looking hard.

Eric

-

From: Eric W. Biederman
Date: Saturday, August 25, 2007 - 11:03 am

Currently sysctl_check_table will complain if a strategy routine
is missing when we have sys_sysctl compiled out, or a proc_handler
is missing when we have procfs compiled out.  At least some
of the custom handlers actually expand to NULL when this is the
case so the warning is actually a problem.



So don't worry about missing strategy routines, or missing proc_handler
routines when they will never be called.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 kernel/sysctl_check.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index aa5b6f6..10dd744 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1552,14 +1552,18 @@ int sysctl_check_table(struct ctl_table *table)
 						set_fail(&fail, table, "No max");
 				}
 			}
+#ifdef CONFIG_SYSCTL_SYSCALL
 			if (table->ctl_name && !table->strategy)
 				set_fail(&fail, table, "Missing strategy");
+#endif
 #if 0
 			if (!table->ctl_name && table->strategy)
 				set_fail(&fail, table, "Strategy without ctl_name");
 #endif
+#ifdef CONFIG_PROC_FS
 			if (table->procname && !table->proc_handler)
 				set_fail(&fail, table, "No proc_handler");
+#endif
 #if 0
 			if (!table->procname && table->proc_handler)
 				set_fail(&fail, table, "proc_handler without procname");
-- 
1.5.3.rc6.17.g1911

-

From: Valdis.Kletnieks
Date: Tuesday, August 28, 2007 - 11:44 am

I'm not seeing the false-positive msgs after applying this patch...

Tested-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>
From: Samuel Ortiz
Date: Sunday, August 26, 2007 - 3:02 pm

Hi Eric,


-

From: Tilman Schmidt
Date: Friday, August 24, 2007 - 4:27 pm

3/2.6.23-rc3-mm1/

After applying Matthew Wilcox' patch to include/linux/isa.h this compiles=

and boots on my Intel/openSUSE 10.2 test machine but throws out the
following messages I don't remember ever seeing with other kernels:

- on console early during boot, also in SuSE's /var/log/boot.msg:

your system time is not correct:
Wed Jul 13 13:15:31 UTC 1910
setting system time to:
Tue Jul 24 00:00:00 UTC 2007

- later, dto. on console and in /var/log/boot.msg:

FATAL: Error inserting processor (/lib/modules/2.6.23-rc3-mm1-testing/ker=
nel/drivers/acpi/processor.ko): Input/output error
WARNING: Error inserting processor (/lib/modules/2.6.23-rc3-mm1-testing/k=
ernel/drivers/acpi/processor.ko): Input/output error
FATAL: Error inserting thermal (/lib/modules/2.6.23-rc3-mm1-testing/kerne=
l/drivers/acpi/thermal.ko): Unknown symbol in module, or unknown paramete=
r (see dmesg)

- apparently corresponding to that, in dmesg:

<4>[    7.691865] thermal: Unknown symbol acpi_processor_set_thermal_limi=
t

- from fsck during boot:

/dev/system/root: Superblock last mount time is in the future.  FIXED.
/dev/system/root: Superblock last write time is in the future.  FIXED.

- in /var/log/warn:

Aug 25 00:44:00 xenon powersaved[5356]: WARNING (CpufreqManagement:51) No=
 capability cpufreq_control
Aug 25 00:44:00 xenon powersaved[5356]: WARNING (CpufreqManagement:51) No=
 capability cpufreq_control

And the SUSE startup sequence displays "failed" for the acpid daemon.
So it seems there is some strangeness wrt to system time and power
management. I don't have the time to bisect this right now, but
wanted to let you know anyway.

Apart from that, the kernel seems to work fine.

HTH

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Andrew Morton
Date: Friday, August 24, 2007 - 5:07 pm

On Sat, 25 Aug 2007 01:27:25 +0200


What architecture?

if x86_64 then perhaps something went wrong with the old x86_64 dynticks
leftovers which were in rc3-mm1.  I've just merged a shiny fresh new series
so perhaps things there got fixed.  Please retest next -mm.




Dunno, there're some significant-looking cpufreq changes in there, such as
cpufreq-allow-ondemand-and-conservative-cpufreq-governors-to-be-used-as-default.patch.
Maybe we went and chose a different governor for you?

Perhaps it would be helpful if you could do a 

	diff -u dmesg-2.6.23-rc3 dmesg-2.6.26-rc3-mm1


OK, thanks.
-

From: Pallipadi, Venkatesh
Date: Friday, August 24, 2007 - 5:13 pm

This is probably not related to cpufreq changes itself. Looks like it is

Not sure why this is failing though. Don't recall any significant
changes in processor.ko recently apart from CPUIDLE stuff.

Thanks,
Venki
-

From: Pallipadi, Venkatesh
Date: Friday, August 24, 2007 - 5:38 pm

This is indeed related to CPUIDLE.

Tilman: Can you configure CONFIG_CPU_IDLE in your config (under Power
Management option) and double check that the frequency part works after
that.

Andrew: Adam Belay sent a recent patchset on linux-pm and linux-acpi and
one of the patches of that addresses this issue (CPUIDLE: load ACPI
properly when CPUIDLE is disabled). Those patches should come to mm soon
through acpi-test.

Thanks,
Venki
-

From: Tilman Schmidt
Date: Saturday, August 25, 2007 - 4:26 pm

Strangely enough, I do not see that option in "make xconfig".
The "Power Management" subtree ends with "CPU Frequency scaling".
In "make menuconfig" the option is there, though.
After activating it, these two errors are indeed gone, and the
"thermal: Unknown symbol acpi_processor_set_thermal_limit" one
as well.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Randy Dunlap
Date: Saturday, August 25, 2007 - 4:57 pm

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Tilman Schmidt
Date: Monday, August 27, 2007 - 6:35 am

I'm sorry but I cannot reproduce the phenomenon anymore. After setting
CONFIG_CPU_IDLE with "make menuconfig", when I ran "make xconfig" again
it showed the option too. Moving .config.old back to .config doesn't
make it disappear again. So it seems "make menuconfig" has elliminated
whatever caused this.

If you still want to have a look, both .config and .config.old are
available at http://gollum.phnxsoft.com/~ts/2.6.23-rc3-mm1/ .

--=20
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Dave Jones
Date: Friday, August 24, 2007 - 5:14 pm

On Fri, Aug 24, 2007 at 05:07:02PM -0700, Andrew Morton wrote:

 > > FATAL: Error inserting processor (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): Input/output error
 > > WARNING: Error inserting processor (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): Input/output error
 > > 
 > > Aug 25 00:44:00 xenon powersaved[5356]: WARNING (CpufreqManagement:51) No capability cpufreq_control
 > > Aug 25 00:44:00 xenon powersaved[5356]: WARNING (CpufreqManagement:51) No capability cpufreq_control
 > 
 > Dunno, there're some significant-looking cpufreq changes in there, such as
 > cpufreq-allow-ondemand-and-conservative-cpufreq-governors-to-be-used-as-default.patch.
 > Maybe we went and chose a different governor for you?
 
More likely, he was using a cpufreq driver that required acpi
functionality, and because processor.ko went boom, the house of cards
came tumbling down.

I long for the olde days when acpi changes didn't end up with
finger pointing at cpufreq.

	Dave

-- 
http://www.codemonkey.org.uk
-

From: john stultz
Date: Friday, August 24, 2007 - 5:21 pm

Hrmm. I'm not super familiar w/ SuSE's init scripts, but I'm guessing
that's the ntpdate call. And "Tuesday Jul 24th"? Sounds about a month


Does this show up before or after the above date stuff? 
Does the issue go away using an older kernel (I want to eliminate easy
stuff like CMOS batteries giving up)?

Also you're not using Linus' CMOS corrupting suspend/resume debugging
trick, right (I'm forgetting the CONFIG name).

thanks
-john


-

From: Tilman Schmidt
Date: Saturday, August 25, 2007 - 3:39 pm

Nope. The ntpdate call comes much later, and finally sets the system cloc=
k


After the "your system time is not correct" messages, and before the
regular "Try to get initial date and time via NTP" message accompanying

It does. Booting 2.6.23-rc3 after that, the system comes up with none

PM_TRACE? No. The entire PM_DEBUG branch is turned off.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Friday, August 24, 2007 - 5:47 pm

This is a multi-part message in MIME format.
--------------060401020703050807050402
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable


The one allowing drivers/scsi/advansys.c to compile with CONFIG_ISA=3Dn:

Date: 	Wed, 22 Aug 2007 10:28:02 -0600
From: Matthew Wilcox <matthew@wil.cx>
Subject: Re: drivers/scsi/advansys.c - ld error ( Re: 2.6.23-rc3-mm1 )
Message-ID: <20070822162802.GJ9163@parisc-linux.org>

When CONFIG_ISA is disabled, the isa_driver support will not be compiled

i386. The machine is a Pentium D 940 which would be x86_64 capable,


I hope the attached helps. I created it by taking /var/log/boot.msg
of the two systems, removing everything after "Kernel logging stopped",
editing out the printk timestamps and then running diff -u on them,
so it should be more or less the dmesg diff. I did not edit out any
of the differences because I'm lazy. (And also because I wasn't so sure
what would or wouldn't be interesting for you.)

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

--------------060401020703050807050402
Content-Type: text/plain;
 name="bootmsg.diff"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
 filename="bootmsg.diff"

--- /tmp/bootmsg-2.6.23-rc3	2007-08-25 02:25:54.000000000 +0200
+++ /tmp/bootmsg-2.6.23-rc3-mm1	2007-08-25 02:26:08.000000000 +0200
@@ -1,10 +1,10 @@
-Inspecting /boot/System.map-2.6.23-rc3-testing
-Loaded 27522 symbols from /boot/System.map-2.6.23-rc3-testing.
+Inspecting /boot/System.map-2.6.23-rc3-mm1-testing
+Loaded 28663 symbols from /boot/System.map-2.6.23-rc3-mm1-testing.
 Symbols match kernel version 2.6.23.
 No module symbols loaded - kernel modules not enabled.
=20
 klogd 1.4.1, log source =3D ksyslog started.
-<5>Linux version 2.6.23-rc3-testing (ts@xenon) (gcc version 4.1.2 200611=
15 (prerelease) ...
From: Andrew Morton
Date: Friday, August 24, 2007 - 8:30 pm

I wonder if that was supposed to happen.  It's also happening in 2.6.23-rc3
base.


I don't see anything there which would cause you to lose the clock setting,
but there are obviously a few things going wrong in the time-management
area here.  Please explicity retest this stuff as the code evolves and kepp
us informed of the problems.

Thanks.
-

From: Dave Jones
Date: Friday, August 24, 2007 - 9:28 pm

On Fri, Aug 24, 2007 at 08:30:00PM -0700, Andrew Morton wrote:
 
 > >  <6>Linux agpgart interface v0.102
 > > +<6>rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
 > > +<4>rtc_cmos: probe of 00:03 failed with error -16
 > > +<6>agpgart: suspend/resume problematic: resume with 3D/DRI active may lockup X.Org
 > > +<4>on some chipset/BIOS combos (see DEBUG_AGP_PM in intel-agp.c)
 > >  <6>agpgart: Detected an Intel 965Q Chipset.
 > >  <6>agpgart: Unknown page table size, assuming 512KB
 > >  <6>agpgart: Detected 7676K stolen memory.
 > >  <6>agpgart: AGP aperture is 256M @ 0x40000000
 > > -<6>rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
 > > -<4>rtc_cmos: probe of 00:03 failed with error -16
 > 
 > I wonder if that was supposed to happen.  It's also happening in 2.6.23-rc3
 > base.
 
EBUSY. I've seen this happen when you have both CONFIG_RTC and
CONFIG_RTC_DRV_CMOS set.

	Dave

-- 
http://www.codemonkey.org.uk
-

From: Paul Rolland
Date: Saturday, August 25, 2007 - 12:55 am

Hello,

On Sat, 25 Aug 2007 00:28:09 -0400

This one is becoming quite worth an entry in a FAQ, it pops up one every
month ;)
There was a discussion about preventing both being set at the same time
when configuring, but I don't remember how it ends... 

Paul


-

From: Tilman Schmidt
Date: Saturday, August 25, 2007 - 4:37 pm

I must have missed that discussion. I have:
CONFIG_RTC=3Dy
CONFIG_RTC_DRV_CMOS=3Dm
because both of these options claim in their help texts that you
should select them if you want to access the PC RTC.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Wednesday, September 5, 2007 - 1:41 pm

With 2.6.23-rc4-mm1 this doesn't happen anymore,

This still happens identically with 2.6.23-rc4-mm1.

2.6.23-rc4-mm1 reverts to mainline behaviour here.
(ie. "busy" instead of "no address or irqs")

Gone in 2.6.23-rc4-mm1.

HTH
Tilman

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Jiri Slaby
Date: Sunday, August 26, 2007 - 6:04 am

Hi,

I've found a regression against 2.6.23-rc2-mm2. X server shutdown freezes
(untainted) kernel hardly. Nothing on netconsole, X output follows:


X Window System Version 1.3.0
Release Date: 19 April 2007
X Protocol Version 11, Revision 0, Release 1.3
Build Operating System: Fedora Core 7 Red Hat, Inc.
Current Operating System: Linux bellona 2.6.23-rc3-mm1 #315 SMP Wed Aug 22
21:43:06 CEST 2007 i686
Build Date: 11 June 2007
Build ID: xorg-x11-server 1.3.0.0-9.fc7
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Module Loader present
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sun Aug 26 14:22:43 2007
(==) Using config file: "/etc/X11/xorg.conf"
(WW) RADEON: No matching Device section for instance (BusID PCI:1:0:1) found
(**) RADEON(0): RADEONPreInit
(II) Module already built-in
(II) Module already built-in
(II) Module already built-in
(**) RADEON(0): RADEONScreenInit f0000000 0
(**) RADEON(0): Map: 0xf0000000, 0x04000000
(**) RADEON(0): RADEONSave
(**) RADEON(0): RADEONSaveMode(0x8240870)
(**) RADEON(0): Read: 0x0000000c 0x00030065 0x00000000
(**) RADEON(0): Read: rd=12, fd=101, pd=3
(**) RADEON(0): RADEONSaveMode returns 0x8240870
(**) RADEON(0): DRI New memory map param
(**) RADEON(0): RADEONInitMemoryMap() :
(**) RADEON(0):   mem_size         : 0x04000000
(**) RADEON(0):   MC_FB_LOCATION   : 0xf3fff000
(**) RADEON(0):   MC_AGP_LOCATION  : 0xffffffc0
(**) RADEON(0): RADEONModeInit()
1280x1024     108.00  1280 1328 1440 1688  1024 1025 1028 1066 (24,32) +H +V
1280x1024     108.00  1280 1328 1440 1688  1024 1025 1028 1066 (24,32) +H +V
(**) RADEON(0): Pitch = 10485920 bytes (virtualX = 1280, displayWidth = 1280)
(**) RADEON(0): dc=10800, of=21600, fd=96, pd=2
(**) RADEON(0): RADEONInit returns ...
From: Jiri Slaby
Date: Tuesday, August 28, 2007 - 4:41 am

Does this went through to your boxes? Any progress, clue, idea?



-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 4:44 am

Also intel on integrated i915 causes this (NoAccell has no effect in this case).
Note that also 2.6.23-rc4-mm1 is affected by this behaviour.
I have a trace for you:
http://www.fi.muni.cz/~xslaby/sklad/panics/x-freeze.png
(this is the only what I'm able to grab so far)

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Andrew Morton
Date: Sunday, September 9, 2007 - 5:47 am

afacit everything on that call trace is good.  I guess it's possible that
one of the higher-level loops has gone infinite (eg, the one in
agp_remove_controller()).

Are you able to get netconsole working, and run sysrq-P and sysrq-T
ten or so times, see if it's always stuck in the same place on the
same CPU?
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 6:04 am

Removed gareth@valinux.com (dead e-mail)


--------------------------------^^^

sorry, both netconsole and my usb devices (including my keyboard -- no numlock

There is no loop in this function. Anyway, going to track this whole issue down.

thanks so far,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 7:08 am

Hm, I suspect Andi's x86_64-mm-cpa-clflush.patch or something like that. It
loops in flush_kernel_map in list_for_each_entry on the first CPU. The a->l list
is somehow corrupted I guess.

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Andi Kleen
Date: Sunday, September 9, 2007 - 7:17 am

Does it still happen with the latest version 

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/cpa-clflush

?

(you might need to replace the other cpa-* patches too because
they depend on each other) 

-Andi
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 7:26 am

I think so :)
$ diff -u x86_64-mm-cpa-clflush.patch cpa-clflush |wc -l

And are there any changes against the -rc4-mm1 in those patches?

BTW it is reproducible for me on two different machines (i386-x86_64,
radeon-intel), don't you have the problem too?

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Andi Kleen
Date: Sunday, September 9, 2007 - 7:33 am

No problems here with a radeon, no.

Does your CPU have clflush or not in /proc/cpuinfo?

-Andi
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 7:35 am

yes:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Duo CPU     E6850  @ 3.00GHz
stepping        : 11
cpu MHz         : 2992.543
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr lahf_lm
bogomips        : 5991.99
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Duo CPU     E6850  @ 3.00GHz
stepping        : 11
cpu MHz         : 2992.543
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr lahf_lm
bogomips        : 5985.42
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 7:43 am

BTW this is how my flush_kernel_map looks like:

static void flush_kernel_map(void *arg)
{
        struct flush_arg *a = (struct flush_arg *)arg;
        struct page *pg;
        unsigned int xx = 0;

        /* When clflush is available use it because it is
           much cheaper than WBINVD. */
        printk("%s: 1\n", __func__);
        if (a->full_flush || !cpu_has_clflush)
                asm volatile("wbinvd" ::: "memory");
        else list_for_each_entry(pg, &a->l, lru) {
        printk("%s: %10u 1a\n", __func__, xx++);
                if (PageFlush(pg))
                        clflush_cache_range(page_address(pg), PAGE_SIZE);
        }
        printk("%s: 2\n", __func__);
        __flush_tlb_all();
        printk("%s: 3\n", __func__);
}

It outputs 1a in the infinite loop with incrementing xx. But only in this case,
some global_flush_tlb are OK. e.g.:
Sep 10 01:39:19 localhost kernel: agpgart: Detected an Intel G33 Chipset.
Sep 10 01:39:19 localhost kernel: global_flush_tlb: 1
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 1
Sep 10 01:39:19 localhost kernel: flush_kernel_map:          0 1a
Sep 10 01:39:19 localhost kernel: flush_kernel_map:          1 1a
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 2
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 3
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 1
Sep 10 01:39:19 localhost kernel: flush_kernel_map:          0 1a
Sep 10 01:39:19 localhost kernel: flush_kernel_map:          1 1a
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 2
Sep 10 01:39:19 localhost kernel: flush_kernel_map: 3
Sep 10 01:39:19 localhost kernel: global_flush_tlb: 2
Sep 10 01:39:19 localhost kernel: global_flush_tlb: 3
Sep 10 01:39:19 localhost kernel: agpgart: Detected 6140K stolen memory.

It seems, that the list is broken only on X shutdown. How can be deferred-pages
list inited in some bad manner, when list_replace_init is called on it? Weird.

-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, ...
From: Andi Kleen
Date: Sunday, September 9, 2007 - 8:01 am

Can you stick a printk into change_page_attr to log in which order
it changes pages (including their addresses and the pgattr)? 

-Andi 
-

From: Jiri Slaby
Date: Sunday, September 9, 2007 - 8:49 am

printk("%s: %p %p %.16lx %d\n", __func__, page, page_address(page),
pgprot_val(prot), numpages);
http://www.fi.muni.cz/~xslaby/sklad/panics/x-freeze_chpa.png

What other info needed?

-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Dave Airlie
Date: Tuesday, September 11, 2007 - 8:18 am

I'm seeing this on my 965gm chipset with Andi's clflush patches on x86 
32-bit, it looks like an interaction with the agp code which does a big 
bunch of change page attr to allocate the AGP aperture backed memory..

I think the code might have worked in a previous iteration on my 64-bit 
965G machine at home but I'm on the road and won't be back anytime soon..

I'll see what I can figure out from my laptop...

Dave.
-

From: Jiri Slaby
Date: Monday, September 17, 2007 - 4:09 am

Ok, here comes a BUG with trace:
set status page addr 0x00033000
list_add corruption. next->prev should be prev (ffffffff8068ae70), but was
ffffffff80697a50. (next=ffff81000117fbd0).
------------[ cut here ]------------
kernel BUG at /home/l/latest/xxx/lib/list_debug.c:27!
invalid opcode: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:02.0/enable
CPU 0
Modules linked in: ipv6 floppy sr_mod rtc_cmos rtc_core cdrom ehci_hcd rtc_lib
usbhid
Pid: 1639, comm: X Not tainted 2.6.23-rc4-mm1_64 #23
RIP: 0010:[<ffffffff80332f49>]  [<ffffffff80332f49>] __list_add+0x39/0x60
RSP: 0018:ffff81000547bd48  EFLAGS: 00010296
RAX: 0000000000000079 RBX: ffff81000117a380 RCX: ffffffff8068b450
RDX: ffff81000317d6a0 RSI: 0000000000000001 RDI: ffffffff8068b420
RBP: ffff81000547bd48 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000006da2
R13: ffff810006c10d10 R14: ffff810006da2000 R15: 8000000000000163
FS:  00007f7a05258710(0000) GS:ffffffff806d1000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000a33040 CR3: 0000000004a43000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 1639, threadinfo ffff81000547a000, task ffff81000317d6a0)
Stack:  ffff81000547bd58 ffffffff80332f7c ffff81000547bdc8 ffffffff80225c56
 ffffffff80225cb5 8000000000000163 ffffffff806830e8 ffffffff806830a0
 ffffffff806830a0 ffff810006da2000 ffff81000547bda8 0000000000006da2
Call Trace:
 [<ffffffff80332f7c>] list_add+0xc/0x10
 [<ffffffff80225c56>] __change_page_attr+0x376/0x390
 [<ffffffff80225cb5>] change_page_attr_addr+0x45/0x140
 [<ffffffff80225d16>] change_page_attr_addr+0xa6/0x140
 [<ffffffff80225de3>] change_page_attr+0x33/0x40
 [<ffffffff80387b64>] agp_generic_destroy_page+0x44/0x70
 [<ffffffff80388645>] agp_free_memory+0x65/0xd0
 [<ffffffff80386d49>] agp_free_memory_wrap+0x39/0x60
 ...
From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

"struct menu_governor" needlessly again became global.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
cb33b296204127cf50df54b84b2d79e152fb924b 
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index f5a8865..8d3fdc5 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -117,7 +117,7 @@ static int menu_enable_device(struct cpuidle_device *dev)
 	return 0;
 }
 
-struct cpuidle_governor menu_governor = {
+static struct cpuidle_governor menu_governor = {
 	.name =		"menu",
 	.rating =	20,
 	.enable =	menu_enable_device,

-

From: Adam Belay
Date: Monday, August 27, 2007 - 3:32 pm

This is already fixed in the most recent ACPI CPUIDLE tree.

Thanks,
Adam


-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

parport_device_num() is no longer used.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 Documentation/parport-lowlevel.txt |   29 +++--------------------------
 drivers/parport/daisy.c            |   29 -----------------------------
 include/linux/parport.h            |    1 -
 3 files changed, 3 insertions(+), 56 deletions(-)

0066510df2b5d4972cfd6a4450af8b82c763adfd 
diff --git a/Documentation/parport-lowlevel.txt b/Documentation/parport-lowlevel.txt
index 8f23024..265fcdc 100644
--- a/Documentation/parport-lowlevel.txt
+++ b/Documentation/parport-lowlevel.txt
@@ -25,7 +25,6 @@ Global functions:
   parport_open
   parport_close
   parport_device_id
-  parport_device_num
   parport_device_coords
   parport_find_class
   parport_find_device
@@ -735,7 +734,7 @@ NULL is returned.
 
 SEE ALSO
 
-parport_register_device, parport_device_num
+parport_register_device
 
 parport_close - unregister device for particular device number
 -------------
@@ -787,29 +786,7 @@ Many devices have ill-formed IEEE 1284 Device IDs.
 
 SEE ALSO
 
-parport_find_class, parport_find_device, parport_device_num
-
-parport_device_num - convert device coordinates to device number
-------------------
-
-SYNOPSIS
-
-#include <linux/parport.h>
-
-int parport_device_num (int parport, int mux, int daisy);
-
-DESCRIPTION
-
-Convert between device coordinates (port, multiplexor, daisy chain
-address) and device number (zero-based).
-
-RETURN VALUE
-
-Device number, or -1 if no device at given coordinates.
-
-SEE ALSO
-
-parport_device_coords, parport_open, parport_device_id
+parport_find_class, parport_find_device
 
 parport_device_coords - convert device number to device coordinates
 ------------------
@@ -833,7 +810,7 @@ Zero on success, in which case the coordinates are (*parport, *mux,
 
 SEE ALSO
 
-parport_device_num, parport_open, parport_device_id
+parport_open, parport_device_id
 
 parport_find_class - find a device by its class
 ------------------
diff ...
From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

do_restart_poll() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
59cd2d11f5f0189973bb280c59262eb50984cb88 
diff --git a/fs/select.c b/fs/select.c
index 5a3ab01..3e515aa 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -711,7 +711,7 @@ out_fds:
 	return err;
 }
 
-long do_restart_poll(struct restart_block *restart_block)
+static long do_restart_poll(struct restart_block *restart_block)
 {
 	struct pollfd __user *ufds = (struct pollfd __user*)restart_block->arg0;
 	int nfds = restart_block->arg1;

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

snd_ctl_elem_{read,write} no longer have any modular users.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 sound/core/control.c |    4 ----
 1 file changed, 4 deletions(-)

23e15051dde57c569e4c9aff1339aaf64185ea71 
diff --git a/sound/core/control.c b/sound/core/control.c
index 396e98e..6144d8a 100644
--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -716,8 +716,6 @@ int snd_ctl_elem_read(struct snd_card *card, struct snd_ctl_elem_value *control)
 	return result;
 }
 
-EXPORT_SYMBOL(snd_ctl_elem_read);
-
 static int snd_ctl_elem_read_user(struct snd_card *card,
 				  struct snd_ctl_elem_value __user *_control)
 {
@@ -781,8 +779,6 @@ int snd_ctl_elem_write(struct snd_card *card, struct snd_ctl_file *file,
 	return result;
 }
 
-EXPORT_SYMBOL(snd_ctl_elem_write);
-
 static int snd_ctl_elem_write_user(struct snd_ctl_file *file,
 				   struct snd_ctl_elem_value __user *_control)
 {

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

sys_{open,read} can finally be unexported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 fs/open.c       |    1 -
 fs/read_write.c |    1 -
 2 files changed, 2 deletions(-)

6f6884f9ee675f2e804c6c58ca46337f9765dd0d 
diff --git a/fs/open.c b/fs/open.c
index 23f334d..c0814de 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1057,7 +1057,6 @@ asmlinkage long sys_open(const char __user *filename, int flags, int mode)
 	prevent_tail_call(ret);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(sys_open);
 
 asmlinkage long sys_openat(int dfd, const char __user *filename, int flags,
 			   int mode)
diff --git a/fs/read_write.c b/fs/read_write.c
index 507ddff..d913d1e 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -370,7 +370,6 @@ asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t count)
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(sys_read);
 
 asmlinkage ssize_t sys_write(unsigned int fd, const char __user * buf, size_t count)
 {

-

From: Arjan van de Ven
Date: Monday, August 27, 2007 - 3:53 pm

On Mon, 27 Aug 2007 23:27:23 +0200

isn't sys_close in the same boat?
-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 4:17 pm

Still used in fs/binfmt_misc.c ...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

<--  snip  -->

...
  AS      arch/m32r/kernel/entry.o
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc3-mm1/arch/m32r/kernel/entry.S: Assembler messages:
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc3-mm1/arch/m32r/kernel/entry.S:358: Error: bad instruction `addi r0,#(((((0)+(64))+(32))+(32)))'
make[2]: *** [arch/m32r/kernel/entry.o] Error 1

<--  snip  -->

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Hirokazu Takata
Date: Monday, August 27, 2007 - 8:50 pm

From: Adrian Bunk <bunk@kernel.org>
Subject: 2.6.23-rc3-mm1: m32r defconfig compile error

Hello, Adrian,

Thank you for the report.


M32700UT/OPSPUT Users,

Please apply this patch to build an m32r 2.6.23-rc3-mm1 kernel.

This patch has also included to my m32r kernel development git repository.
 git://www.linux-m32r.org/git/takata/linux-2.6_dev.git linux-m32r

Thanks,

-- Takata


[PATCH 2.6.23-rc3-mm1] m32r: build fix of entry.S

This patch is required to fix build errors for the modification:
  m32r: Simplify ei_handler code
  commit f6c7546d53a4288501dcdd96d5297214697e7237

Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
---
 arch/m32r/kernel/entry.S |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/m32r/kernel/entry.S b/arch/m32r/kernel/entry.S
index c46cfaa..42b08bf 100644
--- a/arch/m32r/kernel/entry.S
+++ b/arch/m32r/kernel/entry.S
@@ -355,7 +355,7 @@ ENTRY(ei_handler)
 	lduh	r0, @(low(M32R_INT0ICU_ISTS),r0)	; bit10-6 : ISN
 	slli	r0, #21
 	srli	r0, #27				; ISN
-	addi	r0, #(M32R_INT0ICU_IRQ_BASE)
+	add3	r0, r0, #(M32R_INT0ICU_IRQ_BASE)
 	bra	check_end
 	.fillinsn
 4:
@@ -367,7 +367,7 @@ ENTRY(ei_handler)
 	lduh	r0, @(low(M32R_INT2ICU_ISTS),r0)	; bit10-6 : ISN
 	slli	r0, #21
 	srli	r0, #27				; ISN
-	addi	r0, #(M32R_INT2ICU_IRQ_BASE)
+	add3	r0, r0, #(M32R_INT2ICU_IRQ_BASE)
 	; bra	check_end
 	.fillinsn
 5:
-- 
1.5.2.4

--
Hirokazu Takata <takata@linux-m32r.org>
Linux/M32R Project:  http://www.linux-m32r.org/
-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:27 pm

This patch removes the unused unwind exports.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 kernel/unwind.c |    4 ----
 1 file changed, 4 deletions(-)

844ccf670a8204df45b89407bb0e5867f03d0f71 
diff --git a/kernel/unwind.c b/kernel/unwind.c
index 8c267c7..adb1ebe 100644
--- a/kernel/unwind.c
+++ b/kernel/unwind.c
@@ -1243,7 +1243,6 @@ int unwind(struct unwind_frame_info *frame)
 #undef CASES
 #undef FRAME_REG
 }
-EXPORT_SYMBOL(unwind);
 
 int unwind_init_frame_info(struct unwind_frame_info *info,
                            struct task_struct *tsk,
@@ -1255,7 +1254,6 @@ int unwind_init_frame_info(struct unwind_frame_info *info,
 
 	return 0;
 }
-EXPORT_SYMBOL(unwind_init_frame_info);
 
 /*
  * Prepare to unwind a blocked task.
@@ -1269,7 +1267,6 @@ int unwind_init_blocked(struct unwind_frame_info *info,
 
 	return 0;
 }
-EXPORT_SYMBOL(unwind_init_blocked);
 
 /*
  * Prepare to unwind the currently running thread.
@@ -1284,5 +1281,4 @@ int unwind_init_running(struct unwind_frame_info *info,
 
 	return arch_unwind_init_running(info, callback, arg);
 }
-EXPORT_SYMBOL(unwind_init_running);
 

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:28 pm

noautodma can now be unexported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
957dc7601c050cb14a7afc842db0c2d62aaf3509 
diff --git a/drivers/ide/ide.c b/drivers/ide/ide.c
index b3b5f00..5b09066 100644
--- a/drivers/ide/ide.c
+++ b/drivers/ide/ide.c
@@ -100,8 +100,6 @@ static int ide_scan_direction; /* THIS was formerly 2.2.x pci=reverse */
 
 int noautodma = 0;
 
-EXPORT_SYMBOL(noautodma);
-
 #ifdef CONFIG_BLK_DEV_IDEACPI
 int ide_noacpi = 0;
 int ide_noacpitfs = 1;

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:28 pm

This patch fixes an obvious bug in mixdev_open_devices().

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
bb574366744163ff84609843ee43e84a39f57d5a 
diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 2ad8633..4ca0ad3 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -461,7 +461,7 @@ static void mixdev_open_devices(void)
 
 	list_for_each_entry(mousedev, &mousedev_mix_list, mixdev_node) {
 		if (!mousedev->mixdev_open) {
-			if (mousedev_open_device(mousedev));
+			if (mousedev_open_device(mousedev))
 				continue;
 
 			mousedev->mixdev_open = 1;

-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:29 pm

This patch fixes an obvious bug in ivtvfb_release_buffers().

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
093bdc9ba94bffbec2ed44743418899771488832 
diff --git a/drivers/media/video/ivtv/ivtv-fb.c b/drivers/media/video/ivtv/ivtv-fb.c
index 0080765..8a344d5 100644
--- a/drivers/media/video/ivtv/ivtv-fb.c
+++ b/drivers/media/video/ivtv/ivtv-fb.c
@@ -1068,8 +1068,8 @@ static void ivtvfb_release_buffers (struct ivtv *itv)
 	struct osd_info *oi = itv->osd_info;
 
 	/* Release cmap */
-	if (oi->ivtvfb_info.cmap.len);
-	fb_dealloc_cmap(&oi->ivtvfb_info.cmap);
+	if (oi->ivtvfb_info.cmap.len)
+		fb_dealloc_cmap(&oi->ivtvfb_info.cmap);
 
 	/* Release pseudo palette */
 	if (oi->ivtvfb_info.pseudo_palette)

-

From: Hans Verkuil
Date: Monday, August 27, 2007 - 11:30 pm

Ouch. Thanks.

Mauro, I've added this patch to my v4l-dvb tree. Can you pull from it?

Thanks,



-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:29 pm

This patch fixes two obvious bugs.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 drivers/net/wireless/iwl-base.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

600ffdc11b25ac0aee15271d1b2ce99a367efa31 
diff --git a/drivers/net/wireless/iwl-base.c b/drivers/net/wireless/iwl-base.c
index b8293fe..f65c30e 100644
--- a/drivers/net/wireless/iwl-base.c
+++ b/drivers/net/wireless/iwl-base.c
@@ -343,7 +343,7 @@ int iwl_tx_queue_init(struct iwl_priv *priv,
 	 * command is very huge the system will not have two scan at the
 	 * same time */
 	len = sizeof(struct iwl_cmd) * slots_num;
-	if (txq_id == IWL_CMD_QUEUE_NUM);
+	if (txq_id == IWL_CMD_QUEUE_NUM)
 		len +=  IWL_MAX_SCAN_SIZE;
 	txq->cmd = pci_alloc_consistent(dev, len, &txq->dma_addr_cmd);
 	if (!txq->cmd)
@@ -390,7 +390,7 @@ void iwl_tx_queue_free(struct iwl_priv *priv, struct iwl_tx_queue *txq)
 		iwl_hw_txq_free_tfd(priv, txq);
 
 	len = sizeof(struct iwl_cmd) * q->n_window;
-	if (q->id == IWL_CMD_QUEUE_NUM);
+	if (q->id == IWL_CMD_QUEUE_NUM)
 		len += IWL_MAX_SCAN_SIZE;
 
 	pci_free_consistent(dev, len, txq->cmd, txq->dma_addr_cmd);

-

From: Tomas Winkler
Date: Monday, August 27, 2007 - 3:34 pm

Shame on me.
-

From: Adrian Bunk
Date: Monday, August 27, 2007 - 2:29 pm

-maccumulate-outgoing-args on i386 count:
1 + 1 - 1 = 1

If the stack unwinder needs it please enable it only explicitely when 
the unwinder is enabled - we are talking about a 2.5% size increase 
with defconfig.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Jiri Slaby
Date: Tuesday, August 28, 2007 - 4:32 am

I got this during gxine initialization of ocko.tv live stream without any cd in
cdroms:

BUG: unable to handle kernel NULL pointer dereference at virtual address 0000005c
printing eip: f88fbe7a *pde = 00000000
Oops: 0000 [#1] SMP
Modules linked in: ath5k arc4 ecb blkcipher cryptomgr crypto_algapi
rc80211_simple mac80211 cfg80211 nls_cp437 vfat fat usb_storage tun ipv6 floppy
parport_pc parport ohci1394 ieee1394 usbhid sr_mod ehci_hcd cdrom ff_memless

Pid: 2809, comm: hald-addon-stor Not tainted (2.6.23-rc3-mm1 #315)
EIP: 0060:[<f88fbe7a>] EFLAGS: 00010246 CPU: 1
EIP is at sr_block_release+0xb/0x2c [sr_mod]
EAX: 00000000 EBX: 00000000 ECX: f88fbe6f EDX: 00000000
ESI: c21c36c0 EDI: c289a780 EBP: c3729f18 ESP: c3729f10
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process hald-addon-stor (pid: 2809, ti=c3729000 task=c1c2be40 task.ti=c3729000)
Stack: 00000000 c21c36c0 c3729f38 c018d7ad c21c36cc c1f9ff80 c21c3730 c21c36c0
       c2a6ada0 dcbb3f80 c3729f40 c018d7dc c3729f4c c018e103 00000010 c3729f74
       c016bc5f 00000000 00000000 c217fa80 c1f9ff80 c2a6ada0 dcbb3f80 c1cc6900
Call Trace:
 [<c0105022>] show_trace_log_lvl+0x1a/0x30
 [<c01050dd>] show_stack_log_lvl+0xa5/0xca
 [<c01051d2>] show_registers+0xd0/0x1c1
 [<c01053cd>] die+0x10a/0x24d
 [<c011afbe>] do_page_fault+0x496/0x608
 [<c03768e2>] error_code+0x72/0x78
 [<c018d7ad>] __blkdev_put+0x125/0x14a
 [<c018d7dc>] blkdev_put+0xa/0xc
 [<c018e103>] blkdev_close+0x29/0x2c
 [<c016bc5f>] __fput+0xa6/0x161
 [<c016bea4>] fput+0x22/0x3b
 [<c016960b>] filp_close+0x41/0x67
 [<c016a78c>] sys_close+0x60/0x9f
 [<c01040ce>] syscall_call+0x7/0xb
 =======================
Code: 0c 81 c3 4c 01 00 00 89 5c 24 08 89 44 24 04 c7 04 24 88 cd 8f f8 e8 99 84
82 c7 e9 04 fe ff ff 55 89 e5 56 53 8b 80 04 01 00 00 <8b> 40 5c 8b 70 3c 8d 46
18 e8 cf f6 fe ff 89 c3 85 c0 75 07 89
EIP: [<f88fbe7a>] sr_block_release+0xb/0x2c [sr_mod] SS:ESP 0068:c3729f10

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Satyam Sharma
Date: Tuesday, August 28, 2007 - 8:08 am

Hi Jiri,



Yup, that's an old habit of hald-addon-storage ... doing open(2),
ioctl(2) and close(2) on the cdrom block device even when it's idle,

blkdev_put(bdev, ...)
	__blkdev_put(bdev, ...)
		sr_block_release(bdev->bd_inode, ...)
		(sees bdev->bd_inode->i_bdev->bd_disk == NULL)

Doesn't seem like an sr_block_release() (or sr_mod) issue to me at all,
looks more like a wierd race in the block_device code ... can you send
or put up some link to your .config?

If this is reproducible (I bet it isn't, though) you could try bisecting.


Satyam
-

From: Jiri Slaby
Date: Tuesday, August 28, 2007 - 8:21 am

-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-

From: Andrew Morton
Date: Tuesday, August 28, 2007 - 7:58 pm

Possibly due to remove-bdput-from-do_open-in-fs-block_devc.patch.

That patch is "wrong" and I think the problem which it attempts to address
actually lies in the cdrom code.  viro was taking a look at it but appears
to have recoiled in horror.  I'll drop
remove-bdput-from-do_open-in-fs-block_devc.patch so let's just watch out
for any reoccurrence, thanks.

-

From: Valdis.Kletnieks
Date: Wednesday, August 29, 2007 - 7:04 am

Sorry for not catching this one sooner, but AFAICT, Fedora didn't ship a glibc
that trips over this (2.6.90-12) until Saturday and I installed it yesterday.
-22-rc6-mm1 demonstrated the same issue as well.

The issue:  vdso and gettimeofday seem to be having a quarrel.

At boot, the system clock is just fine.   Right when it hits the 5-minute
uptime mark (and suspiciously close to the jiffie rollover), the date suddenly
shoots forward 4096 seconds.

Dumb test script:

#!/bin/bash
log="uptime.`uname -r`"
touch /root/$log
tail -f /root/$log &
while /bin/true;
do
        uptime >> /root/$log
        date >> /root/$log
        sleep 1
done

Exerpt from runtime:

 19:57:55 up 1 min,  0 users,  load average: 0.09, 0.05, 0.01
Tue Aug 28 19:57:55 EDT 2007
 19:57:56 up 1 min,  0 users,  load average: 0.09, 0.05, 0.01
Tue Aug 28 19:57:56 EDT 2007
 19:57:57 up 2 min,  0 users,  load average: 0.08, 0.05, 0.01
Tue Aug 28 19:57:57 EDT 2007
 19:57:58 up 2 min,  0 users,  load average: 0.08, 0.05, 0.01
...
 20:00:55 up 4 min,  0 users,  load average: 0.01, 0.03, 0.00
Tue Aug 28 20:00:55 EDT 2007
 20:00:56 up 4 min,  0 users,  load average: 0.01, 0.03, 0.00
Tue Aug 28 20:00:56 EDT 2007
 20:00:57 up 5 min,  0 users,  load average: 0.01, 0.03, 0.00
Tue Aug 28 21:09:15 EDT 2007
 20:00:58 up 5 min,  0 users,  load average: 0.01, 0.03, 0.00
Tue Aug 28 21:09:16 EDT 2007
 20:00:59 up 5 min,  0 users,  load average: 0.01, 0.03, 0.00
Tue Aug 28 21:09:17 EDT 2007

uptime keeps reporting the right time, date goes flying ahead.  Once we
get into this state, I can issue a 'date' command to set the *right* time,
and then it will immediately reset back.  Doing a 'touch foo; ls -l foo'
shows the correct time.

Booting with vdso=0 makes the time/date run normally.

Ideas?
From: Valdis.Kletnieks
Date: Wednesday, August 29, 2007 - 10:37 am

On Wed, 29 Aug 2007 10:04:33 EDT, Valdis.Kletnieks@vt.edu said:


This is also open as a Fedora bug:
https://bugzilla.redhat.com/show_bug.cgi?id=262481
From: Andrew Morton
Date: Wednesday, August 29, 2007 - 4:15 pm

So it's an interaction between the x86_64 vdso patches in Andi's tree and 
newer glibc, and we don't know which one is getting it wrong yet?

If I ever get another -mm out the door (have been without electricity for
several days) I'll drop the vdso changes until this is sorted out.
-

From: Ulrich Drepper
Date: Wednesday, August 29, 2007 - 7:46 pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


glibc does nothing but call the code in the vdso.  We have a function
pointer variable which either has the old vsyscall value or the address
of the function in the vdso.  Everything else is identical.  Unless the
interface of the vdso function is different (which it shouldn't) I don't
think you can blame glibc.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFG1i+K2ijCOnn/RHQRArogAKC3zBeyOzqJRF+x2zj3fBg9iGLdyQCgx9Z3
dv3Izh65+kxKedza6RH3MHk=
=qEdC
-----END PGP SIGNATURE-----
-

From: Valdis.Kletnieks
Date: Thursday, August 30, 2007 - 7:08 am

Don't bother, I tested this last night against a vanilla 2.6.23-rc3 kernel
and it had the same issue as well.  So Andi's vdso patches in his tree and/or
the -mm kernel aren't to blame - it's in mainline as well.  And it's been
in for a while - I also hit it with a 2.6.22-rc6-mm1 kernel.

From: Chuck Ebbert
Date: Friday, August 31, 2007 - 2:21 pm

We also have:

  http://lkml.org/lkml/2007/7/29/376

(Time repeatedly jumps backwards ~4400 seconds.)
-

From: Chuck Ebbert
Date: Thursday, August 30, 2007 - 9:27 am

Problem is present in stock 2.6.23-rc too. Still don't know whether it is
the new glibc code or the vdso that's causing it, though.
-

From: Valdis.Kletnieks
Date: Saturday, September 8, 2007 - 5:24 pm

Updating on this issue:   Both myself and another person have reported on
the RedHat bugzilla that it's a clocksource issue - if you are using the
hpet clocksource, the time warps, but booting with clocksource=acpi_pm works.

This ring any bells?

From: Andi Kleen
Date: Sunday, September 9, 2007 - 12:27 am

Does this patch fix it? 
-Andi

Add missing mask operation to vdso

vdso vgetns() didn't mask the time source offset calculation, which could
lead to time problems with 32bit HPET. Add the masking.

Thanks to Chuck Ebbert for tracking down.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/x86_64/vdso/vclock_gettime.c
===================================================================
--- linux.orig/arch/x86_64/vdso/vclock_gettime.c
+++ linux/arch/x86_64/vdso/vclock_gettime.c
@@ -34,10 +34,11 @@ static long vdso_fallback_gettime(long c
 
 static inline long vgetns(void)
 {
+	long v;
 	cycles_t (*vread)(void);
 	vread = gtod->clock.vread;
-	return ((vread() - gtod->clock.cycle_last) * gtod->clock.mult) >>
-		gtod->clock.shift;
+	v = (vread() - gtod->clock.cycle_last) & gtod->clock.mask;
+	return (v * gtod->clock.mult) >> gtod->clock.shift;
 }
 
 static noinline int do_realtime(struct timespec *ts)
-

From: Valdis.Kletnieks
Date: Monday, September 10, 2007 - 12:07 pm

Confirming that does indeed fix it - booted with hpet clocksource and vdso=1,
and the time didn't warp at the 5-minute mark.

Tested-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>
From: Chuck Ebbert
Date: Thursday, August 30, 2007 - 9:30 am

Just found this duplicated code in 2.6.23-rc4, maybe it was supposed
to be something else? The second one was added in the x86_64 vdso patch.

arch/x86_64/kernel/vsyscall.c:

void update_vsyscall(struct timespec *wall_time, struct clocksource *clock)
{
        unsigned long flags;

        write_seqlock_irqsave(&vsyscall_gtod_data.lock, flags);
        /* copy vsyscall data */
        vsyscall_gtod_data.clock.vread = clock->vread;
        vsyscall_gtod_data.clock.cycle_last = clock->cycle_last;
        vsyscall_gtod_data.clock.mask = clock->mask;
        vsyscall_gtod_data.clock.mult = clock->mult;
        vsyscall_gtod_data.clock.shift = clock->shift;
        vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec;
        vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;  <===
        vsyscall_gtod_data.sys_tz = sys_tz;
        vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;  <===
        vsyscall_gtod_data.wall_to_monotonic = wall_to_monotonic;
        write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}
-

From: Andi Kleen
Date: Saturday, September 1, 2007 - 3:07 am

Must have been a (harmless) merging mistake, but I bet gcc optimizes it out
anyways.

-Andi
-

From: Chuck Ebbert
Date: Friday, September 7, 2007 - 12:39 pm

I did find this after some digging:


In the vdso code:

static inline long vgetns(void)
{
        cycles_t (*vread)(void);
        vread = gtod->clock.vread;
        return ((vread() - gtod->clock.cycle_last) * gtod->clock.mult) >>
                gtod->clock.shift;
}


Looks like an open-coded version of this in the kernel timekeeping code:

static inline s64 __get_nsec_offset(void)
{
        cycle_t cycle_now, cycle_delta;
        s64 ns_offset;

        /* read clocksource: */
        cycle_now = clocksource_read(clock);

        /* calculate the delta since the last update_wall_time: */
        cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;

        /* convert to nanoseconds: */
        ns_offset = cyc2ns(clock, cycle_delta);

        return ns_offset;
}

But the vdso version isn't doing any masking. And the mask is different for
different clocksources, so it has to track the underlying kernel's clocksource
when it gets changed.

-

From: Andi Kleen
Date: Saturday, September 8, 2007 - 1:57 am

vdso code needs to be all inlined because the vdso runs in ring 3 and cannot
access other kernel code.

It was opencoded it to have more control over the code (vdso requirements

vdso effectively only supports TSC and HPET (the other clock sources are not accessible
from ring 3)

TSC doesn't need a mask, but many HPETs need a 32bit mask; good point.

Does adding the mask to vgetns make the clock problems go away?

-Andi

-

From: Valdis.Kletnieks
Date: Saturday, September 8, 2007 - 8:20 pm

Ahh.. that explains why acpi_pm clocksource doesn't trip over the problem....
Previous thread: plea for project idea by shaneed cm on Wednesday, August 22, 2007 - 1:41 am. (4 messages)

Next thread: none