Re: Linux 2.6.23

Previous thread: [RFC] [Patch] calgary iommu: Use the first kernel's tce tables in kdump by chandru on Tuesday, October 9, 2007 - 1:40 pm. (20 messages)

Next thread: (resubmitting) typo fixes for 2.6.23 by Matt LaPlante on Tuesday, October 9, 2007 - 3:13 pm. (5 messages)
From: Linus Torvalds
Subject: Linux 2.6.23
Date: Tuesday, October 9, 2007 - 1:54 pm

Finally.

Yeah, it got delayed, not because of any huge issues, but because of 
various bugfixes trickling in and causing me to reset my "release clock" 
all the time. But it's out there now, and hopefully better for the wait.

Not a whole lot of changes since -rc9, although there's a few updates to 
mips, sparc64 and blackfin in there.  Ignoring those arch updates, there's 
basically a number of mostly one-liners (mostly in drivers, but there's 
some networking fixes and soem VFS/VM fixes there too).

Shortlog and diffstat appended (both relative to -rc9, of course - the 
full log from 2.6.22 is on kernel.org as usual).

I want this to be what people look at for a few days, but expect the x86 
merge to go ahead after that. So far, all indications are still that it's 
going to be all smooth sailing, but hey, those indicators seem to always 
say that, and only after the fact do people notice any problems ;)

		Linus

---
Akinobu Mita (1):
      [SPARC64]: check fork_idle() error

Al Viro (1):
      fix bogus reporting of signals by audit

Alexey Dobriyan (2):
      Move kasprintf.o to obj-y
      [ROSE]: Fix rose.ko oops on unload

Alexey Kuznetsov (1):
      [SFQ]: Remove artificial limitation for queue limit.

Andrew Morton (1):
      binfmt_flat: checkpatch fixing minimum support for the blackfin relocations

Anton Blanchard (2):
      [POWERPC] Fix xics set_affinity code
      Fix timer_stats printout of events/sec

Attila Kinali (1):
      Add manufacturer and card id of teltonica pcmcia modems

Ben Dooks (2):
      [ARM] 4597/2: OSIRIS: ensure CPLD0 is preserved after suspend
      [ARM] 4598/2: OSIRIS: Ensure we do not get nRSTOUT during suspend

Benjamin Herrenschmidt (1):
      Fix non-terminated PCI match table in PowerMac IDE

Bernd Schmidt (1):
      Binfmt_flat: Add minimum support for the Blackfin relocations

Brian Haley (1):
      [IPv6]: Fix ICMPv6 redirect handling with target multicast address

Bryan Wu (1):
      Blackfin arch: ...
From: Nicholas Miell
Date: Tuesday, October 9, 2007 - 11:12 pm

Does CFS still generate the following sysbench graphs with 2.6.23, or
did that get fixed?

http://people.freebsd.org/~kris/scaling/linux-pgsql.png
http://people.freebsd.org/~kris/scaling/linux-mysql.png

(There's also some interesting FreeBSD vs. Linux graphs in
http://people.freebsd.org/~kris/scaling/Scalability%20Update.pdf , but
AFAIK those comparisons are more indicative of glibc malloc performance
than Linux performance.)

-- 
Nicholas Miell <nmiell@comcast.net>

-

From: Ingo Molnar
Date: Wednesday, October 10, 2007 - 3:14 am

as far as my testsystem goes, v2.6.23 beats v2.6.22.9 in sysbench:

    http://redhat.com/~mingo/misc/sysbench.jpg

As you can see it in the graph, v2.6.23 schedules much more consistently 
too. [ v2.6.22 has a small (but potentially statistically insignificant) 
edge at 4-6 clients, and CFS has a slightly better peak (which is 
statistically insignificant). ]

( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
  1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
  nor in the setup - everything is pretty close to the defaults. )

i'm aware of a 2.6.21 vs. 2.6.23 sysbench regression report, and it 
apparently got resolved after various changes to the test environment:

   http://jeffr-tech.livejournal.com/10103.html

 " [<CFS>] has virtually no dropoff and performs better under load than
   the default 2.6.21 scheduler. " (paraphrased)

(The new link you posted, just a few hours after the release of v2.6.23, 
has not been reported to lkml before AFAICS - when did you become aware 
of it? If you learned about it before v2.6.23 it might have been useful 
to report it to the v2.6.23 regression list.)

At a quick glance there are no .configs or other testing details at or 
around that URL that i could use to reproduce their result precisely, so 
at least a minimal bugreport would be nice.

In any case, here are a few general comments about sysbench numbers:

Sysbench is a pretty 'batched' workload: it benefits most from batchy 
scheduling: the client doing as much work as it can, then server doing 
as much work as it can - and so on. The longer the client can work the 
more cache-efficient the workload is. Any round-trip to the server due 
to pesky preemption only blows up the cache footprint of the workload 
and gives lower throughput.

This kind of workload would probably run best on DOS or Windows 3.11, 
with no preemptive scheduling done at all. In other words: run both 
mysqld and the client as SCHED_FIFO to get the best ...
From: Nicholas Miell
Date: Wednesday, October 10, 2007 - 6:20 pm

That's nice to know. Note that I'm not actually involved in any of these

According to my IRC logs, Jeffr pasted the URL at Oct 09 22:53:56 PDT.
He says he tried to contact you early in CFS's development, but got no

AFAICT, the configuration is described in
http://people.freebsd.org/~kris/scaling/mysql.html


-- 
Nicholas Miell <nmiell@comcast.net>

-

From: Zhang, Yanmin
Date: Wednesday, October 10, 2007 - 7:34 pm

I used FedoraCore 8 Test2 distribution, so glibc-2.6.90-13 already fixed
the old malloc scalability issue. Cpu is 2.66GHZ quad core, 2 physical
I tested it in 2.6.22 and all 2.6.23-rc kernels. All 2.6.23-rc kernel has
Commandline to run testing:
#sysbench --test=oltp --mysql-user=root --mysql-db=mysql --max-time=120
Below is PREMPT config in my kernel config file.

CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
# CONFIG_NUMA is not set


-yanmin
-

From: Ingo Molnar
Date: Thursday, October 11, 2007 - 6:32 am

thanks for confirming this! I've updated glibc and mysql and now i can 
reproduce something similar. (I have a theory about the reason of this 
regression, and i'm working on a test-patch.)

	Ingo
-

From: Nick Piggin
Date: Thursday, October 11, 2007 - 2:16 am

;) I think you snipped the important bit:

"the peak is terrible but it has virtually no dropoff and performs
better under load than the default 2.6.21 scheduler." (verbatim)

The dropoff under load was due to trivially avoided mmap_sem
contention in the kernel and glibc (and not-very-scalable mysql
heap locking), rather than specifically anything the scheduler
was doing wrong, I think (when the scheduler chose to start
preempting threads holding locks, then performance would tank.
Exactly when that point was reached, and what happens afterwards
was probably just luck.)
-

From: Ingo Molnar
Date: Thursday, October 11, 2007 - 10:46 pm

hm, i understood that peak remark to be in reference to FreeBSD's 
scheduler (which the FreeBSD guys are primarily interested in 
obviously), not v2.6.21 - but i could be wrong.

In any case, there is indeed a regression with sysbench and a low number 
of threads, and it's being fixed. The peak got improved visibly in 
sched-devel:

  http://people.redhat.com/mingo/misc/sysbench-sched-devel.jpg

but there is still some peak regression left, i'm testing a patch for 
that.

	Ingo
-

From: Nick Piggin
Date: Thursday, October 11, 2007 - 7:15 am

I think the Linux peak has always been roughly as good as their
best FreeBSD ones (eg. http://people.freebsd.org/~jeff/sysbench.png).
Obviously in that graph, Linux sucks because of the malloc/mmap_sem
issue. It also shows what he is calling the terrible CFS peak, I
guess.

In my own tests, after that was fixed, Linux's peak got even a bit

OK good. Once that's fixed, we'll hopefully be competitive with
FreeBSD again in this test :)
-

From: Bill Davidsen
Date: Friday, October 12, 2007 - 5:21 am

There's one important bit missing from that graph, the 
2.6.23-SCHED_BATCH values. Without that we can't tell how much 
improvement is from sched-devel and how much from SCHED_BATCH. Clearly 
2.6.23 is better than 2.6.22.any in this test, the locking issues seem 
to dominate that difference to the point that nothing else would be 
informative.

This weekend I have to do some building of kernels for various machines, 
so I intend to run some builds SCHED_BATCH and some will just run. If I 
find anything interesting I'll report.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-

From: René Rebe
Date: Wednesday, October 10, 2007 - 12:44 am

Hi Linus et al.,

2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:

In file included from fs/drop_caches.c:8:
include/linux/mm.h:1210: warning: 'struct super_block' declared inside parameter list
nclude/linux/mm.h:1210: warning: its scope is only this definition or declaration, which is probably not what you want
fs/drop_caches.c:17: error: conflicting types for 'drop_pagecache_sb'
include/linux/mm.h:1210: error: previous declaration of 'drop_pagecache_sb' was here
fs/drop_caches.c:28: error: conflicting types for 'drop_pagecache_sb'
include/linux/mm.h:1210: error: previous declaration of 'drop_pagecache_sb' was here

A little forward declaration fixes this:

--- linux-2.6.23/include/linux/mm.h.vanilla	2007-10-10 09:28:33.000000000 +0200
+++ linux-2.6.23/include/linux/mm.h	2007-10-10 09:30:23.000000000 +0200
@@ -1207,6 +1207,7 @@
 					void __user *, size_t *, loff_t *);
 unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 			unsigned long lru_pages);
+struct super_block;
 extern void drop_pagecache_sb(struct super_block *);
 void drop_pagecache(void);
 void drop_slab(void);

You probably end up fixing it some other way, but as I do not know this
file inside out I just wanted to drop a note.

Yours,
  René Rebe


-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name
-

From: Alexey Dobriyan
Date: Wednesday, October 10, 2007 - 1:37 am

You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.
-

From: Michael Tokarev
Date: Wednesday, October 10, 2007 - 2:12 am

The same happens here as well.

-rw-rw-r--  1 mjt mjt 45488158 Oct  9 20:48 linux-2.6.23.tar.bz2
2cc2fd4d521dc5d7cfce0d8a9d1b3472  linux-2.6.23.tar.bz2

(timestamp is in UTC) Downloaded yesterday, 3 hours after an announce,
from http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 .

/mjt
-

From: Alexey Dobriyan
Date: Wednesday, October 10, 2007 - 3:36 am

Strange. Same size, same md5, no super_block in mm.h, though
-

From: Jan Engelhardt
Date: Wednesday, October 10, 2007 - 3:53 am

Does someone still have the broken tarball?

There has not been any drop_pagecache_sb anytime between 2.6.23-rc1
and 2.6.23. drop_pagecache_sb reminds me of reiser4, too.
-

From: Michael Tokarev
Date: Wednesday, October 10, 2007 - 4:13 am

ghhrm.  That's nonsense.  I found where that struct super_block come
from -- it's from unionfs patches for 2.6.22, which I forgot to
update for 2.6.23 (I just dropped new kernel tarball into my
build directory together with other patches and ran usual build
procedure).  It's a definitely false alarm - the tarball is
fine.

/mjt
-

From: Ingo Molnar
Date: Wednesday, October 10, 2007 - 12:14 pm

i know about 4 (low-impact, cornercase) build breakages for 2.6.23-final 
on x86:

- an uncommon embedded config combinatio: if CONFIG_EMBEDDED=y and
  CONFIG_BLOCK is unset. (a normally useless combination)

- an uncommon V4L config combination: mixed-modular-built-in driver V4L
  config variation. (CONFIG_VIDEO_SAA7146=y and CONFIG_VIDEO_BUF=m)

- an uncommon MTD config combination (normal systems do not need
  CONFIG_MTD configured)

- an uncommon CONFIG_USB_NET_CDC_SUBSET config combination (normal 
  systems should never hit that)

[ furthermore there are a few driver-firmware build options that break 
  and which are not correctly made dependent on !PREVENT_FIRMWARE_BUILD. 
  Again, this is not something one would normally configure. ]

your superblock build failure would be a new and so far unknown build 
breakage variant - please send the .config you used, and double-check 
that it's indeed a vanilla 2.6.23 tree.

	Ingo
-

From: Michael Tokarev
Date: Wednesday, October 10, 2007 - 12:26 pm

It's not a vanilla 2.6.23.  In vanilla 2.6.23 there's no lines about
which it complains (struct super_block isn't mentioned in mm.h at all).
It's some external patch that used to work with 2.6.22 but needs to be
updated for 2.6.23 - in my case it was unionfs.

/mjt
-

From: Andi Kleen
Date: Wednesday, October 10, 2007 - 1:04 pm

It is not -- my 2.6.23 tree doesn't have the prototype that broke
the build for him.

-Andi
-

From: Krzysztof Halasa
Date: Wednesday, October 10, 2007 - 4:27 pm

Uncommon but far from useless - may be pure initramfs-based.
-- 
Krzysztof Halasa
-

Previous thread: [RFC] [Patch] calgary iommu: Use the first kernel's tce tables in kdump by chandru on Tuesday, October 9, 2007 - 1:40 pm. (20 messages)

Next thread: (resubmitting) typo fixes for 2.6.23 by Matt LaPlante on Tuesday, October 9, 2007 - 3:13 pm. (5 messages)