Linux 2.6.35

Previous thread: xiangyanh5 by steedley_wmpd on Sunday, August 1, 2010 - 4:49 pm. (1 message)

Next thread: [ANNOUNCE] mtdev v1.0.8 -- Multitouch Protocol Translation Library by Henrik Rydberg on Sunday, August 1, 2010 - 5:04 pm. (1 message)
From: Linus Torvalds
Subject: Linux 2.6.35
Date: Sunday, August 1, 2010 - 4:52 pm

So I said -rc6 would likely be the last -rc, and nothing happened to
change my mind. I'd always be happier if it had been an even quieter
week, but the appended Shortlog of changes since rc6 doesn't contain
anything earthshaking, and I don't think we'd have been any better off
by another rc, and waiting one more week. So 2.6.35 is out, go check
it out.

This may have been a fairly odd release cycle with my rather strict
-rc rules before -rc3, but on the whole I think I liked it, and it
seems to have worked out ok. I relaxed my extreme stance after getting
back from vacation, so the latter half of the rc series was more
normal. But even then I got the feeling that people were perhaps a bit
more aware of the whole "regression fixes only" model, which is all
good. It's a bit hard to judge, but there are some numbers to back it
up: in the 2.6.34 release, there were 3800 commits after -rc1, but in
the current 35 release cycle we had less than 2000.

Now, admittedly 34 was worse than average in that respect (3800
commits is a _lot_ of work after -rc1), but git history says that at
least going back to 2.6.24, we've never had less than 2000 commits
after -rc1 before now. They tend to be in the 2700-3200 commit range.
So I do think we really did have a lot less churn than usual
post-merge-window. And that's good.

So I'd like to try to repeat the experiment for the next release
cycle, and be pretty hardnosed about taking patches and git pull
requests after the merge window closes.

Talking about the next merge window: Andrew Morton was pretty unhappy
with the stability of linux-next at least a couple of weeks ago. It's
what he bases his -mm trees on, and so an unstable linux-next makes it
hard for Andrew to get his work done. It also makes me worried,
because a lot of people seem to think that "it's been in linux-next
for several months" means that something can and should be merged. And
if linux-next ends up being really flaky, that clearly cannot be the
case.

So guys - please ...
From: Stephen Rothwell
Date: Sunday, August 1, 2010 - 5:32 pm

Hi all,


To that end, Nick, can you please submit that tree for inclusion in
linux-next in case there are some interactions with some of the other
stuff there?  (or send it all to Al, instead (or both), I guess.)

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
From: Nick Piggin
Date: Monday, August 2, 2010 - 1:14 am

Hi Stephen,

I will work something out with Al and try to have something in
linux-next ASAP.

--

From: Stephen Rothwell
Date: Monday, August 2, 2010 - 1:52 am

Hi Nick,


OK, great.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
From: Dave Chinner
Date: Sunday, August 1, 2010 - 7:33 pm

There hasn't been nearly enough review or testing of this patch
series yet.  Before a merge, it needs to be split up in smaller,
more digestable chunks for more comprehensive review, regression
testing and behavioural analysis.

There's probably only a handful of people who have done any testing
on the patchset so far, and given the widespread changes it needs a
lot more testing than this before we should consider merging any of
it.

I really want to see this move forward too, but it changes lots of
critical infrastructure in subtle ways and so, IMO, this is not
a patchset we should be gung-ho about.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
--

From: Linus Torvalds
Date: Sunday, August 1, 2010 - 7:50 pm

I dunno. We merge _way_ scarier things in the VM and the block layer,
for much less actual upside, and with less review.

The RCU pathname lookup has some rather impressive performance
upsides, and I agree that it would be good to get a lot of review and
testing, but the latter isn't going to happen without it being
mainlined, and the former is sadly lacking. The person I'd like most
to review it is Al, but anybody in the filesystem world should
basically see it as a #1 priority, because unlike all the masturbatory
patches like xstat() that add new functionality that nobody will
likely ever use, Nick's patchseries improves on the thing that
everybody uses heavily every day without even thinking about it.

Is it tough to review? Yes. It's core code, not just some random
addition that adds a new feature and doesn't impact any old code. But
that's also the thing that makes it meaningful, and makes me think it
should get merged _much_ more eagerly than most code we ever see.

                           Linus
--

From: Dave Chinner
Date: Sunday, August 1, 2010 - 10:58 pm

Scary stuff outside of direct VFS/FS interfaces is generally hidden
from me by my +6 Blinkers of Blissful Ignorance. I make the
assumption that the experts involved know the risks and have weighed


Agreed - I've actually looked at every patch, commented on some
of the more questionable things, got quoted by LWN for saying that
it "fell off the locking cliff", have run benchmarks on it and sent
patches fixing bugs back to Nick.

It's just really hard to digest it all in one lump and core VFS

I agree with you for the pure locking changes.

But for the bits that change writeback, LRU ordering and reclaim
calculations the benefits are not quite so obvious, nor is the
correctness of the code/behaviour quite so provably correct.  Maybe
I'm being a bit too paranoid, but generally it pays to be a bit
conservative as a filesystem developer because the cost of screwing
up can be pretty high...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
--

From: Nick Piggin
Date: Monday, August 2, 2010 - 12:55 am

BTW. it has in fact had quite a bit of testing in earlier form in the
-rt tree for a long time, and several fixes come from there. And good

I hate to say but I would like to see it mature for another release. It
should also clash a bit with Al's recent inode work that he'll want to
push.

What I can do is send some of the ground work patches this time around,
put the tree into linux-next, and put reviewers on notice.

I think it is all conceptually sound, but it will inevitably have some
bugs left to shake out, and things to be fixed on the review side. I
don't anticipate a problem that could not be fixed in the release cycle,
but I think aiming for post 2.6.36 is a bit fairer for vfs guys,
honestly. LSF is next week too, so most of them will be busy with travel

For filesystems developers, the dcache and inode locking changes
should be more or less just following simple steps as shown in the
patch series. If they're not abusing dcache_lock (and most except
autofs4 are not), then it should not be a big deal.

There are a couple of locking constraints changed at the API level,
but I didn't run into any problems there yet. It should be all
documented in Documentation/filesystems/* although I need to run a

Writeback shouldn't be changed. LRU ordering is changed for 2
reasons. Firstly, to make things per-zone instead of global. This
basically fits our whole reclaim model much better, although it
will inevitably cause some random little changes but I think it
is agreed this is a good thing (memory shortage in one zone or
node does not require global shrinkings, NUMA level parallelism
of reclaim.)

The other thing is converting the last few dcache refcounting, and
all of inode refcounting over to this "lazy LRU" model. This can
have a bigger impact, but it really reduces locking on the per-zone
lists, so it definitely helps speed and scalability of non-reclaim
fastpaths. I'm up for changing this if numbers show it hurts, it
would be rather easy to do, but in comparison to ...
From: Christoph Hellwig
Date: Monday, August 2, 2010 - 1:24 am

What I'm most concerned bit merging everything in one go.  It's a huge
series and I'd rather see it start going in in batches over multiple
kernel releases.

Things like the fs_struct spinlock and some other preparatory patches
should be ver easily to do for 2.6.36.  Scaling the files and vfsmount
locks should also be easily doable, but we need to sort out the struct
file growth in the later.  We really can't grow struct file by two
pointers as that would have devasting effects on various workloads.

What follows after that is the dcache_lock scaling which to seems the
most immature bit of the series, and the one that showed by far the
most problems in -RT.  I'm very much dead set against merging that in
.36.  I'd much rather see the inode_lock scaling or the lockless path
walk going in before, but I haven't checked how complicated the
reordering would be.  The lockless path walk also is only rather
theoretically useful until we do ACL checks lockless as we're having
ACLs enabled pretty much everywhere at least in the distros.

The per-zone shrinkers are another thing that's not directly related,
I think they need a lot more discussion with the VM folks, and
integrating with Dave's work in that area.



--

From: KOSAKI Motohiro
Date: Monday, August 2, 2010 - 1:46 am

per-zone shrinkers don't cause so much impact to VM design except zone
reclaim feature. So, if FS folks think it's ok, I'm not against this at all.

btw, however, I haven't review such patch series in the detail yet. so
perhaps I might post some bug fix later.



--

From: Christoph Hellwig
Date: Monday, August 2, 2010 - 2:05 am

From a quick look it seems like the inode_lock splitup can easily
be moved forward, and it would help us with doing some work on the
writeback side.  The problem is that it would need rebasing ontop
of both the vfs and writeback (aka block) trees.

--

From: Nick Piggin
Date: Monday, August 2, 2010 - 3:07 am

inode_lock splitup is much simpler than dcache_lock, yes.

And I have to rebase it on the work currently queued for 2.6.35
anyway, so that's no problem. I can easily put it in front of
dcache_lock patches in the series (as I said, I've kept everything
independent and well split up).

I do want opinions on how to do the big-picture merge, though,
before I start moving things around. And obviously reviewing
each of the parts is more important at this point than exact
way to order the thing.

But even the inode_lock patches I am wary of merging in 2.6.36
without having much review or any linux-next / vfs-tree exposure.

--

From: Nick Piggin
Date: Monday, August 2, 2010 - 2:51 am

One problem is that to win much benefit, several different aspects
must be scaled. If not, then you end up with more locks *and* still
have bouncing global cachelines. And filesystems will go through
multiple releases where locking changes are in flux. This is what
I'm concerned about.

I definitely have tried to keep everything as conceptually seperate
small chunks. But there is a real big-picture aspect that is required
to review it.

For example, you asked for just the locking split-up without any of
the per-hash-locks and per-cpu locks etc. That's fine for review, but
you cannot merge it because then you end up with N bouncing global
locks instead of 1. It also tends to be much uglier than a final
outcome because I have not applied any transformations to improve lock

Strictly, it is a filesystem corruption bug-fix for the tty layer
and nothing to do with tty scaling patches.

I don't have the patience at the moment to sort through tty layer
crap, but whoever is maintaining that should. I could possibly come
back and look at it some point, but given your half-working patch


I would much prefer not to re-order it before either of inode or
dcache scaling patches. It would introduce a lot of churn and
locking is significantly changed.

It probably should be possible, although we would still get path
walk contention on dcache_lock, vfsmount_lock, and requires inode-RCU
(making inodes more expensive without being offset by any benefits of
inode scaling), and requires changes to filesystem dcache and inode
APIs.

I could work on re-ordering it certainly, but only if it is decided
that we definitely don't want dcache-scale or inode-scale patch sets
in the forseeable future. I think we definitely do want them, so I

True, it needs a last bit of work for permission checking. The 
conceptual idea and the bulk of the code I think is ready to review

Well I'm a VM folk :) Conceptually, there is no problems for MM
here. This is really the right way to drive reclaim from the ...
From: Andi Kleen
Date: Tuesday, August 3, 2010 - 1:18 am

We started some testing of the VFS series on larger systems and so
far it looks all very good and the performance improvements are impressive
(but of course new bottlenecks are being exposed then)

The only snag found so far was that an ACL enabled file system
disables all the nice path walk improvements, so right now you
need to remount with noacl. I'm hoping this can be fixed
before a mainline release, otherwise I suspect it would disable
the improvements for lots of people (common distributions default
to acl on)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Nick Piggin
Date: Tuesday, August 3, 2010 - 2:28 am

OK, vfs-scale-working branch now has commits to enable rcu-walk aware
d_revalidate, permission, and check_acl in the filesystems, and
implements a basic rcu-walk aware scheme for generic/posix acls and
implements that for tmpfs, ext2, btrfs. It just drops out of rcu-walk
if there is an ACL on a directory, or if it is not cached. I think
that's enough to be production ready now. Pushing rcu-walk awareness
down into acl checking code would not be hard.

I was under the impression that ACLs on directories are not that common,
so maybe this is as far as we need to go for now anyway.

It does need more commenting of the new methods and explaining how they
can and can't be used by filesystems. The tree is also getting messy
with incremental changes -- I'm avoiding rebasing it so people following
it can see response to reviews and issues that arise. Obviously it will
all get cleaned up and rebased properly onto a new branch before
anything is merged.

--

From: Andi Kleen
Date: Tuesday, August 3, 2010 - 2:49 am

Yes it shouldn't be common normally. I think the common case for distros
is just a few ACLs in /dev. Of course you never know for 
specific end user workloads.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Henrique de Moraes Holschuh
Date: Tuesday, August 3, 2010 - 8:05 am

They are quite common on fileserver data areas, at least on the places where
I worked at.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--

From: Thomas Backlund
Date: Monday, August 2, 2010 - 1:51 am

Hi,
(please cc me as I'm not subscribed)

updating from make 3.81 to 3.82 gets me this:

[thomas@tmb linux-2.6.35]$ cp arch/powerpc/configs/ppc64_defconfig .config
[thomas@tmb linux-2.6.35]$ LC_ALL=C make oldconfig ARCH=powerpc
/mnt/work/2.6.35/linux-2.6.35/arch/powerpc/Makefile:183: *** mixed 
implicit and normal rules.  Stop.

The lines are:

182:
183: $(BOOT_TARGETS): vmlinux
184:         $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst 
%,$(boot)/%,$@)
185:

BOOT_TARGETS are defined on line 166 as:
BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% 
cuImage.% simpleImage.%


Now it's not a regression in the kernel as the same happends with the 
2.6.34 tree too.

(btw, the host I'm syncing the defconfig with is a x86_64 machine)


Now, I dont know if this is "intended breakage" by the make update, or 
if the Makefile needs to be updated....

Any ideas how to fix ?

--
Thomas

--

From: Sam Ravnborg
Date: Monday, August 2, 2010 - 11:28 am

This is in the category "intended breakage".
We had a similar issue in the top-level Makefile which Paul (IIRC)
helped me to fix long time ago.

To fix popwerpc I suggest something along these lines.
[Note: I did not test it - please do so.

	Sam

[PATCH] powerpc: fix build with make 3.82

Thomas Backlund reported that the powerpc build broke with make 3.82.
It failed with the following message:

    arch/powerpc/Makefile:183: *** mixed implicit and normal rules.  Stop.

The fix is to avoid mixing non-wildcard and wildcard targets.

Reported-by: Thomas Backlund <tmb@mandriva.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
---
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 77cfe7a..ad88b21 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -163,9 +163,11 @@ drivers-$(CONFIG_OPROFILE)	+= arch/powerpc/oprofile/
 # Default to zImage, override when needed
 all: zImage
 
-BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
+# With make 3.82 we cannot mix normal and wildcard targets
+BOOT_TARGETS1 := zImage zImage.initrd uImaged
+BOOT_TARGETS2 := zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
 
-PHONY += $(BOOT_TARGETS)
+PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2)
 
 boot := arch/$(ARCH)/boot
 
@@ -180,7 +182,9 @@ relocs_check: arch/powerpc/relocs_check.pl vmlinux
 zImage: relocs_check
 endif
 
-$(BOOT_TARGETS): vmlinux
+$(BOOT_TARGETS1): vmlinux
+	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
+$(BOOT_TARGETS2): vmlinux
 	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
 
 bootwrapper_install %.dtb:
--

From: Sam Ravnborg
Date: Monday, August 2, 2010 - 1:51 pm

Obviously - dunno how I missed that.
Updated patch below.

I will do a proper submission after you
confirm that powerpc build is working with make 3.82.

	Sam

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 77cfe7a..ace7a3e 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -163,9 +163,11 @@ drivers-$(CONFIG_OPROFILE)	+= arch/powerpc/oprofile/
 # Default to zImage, override when needed
 all: zImage
 
-BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
+# With make 3.82 we cannot mix normal and wildcard targets
+BOOT_TARGETS1 := zImage zImage.initrd uImaged
+BOOT_TARGETS2 := zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
 
-PHONY += $(BOOT_TARGETS)
+PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2)
 
 boot := arch/$(ARCH)/boot
 
@@ -180,10 +182,16 @@ relocs_check: arch/powerpc/relocs_check.pl vmlinux
 zImage: relocs_check
 endif
 
-$(BOOT_TARGETS): vmlinux
+$(BOOT_TARGETS1): vmlinux
+	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
+$(BOOT_TARGETS2): vmlinux
+	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
+
+
+bootwrapper_install
 	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
 
-bootwrapper_install %.dtb:
+%.dtb:
 	$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
 
 define archhelp
--

From: Andreas Schwab
Date: Monday, August 2, 2010 - 2:02 pm

Missing colon.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--

From: Thomas Backlund
Date: Monday, August 2, 2010 - 2:03 pm

Yeah, that was an obvious fix, thanks!

One small typo fix below...
(a missing ':')

Otherwise it works here, so:



--
Thomas
--

From: Sam Ravnborg
Date: Monday, August 2, 2010 - 11:48 pm

Thanks.
I have sent a proper patch to Ben/Paul (powerpc maintainers).

	Sam
--

From: Thomas Backlund
Date: Monday, August 2, 2010 - 1:46 pm

Thanks, this seems to fix the first issue, but then I get the same erro on the following line 190:

190: bootwrapper_install %.dtb:
191:        $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)


--
Thomas
--

From: Paul Smith
Date: Saturday, August 7, 2010 - 10:56 am

The change is intentional.  Note, though, that this syntax was always
dodgy, even in previous versions of GNU make.

If you wrote it exactly as you did, where all the explicit targets come
first and all the implicit targets come second, then it seems to have
been interpreted correctly.

However, if you did it any other way (for example, put some explicit
targets after the first implicit target) then make would silently throw
away all the targets starting with the first implicit target.

Since the syntax used here wasn't ever described in the documentation,
rather than reworking it as a new feature I decided to follow the docs
and disallow it, and be verbose about the error.

-- 
-------------------------------------------------------------------------------
 Paul D. Smith <psmith@gnu.org>          Find some GNU make tips at:
 http://www.gnu.org                      http://make.mad-scientist.net
 "Please remain calm...I may be mad, but I am a professional." --Mad Scientist

--

Previous thread: xiangyanh5 by steedley_wmpd on Sunday, August 1, 2010 - 4:49 pm. (1 message)

Next thread: [ANNOUNCE] mtdev v1.0.8 -- Multitouch Protocol Translation Library by Henrik Rydberg on Sunday, August 1, 2010 - 5:04 pm. (1 message)