So I said -rc6 would likely be the last -rc, and nothing happened to change my mind. I'd always be happier if it had been an even quieter week, but the appended Shortlog of changes since rc6 doesn't contain anything earthshaking, and I don't think we'd have been any better off by another rc, and waiting one more week. So 2.6.35 is out, go check it out. This may have been a fairly odd release cycle with my rather strict -rc rules before -rc3, but on the whole I think I liked it, and it seems to have worked out ok. I relaxed my extreme stance after getting back from vacation, so the latter half of the rc series was more normal. But even then I got the feeling that people were perhaps a bit more aware of the whole "regression fixes only" model, which is all good. It's a bit hard to judge, but there are some numbers to back it up: in the 2.6.34 release, there were 3800 commits after -rc1, but in the current 35 release cycle we had less than 2000. Now, admittedly 34 was worse than average in that respect (3800 commits is a _lot_ of work after -rc1), but git history says that at least going back to 2.6.24, we've never had less than 2000 commits after -rc1 before now. They tend to be in the 2700-3200 commit range. So I do think we really did have a lot less churn than usual post-merge-window. And that's good. So I'd like to try to repeat the experiment for the next release cycle, and be pretty hardnosed about taking patches and git pull requests after the merge window closes. Talking about the next merge window: Andrew Morton was pretty unhappy with the stability of linux-next at least a couple of weeks ago. It's what he bases his -mm trees on, and so an unstable linux-next makes it hard for Andrew to get his work done. It also makes me worried, because a lot of people seem to think that "it's been in linux-next for several months" means that something can and should be merged. And if linux-next ends up being really flaky, that clearly cannot be the case. So guys - please ...
Hi all, To that end, Nick, can you please submit that tree for inclusion in linux-next in case there are some interactions with some of the other stuff there? (or send it all to Al, instead (or both), I guess.) -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/
Hi Nick, OK, great. -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/
There hasn't been nearly enough review or testing of this patch series yet. Before a merge, it needs to be split up in smaller, more digestable chunks for more comprehensive review, regression testing and behavioural analysis. There's probably only a handful of people who have done any testing on the patchset so far, and given the widespread changes it needs a lot more testing than this before we should consider merging any of it. I really want to see this move forward too, but it changes lots of critical infrastructure in subtle ways and so, IMO, this is not a patchset we should be gung-ho about. Cheers, Dave. -- Dave Chinner david@fromorbit.com --
I dunno. We merge _way_ scarier things in the VM and the block layer,
for much less actual upside, and with less review.
The RCU pathname lookup has some rather impressive performance
upsides, and I agree that it would be good to get a lot of review and
testing, but the latter isn't going to happen without it being
mainlined, and the former is sadly lacking. The person I'd like most
to review it is Al, but anybody in the filesystem world should
basically see it as a #1 priority, because unlike all the masturbatory
patches like xstat() that add new functionality that nobody will
likely ever use, Nick's patchseries improves on the thing that
everybody uses heavily every day without even thinking about it.
Is it tough to review? Yes. It's core code, not just some random
addition that adds a new feature and doesn't impact any old code. But
that's also the thing that makes it meaningful, and makes me think it
should get merged _much_ more eagerly than most code we ever see.
Linus
--
Scary stuff outside of direct VFS/FS interfaces is generally hidden from me by my +6 Blinkers of Blissful Ignorance. I make the assumption that the experts involved know the risks and have weighed Agreed - I've actually looked at every patch, commented on some of the more questionable things, got quoted by LWN for saying that it "fell off the locking cliff", have run benchmarks on it and sent patches fixing bugs back to Nick. It's just really hard to digest it all in one lump and core VFS I agree with you for the pure locking changes. But for the bits that change writeback, LRU ordering and reclaim calculations the benefits are not quite so obvious, nor is the correctness of the code/behaviour quite so provably correct. Maybe I'm being a bit too paranoid, but generally it pays to be a bit conservative as a filesystem developer because the cost of screwing up can be pretty high... Cheers, Dave. -- Dave Chinner david@fromorbit.com --
BTW. it has in fact had quite a bit of testing in earlier form in the -rt tree for a long time, and several fixes come from there. And good I hate to say but I would like to see it mature for another release. It should also clash a bit with Al's recent inode work that he'll want to push. What I can do is send some of the ground work patches this time around, put the tree into linux-next, and put reviewers on notice. I think it is all conceptually sound, but it will inevitably have some bugs left to shake out, and things to be fixed on the review side. I don't anticipate a problem that could not be fixed in the release cycle, but I think aiming for post 2.6.36 is a bit fairer for vfs guys, honestly. LSF is next week too, so most of them will be busy with travel For filesystems developers, the dcache and inode locking changes should be more or less just following simple steps as shown in the patch series. If they're not abusing dcache_lock (and most except autofs4 are not), then it should not be a big deal. There are a couple of locking constraints changed at the API level, but I didn't run into any problems there yet. It should be all documented in Documentation/filesystems/* although I need to run a Writeback shouldn't be changed. LRU ordering is changed for 2 reasons. Firstly, to make things per-zone instead of global. This basically fits our whole reclaim model much better, although it will inevitably cause some random little changes but I think it is agreed this is a good thing (memory shortage in one zone or node does not require global shrinkings, NUMA level parallelism of reclaim.) The other thing is converting the last few dcache refcounting, and all of inode refcounting over to this "lazy LRU" model. This can have a bigger impact, but it really reduces locking on the per-zone lists, so it definitely helps speed and scalability of non-reclaim fastpaths. I'm up for changing this if numbers show it hurts, it would be rather easy to do, but in comparison to ...
What I'm most concerned bit merging everything in one go. It's a huge series and I'd rather see it start going in in batches over multiple kernel releases. Things like the fs_struct spinlock and some other preparatory patches should be ver easily to do for 2.6.36. Scaling the files and vfsmount locks should also be easily doable, but we need to sort out the struct file growth in the later. We really can't grow struct file by two pointers as that would have devasting effects on various workloads. What follows after that is the dcache_lock scaling which to seems the most immature bit of the series, and the one that showed by far the most problems in -RT. I'm very much dead set against merging that in .36. I'd much rather see the inode_lock scaling or the lockless path walk going in before, but I haven't checked how complicated the reordering would be. The lockless path walk also is only rather theoretically useful until we do ACL checks lockless as we're having ACLs enabled pretty much everywhere at least in the distros. The per-zone shrinkers are another thing that's not directly related, I think they need a lot more discussion with the VM folks, and integrating with Dave's work in that area. --
per-zone shrinkers don't cause so much impact to VM design except zone reclaim feature. So, if FS folks think it's ok, I'm not against this at all. btw, however, I haven't review such patch series in the detail yet. so perhaps I might post some bug fix later. --
From a quick look it seems like the inode_lock splitup can easily be moved forward, and it would help us with doing some work on the writeback side. The problem is that it would need rebasing ontop of both the vfs and writeback (aka block) trees. --
inode_lock splitup is much simpler than dcache_lock, yes. And I have to rebase it on the work currently queued for 2.6.35 anyway, so that's no problem. I can easily put it in front of dcache_lock patches in the series (as I said, I've kept everything independent and well split up). I do want opinions on how to do the big-picture merge, though, before I start moving things around. And obviously reviewing each of the parts is more important at this point than exact way to order the thing. But even the inode_lock patches I am wary of merging in 2.6.36 without having much review or any linux-next / vfs-tree exposure. --
One problem is that to win much benefit, several different aspects must be scaled. If not, then you end up with more locks *and* still have bouncing global cachelines. And filesystems will go through multiple releases where locking changes are in flux. This is what I'm concerned about. I definitely have tried to keep everything as conceptually seperate small chunks. But there is a real big-picture aspect that is required to review it. For example, you asked for just the locking split-up without any of the per-hash-locks and per-cpu locks etc. That's fine for review, but you cannot merge it because then you end up with N bouncing global locks instead of 1. It also tends to be much uglier than a final outcome because I have not applied any transformations to improve lock Strictly, it is a filesystem corruption bug-fix for the tty layer and nothing to do with tty scaling patches. I don't have the patience at the moment to sort through tty layer crap, but whoever is maintaining that should. I could possibly come back and look at it some point, but given your half-working patch I would much prefer not to re-order it before either of inode or dcache scaling patches. It would introduce a lot of churn and locking is significantly changed. It probably should be possible, although we would still get path walk contention on dcache_lock, vfsmount_lock, and requires inode-RCU (making inodes more expensive without being offset by any benefits of inode scaling), and requires changes to filesystem dcache and inode APIs. I could work on re-ordering it certainly, but only if it is decided that we definitely don't want dcache-scale or inode-scale patch sets in the forseeable future. I think we definitely do want them, so I True, it needs a last bit of work for permission checking. The conceptual idea and the bulk of the code I think is ready to review Well I'm a VM folk :) Conceptually, there is no problems for MM here. This is really the right way to drive reclaim from the ...
We started some testing of the VFS series on larger systems and so far it looks all very good and the performance improvements are impressive (but of course new bottlenecks are being exposed then) The only snag found so far was that an ACL enabled file system disables all the nice path walk improvements, so right now you need to remount with noacl. I'm hoping this can be fixed before a mainline release, otherwise I suspect it would disable the improvements for lots of people (common distributions default to acl on) -Andi -- ak@linux.intel.com -- Speaking for myself only. --
OK, vfs-scale-working branch now has commits to enable rcu-walk aware d_revalidate, permission, and check_acl in the filesystems, and implements a basic rcu-walk aware scheme for generic/posix acls and implements that for tmpfs, ext2, btrfs. It just drops out of rcu-walk if there is an ACL on a directory, or if it is not cached. I think that's enough to be production ready now. Pushing rcu-walk awareness down into acl checking code would not be hard. I was under the impression that ACLs on directories are not that common, so maybe this is as far as we need to go for now anyway. It does need more commenting of the new methods and explaining how they can and can't be used by filesystems. The tree is also getting messy with incremental changes -- I'm avoiding rebasing it so people following it can see response to reviews and issues that arise. Obviously it will all get cleaned up and rebased properly onto a new branch before anything is merged. --
Yes it shouldn't be common normally. I think the common case for distros is just a few ACLs in /dev. Of course you never know for specific end user workloads. -Andi -- ak@linux.intel.com -- Speaking for myself only. --
They are quite common on fileserver data areas, at least on the places where I worked at. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
Hi, (please cc me as I'm not subscribed) updating from make 3.81 to 3.82 gets me this: [thomas@tmb linux-2.6.35]$ cp arch/powerpc/configs/ppc64_defconfig .config [thomas@tmb linux-2.6.35]$ LC_ALL=C make oldconfig ARCH=powerpc /mnt/work/2.6.35/linux-2.6.35/arch/powerpc/Makefile:183: *** mixed implicit and normal rules. Stop. The lines are: 182: 183: $(BOOT_TARGETS): vmlinux 184: $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) 185: BOOT_TARGETS are defined on line 166 as: BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.% Now it's not a regression in the kernel as the same happends with the 2.6.34 tree too. (btw, the host I'm syncing the defconfig with is a x86_64 machine) Now, I dont know if this is "intended breakage" by the make update, or if the Makefile needs to be updated.... Any ideas how to fix ? -- Thomas --
This is in the category "intended breakage".
We had a similar issue in the top-level Makefile which Paul (IIRC)
helped me to fix long time ago.
To fix popwerpc I suggest something along these lines.
[Note: I did not test it - please do so.
Sam
[PATCH] powerpc: fix build with make 3.82
Thomas Backlund reported that the powerpc build broke with make 3.82.
It failed with the following message:
arch/powerpc/Makefile:183: *** mixed implicit and normal rules. Stop.
The fix is to avoid mixing non-wildcard and wildcard targets.
Reported-by: Thomas Backlund <tmb@mandriva.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
---
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 77cfe7a..ad88b21 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -163,9 +163,11 @@ drivers-$(CONFIG_OPROFILE) += arch/powerpc/oprofile/
# Default to zImage, override when needed
all: zImage
-BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
+# With make 3.82 we cannot mix normal and wildcard targets
+BOOT_TARGETS1 := zImage zImage.initrd uImaged
+BOOT_TARGETS2 := zImage% dtbImage% treeImage.% cuImage.% simpleImage.%
-PHONY += $(BOOT_TARGETS)
+PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2)
boot := arch/$(ARCH)/boot
@@ -180,7 +182,9 @@ relocs_check: arch/powerpc/relocs_check.pl vmlinux
zImage: relocs_check
endif
-$(BOOT_TARGETS): vmlinux
+$(BOOT_TARGETS1): vmlinux
+ $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
+$(BOOT_TARGETS2): vmlinux
$(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
bootwrapper_install %.dtb:
--
Obviously - dunno how I missed that. Updated patch below. I will do a proper submission after you confirm that powerpc build is working with make 3.82. Sam diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 77cfe7a..ace7a3e 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -163,9 +163,11 @@ drivers-$(CONFIG_OPROFILE) += arch/powerpc/oprofile/ # Default to zImage, override when needed all: zImage -BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.% +# With make 3.82 we cannot mix normal and wildcard targets +BOOT_TARGETS1 := zImage zImage.initrd uImaged +BOOT_TARGETS2 := zImage% dtbImage% treeImage.% cuImage.% simpleImage.% -PHONY += $(BOOT_TARGETS) +PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2) boot := arch/$(ARCH)/boot @@ -180,10 +182,16 @@ relocs_check: arch/powerpc/relocs_check.pl vmlinux zImage: relocs_check endif -$(BOOT_TARGETS): vmlinux +$(BOOT_TARGETS1): vmlinux + $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) +$(BOOT_TARGETS2): vmlinux + $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) + + +bootwrapper_install $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) -bootwrapper_install %.dtb: +%.dtb: $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) define archhelp --
Missing colon. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." --
Yeah, that was an obvious fix, thanks! One small typo fix below... (a missing ':') Otherwise it works here, so: -- Thomas --
Thanks. I have sent a proper patch to Ben/Paul (powerpc maintainers). Sam --
Thanks, this seems to fix the first issue, but then I get the same erro on the following line 190: 190: bootwrapper_install %.dtb: 191: $(Q)$(MAKE) ARCH=ppc64 $(build)=$(boot) $(patsubst %,$(boot)/%,$@) -- Thomas --
The change is intentional. Note, though, that this syntax was always dodgy, even in previous versions of GNU make. If you wrote it exactly as you did, where all the explicit targets come first and all the implicit targets come second, then it seems to have been interpreted correctly. However, if you did it any other way (for example, put some explicit targets after the first implicit target) then make would silently throw away all the targets starting with the first implicit target. Since the syntax used here wasn't ever described in the documentation, rather than reworking it as a new feature I decided to follow the docs and disallow it, and be verbose about the error. -- ------------------------------------------------------------------------------- Paul D. Smith <psmith@gnu.org> Find some GNU make tips at: http://www.gnu.org http://make.mad-scientist.net "Please remain calm...I may be mad, but I am a professional." --Mad Scientist --
