It's been two weeks, so it's time to close the merge window. A 2.6.28-rc1 is out there, and it's hopefully all good. The changes in -rc1 are (as usual) too many to really enumerate, with the bulk of them being - again as usual - drivers. In fact, that's doubly true now that we merged the drivers from the staging tree. The dirstat output makes that very obvious: 3.3% arch/arm/ 14.2% arch/ 3.1% crypto/ 4.0% drivers/media/ 3.7% drivers/net/wireless/ 10.3% drivers/net/ 6.5% drivers/staging/me4000/ 8.5% drivers/staging/slicoss/ 4.8% drivers/staging/wlan-ng/ 29.7% drivers/staging/ 63.6% drivers/ 3.3% include/ 4.6% net/ 4.6% sound/ but some other statistics may be fun: - 7141 non-merge commits (and 419 merges) - average non-merge commit: 39 lines removed, 104 lines added (not counting renames) - About 880 individual authors - 340 of which had just one commit - while 183 authors had ten or more commits - Most screwed-up clock award goes to: Greg Kroah-Hartman for his commit 51b90540, which claims to be from April 9 - six years ago! And it's a fix to a driver that was merged this July! Way to go, Greg! Anyway, have fun, please test it, and report any intersting anomalies you find. Linus --
> - Most screwed-up clock award goes to: > > Greg Kroah-Hartman > > for his commit 51b90540, which claims to be from April 9 - six years > ago! And it's a fix to a driver that was merged this July! Maybe that commit really is from a time before printk supported the "%zd" format for size_t :) - R. --
Heh :( Sorry about this, stupid "default date for my patch" script messed up. I suck at bash scripting at times... greg k-h --
It seems if you have a broken asm/ symlink in include/ (which happened as a result of the x86 header moves, for me) the kernel won't try to update it appropriately, and this breaks "make prepare". $ make ARCH=x86_64 prepare CHK include/linux/version.h CHK include/linux/utsrelease.h GEN include/asm/asm-offsets.h /bin/sh: include/asm/asm-offsets.h: No such file or directory make[1]: *** [include/asm/asm-offsets.h] Error 1 make: *** [prepare0] Error 2 rm -f include/asm fixes it This was just from taking a 2.6.27 tree, git clean -d -f, git pull, make oldconfig. Might be a nice thing to fix? -- Cheers, Alistair. --
This should reproduce it (whether or not it's a use-case we care about is another matter). First, make sure your include/asm symlink has been removed, then execute the following sequence: git reset --hard v2.6.27 ; git clean -d -f git status ("Nothing to commit") cp /path/to/config .config make oldconfig prepare git clean -d -f ; git reset --hard git status ("Nothing to commit") Observe at this point that include/asm is valid and points to include/asm-x86, despite the clean and reset (I guess this file is being ignored). Now: git reset --hard v2.6.28-rc1 (Or whatever other method you might choose) git clean -d -f (Removes include/asm-x86) Observe at this point that include/asm is now invalid, and still points to the removed include/asm-x86 directory. cp /path/to/config .config make oldconfig prepare Should fail at this point: scripts/kconfig/conf -o arch/x86/Kconfig # # configuration written to .config # scripts/kconfig/conf -s arch/x86/Kconfig CHK include/linux/version.h UPD include/linux/version.h CHK include/linux/utsrelease.h UPD include/linux/utsrelease.h CC kernel/bounds.s GEN include/linux/bounds.h CC arch/x86/kernel/asm-offsets.s GEN include/asm/asm-offsets.h /bin/sh: include/asm/asm-offsets.h: No such file or directory make[2]: *** [include/asm/asm-offsets.h] Error 1 make[1]: *** [prepare0] Error 2 make: *** [prepare] Error 2 Can you confirm? I checked out Makefile and I believe it occurs because the current checks only make sure a symlink exists, and if it does exist that its target matches up with the selected architecture. It doesn't actually check the destination of the symlink is valid. I'd suggest that it should do that too, and if the destination doesn't exist, re-write the symlink when it does "mkdir include/asm-x86" further down, but I'm not a kbuild expert. -- Cheers, Alistair. --
Use this script for super-clean project-agnostic clean: $ cat ~/bin/git-mrproper #!/bin/sh git-ls-files -o --directory -z | xargs -0 rm -rf I'd say nothing should be done here, include/asm symlink autochange because of different ARCH was unsupported due to it being "big" event, and headers move is equally "big" event. --
JFYI, that should be the same as: git clean -xdf The -x makes it wipe out ignored files as well. Björn --
The problem is ignored files. Yes, git claims everything is clean, but that's because it has been told to ignore certain files, and because it has been told to ignore them, it will not remove them (without the -x flag) in "git clean", nor will it mention them in "git status". And yes, one of the ignored file patterns is include/asm-*/asm-offsets.h which means that your "git clean -df" didn't *really* clean everything from the old include/asm-x86, and because it didn't clean it all it also wouldn't be able to remove the old stale directory - since it wasn't empty. You can use "git clean -dfx" to force git to remove ignored files too. And "make distclean" should have done it too. Now, _another_ part (and arguably the really core reason) of this problem is that our Makefile rules for the asm include directory is weak and unreliable in the presense of already-existing unexpected entries. And it has caused problems before. For example, if you somehow made the symlink not be a symlink at all (by using "cp -LR" for example), or a symlink pointing to another architecture (changing architecture builds in the same tree without doing a "make clean" in between), you historically got really odd results. In fact, it's broken in subtle way before to the point where we now have a special "check-symlink" target internally that checks that the symlink is correctly set up. Of course, it didn't check that you had some old stuff in include/asm-x86, it only checks for the _traditional_ problems we've had. Not some new odd one. Linus --
From: Linus Torvalds <torvalds@linux-foundation.org> I guess we could use seperate "stamp" files to deal with this. Along with the generated file "foo" there is a "foo.stamp" file that is generated with "touch" after "foo" is built. Then "foo"'s update rule is whether the "foo.stamp" is out of date wrt. it's dependencies. --
I remember I made an attempt doing so long time ago for the asm symlink. But why it failed for me I dunno. We used this trick in many archs before but as part of the header move to arch/$ARCH we have killed almost all uses of symlink to reach certain files. The asm symlink is only used by asm-offsett.h for most archs these days and when I get around to it I will fix that too so we can kill it entirely. But first we need to move all archs headers to arch/$ARCH. And we are getting there. Sam --
I just checked and make mrproper / make distclean deletes the symlink We do not cover the "asm symlink became a dir" problem. But when all archs has moved headers it is anyway implicitly covered. Sam --
The following patch add another special case hwre we delete stale symlinks. In my limited testing it fixes the issue - can you try to give it a spin. Sam diff --git a/Makefile b/Makefile index f6703f1..9dc7427 100644 --- a/Makefile +++ b/Makefile @@ -961,6 +961,7 @@ export CPPFLAGS_vmlinux.lds += -P -C -U$(ARCH) # The asm symlink changes when $(ARCH) changes. # Detect this and ask user to run make mrproper +# If asm is a stale symlink (point to dir that does not exist) remove it define check-symlink set -e; \ if [ -L include/asm ]; then \ @@ -970,6 +971,7 @@ define check-symlink echo " set ARCH or save .config and run 'make mrproper' to fix it"; \ exit 1; \ fi; \ + test -e $$asmlink || rm include/asm; \ fi endef --
This fails building on allnoconfig on at least x86-64 because forbid_dac used by arch/x86/kernel/pci-dma.c is defined off in drivers/pci/quirks.c, which isn't built if CONFIG_PCI isn't set. -- Mathematics is the supreme nostalgia of our time. --
(Also fails on x86-32) Bisection points to: 5b6985ce8ec7127b4d60ad450b64ca8b82748a3b intel-iommu: IA64 support -- Mathematics is the supreme nostalgia of our time. --
Yes, the fix patch has been posted yesterday. And it has been merged into linux-next tree already. In case you need it, I post it here again.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
arch/ia64/include/asm/iommu.h | 1 -
arch/ia64/kernel/pci-dma.c | 7 -------
arch/x86/include/asm/iommu.h | 1 -
arch/x86/kernel/pci-dma.c | 16 ++++++++++++++++
drivers/pci/pci.h | 0
drivers/pci/quirks.c | 14 --------------
6 files changed, 16 insertions(+), 23 deletions(-)
diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h
index 5fb2bb9..0490794 100644
--- a/arch/ia64/include/asm/iommu.h
+++ b/arch/ia64/include/asm/iommu.h
@@ -11,6 +11,5 @@ extern int force_iommu, no_iommu;
extern int iommu_detected;
extern void iommu_dma_init(void);
extern void machvec_init(const char *name);
-extern int forbid_dac;
#endif
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index 10a75b5..031abbf 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -89,13 +89,6 @@ int iommu_dma_supported(struct device *dev, u64 mask)
{
struct dma_mapping_ops *ops = get_dma_ops(dev);
-#ifdef CONFIG_PCI
- if (mask > 0xffffffff && forbid_dac > 0) {
- dev_info(dev, "Disallowing DAC for device\n");
- return 0;
- }
-#endif
-
if (ops->dma_supported_op)
return ops->dma_supported_op(dev, mask);
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 1972266..1926248 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -9,6 +9,8 @@
#include <asm/calgary.h>
#include <asm/amd_iommu.h>
+static int forbid_dac __read_mostly;
+
struct dma_mapping_ops *dma_ops;
EXPORT_SYMBOL(dma_ops);
@@ -291,3 +293,17 @@ void pci_iommu_shutdown(void)
}
/* Must execute after PCI subsystem */
fs_initcall(pci_iommu_init);
+
+#ifdef CONFIG_PCI
+/* Many VIA bridges seem to corrupt data for DAC. Disable it here */
+
+static __devinit void ...Hi,
I have a couple of back traces on my parisc. Config file follows.
Let me know if you need a bisection.
thanks,
Domenico
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
Backtrace:
[<00feca70>] do_ipt_get_ctl+0x358/0x498 [ip_tables]
[<1031fbcc>] nf_sockopt+0x1cc/0x204
[<1031fc24>] nf_getsockopt+0x20/0x2c
[<1032cadc>] ip_getsockopt+0xc0/0x100
[<102f9890>] sock_common_getsockopt+0x28/0x34
[<102f73c0>] sys_getsockopt+0x7c/0x104
[<101190c0>] syscall_exit+0x0/0x14
Kernel Fault: Code=15 regs=ee7c0400 (Addr=01307008)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001101111111100001111 Tainted: G W
r00-03 0006ff0f 00fec000 00feca70 ee6a6180
r04-07 00000001 00021140 01306000 1044c820
r08-11 ef1a2dc0 000008b0 00000001 ed00d080
r12-15 01306000 fb61abc0 0002078c fb61b450
r16-19 000201c8 00000054 000201c8 00000000
r20-23 00000000 ed786000 00000000 ee6a61b4
r24-27 00000000 01306000 ee6a6180 1041a020
r28-31 00000000 00000000 ee7c0400 01307000
sr00-03 00000000 00000000 00000000 000006b1
sr04-07 00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 00fec640 00fec644
IIR: 0ffc1290 ISR: 00000000 IOR: 01307008
CPU: 1 CR30: ee7c0000 CR31: 11111111
ORIG_R28: 00000001
IAOQ[0]: get_counters+0x54/0x12c [ip_tables]
IAOQ[1]: get_counters+0x58/0x12c [ip_tables]
RP(r2): do_ipt_get_ctl+0x358/0x498 [ip_tables]
Backtrace:
[<00feca70>] do_ipt_get_ctl+0x358/0x498 [ip_tables]
[<1031fbcc>] nf_sockopt+0x1cc/0x204
[<1031fc24>] nf_getsockopt+0x20/0x2c
[<1032cadc>] ip_getsockopt+0xc0/0x100
[<102f9890>] sock_common_getsockopt+0x28/0x34
[<102f73c0>] sys_getsockopt+0x7c/0x104
[<101190c0>] syscall_exit+0x0/0x14
Kernel panic - not syncing: Kernel Fault
Rebooting in 60 seconds..
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.28-rc1
# Fri Oct 24 14:14:51 2008
#
CONFIG_PARISC=y
CONFIG_MMU=y
CONFIG_STACK_GROWSUP=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# ...--=-EN4gjLaSxdwGpTXp6bbD Content-Type: text/plain Content-Transfer-Encoding: quoted-printable I'm afraid it fails to boot here entirely. Gets as far as: hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31 hpet0: 3 32-bit timers, 25000000 Hz And it sits there, instead of showing me it switched into high-resolution mode as expected: Switched to high resolution mode on CPU 0 Switched to high resolution mode on CPU 4 Switched to high resolution mode on CPU 2 Switched to high resolution mode on CPU 3 Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 5 Switched to high resolution mode on CPU 7 Switched to high resolution mode on CPU 6 I'll go and bisect now if it doesn't ring any bells. It's a dual quad-core Opteron 2354 on a Tyan n6650W (S2915-E), BIOS 2.07 I've attached .config; personally I'm suspecting commit 1f6d6e8ebe73ba9d9d4c693f7f6f50f661dbd6e4 as I booted a tree from Wednesday without issue. No accusation as I can't back that up yet. Any patches that you want me to = try and revert first? Regards, Tony V. --=-EN4gjLaSxdwGpTXp6bbD Content-Disposition: attachment; filename=.config Content-Type: text/plain; name=.config; charset=UTF-8 Content-Transfer-Encoding: ...
On Fri, 24 Oct 2008 23:53:28 +0100 I suspect these are totally innocent; the reason I think this is that select/poll only get used once you hit userspace... and you're hanging way before that. -- Arjan van de Ven Intel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org --
Entirely correct. It seems commit 4403b406d4369a275d483ece6ddee0088cc0d592 by Linus just fixed it for me. My boot hang is gone. Linux prometheus 2.6.28-rc1-00005-g23cf24c #1 SMP Sun Oct 26 13:12:55 GMT 2008 x86_64 Quad-Core AMD Opteron(tm) Processor 2354 AuthenticAMD GNU/Linux Regards, Tony V.
I first encountered this problem in SLES 11 Beta 2 but now I see it affects 2.6.28-rc1 too. On some ppc64 machines, NVRAM is being corrupted very early in boot (before console is initialised). The machine reboots and then fails to find yaboot printing the error "PReP-BOOT: Unable to load PRep image". It's nowhere near as serious as the ftrace+e1000 problem as the machine is not bricked but it's fairly scary looking, the machine cannot boot and the fix is non-obvious. To "fix" the machine; 1. Go to OpenFirmware prompt 2. type dev nvram 3. type wipe-nvram The machine will reboot, reconstruct the NVRAM using some magic and yaboot work again allowing an older kernel to be used. I bisected the problem down to this commit. From 91a00302959545a9ae423e99732b1e46eb19e877 Mon Sep 17 00:00:00 2001 From: Paul Mackerras <paulus@samba.org> Date: Wed, 8 Oct 2008 14:03:29 +0000 Subject: [PATCH] powerpc: Sync RPA note in zImage with kernel's RPA note Commit 9b09c6d909dfd8de96b99b9b9c808b94b0a71614 ("powerpc: Change the default link address for pSeries zImage kernels") changed the real-base value in the CHRP note added by the addnote program from 12MB to 32MB to give more space for Open Firmware to load the zImage. (The real-base value says where we want OF to position itself in memory.) However, this change was ineffective on most pSeries machines, because the RPA note added by addnote has the "ignore me" flag set to 1. This was intended to tell OF to ignore just the RPA note, but has the side effect of also making OF ignore the CHRP note (at least on most pSeries machines). To solve this we have to set the "ignore me" flag to 0 in the RPA note. (We can't just omit the RPA note because that is equivalent to having an RPA note with default values, and the default values are not what we want.) However, then we have to make sure the values in the zImage's RPA note match up with the values that the kernel supplies later in prom_init.c with either the ...
Eek! Which ppc64 machines has this been seen on, and how were they being booted (netboot, yaboot, etc.)? Is it just the Powerstations with their SLOF-based firmware, or is it IBM pSeries machines as well? Paul. --
I'm pretty sure it was with pSeries machines. I saw reports of POWER5 being effected (p520 and p710). I believe one of them resolved the issue by upgrading firmware on the machine. josh --
This is true of a p720 (CHRP IBM,9124-720) that I was testing on. With upgraded firmware, the problem is gone. -- David Kleikamp IBM Linux Technology Center --
Yaboot in my case and I've heard it affected a DVD installation. I don't know for sure if it affects netboot but as I think it's something the To be honest, I haven't been brave enough to try this on a Powerstation yet as I only have the one and I don't know if it's a) affected or b) fixable with the same workaround. It was an IBM pSeries that was affected in my case and a few people have hit the problem on pSeries AFARIK. It's been pointed out that it can be "fixed" by upgrading the firmware but surely we can avoid breaking the machine in the first place? -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab --
What changed in that commit was the contents of a couple of structures that the firmware looks at to see what the kernel wants from firmware. Specifically the change was to say that the kernel (or really the zImage wrapper) would like the firmware to be based at the 32MB point (which is what AIX uses) rather than 12MB (which was the default on older machines). So, as I understand it, it's not anything the kernel is actively doing, it's how the firmware is reacting to what the kernel says it wants. And since we are requesting the same value as AIX (as far as I know) I'm really surprised it caused problems. We can revert that commit, but I still need to solve the problem that the distros are facing, namely that their installer kernel + initramfs images are now bigger than 12MB and can't be loaded if the firmware is based at 12MB. That's why I really want to understand the problem in Have you upgraded the firmware on the machine you saw this problem on? If not, would you be willing to run some tests for me? Paul. --
Same here, it sounds like an innocent change. While it is possible that AIX Of course. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab --
As per an off-line suggestion, I was able to get past the NVRAM problem
using the following patch. The machine still fails to fully boot but it's
due to some modules problem and unrelated to this issue.
From 7e54016ce29eb80026d7ff9a8310cf9c3a7e17a9 Mon Sep 17 00:00:00 2001
From: Mel Gorman <mel@csn.ul.ie>
Date: Fri, 31 Oct 2008 17:12:46 +0000
Subject: [PATCH] Partial revert of 91a00302, set new_mem_def back to 0
On the suggestion of Paul McKerras, I tried the following patch. It partially
reverts a change made by commit 91a00302 by setting new_mem_def back to 0.
Once applied, IBM pSeries with old firmware do not corrupt their NVRAM early
in boot.
I do not know why this change fixes the problem. A structure like this is
also in arch/powerpc/boot/addnote.c but it's not clear if it needs to be
similarly changed or not. Paul?
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
arch/powerpc/kernel/prom_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 23e0db2..d6c8128 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -719,7 +719,7 @@ static struct fake_elf {
.max_pft_size = 46, /* 2^46 bytes max PFT size */
.splpar = 1,
.min_load = ~0U,
- .new_mem_def = 1
+ .new_mem_def = 0
}
}
};
--
I do need to know whether it was the vmlinux or the zImage.pseries that you were loading with yaboot. That commit you identified affects the contents of an ELF note in the zImage.pseries that firmware looks at, as well as a structure in the kernel itself that gets passed as an argument to a call to firmware. If you were loading a vmlinux with yaboot when you saw the corruption occur then that narrows things down a bit. Paul. --
It's the vmlinux file I am seeing problems with. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab --
Unless missed something, I think it's narrowed already. When loaded from yaboot, there is no relevant difference between zImage and vmlinux here. IE. yaboot parses the ELF header of the zImage itself and ignores the special notes anyway so only the CAS firmware call is relevant in both cases, no ? Cheers, Ben. --
Good point. However, it would be the parse-elf-header firmware call, rather than the CAS firmware call, since 91a00302 modified the fake_elf structure (to make it consistent with the CAS structure) but not the CAS structure. Paul. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
