ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23/2.6.23-mm1/ - I've been largely avoiding applying anything since rc8-mm2 in an attempt to stabilise things for the 2.6.23 merge. But that didn't stop all the subsystem maintainers from going nuts, with the usual accuracy. We're up to a 37MB diff now, but it seems to be working a bit better. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. Changes since 2.6.23-rc8-mm2: git-acpi.patch git-alsa.patch git-arm.patch git-audit-master.patch git-avr32.patch git-cifs.patch git-cpufreq.patch git-powerpc.pa...
Hello,
I'm seeing reproducible oops on 2.6.23-mm1 when trying to run tcpdump
over ppp0 interface. To reproduce I type simply:
# tcpdump -i ppp0
and wait a few seconds. I captured two oopses with a bit different stack
trace but EIP always points to packet_rcv():
(gdb) l* 0xc02d7d49
0xc02d7d49 is in packet_rcv (include/linux/netdevice.h:830).
825 static inline int dev_parse_header(const struct sk_buff *skb,
826 unsigned char *haddr)
827 {
828 const struct net_device *dev = skb->dev;
829
830 if (!dev->header_ops->parse)
831 return 0;
832 return dev->header_ops->parse(skb, haddr);
833 }
834
Please find pics attached (sorry for poor quality - I can provide you with better ones
tommorow if needed):
http://tuxland.pl/misc/2.6.23-mm1/DSC00136.JPG
http://tuxland.pl/misc/2.6.23-mm1/DSC00142.JPG
Regards,
MariuszCan you please test the latest Linus kernel from ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/? Because all netwrking things which were in 2.6.23-mm1 are now in mainline. So if mainline is OK then that bug presumably got fixed. Thanks. -
You're right. 2.6.23-git17 runs fine so the bug must have been fixed. Regards, Mariusz -
Ok, now that it boots let's go for more. I cannot suspend if mysqld is running. mysql isn't atually doing anything useful anyway. This is the failed suspend tasks dump of mysql: [ 0.000000] Linux version 2.6.23-mm1-1 (mattia@tadamune) (gcc version 4.2.1 (Debian 4.2.1-3)) #5 SMP PREEMPT Sun Oct 21 13:50:54 JST 2007 ... [ 271.736214] PM: Preparing system for mem sleep [ 271.738185] Freezing user space processes ... [ 291.918090] Freezing of tasks failed after 20.19 seconds (1 tasks refusing to freeze): [ 291.918156] task PC stack pid father ... [ 292.043105] ======================= [ 292.043175] mysqld_safe D c03d40c0 0 2393 1 [ 292.043343] c26b3eac 00000082 c03d0eb0 c03d40c0 c011a850 c011a843 c2626aa0 c2626bd4 [ 292.043803] c17fd0c0 00000000 c26b3e88 c26cc380 c26b3ea8 c011b83a c26b3ea0 00000000 [ 292.044322] 08104d08 00000000 00000000 08104d08 00000000 c26b3eb8 c0141de0 c26b3fb8 [ 292.044843] Call Trace: [ 292.044969] [<c0141de0>] refrigerator+0xcf/0xdb [ 292.045091] [<c012b4d2>] get_signal_to_deliver+0x33/0x414 [ 292.045214] [<c01034e8>] do_notify_resume+0x81/0x61e [ 292.045335] [<c0103f06>] work_notifysig+0x13/0x19 [ 292.045456] ======================= [ 292.045524] mysqld D c03d40c0 0 2430 2393 [ 292.045692] c25d0eac 00000086 c03d0eb0 c03d40c0 c0119eb5 00000000 c1c98550 c1c98684 [ 292.046184] c18060c0 00000001 c25d0e88 c2603000 c25d0ea8 c011b83a c25d0ea0 00000000 [ 292.046705] 00000000 00000000 00000000 00000000 00000000 c25d0eb8 c0141de0 c25d0fb8 [ 292.047272] Call Trace: [ 292.049112] [<c0141de0>] refrigerator+0xcf/0xdb [ 292.049234] [<c012b4d2>] get_signal_to_deliver+0x33/0x414 [ 292.049357] [<c01034e8>] do_notify_resume+0x81/0x61e [ 292.049477] [<c0103f06>] work_notifysig+0x13/0x19 [ 292.049598] ======================= [ 292.049666] mysqld D c03d40c0 0 2433 2393 [ 292.049834] ...
I believe this is known and rafael already has a fix somewhere. The "guilty" patch already hit mainline, not sure about the "fix" patch. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
The fix has not been merged yet, but
freezer-use-wait-queue-instead-of-busy-looping.patch has been dropped for
another reason.
The mysqld problem seems to have been caused by another patch, though, and the
fix is appended.
Greetings,
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
Do not allow processes to clear their TIF_SIGPENDING if TIF_FREEZE is set,
so that they will not race with the freezer (like mysqld, for example).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
kernel/signal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.23-mm1/kernel/signal.c
===================================================================
--- linux-2.6.23-mm1.orig/kernel/signal.c
+++ linux-2.6.23-mm1/kernel/signal.c
@@ -124,7 +124,7 @@ void recalc_sigpending_and_wake(struct t
void recalc_sigpending(void)
{
- if (!recalc_sigpending_tsk(current))
+ if (!recalc_sigpending_tsk(current) && !freezing(current))
clear_thread_flag(TIF_SIGPENDING);
}
-great, that was the guilty patch in fact. -- -
Hi Andrew, The kernel build fails on the power box INSTALL vdso64.so INSTALL vdso32.so BOOTCC arch/powerpc/boot/inflate.o arch/powerpc/boot/inflate.c:920:19: error: errno.h: No such file or directory arch/powerpc/boot/inflate.c:921:18: error: slab.h: No such file or directory arch/powerpc/boot/inflate.c:922:21: error: vmalloc.h: No such file or directory arch/powerpc/boot/inflate.c: In function
This problem is fixed by d4faaecbcc6d9ea4f7c05f6de6af98e2336a4afb in Linus' tree. Paul. -
Hi Paul, Thanks, we tried it out over the 2.6.23-mm1 and the patch fixes the build failure. -- Thanks & Regards, Kamalesh Babulal, -
Hello !
While polling the contents of a cgroup task file, I caught the
following corruption. Is there a known race (and a fix) or should
I start digging ?
the program running in the cgroup is fork/exec intensive:
while (1) {
int i, s;
for (i = 0; i < count; i++)
if (fork() == 0)
execlp("/bin/true", "true", 0);
for (i = 0; i < count; i++)
wait(&s);
}
Thanks for any insights,
C.
list_add corruption. next->prev should be prev (ffffffff80a3f338), but was 0000000000200200. (next=ffff810103dcbe90).
------------[ cut here ]------------
kernel BUG at /home/legoater/linux/2.6.23-mm1/lib/list_debug.c:27!
invalid opcode: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/local_cpus
CPU 3
Modules linked in: ipt_REJECT iptable_filter autofs4 nfs lockd sunrpc tg3 sg joydev ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 2441, comm: bash Not tainted 2.6.23-mm1 #4
RIP: 0010:[<ffffffff80308cda>] [<ffffffff80308cda>] __list_add+0x27/0x5b
RSP: 0018:ffff810103d87dd8 EFLAGS: 00010296
RAX: 0000000000000079 RBX: ffff810105033040 RCX: 0000000000000079
RDX: ffff810103d960c0 RSI: 0000000000000001 RDI: 0000000000000096
RBP: ffff810103d87dd8 R08: 0000000000000002 R09: ffff810008123780
R10: 0000000000000000 R11: ffff810103d87a98 R12: 0000000000000000
R13: ffff810105033040 R14: ffff810104c11ac0 R15: 0000000000000000
FS: 00007f4e273556f0(0000) GS:ffff81010011a840(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006ca2f8 CR3: 0000000103d82000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 2441, threadinfo ffff810103d86000, task ffff810103d960c0)
last branch before last exception/interrupt
from [<ffffffff80235885>] printk+0x68/0x69
to [<ffffffff80308cda>] __list_add+0x...This is a crash on
list_add(&child->cg_list, &child->cgroups->tasks);
in cgroup_post_fork(). So it looks like child->cgroups->tasks.next is
a deleted list element. But there are no places that modify that list
outside of write_lock(&css_set_lock) as far as I can see, so I'm a bit
confused as to what the problem could be. I'll try to reproduce this.
Paul
-Not a known race, no. Sorry, didn't have time to look at this yesterday since I was out of the office all day; I'll try to get a chance today. -
Hi Andrew, The link failure while compiling the kernel with allyesconfig over the lpar, which was seen in 2.6.23-rc8-mm2 (http://lkml.org/lkml/2007/9/30/2) is still seen in 2.6.23-mm1, the link failure is ld: arch/powerpc/kernel/head_64.o(.text+0x80c8): sibling call optimization to `.text.init.refok' does not allow automatic multiple TOCs; recompile with -mminimal-toc or -fno-optimize-sibling-calls, or make `.text.init.refok' extern ld: arch/powerpc/kernel/head_64.o(.text+0x8160): sibling call optimization to `.text.init.refok' does not allow automatic multiple TOCs; recompile with -mminimal-toc or -fno-optimize-sibling-calls, or make `.text.init.refok' extern ld: arch/powerpc/kernel/head_64.o(.text+0x81c4): sibling call optimization to `.text.init.refok' does not allow automatic multiple TOCs; recompile with -mminimal-toc or -fno-optimize-sibling-calls, or make `.text.init.refok' extern ld: final link failed: Bad value make: *** [.tmp_vmlinux1] Error 1 # gcc -v Using built-in specs. Target: powerpc64-suse-linux Configured with: ../configure --enable-threads=posix --prefix=/usr --with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib --libexecdir=/usr/lib --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.1.2 --enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --enable-libstdcxx-allocator=new --program-suffix=-4.1 --enable-version-specific-runtime-libs --without-system-libunwind --with-cpu=default32 --enable-secureplt --with-long-double-128 --host=powerpc64-suse-linux Thread model: posix gcc version 4.1.2 20061115 (prerelease) (SUSE Linux) ld -v GNU ld version 2.17.50.0.5 20060927 (SUSE Linux) Anything I can provide to help diagnose this? -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -
Did we work out which patch is causing this? -
Hi Andrew, No, we did not work out on which patch is causing this ! I will try a bisect to find the patch causing this issue. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -
Hi Andrew, After the bisecting, i found that the patch git-net.patch is the cause for the link failure. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -
On Sun, 21 Oct 2007 12:12:38 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ib= r the link failure. The actual cause is my patch to mark some things in head_64.S as init_refok. I have a test patch which I will tidy up and post soon. However, even with that fixed, I am running into a linker bug which Alan Modra is looking into. --=20 Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/
/home is mounted with the following options: /dev/mapper/vglinux1-lvhome on /home type reiserfs (rw,noatime,nodiratime,user_xattr) I guess that beagled (the Beagle desktop search daemon) has populated user xattrs on almost all files. Now, when I delete a file, two BUGs occur and the system hangs. Here is the stack for the first BUG (the second one is very similar): [partially hand copied stack] _fput fput reiserfs_delete_xattrs reiserfs_delete_inode generic_delete_inode generic_drop_inode iput do_unlinkat sys_unlink sys_enter_past_esp I reported a similar BUG in 2.6.22-rc8-mm2 (see http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I tested it and it was OK for 2.6.22-rc8-mm2. I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too. ---- From: Dave Hansen <haveblue@us.ibm.com> The bug is caused by reiserfs creating a special 'struct file' with a NULL vfsmount. /* Opens a file pointer to the attribute associated with inode */ static struct file *open_xa_file(const struct inode *inode, const char *name, int flags) { ... fp = dentry_open(xafile, NULL, O_RDWR); /* dentry_open dputs the dentry if it fails */ As Christoph just said, this is somewhat of a bandaid. But, it shouldn't hurt anything. --- lxc-dave/fs/file_table.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/open.c~fix-reiserfs-oops fs/open.c diff -puN fs/file_table.c~fix-reiserfs-oops fs/file_table.c --- lxc/fs/file_table.c~fix-reiserfs-oops 2007-09-27 13:32:20.000000000 -0700 +++ lxc-dave/fs/file_table.c 2007-09-27 13:33:11.000000000 -0700 @@ -236,7 +236,7 @@ void fastcall __fput(struct file *file) fops_put(file->f_op); if (file->f_mode & FMODE_WRITE) { put_write_access(inode); - if (!special_file(inode->i_mode)) + if (!special_file(inode->i_mode) && mnt) mnt_drop_write(mnt); } put_pid(file->f_owner.pid); diff -puN include/li...
The delete path is a similar case as the one Dave fixed, also cause by
a NULL vfsmount passed to dentry_open, but through a different code-path.
Untested fix for this problem below:
Index: linux-2.6.23-rc8/fs/reiserfs/xattr.c
===================================================================
--- linux-2.6.23-rc8.orig/fs/reiserfs/xattr.c 2007-09-30 14:13:46.000000000 +0200
+++ linux-2.6.23-rc8/fs/reiserfs/xattr.c 2007-09-30 14:18:30.000000000 +0200
@@ -207,9 +207,8 @@ static struct dentry *get_xa_file_dentry
* we're called with i_mutex held, so there are no worries about the directory
* changing underneath us.
*/
-static int __xattr_readdir(struct file *filp, void *dirent, filldir_t filldir)
+static int __xattr_readdir(struct inode *inode, void *dirent, filldir_t filldir)
{
- struct inode *inode = filp->f_path.dentry->d_inode;
struct cpu_key pos_key; /* key of current position in the directory (key of directory entry) */
INITIALIZE_PATH(path_to_entry);
struct buffer_head *bh;
@@ -352,24 +351,19 @@ static int __xattr_readdir(struct file *
* this is stolen from vfs_readdir
*
*/
-static
-int xattr_readdir(struct file *file, filldir_t filler, void *buf)
+static int xattr_readdir(struct inode *inode, filldir_t filler, void *buf)
{
- struct inode *inode = file->f_path.dentry->d_inode;
int res = -ENOTDIR;
- if (!file->f_op || !file->f_op->readdir)
- goto out;
+
mutex_lock_nested(&inode->i_mutex, I_MUTEX_XATTR);
-// down(&inode->i_zombie);
res = -ENOENT;
if (!IS_DEADDIR(inode)) {
lock_kernel();
- res = __xattr_readdir(file, buf, filler);
+ res = __xattr_readdir(inode, buf, filler);
unlock_kernel();
}
-// up(&inode->i_zombie);
mutex_unlock(&inode->i_mutex);
- out:
+
return res;
}
@@ -721,7 +715,6 @@ reiserfs_delete_xattrs_filler(void *buf,
/* This is called w/ inode->i_mutex downed */
int reiserfs_delete_xattrs(struct inode *inode)
{
- struct file ...Does work fine, thanks. -
Here's a patch I worked up the other night that kills off struct file
completely from the xattr code. I've tested it locally.
After several posts and bug reports regarding interaction with the NULL
nameidata, here's a patch to clean up the mess with struct file in the
reiserfs xattr code.
As observed in several of the posts, there's really no need for struct file
to exist in the xattr code. It was really only passed around due to the
f_op->readdir() and a_ops->{prepare,commit}_write prototypes requiring it.
reiserfs_prepare_write() and reiserfs_commit_write() don't actually use
the struct file passed to it, and the xattr code uses a private version of
reiserfs_readdir() to enumerate the xattr directories.
I do have patches in my queue to convert the xattrs to use reiserfs_readdir(),
but I guess I'll just have to rework those.
This is pretty close to the patch by Dave Hansen for -mm, but I didn't
notice it until after I wrote this up.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
fs/reiserfs/xattr.c | 111 ++++++++++++++--------------------------------------
1 file changed, 31 insertions(+), 80 deletions(-)
--- a/fs/reiserfs/xattr.c 2007-08-27 14:03:39.000000000 -0400
+++ b/fs/reiserfs/xattr.c 2007-10-14 22:11:05.000000000 -0400
@@ -191,28 +191,11 @@ static struct dentry *get_xa_file_dentry
dput(xadir);
if (err)
xafile = ERR_PTR(err);
- return xafile;
-}
-
-/* Opens a file pointer to the attribute associated with inode */
-static struct file *open_xa_file(const struct inode *inode, const char *name,
- int flags)
-{
- struct dentry *xafile;
- struct file *fp;
-
- xafile = get_xa_file_dentry(inode, name, flags);
- if (IS_ERR(xafile))
- return ERR_PTR(PTR_ERR(xafile));
else if (!xafile->d_inode) {
dput(xafile);
- return ERR_PTR(-ENODATA);
+ xafile = ERR_PTR(-ENODATA);
}
-
- fp = dentry_open(xafile, NULL, O_RDWR);
- /* dentry_open dputs the dentry if it fails */
-
- return fp;
+ return xafile;
}
...Sorry Jeff, your patch does not apply on 2.6.23-mm1. The 'struct file'
removal from reiserfs_xattr_ function is already in -mm
(make-reiserfs-stop-using-struct-file-for-internal.patch).
The Dave's patch I was refering to is this one:
==== BEGIN =====
The bug is caused by reiserfs creating a special 'struct file' with a
NULL vfsmount.
/* Opens a file pointer to the attribute associated with inode */
static struct file *open_xa_file(const struct inode *inode, const char
*name,
int flags)
{
...
fp = dentry_open(xafile, NULL, O_RDWR);
/* dentry_open dputs the dentry if it fails */
As Christoph just said, this is somewhat of a bandaid. But, it
shouldn't hurt anything.
---
lxc-dave/fs/file_table.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN fs/open.c~fix-reiserfs-oops fs/open.c
diff -puN fs/file_table.c~fix-reiserfs-oops fs/file_table.c
--- lxc/fs/file_table.c~fix-reiserfs-oops 2007-09-27 13:32:20.000000000 -0700
+++ lxc-dave/fs/file_table.c 2007-09-27 13:33:11.000000000 -0700
@@ -236,7 +236,7 @@ void fastcall __fput(struct file *file)
fops_put(file->f_op);
if (file->f_mode & FMODE_WRITE) {
put_write_access(inode);
- if (!special_file(inode->i_mode))
+ if (!special_file(inode->i_mode) && mnt)
mnt_drop_write(mnt);
}
put_pid(file->f_owner.pid);
diff -puN include/linux/mount.h~fix-reiserfs-oops include/linux/mount.h
==== END ====
Dave sent it privately to me... I guess this "bandaid" is no longer
needed now, is it?
~~
laurent
------BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'd guess not. This patch was actually against mainline. I should've specified. I can work up one against -mm later today if it's needed. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFHE8wyLPWxlyuTD7IRAiJrAJ4nC6gwH1cFjWx6BI04O5fDIRftmACcD2wb whyXThHlIBK2phnZ6Pf8Pb8= =Kx6k -----END PGP SIGNATURE----- -
We'll need to drop Dave's patch first. Andrew, can you drop it and put this one in instead? -
Looks like a merge of Dave's and my patch :) ACK from me, I don't care whether it's one or two patches. -
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yeah, it probably is. I did it from scratch since it was my mess, and the patches I saw were against -mm. *shrug* Likewise, I don't care if it's one or two. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFHFiJHLPWxlyuTD7IRAojqAJwKS+eL1yCtUVHzBSFUxjjkW6KgPwCcDRUE Q1V7tCPcT9h0a8ahVmYn+ms= =5kMt -----END PGP SIGNATURE----- -
Something seems to be amiss with CONFIG_LOCALVERSION handling.
I am routinely building with
CONFIG_LOCALVERSION=3D"-testing"
CONFIG_LOCALVERSION_AUTO=3Dy
My usual sequence of "make ; sudo make modules_install install"
has worked fine for all of 2.6.23{-rc?{,-mm?},}. For 2.6.23-mm1
it fails with:
ts@xenon:~/kernel/linux-2.6.23-mm1-work> sudo make modules_install instal=
l
root's password:
INSTALL arch/i386/crypto/aes-i586.ko
[...]
INSTALL sound/usb/usx2y/snd-usb-usx2y.ko
if [ -r System.map -a -x /sbin/depmod ]; then /sbin/depmod -ae -F System.=
map 2.6.23-mm1; fi
sh /home/ts/kernel/linux-2.6.23-mm1-work/arch/i386/boot/install.sh 2.6.23=
-mm1 arch/i386/boot/bzImage System.map "/boot"
Root device: /dev/system/root (mounted on / as ext3)
Module list: processor thermal ahci pata_marvell aic7xxx fan jbd ext3 =
dm_mod edd dm-mod dm-snapshot (xennet xenblk dm-mod dm-snapshot)
Kernel image: /boot/vmlinuz-2.6.23-mm1
Initrd image: /boot/initrd-2.6.23-mm1
No modules found for kernel 2.6.23-mm1-testing
ts@xenon:~/kernel/linux-2.6.23-mm1-work>
That is, both "make modules_install" and "make install" omit
the "-testing" suffix, "make modules_install" installing the
modules into /lib/modules/2.6.23-mm1 instead of
/lib/modules/2.6.23-mm1-testing, and "make install" passing
"2.6.23-mm1" without the "-testing" suffix to the install.sh
script, but mkinitrd suddenly rediscovers the real kernel
version string and consequently looks for modules in
/lib/modules/2.6.23-mm1-testing, so initrd creation fails.
Ideas?
--=20
Tilman Schmidt E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)I have investigated a bit more, and stumbled on this: ts@xenon:~/kernel/linux-2.6.23-mm1-work> make include/config/kernel.relea= se ts@xenon:~/kernel/linux-2.6.23-mm1-work> cat include/config/kernel.releas= e 2.6.23-mm1-testing ts@xenon:~/kernel/linux-2.6.23-mm1-work> make Using ARCH=3Di386 CROSS_COMPILE=3D CHK include/linux/version.h CHK include/linux/utsrelease.h [...] Kernel: arch/i386/boot/bzImage is ready (#1) Building modules, stage 2. MODPOST 1085 modules ts@xenon:~/kernel/linux-2.6.23-mm1-work> cat include/config/kernel.releas= e 2.6.23-mm1 ts@xenon:~/kernel/linux-2.6.23-mm1-work> Hmmm. "Curiouser and curiouser", said Alice. So the content of the file include/config/kernel.release generated by "make" varies depending on whether I ask "make" to create just that file, or an entire kernel!? That runs against everything I ever learned about "make"! My ability to comprehend the inner workings of Kbuild ends here. I'll just skip this -mm release and wait for 2.6.24-rc1, hoping it won't have the same problem. --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
2.6.24-rc1 is fine, so the issue can be closed. T. -
Thanks for reporting back, Sam -
Nope... I have just tried it out with latest -linus tree and I see no bugs. Note that all kbuild fixes are in latest -linus except for a few things that are postponed. I will keep it in mind but nor persuade it further for now. Sam -
Andrew Morton wrote: Works a bit better right :) At least it boots here but I have a strange problem with it. It seems 2.6.23-mm1 kills off java. Every program needs java here does not work anymore telling 'my java' installation is incorrect. Also I noticed firefox is acting weird as well thunderbird. Gtk apps just random freeze and need be killed with -11. Running 'java -version' manually returns nothing , 'java -jar some.jar' does nothing as well. ( not even a error or anything else ) ( I've also tested sun's java 1.5 and 1.6 and openjre as well all with same result ) I only have a WARNING in my dmesg but i don't think this is related to this : Oct 13 01:44:52 lara [10722.146448] WARNING: at fs/namespace.c:586 __mntput() Oct 13 01:44:52 lara [10722.146478] [<c0167cb2>] mntput_no_expire+0x5d/0xab Oct 13 01:44:52 lara [10722.146503] [<c01683d1>] sys_umount+0x1f8/0x202 Oct 13 01:44:52 lara [10722.146511] [<c010f368>] check_pgt_cache+0x13/0x15 Oct 13 01:44:52 lara [10722.146529] [<c0158cd0>] sys_stat64+0xf/0x23 Oct 13 01:44:52 lara [10722.146549] [<c0147a9c>] remove_vma+0x31/0x36 Oct 13 01:44:52 lara [10722.146574] [<c010fbf6>] do_page_fault+0x180/0x4ea Oct 13 01:44:52 lara [10722.146600] [<c01683e6>] sys_oldumount+0xb/0xe Oct 13 01:44:52 lara [10722.146614] [<c010258e>] sysenter_past_esp+0x5f/0x85 Oct 13 01:44:52 lara [10722.146639] [<c02e0000>] xfrm_tmpl_resolve+0x2bd/0x37b Oct 13 01:44:52 lara [10722.146656] ======================= I also noticed some programs like vlc segfaults : vlc[20506]: segfault at 01950000 eip 01950000 esp b4876368 error 4 Booting 2.6.23 makes all these go away. I don't have anything else in my logs. Any idea what patches could cause this problem(s) ? Config can be found there -> http://194.231.229.228/2.6.23-mm1-config Regards, Gabriel C -
what is vlc? -
Media player -> http://www.videolan.org/vlc/ -
Do you know any more about when this happened? Was it during a reboot, or after you unmounted some device or volume? Have you seen it again? Which filesystem(s) do you use? -- Dave -
Hi Andrew, Another build failure with following message CC drivers/scsi/advansys.o drivers/scsi/advansys.c:71:2: warning: #warning this driver is still not properly converted to the DMA API drivers/scsi/advansys.c: In function ‘AdvBuildCarrierFreelist’: drivers/scsi/advansys.c:6486: error: implicit declaration of function ‘virt_to_bus’ drivers/scsi/advansys.c: In function ‘AdvInitAsc3550Driver’: drivers/scsi/advansys.c:6974: error: implicit declaration of function ‘bus_to_virt’ drivers/scsi/advansys.c:6974: warning: cast to pointer from integer of different size drivers/scsi/advansys.c:6994: warning: cast to pointer from integer of different size drivers/scsi/advansys.c: In function ‘AdvInitAsc38C0800Driver’: drivers/scsi/advansys.c:7450: warning: cast to pointer from integer of different size drivers/scsi/advansys.c:7471: warning: cast to pointer from integer of different size drivers/scsi/advansys.c: In function ‘AdvInitAsc38C1600Driver’: drivers/scsi/advansys.c:7939: warning: cast to pointer from integer of different size drivers/scsi/advansys.c:7963: warning: cast to pointer from integer of different size drivers/scsi/advansys.c: In function ‘adv_isr_callback’: drivers/scsi/advansys.c:8175: warning: cast to pointer from integer of different size drivers/scsi/advansys.c: In function ‘AdvISR’: drivers/scsi/advansys.c:8392: warning: cast to pointer from integer of different size drivers/scsi/advansys.c:8412: warning: cast to pointer from integer of different size drivers/scsi/advansys.c: In function ‘AdvExeScsiQueue’: drivers/scsi/advansys.c:10845: warning: cast to pointer from integer of different size make[2]: *** [drivers/scsi/advansys.o] Error 1 make[1]: *** [drivers/scsi] Error 2 make: *** [drivers] Error 2 The functions virt_to_bus and bus_to_virt are begin defined between ifdef CONFIG_PPC32 but when i compile allyesconfig with ppc64 box,i get this error. This patch removes the ifdef. Signed-off-by : Kamalesh Babulal <kama...
especially ones like that ;) Matthew has proposed that advansys should be dependent upon CONFIG_VIRT_TO_BUS. I don't think anyone's done a patch yet though. (Actually, the code which you've altered there should probably be using CONFIG_VIRT_TO_BUS, too). -
Which is totally bogus, because virt_to_bus/bus_to_virt only work on systems without an IOMMU. Most if not all ppc64 systems have one or more IOMMUs. This patch is nacked. The correct fix is to make advansys depend on CONFIG_VIRT_TO_BUS, or alternatively fix advansys.c properly by making it use the interfaces described in Documentation/DMA-mapping.txt (or the equivalent scsi Definitely. Paul. -
If you look at the git logs, you'll notice there's some progress towards this. It's already the case for the narrow boards. I have a patch to rip it all out for the wide boards, but there's clearly a bug because it crashes my parisc machine. Works fine on x86 though. I can't work on it this week because I'm travelling and the parisc machine with remote power died on me last week. I think I already suggested a temporary CONFIG_VIRT_TO_BUS dependency to akpm last week. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
Hi Andrew,
The build fails with following message
CC drivers/net/ibm_newemac/zmii.o
CC drivers/net/ibm_newemac/rgmii.o
drivers/net/ibm_newemac/rgmii.c: In function ‘rgmii_probe’:
drivers/net/ibm_newemac/rgmii.c:254: error: implicit declaration of
function ‘device_is_compatible’
make[3]: *** [drivers/net/ibm_newemac/rgmii.o] Error 1
make[2]: *** [drivers/net/ibm_newemac] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
The function device_is_compatible does not exist, and seems to called
instead of
of_device_compatible. This patch replace the function.
Signed-off-by : Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
---
--- linux-2.6.23/drivers/net/ibm_newemac/rgmii.c 2007-10-12 12:10:48.000000000 +0530
+++ linux-2.6.23/drivers/net/ibm_newemac/~rgmii.c 2007-10-12 14:37:21.000000000 +0530
@@ -251,7 +251,7 @@ static int __devinit rgmii_probe(struct
}
/* Check for RGMII type */
- if (device_is_compatible(ofdev->node, "ibm,rgmii-axon"))
+ if (of_device_is_compatible(ofdev->node, "ibm,rgmii-axon"))
dev->type = RGMII_AXON;
else
dev->type = RGMII_STANDARD;
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-Le 12.10.2007 06:31, Andrew Morton a
On Fri, 12 Oct 2007 22:38:25 +0200 ho hum. Maybe reiser4 needs updating for the git-block changes. I don't recall having seen a useful description of what's going on in git-block so some reverse-engineering might be needed. -
Reiser4: Drop 'size' argument from bio_endio and bi_end_io
This patch pushes into Reiser4 the changes introduced by
commit 6712ecf8f648118c3363c142196418f89a510b90:
As bi_end_io is only called once when the request is complete,
the 'size' argument is now redundant. Remove it.
Now there is no need for bio_endio to subtract the size completed
from bi_size. So don't do that either.
While we are at it, change bi_end_io to return void.
Please review.
Signed-Off-By: Laurent Riffard <laurent.riffard@free.fr>
---
fs/reiser4/flush_queue.c | 10 ++--------
fs/reiser4/page_cache.c | 24 ++++--------------------
fs/reiser4/status_flags.c | 7 +------
3 files changed, 7 insertions(+), 34 deletions(-)
Index: linux-2.6-mm/fs/reiser4/flush_queue.c
===================================================================
--- linux-2.6-mm.orig/fs/reiser4/flush_queue.c
+++ linux-2.6-mm/fs/reiser4/flush_queue.c
@@ -391,9 +391,8 @@ int atom_fq_parts_are_clean(txn_atom * a
}
#endif
/* Bio i/o completion routine for reiser4 write operations. */
-static int
-end_io_handler(struct bio *bio, unsigned int bytes_done UNUSED_ARG,
- int err)
+static void
+end_io_handler(struct bio *bio, int err)
{
int i;
int nr_errors = 0;
@@ -401,10 +400,6 @@ end_io_handler(struct bio *bio, unsigned
assert("zam-958", bio->bi_rw & WRITE);
- /* i/o op. is not fully completed */
- if (bio->bi_size != 0)
- return 1;
-
if (err == -EOPNOTSUPP)
set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
@@ -447,7 +442,6 @@ end_io_handler(struct bio *bio, unsigned
}
bio_put(bio);
- return 0;
}
/* Count I/O requests which will be submitted by @bio in given flush queues
Index: linux-2.6-mm/fs/reiser4/page_cache.c
===================================================================
--- linux-2.6-mm.orig/fs/reiser4/page_cache.c
+++ linux-2.6-mm/fs/reiser4/page_cache.c
@@ -320,18 +320,11 @@ reiser4_tree *reiser4_tree_by_page(const
mpage_end_io_r...Thanks! -
Looks correct to me. Acked-by: Jens Axboe <jens.axboe@oracle.com> -- Jens Axboe -
Hmm. I can add more data to this. My x86_64 mode laptop is running 2.6.23-mm1 with Reiser4 and does not experience problems. I am using 64-bit kernel, libata (I think, whatever the SCSI-like PATA is called), and Reiser4. Both libata and Reiser4 are built-in, not modules. --=20 Zan Lynx <zlynx@acm.org>
Hi Andrew,
The build fails with the following error message.
CC arch/powerpc/sysdev/axonram.o
arch/powerpc/sysdev/axonram.c:120:34: error: macro "bio_io_error" passed 2 arguments, but takes just 1
arch/powerpc/sysdev/axonram.c: In function ‘axon_ram_make_request’:
arch/powerpc/sysdev/axonram.c:120: error: ‘bio_io_error’ undeclared (first use in this function)
arch/powerpc/sysdev/axonram.c:120: error: (Each undeclared identifier is reported only once
arch/powerpc/sysdev/axonram.c:120: error: for each function it appears in.)
arch/powerpc/sysdev/axonram.c:134: error: too many arguments to function ‘bio_endio’
make[1]: *** [arch/powerpc/sysdev/axonram.o] Error 1
make: *** [arch/powerpc/sysdev] Error 2
The patch fixes the build failure.
Signed-off-by : Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
---
--- linux-2.6.23/arch/powerpc/sysdev/axonram.c 2007-10-12 12:58:14.000000000 +0530
+++ linux-2.6.23/arch/powerpc/sysdev/~axonram.c 2007-10-12 12:51:43.000000000 +0530
@@ -117,7 +117,7 @@ axon_ram_make_request(struct request_que
transfered = 0;
bio_for_each_segment(vec, bio, idx) {
if (unlikely(phys_mem + vec->bv_len > phys_end)) {
- bio_io_error(bio, bio->bi_size);
+ bio_io_error(bio);
rc = -ERANGE;
break;
}
@@ -131,7 +131,7 @@ axon_ram_make_request(struct request_que
phys_mem += vec->bv_len;
transfered += vec->bv_len;
}
- bio_endio(bio, transfered, 0);
+ bio_endio(bio, 0);
return rc;
}
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-Hey there!! fails to boot here with this friendly oops: http://oioio.altervista.org/linux/dsc01702.jpg .config: http://oioio.altervista.org/linux/config-2.6.23-mm1-1 2.6.23-rc8-mm2 booted ok but had other problems I haven't reported yet (no s2ram with mysql running and some net WARNING). Let's see if .23-mm1 still has those first. I'm adding Cc: linux-scsi PS: I'll hardly be able to bisect in the next days... :P -- -
That looks like a Jens and Dave production to me. -
Yes, and it's been fixed: http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdif... See also: http://lkml.org/lkml/2007/10/13/174 Thanks, Shaggy -- David Kleikamp IBM Linux Technology Center -
thanks this fixes it -- -
On Thu, 11 Oct 2007 21:31:26 -0700 Between rc8-mm2 and 2.6.23-mm1, autofs stopped working in the -mm kernel. Instead of mounting my home directory, I get these messages in /var/log/messages: Oct 20 00:38:52 kenny automount[2293]: cache_readlock: mapent cache rwlock lock failed Oct 20 00:38:52 kenny automount[2293]: unexpected pthreads error: 11 at 65 in cache.c I am not sure if this is due to autofs changes or changes in some other code that was merged. If you can think of any suspicious change that I should test, please let me know. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -
On Mon, 22 Oct 2007 11:45:19 +0800 Not that I know. If I reboot the system into 2.6.23 or 2.6.23-git, things work just fine though. That makes me think the server is not I do not know if this an autofs issue or the result of something Nope, the only two lines that I found in the log are above... Nothing in dmesg either. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -
I don't think anything changed in autofs in that period. I'd be suspecting the r-o-bind-mounts patches, but they didn't change much in that time either. Does current mainline work OK? If so, pretty much the only thing in that area left unmerged is r-o-bind-mounts and hch's exportfs stuff. -
On Fri, 19 Oct 2007 22:39:00 -0700 Yes, 2.6.23 mainline works fine. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -
On Sat, 20 Oct 2007 01:54:04 -0400 Let me clarify: 2.6.23 vanilla works. I have not yet tried the latest 2.6.23+ git tree. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -
On Sat, 20 Oct 2007 01:54:45 -0400 I just tried it. In the latest git tree, autofs still works. The regression is in -mm only. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -
Andrew, Rik tracked it down to an interaction with futexes from the pid namespace code. I believe r/o bind mounts are innocent for now. -- Dave -
I noticed 32bit binary on x86_64 behavior is changed on 2.6.23-mm1. This is a result of pmap after process returns -ENOMEM.(see attached program) == on 2.6.23 == errno 12 3531: ./malloc 0000000000001000 6272K ----- [ anon ] 0000000000621000 100K r-x-- /lib/ld-2.5.so 000000000063a000 4K r---- /lib/ld-2.5.so 000000000063b000 4K rw--- /lib/ld-2.5.so 000000000063c000 8K ----- [ anon ] 000000000063e000 1244K r-x-- /lib/libc-2.5.so 0000000000775000 8K r---- /lib/libc-2.5.so 0000000000777000 4K rw--- /lib/libc-2.5.so 0000000000778000 12K rw--- [ anon ] 000000000077b000 123700K ----- [ anon ] 0000000008048000 4K r-x-- /home/kamezawa/malloc 0000000008049000 4K rw--- /home/kamezawa/malloc 000000000804a000 3929824K ----- [ anon ] 00000000f7f02000 8K rw--- [ anon ] 00000000f7f04000 100K ----- [ anon ] 00000000f7f1d000 4K rw--- [ anon ] 00000000f7f1e000 131812K ----- [ anon ] 00000000fffd7000 84K rw--- [ stack ] 00000000fffec000 72K ----- [ anon ] 00000000ffffe000 4K r-x-- [ anon ] total 4193272K == == on 2.6.23-mm1== errno 12 3504: ./malloc 0000000000621000 100K r-x-- /lib/ld-2.5.so 000000000063a000 4K r---- /lib/ld-2.5.so 000000000063b000 4K rw--- /lib/ld-2.5.so 000000000063e000 1244K r-x-- /lib/libc-2.5.so 0000000000775000 8K r---- /lib/libc-2.5.so 0000000000777000 4K rw--- /lib/libc-2.5.so 0000000000778000 12K rw--- [ anon ] 0000000008048000 4K r-x-- /home/kamezawa/malloc 0000000008049000 4K rw--- /home/kamezawa/malloc 0000000055555000 4K rw--- [ anon ] 0000000055556000 100K ----- [ anon ] 000000005556f000 8K rw--- [ anon ] 0000000055671000 2789016K ----- [ anon ] 00000000ffa17000 84K rw--- [ stack ] 00000000ffa2c000 5960K ----- [ anon ] 00000000ffffe000 4K r-x-- [ anon ] total 2796560K == Maybe get_unmapped_area() had some change. Is th...
Hi, hm, I guess this is probably due to pie-randomization patch, right? (could you please try reverting it, to see whether things get back to normal). Oh well, this causes more trouble that I have ever imagined ... I will look into it, thanks a lot for the report. Andrew, please drop this one again, I will fix it up. Thanks, -- Jiri Kosina -
On Wed, 17 Oct 2007 11:10:23 +0200 (CEST)
Maybe this can be fix.
Thanks,
-Kame
==
ia32 on x86_64 seems to be handled as it is.
arch/x86_64/mm/mmap.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
Index: devel-2.6.23-mm1/arch/x86_64/mm/mmap.c
===================================================================
--- devel-2.6.23-mm1.orig/arch/x86_64/mm/mmap.c
+++ devel-2.6.23-mm1/arch/x86_64/mm/mmap.c
@@ -54,13 +54,17 @@ static inline unsigned long mmap_base(vo
return TASK_SIZE - (gap & PAGE_MASK);
}
-static inline int mmap_is_legacy(void)
+static inline int mmap_is_32(void)
{
#ifdef CONFIG_IA32_EMULATION
if (test_thread_flag(TIF_IA32))
return 1;
#endif
+ return 0;
+}
+static inline int mmap_is_legacy(void)
+{
if (current->personality & ADDR_COMPAT_LAYOUT)
return 1;
@@ -89,7 +93,12 @@ void arch_pick_mmap_layout(struct mm_str
* Fall back to the standard layout if the personality
* bit is set, or if the expected stack growth is unlimited:
*/
- if (mmap_is_legacy()) {
+ if (mmap_is_32()) {
+#ifdef CONFIG_IA32_EMULATION
+ /* ia32_pick_mmap_layout has its own. */
+ return ia32_pick_mmap_layout(mm);
+#endif
+ } else if(mmap_is_legacy()) {
mm->mmap_base = TASK_UNMAPPED_BASE;
mm->get_unmapped_area = arch_get_unmapped_area;
mm->unmap_area = arch_unmap_area;
-Andrew, below is a fixed version with patch from Kamezawa Hiroyuki incorporated. It fixes the small regression Kamezawa found just at the time you sent merge request for this patch to Linus -- that ia32 ELF binaires on x86_64 were able to allocate only about 2/3 of memory they were able to allocate without this patch. Apart from this fix, the patch is the same as it has been in -mm tree for quite some time. It'd be great if it could make it for 2.6.24, if feasible. Thanks. From: Jiri Kosina <jkosina@suse.cz> Subject: PIE executable randomization This patch is using mmap()'s randomization functionality in such a way that it maps the main executable of (specially compiled/linked -pie/-fpie) ET_DYN binaries onto a random address (in cases in which mmap() is allowed to perform a randomization). The code has been extraced from Ingo's exec-shield patch http://people.redhat.com/mingo/exec-shield/ [akpm@linux-foundation.org: fix used-uninitialsied warning] [kamezawa.hiroyu@jp.fujitsu.com: fixed ia32 ELF on x86_64 handling] Signed-off-by: Jiri Kosina <jkosina@suse.cz> diff --git a/arch/ia64/ia32/binfmt_elf32.c b/arch/ia64/ia32/binfmt_elf32.c index f6ae3ec..3db699b 100644 --- a/arch/ia64/ia32/binfmt_elf32.c +++ b/arch/ia64/ia32/binfmt_elf32.c @@ -226,7 +226,7 @@ elf32_set_personality (void) } static unsigned long -elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type) +elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused) { unsigned long pgoff = (eppnt->p_vaddr) & ~IA32_PAGE_MASK; diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index 907942e..95485e6 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -12,6 +12,7 @@ #include <linux/file.h> #include <linux/utsname.h> #include <linux/personality.h> +#include <linux/random.h> #include <asm/uaccess.h...
