ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1...
- the git-block tree remains dropped due to disageement with the Vaio
- git-e1000new was withdrawn by the authors
- git-wireless is back. It is still a >3MB diff, and appears to compile.
- Is anyone testing the kgdb code in here?
Boilerplate:
- See the `hot-fixes' directory for any important updates to this patchset.
- To fetch an -mm tree using git, use (for example)
git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.echo "subscribe mm-commits" | mail majordomo@vger.kernel.org
- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are athttp://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.- Occasional snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list.Changes since 2.6.23-rc1-mm1:
origin.patch
git-acpi.patch
git-alsa.patch
git-audit-master.patch
git-cifs.patch
git-dma.patch
git-drm.patch
git-dvb.patch
git-hwmon.patch
git-gfs2-nmw.patch
git-hi...
This patch makes the following needlessly global functions static:
- file/cryptcompress.c: __put_page_cluster()
- file/cryptcompress.c: put_hint_cluster()
- item/ctail.c: ctail_read_disk_cluster()Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
fs/reiser4/plugin/cluster.h | 4 ----
fs/reiser4/plugin/file/cryptcompress.c | 8 ++++----
fs/reiser4/plugin/file/cryptcompress.h | 2 --
fs/reiser4/plugin/item/ctail.c | 5 +++--
4 files changed, 7 insertions(+), 12 deletions(-)dcbd29fcda0d6143ab0ec5f91e347d6e540f27bd
diff --git a/fs/reiser4/plugin/cluster.h b/fs/reiser4/plugin/cluster.h
index 7856074..af7a305 100644
--- a/fs/reiser4/plugin/cluster.h
+++ b/fs/reiser4/plugin/cluster.h
@@ -326,8 +326,6 @@ int reiser4_deflate_cluster(struct cluster_handle *, struct inode *);
void truncate_complete_page_cluster(struct inode *inode, cloff_t start,
int even_cows);
void invalidate_hint_cluster(struct cluster_handle * clust);
-void put_hint_cluster(struct cluster_handle * clust, struct inode *inode,
- znode_lock_mode mode);
int get_disk_cluster_locked(struct cluster_handle * clust, struct inode * inode,
znode_lock_mode lock_mode);
void reset_cluster_params(struct cluster_handle * clust);
@@ -335,8 +333,6 @@ int set_cluster_by_page(struct cluster_handle * clust, struct page * page,
int count);
int prepare_page_cluster(struct inode *inode, struct cluster_handle * clust,
rw_op rw);
-void __put_page_cluster(int from, int to, struct page ** pages,
- struct inode * inode);
void put_page_cluster(struct cluster_handle * clust,
struct inode * inode, rw_op rw);
void put_cluster_handle(struct cluster_handle * clust);
diff --git a/fs/reiser4/plugin/file/cryptcompress.c b/fs/reiser4/plugin/file/cryptcompress.c
index f0f0bee..9724d64 100644
--- a/fs/reiser4/plugin/file/cryptcompress.c
+++ b/fs/reiser4/plugin/file/cryptcompress.c
@@ -1378,8 +1378,8 @@ static void truncate_page_cluster_range(struct inode * in...
pm3fb_init() needlessly became global.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
drivers/video/pm3fb.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)9d0fd36d6429c19ac4377c19172d021b3aaa52e8
diff --git a/drivers/video/pm3fb.c b/drivers/video/pm3fb.c
index 3f004e8..195bcdb 100644
--- a/drivers/video/pm3fb.c
+++ b/drivers/video/pm3fb.c
@@ -798,8 +798,6 @@ static void pm3fb_write_mode(struct fb_info *info)
/*
* hardware independent functions
*/
-int pm3fb_init(void);
-
static int pm3fb_check_var(struct fb_var_screeninfo *var, struct fb_info *info)
{
u32 lpitch;
@@ -1419,7 +1417,7 @@ static int __init pm3fb_setup(char *options)
}
#endif /* MODULE */-int __init pm3fb_init(void)
+static int __init pm3fb_init(void)
{
/*
* For kernel boot options (in 'video=pm3fb:<options>' format)-
CONFIG_MMC_ARMMMCI=m/y results in the following compile error:
<-- snip -->
...
CC [M] drivers/mmc/host/mmci.o
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c: In function 'mmci_request':
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c:398: error: implicit declaration of function 'mmc_end_request'
make[4]: *** [drivers/mmc/host/mmci.o] Error 1<-- snip -->
cu
Adrian--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed-
On Wed, 8 Aug 2007 23:31:14 +0200
Thanks. That wasn't the only bug in there. Hopefully fixed now.
Rgds
--
-- Pierre OssmanLinux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
-
Hi,
I still get some nfs related locking bug.
I applied
linux-2.6.23-001-fix_rpciod_down_race.dif
linux-2.6.23-003-fix_locking_regression.dif
linux-2.6.23-004-fix_stateid_regression.dif=============================================
[ INFO: possible recursive locking detected ]
2.6.23-rc1-mm2 #3
---------------------------------------------
events/0/5 is trying to acquire lock:
(events){--..}, at: [<c012ed90>] flush_workqueue+0x0/0x70but task is already holding lock:
(events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0other info that might help us debug this:
2 locks held by events/0/5:
#0: (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
#1: ((nfs_automount_task).work){--..}, at: [<c012e5c4>]
run_workqueue+0xd4/0x1e0stack backtrace:
[<c0104fda>] show_trace_log_lvl+0x1a/0x30
[<c0105c02>] show_trace+0x12/0x20
[<c0105d15>] dump_stack+0x15/0x20
[<c013ee42>] __lock_acquire+0xc22/0x1030
[<c013f2b1>] lock_acquire+0x61/0x80
[<c012edd9>] flush_workqueue+0x49/0x70
[<c012ee0d>] flush_scheduled_work+0xd/0x10
[<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
[<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
[<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
[<c017b38d>] deactivate_super+0x7d/0xa0
[<c018f94b>] mntput_no_expire+0x4b/0x80
[<c018fd94>] expire_mount_list+0xe4/0x140
[<c0191219>] mark_mounts_for_expiry+0x99/0xb0
[<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
[<c012e61b>] run_workqueue+0x12b/0x1e0
[<c012f05b>] worker_thread+0x9b/0x100
[<c0131c72>] kthread+0x42/0x70
[<c0104c0f>] kernel_thread_helper+0x7/0x18
=======================thanks
Marc
--
"The enemy uses unauthorized weapons."
Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First
World War", 1928
-
There is new debugging stuff in -mm: deadlockable usage of workqueue
primitives will now trigger lockdep warnings. Seeftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1...
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1...I am suspecting that running flush_scheduled_work() from within run_workqueue()
isn't good.
-
I'll have a look at this. I suspect that most if not all of our calls to
run_workqueue()/flush_scheduled_work() can now be replaced by more
targeted calls to cancel_work_sync() and cancel_delayed_work_sync().Trond
-
Yes, please, if possible.
To avoid a possible confusion: it is still OK if work->func() flushes
its own workqueue, so strictly speaking this trace is false positive,
but it would be very nice if we can get rid of this practice.Oleg.
-
All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason
for the flush_workqueue()/flush_scheduled_work() calls was to ensure
that the cancel_work()/cancel_delayed_work() calls preceding them have
completed. Nevertheless I've split the conversion into two patches,
since one touches only the NFS code, whereas the other touches the
SUNRPC client and server code.The two patches have been tested, and appear to work...
Trond
this looks unsafe to me, the window is very small, but afaics this can
deadlock if called when nfs4_renew_state() has already started, but didn't
take ->cl_sem yet.Can't we avoid taking clp->cl_sem here?
Btw, unless I missed something, the code without this patch looks incorrect
too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd
didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself
after flush_scheduled_work() returns.Oleg.
-
Not really. We have removed the nfs_client from the public lists, and we
are guaranteed that there are no more active superblocks attached to it
so nothing can call the reclaimer routine (which is the only routineYes, I believe that we can, for the same reasons as above: the race with
No, that should not be possible. Again, see above: there are no active
superblocks, so clp->cl_superblocks is empty.Cheers
Trond-
Thanks for your explanation. Not that I was able to understand, nfs is a
black magic to me :)But. nfs4_renew_state() checks list_empty(&clp->cl_superblocks) under
clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled
at the time when nfs4_kill_renewd(), we can deadlock, no? Because
nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd()Yes, thanks. I missed "goto out" in nfs4_renew_state().
Oleg.
-
They both take read locks, which means that they can take them
simultaneously. AFAICS, the deadlock can only occur if something manages
to insert a request for a write lock after nfs4_kill_renewd() takes its
read lock, but before nfs4_renew_state() takes its read lock:1) nfs4_kill_renewd() 2) nfs4_renew_state() 3) somebody else
------------------- ------------------ -------------
read lock
wait on (2) to complete
write lock <waits on (1)>
read lock <waits on (3),
because rw_semaphores
don't allow a read lock
request to jump a write
lock request>however as I explained earlier, the only process that can take a write
lock is the reclaimer daemon, but we _know_ that cannot be running (for
one thing, the reference count on nfs_client is zero, for the other,
there are no superblocks).Cheers
Trond-
Oleg.
-
I just had a thought: we could get rid of this warning by using a
read-lock here. That way, flushing from within a work function (which
would be seen as read-after-read recursive lock) won't trigger this
warning. Patch below. This would, however, also get rid of any warnings
for run_workqueue recursion. Which again we may or may not want, the
code inidicates that it should be allowed up to a depth of three.However, the question whether we should allow flush_workqueue from
within a struct work is mainly an API policy issue; it doesn't hurt to
flush a workqueue from within a work, but it is probably nearer the
intent to use targeted cancel_work_sync() or such. OTOH, one could
imagine situations where multiple different work structs are on that
workqueue belonging to the same subsystem and then the general
flush_scheduled_work() call is the only way to guarantee nothing is on
scheduled at a given point... I don't feel qualified to make the
decision for or against allowing this use of the API at this point.Marc, do you have an easy way to trigger this warning? Could you verify
that it goes away with the patch below applied?johannes
---
kernel/workqueue.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)--- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 +020=
0
+++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200
@@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor
=20
BUG_ON(get_wq_data(work) !=3D cwq);
work_clear_pending(work);
- lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+ lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
f(work);
lock_release(&lockdep_map, 1, _THIS_IP_);
@@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor
int cpu;
=20
might_sleep();
- lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+ lock_acquire(&wq->lockdep_map, 0, 0, 1, 2, _THI...
Hi,
just booting into X is enough.
I applied the patch, but now I get:
=================================
[ INFO: inconsistent lock state ]
2.6.23-rc1-mm2 #4
---------------------------------
inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
(rpc_credcache_lock){-+..}, at: [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
{softirq-on-W} state was registered at:
[<c013e870>] __lock_acquire+0x650/0x1030
[<c013f2b1>] lock_acquire+0x61/0x80
[<c02db9ac>] _spin_lock+0x2c/0x40
[<c01dc487>] _atomic_dec_and_lock+0x17/0x60
[<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
[<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
[<dced3fd4>] a0 [sunrpc]
[<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
[<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
[<dced65b3>] svc_register+0x93/0x160 [sunrpc]
[<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
[<dced7053>] svc_create+0x13/0x20 [sunrpc]
[<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
[<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
[<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
[<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
[<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
[<c017b444>] vfs_kern_mount+0x94/0x110
[<c0190a62>] do_mount+0x1f2/0x7d0
[<c01910a6>] sys_mount+0x66/0xa0
[<c0104046>] syscall_call+0x7/0xb
[<ffffffff>] 0xffffffff
irq event stamp: 5277830
hardirqs last enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0
hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0
softirqs last enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0
softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50other info that might help us debug this:
no locks held by swapper/0.stack backtrace:
[<c0104fda>] show_trace_log_lvl+0x1a/0x30
[<c0105c02>] show_...
That is a different matter. I assume this patch should suffice to fix
the above problem.Trond
yes - it does.
thanks.
Marc
--
"Our cause has a sacred nature."
Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First
World War", 1928
-
x60
Interesting, but doesn't seem related to this at all. As Oleg just
pointed out this basically disabled checking for workqueue stuff so this
should be looked into by somebody familiar with the NFS code.johannes
I am not sure, but currently I hope we can forbid this eventually, so I
But this makes ->lockdep_map meaningless? We always take wq->lockdep_map
for reading, now we can't detect deadlocks.read_lock(A);
lock(B);vs
lock(B);
read_lock(A);is valid, kernel/lockdep.c should not complain.
No?
Oleg.
-
Ah, hmm. Good point, I guess you can always have multiple read locks.
Then we'd have to make a new parameter or such to get rid of the
recursive locking try message. But if you want to deprecate the API
anyway then this is a good way to find it.johannes
From: Rafael J. Wysocki <rjw@sisk.pl>
My test box crashes during suspend, while the nonboot CPUs are being disabled,
because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not
NULL. Fix it.Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
fs/sysfs/inode.c | 2 ++
1 file changed, 2 insertions(+)Index: linux-2.6.23-rc1-mm2/fs/sysfs/inode.c
===================================================================
--- linux-2.6.23-rc1-mm2.orig/fs/sysfs/inode.c
+++ linux-2.6.23-rc1-mm2/fs/sysfs/inode.c
@@ -191,6 +191,8 @@ int sysfs_hash_and_remove(struct kobject
struct sysfs_addrm_cxt acxt;
struct sysfs_dirent **pos, *sd;+ if (!dir_sd)
+ return -ENOENT;
sysfs_addrm_start(&acxt, dir_sd);
if (!sysfs_resolve_for_remove(kobj, &acxt.parent_sd))
goto addrm_finish;
-
It got broken when shadow support was added. The shadow support in -mm1
will be dropped and Eric is preparing a new version. So, this fix
probably won't be necessary from -mm2.Thanks.
--
tejun
-
Agreed. That check is in my current development tree.
Eric
-
This patch fixes the following section mismatch warnings for sound/pci/hda/*
...
WARNING: vmlinux.o(.text+0x28d5f7): Section mismatch: reference to .init.text:snd_hda_add_new_ctls (between 'alc_build_controls' and 'alc662_auto_set_output_and_unmute')
WARNING: vmlinux.o(.text+0x28d621): Section mismatch: reference to .init.text:snd_hda_create_spdif_in_ctls (between 'alc_build_controls' and 'alc662_auto_set_output_and_unmute')
WARNING: vmlinux.o(.text+0x28d63d): Section mismatch: reference to .init.text:snd_hda_create_spdif_out_ctls (between 'alc_build_controls' and 'alc662_auto_set_output_and_unmute')
WARNING: vmlinux.o(.text+0x2904ca): Section mismatch: reference to .init.text:snd_hda_parse_pin_def_config (between 'alc880_parse_auto_config' and 'alc882_gpio_mute')
WARNING: vmlinux.o(.text+0x290e5e): Section mismatch: reference to .init.text:snd_hda_check_board_config (between 'patch_alc268' and 'patch_alc662')
WARNING: vmlinux.o(.text+0x290e7c): Section mismatch: reference to .init.text:snd_hda_parse_pin_def_config (between 'patch_alc268' and 'patch_alc662')
WARNING: vmlinux.o(.text+0x291248): Section mismatch: reference to .init.text:snd_hda_check_board_config (between 'patch_alc662' and 'alc_mux_enum_info')
WARNING: vmlinux.o(.text+0x291315): Section mismatch: reference to .init.text:snd_hda_parse_pin_def_config (between 'patch_alc662' and 'alc_mux_enum_info')
WARNING: vmlinux.o(.text+0x2919ca): Section mismatch: reference to .init.text:snd_hda_check_board_config (between 'patch_alc880' and 'patch_alc260')
WARNING: vmlinux.o(.text+0x291b96): Section mismatch: reference to .init.text:snd_hda_check_board_config (between 'patch_alc260' and 'patch_alc882')
WARNING: vmlinux.o(.text+0x291c2c): Section mismatch: reference to .init.text:snd_hda_parse_pin_def_config (between 'patch_alc260' and 'patch_alc882')
WARNING: vmlinux.o(.text+0x292010): Section mismatch: reference to .init.text:snd_hda_check_board_config (between 'patch_alc882' and 'patch_alc883')
WARNING: vmlinux.o(.text+0x2922da): Section mis...
At Thu, 02 Aug 2007 15:11:59 +0200,
Hold on this. I have other changes in my tree with removal of
__devinit in relevant places.Takashi
-
Can I ask you to text-compile with and without HOTPLUG before submitting.
This catches most cases of section mismatch.Sam
-
At Thu, 2 Aug 2007 18:32:26 +0200,
Yep, checked that.
thanks,
Takashi
-
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
---
--- linux-2.6.23-rc1-mm/MAINTAINERS.orig 2007-08-02 01:51:40.000000000 +0200
+++ linux-2.6.23-rc1-mm/MAINTAINERS 2007-08-02 01:52:17.000000000 +0200
@@ -672,7 +672,7 @@ S: Maintained
AUDIT SUBSYSTEM
P: David Woodhouse
M: dwmw2@infradead.org
-L: linux-audit@redhat.com
+L: linux-audit@redhat.com (subscribers-only)
W: http://people.redhat.com/sgrubb/audit/
T: git kernel.org:/pub/scm/linux/kernel/git/dwmw2/audit-2.6.git
S: Maintained-
Ack, I sent this patch last week...
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
...
kernel/auditsc.c: In function 'handle_one':
kernel/auditsc.c:1411: error: 'const struct inode' has no member named 'inotify_watches'
kernel/auditsc.c:1411: error: 'const struct inode' has no member named 'inotify_watches'
kernel/auditsc.c:1411: error: 'const struct inode' has no member named 'inotify_watches'
kernel/auditsc.c: In function 'handle_path':
kernel/auditsc.c:1452: error: 'struct inode' has no member named 'inotify_watches'
kernel/auditsc.c:1452: error: 'struct inode' has no member named 'inotify_watches'
kernel/auditsc.c:1452: error: 'struct inode' has no member named 'inotify_watches'
make[1]: *** [kernel/auditsc.o] Error 1
make: *** [kernel] Error 2
make: *** Waiting for unfinished jobs.......
Got that with a randconfig ( http://194.231.229.228/MM/randconfig-auto-34 )
Regards,
Gabriel
-
Fixes one issue I had in -mm1 (I'm assuming somebody else spotting this one
And the other...
As an aside, it looks like bits&pieces of dynticks-for-x86_64 are in there.
In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in
there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing
in the x86_64 tree actually *sets* it. There's a few other dynticks-related
prep patches in there as well. Does this mean it's back to "coming soon to
a CPU near you" status? :)
On Wed, 01 Aug 2007 16:30:08 -0400
I've lost the plot on that stuff: I'm just leaving things as-is for now,
wait for Thomas to return from vacation so we can have another run at it.
-
For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD
SMP system without trouble.Next try with 2.6.23-rc1-mm2 and SPARSEMEM:
Probably the same exception, but this time with Call Trace:
[ 0.000000] Bootmem setup node 0 0000000000000000-0000000080000000
[ 0.000000] Bootmem setup node 1 0000000080000000-0000000120000000
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] DMA32 4096 -> 1048576
[ 0.000000] Normal 1048576 -> 1179648
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[4] active PFN ranges
[ 0.000000] 0: 0 -> 159
[ 0.000000] 0: 256 -> 524288
[ 0.000000] 1: 524288 -> 917488
[ 0.000000] 1: 1048576 -> 1179648
PANIC: early exception rip ffffffff807cddb5 error 2 cr2 ffffe20003000010
[ 0.000000]
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff807cddb5>] memmap_init_zone+0xb5/0x130
[ 0.000000] [<ffffffff807ce874>] init_currently_empty_zone+0x84/0x110
[ 0.000000] [<ffffffff807cec93>] free_area_init_node+0x393/0x3e0
[ 0.000000] [<ffffffff807cefea>] free_area_init_nodes+0x2da/0x320
[ 0.000000] [<ffffffff807c9c97>] paging_init+0x87/0x90
[ 0.000000] [<ffffffff807c0f85>] setup_arch+0x355/0x470
[ 0.000000] [<ffffffff807bc967>] start_kernel+0x57/0x330
[ 0.000000] [<ffffffff807bc12d>] _sinittext+0x12d/0x140
[ 0.000000]
[ 0.000000] RIP memmap_init_zone+0xb5/0x130(gdb) list *0xffffffff807cddb5
0xffffffff807cddb5 is in memmap_init_zone (include/linux/list.h:32).
27 #define LIST_HEAD(name) \
28 struct list_head name = LIST_HEAD_INIT(name)
29
30 static inline void INIT_LIST_HEAD(struct list_head *list)
31 {
32 list->next = list;
33 list->prev = list;
34 }
35
36 /*I will test more tomorrow...
Torsten
-
Well.... That doesn't make a whole pile of sense unless the memory map
Node 1 spans a region with a nice little hole in the middle of DMA32. In our
test machines, we wouldn't see a hole like this, at least that I can recall
so it would appear to work on some machines. On SPARSEMEM, sparse_init()
is responsible for allocating memmap for each section. In 2.6.22-rc6-mm1,
it allocated the memory if the section was *valid*. In 2.6.23-rc1-mm1,
it allocates the memory if the section is *present* due to the patch
sparsemem-record-when-a-section-has-a-valid-mem_map.patch[1]. Much later in
the init process, memmap is initialised based on spanned memory, not present
memory so initialisation will init memmap that resides in holes if a zone
spans that area in a node which is the case on this machine. I think this
is why it kablamos - it's inits memmap that wasn't allocated because it's
not present and the suprise is that it doesn't blow up sooner. Please try
the patch below Torsten, thanks.[1] yeah, I acked this patch and I had read through it. My bad if the
patch below does fix the problemdiff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc1-mm2-clean/mm/sparse.c linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c
--- linux-2.6.23-rc1-mm2-clean/mm/sparse.c 2007-08-01 10:09:39.000000000 +0100
+++ linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c 2007-08-02 00:27:00.000000000 +0100
@@ -483,7 +483,7 @@ void __init sparse_init(void)
unsigned long *usemap;for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
- if (!present_section_nr(pnum))
+ if (!valid_section_nr(pnum))
continue;map = sparse_early_mem_map_alloc(pnum);
-
This implies that &page->lru is invalid. Which implies that the memory
map is indeed not present. However, if we look at the code in detail we
have actually already updated several fields in the struct page already.
Particularly we have already updated the flags, _count, and _mapcount.
It is when we touch lru which we blammo. All of the good entries are in
the first 24 bytes of the struct page, lru is in the 8th 64bit word, or
+64 bytes. Looking at the faulting address it is ffffe20003000010, ie
the fault is 16 bytes into a page. So the first three elements of this
struct page are in one PMD mapped page, and the lru the next.As this has SPARSEMEM_VMEMMAP enabled that implies that the vemmmap has
not been filled out correctly. Looking at the x86_64 initialiser it
appears that we have the same bug that Kame-san reported against the
generic initialisers. At the end of this email is a proposed patch for
this, could you apply that to a clean 2.6.23-rc1-mm2 tree and give it
a test for me. I have boot tested this on our x86_64 boxes, but they
happen to be sized and layed out to not trip this bug.Let me know if it fixes things up for you and I will push it upstream.
If this patch does not fix it could you please get us a boot log at
loglevel=8 of an unmodified 2.6.23-rc1-mm2 kernel, this should give
[...]-apw
=== 8< ===
vmemmap x86_64: ensure end of section memmap is initialisedSimilar to the generic initialisers, the x86_64 vmemmap
initialisation may incorrectly skip the last page of a section if
the section start is not aligned to the page.Where we have a section spanning the end of a PMD we will check the
start of the section at A populating it. We will then move on 1
PMD page to C and find ourselves beyond the end of the section which
ends at B we will complete without checking the second PMD page.| PMD | PMD |
| SECTION |
A B CWe should round ourselves to the end of the PMD...
That patch applied to 2.6.23-rc1-mm2 boots.
But I still the the MP-BIOS bug, now with an additional Call Trace:
[ 27.034907] ACPI: Core revision 20070126
[ 27.082090] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[ 27.132617] WARNING: at kernel/irq/resend.c:69 check_irq_resend()
[ 27.150837]
[ 27.150837] Call Trace:
[ 27.162621] [<ffffffff80261c4c>] check_irq_resend+0xbc/0xd0
[ 27.179558] [<ffffffff802617c0>] enable_irq+0xf0/0x100
[ 27.195177] [<ffffffff807c6984>] setup_IO_APIC+0x6c4/0x9a0
[ 27.211833] [<ffffffff80234e74>] set_cpus_allowed+0x64/0xc0
[ 27.228749] [<ffffffff807c4e14>] smp_prepare_cpus+0x434/0x460
[ 27.246183] [<ffffffff807bc627>] kernel_init+0x67/0x350
[ 27.262062] [<ffffffff8020cac8>] child_rip+0xa/0x12
[ 27.276928] [<ffffffff803d4f80>] acpi_ds_init_one_object+0x0/0x7c
[ 27.295425] [<ffffffff807bc5c0>] kernel_init+0x0/0x350
[ 27.311043] [<ffffffff8020cabe>] child_rip+0x0/0x12
[ 27.325881]
[ 27.463199] Using local APIC timer interrupts.
[ 27.514874] result 12500129
[ 27.523240] Detected 12.500 MHz APIC timer.It does no longer seem to matter if it was a warm or cold start.
Otherwise the system seems to be working normal.
Torsten
-
Complete bootlog, if you need more info about the memmaps...
[ 0.000000] Linux version 2.6.23-rc1-mm2 (root@treogen) (gcc
version 4.2.1 (Gentoo 4.2.1 p1.4)) #1 SMP Wed Aug 1 21:56:36 CEST 2007
[ 0.000000] Command line: earlyprintk=serial,ttyS0,38400
console=ttyS0,38400 console=tty1 crypt_root=/dev/md1
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000dfff0000 (usable)
[ 0.000000] BIOS-e820: 00000000dfff0000 - 00000000dfffe000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000dfffe000 - 00000000e0000000 (ACPI NVS)
[ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
[ 0.000000] BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
[ 0.000000] BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
[ 0.000000] console [earlyser0] enabled
[ 0.000000] end_pfn_map = 1179648
kernel direct mapping tables up to 120000000 @ 8000-e000
[ 0.000000] DMI present.
[ 0.000000] ACPI: RSDP 000FB5E0, 0014 (r0 ACPIAM)
[ 0.000000] ACPI: RSDT DFFF0000, 003C (r1 A M I OEMRSDT 6000626
MSFT 97)
[ 0.000000] ACPI: FACP DFFF0200, 0084 (r2 A M I OEMFACP 6000626
MSFT 97)
[ 0.000000] ACPI: DSDT DFFF0450, 48E1 (r1 S0027 S0027000 0
INTL 20051117)
[ 0.000000] ACPI: FACS DFFFE000, 0040
[ 0.000000] ACPI: APIC DFFF0390, 0080 (r1 A M I OEMAPIC 6000626
MSFT 97)
[ 0.000000] ACPI: MCFG DFFF0410, 003C (r1 A M I OEMMCFG 6000626
MSFT 97)
[ 0.000000] ACPI: OEMB DFFFE040, 0060 (r1 A M I AMI_OEM 6000626
MSFT 97)
[ 0.000000] ACPI: SRAT DFFF4D40, 0110 (r1 AMD HAMMER 1
AMD 1)
[ 0.000000] ACPI: SSDT DFFF4E50, 04F0 (r1 A M I ACPI2PPC ...
On Wed, 1 Aug 2007 22:52:44 +0200
Thanks. Please send the .config?
-
Alan,
this does not work after a suspend-resume cycle, I get a " ACPI get
timing mode failed (AE 0x1001)" error.$ dmesg | grep ata
...
scsi0 : pata_via
scsi1 : pata_via
ata1: PATA max UDMA/100 cmd 0x000101f0 ctl 0x000103f6 bmdma
0x0001b800 irq 14
ata2: PATA max UDMA/100 cmd 0x00010170 ctl 0x00010376 bmdma
0x0001b808 irq 15
ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
ata1.00: 78165360 sectors, multi 16: LBA
ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
ata1.01: 160086528 sectors, multi 16: LBA
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/100
ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL03, max UDMA/33
ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
ata2.00: configured for UDMA/33
ata2.01: configured for MWDMA2
ata1.00: Unable to set Link PM policy
ata1.01: Unable to set Link PM policy
ata2.00: Unable to set Link PM policy
ata2.01: Unable to set Link PM policy
...
[ suspend-to-disk/resume cycle happens here ]
...
ata1.00: Unable to set Link PM policy
ata1.01: Unable to set Link PM policy
ata2.00: Unable to set Link PM policy
ata2.01: Unable to set Link PM policy
ata1: ACPI get timing mode failed (AE 0x1001) <==========
ata1.00: limited to UDMA/33 due to 40-wire cable
ata1.01: limited to UDMA/33 due to 40-wire cable
ata1.00: configured for UDMA/33
ata1.01: configured for UDMA/33
ata2: ACPI get timing mode failed (AE 0x1001)
ata2.00: configured for UDMA/33
ata2.01: configured for MWDMA2Anyway, long before 2.6.23-rc1-mm2, 80-wire cable detection was
already wrong after a suspend-resume cycle. So I cooked the
following patch 2 days ago.It may be the wrong approach but it works for me.
--
pata_via: preserve cable detection bits in via_do_set_modevia_cable_detect performs cable detection by checking bits in PCI
layer. But via_do_set_mode overwrites these bits. This behaviour
breaks cable detection after suspend/resume cycle.So let's teach via_do_set_mode to preserve cable detection bits.
Signed-off-by: Laur...
...
WARNING: vmlinux.o(.text+0x8b9f): Section mismatch: reference to .init.text:cache_remove_shared_cpu_map (between 'cpuid4_cache_sysfs_exit' and 'unexpected_machine_check')
...
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
---
--- linux-2.6.23-rc1-mm/arch/i386/kernel/cpu/intel_cacheinfo.c.orig 2007-08-01 17:10:27.000000000 +0200
+++ linux-2.6.23-rc1-mm/arch/i386/kernel/cpu/intel_cacheinfo.c 2007-08-01 17:11:37.000000000 +0200
@@ -681,7 +681,7 @@ static struct kobj_type ktype_percpu_ent
.sysfs_ops = &sysfs_ops,
};-static void cpuid4_cache_sysfs_exit(unsigned int cpu)
+static void __cpuinit cpuid4_cache_sysfs_exit(unsigned int cpu)
{
kfree(cache_kobject[cpu]);
kfree(index_kobject[cpu]);
-
Hi,
move_msr_up() is used only on X86_64 and generates a warning on !X86_64
...
drivers/kvm/vmx.c:548: warning: 'move_msr_up' defined but not used
...
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
---
PS: Btw Avi why do you think I'm mysterious ?:)
...git-kvm.patch: Noted by the mysterious Gabriel C.
...
--- linux-2.6.23-rc1-mm/drivers/kvm/vmx.c.orig 2007-08-01 15:56:41.000000000 +0200
+++ linux-2.6.23-rc1-mm/drivers/kvm/vmx.c 2007-08-01 15:58:24.000000000 +0200
@@ -544,6 +544,7 @@ static void vmx_inject_gp(struct kvm_vcp
/*
* Swap MSR entry in host/guest MSR entry array.
*/
+#ifdef CONFIG_X86_64
static void move_msr_up(struct vcpu_vmx *vmx, int from, int to)
{
struct kvm_msr_entry tmp;
@@ -555,6 +556,7 @@ static void move_msr_up(struct vcpu_vmx
vmx->host_msrs[to] = vmx->host_msrs[from];
vmx->host_msrs[from] = tmp;
}
+#endif/*
* Set up the vmcs to automatically save and restore system
-
Well, the C surname (maybe you're related to the programming language?)
and the email address.--
error compiling committee.c: too many arguments to function-
Getting this with a randconfig ( http://194.231.229.228/MM/randconfig-auto-10 )
...
drivers/scsi/advansys.c:794:2: warning: #warning this driver is still not properly converted to the DMA API
drivers/scsi/advansys.c: In function 'advansys_board_found':
drivers/scsi/advansys.c:17781: error: implicit declaration of function 'to_pci_dev'
drivers/scsi/advansys.c:17781: warning: pointer/integer type mismatch in conditional expression
drivers/scsi/advansys.c:17788: warning: unused variable 'pci_memory_address'
drivers/scsi/advansys.c:17781: warning: unused variable 'pdev'
make[2]: *** [drivers/scsi/advansys.o] Error 1
make[1]: *** [drivers/scsi] Error 2
make[1]: *** Waiting for unfinished jobs.......
-
Hi Gabriel,
Hope the following trivial patch helps.
[...]
This patch fixes the following compile error:
drivers/scsi/advansys.c: In function 'advansys_board_found':
drivers/scsi/advansys.c:17781: error: implicit declaration of function
'to_pci_dev'Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
---
drivers/scsi/advansys.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
index 79c0b6e..908f02b 100644
--- a/drivers/scsi/advansys.c
+++ b/drivers/scsi/advansys.c
@@ -774,6 +774,7 @@
#include <linux/stat.h>
#include <linux/spinlock.h>
#include <linux/dma-mapping.h>
+#include <linux/pci.h>#include <asm/io.h>
#include <asm/system.h>-
Or just remove the ifdefs around the include ... which is done in this
patch:http://www.kernel.org/pub/linux/kernel/people/willy/advansys-2007-07-30/...
I'd be interested in seeing the results of the randconfig trials on the
driver with those 23 patches applied, but not particularly interested in
the intermediate result.--
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
-
I can do that on weekend.
Are the patches meant for -mm or git head ?
-
They were developed against git head as of a few days ago. Here's a git
tree, if that's easier for you:http://git.kernel.org/?p=linux/kernel/git/willy/advansys.git;a=shortlog
--
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
-
Ach nice , yes is a lot easier to work with git.
-
-
From: Heiko Carstens <heiko.carstens@de.ibm.com>
The slow-down-printk-during-boot patch depends on preset_lpj being
available. That's not the case for architectures that have it's own
calibrate_delay() function.kernel/sched.c:3840: undefined reference to `preset_lpj'
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Dave Jones <davej@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
lib/Kconfig.debug | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)Index: linux-2.6.22/lib/Kconfig.debug
===================================================================
--- linux-2.6.22.orig/lib/Kconfig.debug
+++ linux-2.6.22/lib/Kconfig.debug
@@ -436,7 +436,7 @@ config FORCED_INLININGconfig BOOT_PRINTK_DELAY
bool "Delay each boot printk message by N milliseconds"
- depends on DEBUG_KERNEL && PRINTK
+ depends on DEBUG_KERNEL && PRINTK && GENERIC_CALIBRATE_DELAY
help
This build option allows you to read kernel boot messages
by inserting a short delay after each one. The delay is
-
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
Andrew Morton wrote:
...allmodconfig on UML
...
In file included from drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:48:
drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h: In function 'bcm43xx_pio_write':
drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h:97: error: implicit declaration of function 'mmiowb'
drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c: In function 'bcm43xx_init':
drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:4051: warning: label 'err_dfs_exit' defined but not used
make[4]: *** [drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.o] Error 1
make[3]: *** [drivers/net/wireless/bcm43xx-mac80211] Error 2
make[2]: *** [drivers/net/wireless] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
make: *** Waiting for unfinished jobs.......
Gabriel
-
(cc linux-wireless)
Probably Kconfig troubles again.
-
Maybe something from SSB ?
...
scripts/kconfig/conf -s arch/um/Kconfig
net/bluetooth/hidp/Kconfig:4:warning: 'select' used by config symbol 'BT_HIDP' refers to undefined symbol 'HID'
drivers/net/Kconfig:1456:warning: 'select' used by config symbol 'B44_PCI' refers to undefined symbol 'SSB_PCIHOST'
drivers/net/Kconfig:1457:warning: 'select' used by config symbol 'B44_PCI' refers to undefined symbol 'SSB_DRIVER_PCICORE'
drivers/net/Kconfig:1437:warning: 'select' used by config symbol 'B44' refers to undefined symbol 'SSB'
drivers/net/Kconfig:2112:warning: 'select' used by config symbol 'R8169' refers to undefined symbol 'EEPROM_93CX6'
drivers/net/wireless/Kconfig:552:warning: 'select' used by config symbol 'RTL8187' refers to undefined symbol 'EEPROM_93CX6'
drivers/net/wireless/Kconfig:637:warning: 'select' used by config symbol 'RT2X00_LIB_RFKILL' refers to undefined symbol 'INPUT_POLLDEV'
drivers/net/wireless/Kconfig:643:warning: 'select' used by config symbol 'RT2400PCI' refers to undefined symbol 'EEPROM_93CX6'
drivers/net/wireless/Kconfig:662:warning: 'select' used by config symbol 'RT2500PCI' refers to undefined symbol 'EEPROM_93CX6'
drivers/net/wireless/Kconfig:682:warning: 'select' used by config symbol 'RT61PCI' refers to undefined symbol 'EEPROM_93CX6'
drivers/net/wireless/bcm43xx-mac80211/Kconfig:14:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refers to undefined symbol 'SSB_PCIHOST'
drivers/net/wireless/bcm43xx-mac80211/Kconfig:15:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refers to undefined symbol 'SSB_DRIVER_PCICORE'
drivers/net/wireless/bcm43xx-mac80211/Kconfig:28:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCMCIA' refers to undefined symbol 'SSB_PCMCIAHOST'
drivers/net/wireless/bcm43xx-mac80211/Kconfig:48:warning: 'select' used by config symbol 'BCM43XX_MAC80211_DEBUG' refers to undefined symbol 'SSB_DEBUG'
drivers/net/wireless/bcm43xx-mac80211/Kconfig:5:warning: 'select' used by config symbol 'BCM43XX_MAC80211' refers to un...
....
fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a function)
make[2]: *** [fs/unionfs/file.o] Error 1
make[1]: *** [fs/unionfs] Error 2
make: *** [fs] Error 2
make: *** Waiting for unfinished jobs.......
Config can be found there -> http://194.231.229.228/MM/config-auto-3
Regards,
Gabriel
-
This, I assume:
--- a/fs/unionfs/file.c~git-unionfs-fix-2
+++ a/fs/unionfs/file.c
@@ -17,6 +17,7 @@
*/#include "union.h"
+#include <linux/buffer_head.h>/*******************
* File Operations *
_(and no, sorry, I will not be complicit in that
single-header-file-which-includes-the-whole-world junk).-
-
Ouch. I had a fix for this, and it managed to get lost in the pile of
patches.I'll fix it up and push fix to kernel.org.
Jeff.
--
I abhor a system designed for the "user", if that word is a coded pejorative
meaning "stupid and unsophisticated."
- Ken Thompson
-
Jeff, make sure you push my fix which will work even if CONFIG_BLOCK=n is
Erez.
-
From: Heiko Carstens <heiko.carstens@de.ibm.com>
drivers/ssb/Kconfig has already a depends on HAS_IOMEM which should
prevent SSB from being selected. But appearantly it looks like this
doesn't matter at all if it gets selected from somewhere else.
So add an explicit depends on HAS_IOMEM to the Broadcom driver to
prevent selection on s390.Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
drivers/net/Kconfig | 1 +
1 files changed, 1 insertion(+)Index: linux-2.6.22/drivers/net/Kconfig
===================================================================
--- linux-2.6.22.orig/drivers/net/Kconfig
+++ linux-2.6.22/drivers/net/Kconfig
@@ -1434,6 +1434,7 @@ config APRICOTconfig B44
tristate "Broadcom 440x/47xx ethernet support"
+ depends on HAS_IOMEM
select SSB
select MII
help
-
By the way.. wouldn't something like depends on NET_PCI or something
similar more correct for this driver? Just wondering...
-
No, B44 does not depend on PCI. It does depend on the SSB bus.
(Of course the SSB PCI parts do depend on PCI)
-
Note to reviewers: this is only relevant to -mm and wireless-dev at
the moment, AFAIK...John
--
John W. Linville
linville@tuxdriver.com
-
Hello,
I get this warning. Looking at the comment in kernel/irq/resend.c
it's harmless. Is it?WARNING: at kernel/irq/resend.c:69 check_irq_resend()
[<c010456a>] show_trace_log_lvl+0x1a/0x30
[<c010508d>] show_trace+0x12/0x14
[<c01051e0>] dump_stack+0x15/0x17
[<c013b001>] check_irq_resend+0x91/0xa0
[<c013ab58>] enable_irq+0xb1/0xb3
[<c02d7b81>] ide_config_drive_speed+0xda/0x269
[<c02d391c>] ali15x3_tune_chipset+0xcc/0x161
[<c02dd367>] ide_tune_dma+0x43/0x4d
[<c02d29e0>] ali15x3_config_drive_for_dma+0xf/0x2a
[<c02dc89a>] ide_set_dma+0x11/0x40
[<c02d41cf>] set_using_dma+0x84/0xd1
[<c02d4dfe>] generic_ide_ioctl+0xb9/0x3fb
[<c02df7e4>] idedisk_ioctl+0x3f/0x10c
[<c023d779>] blkdev_driver_ioctl+0x55/0x5e
[<c023da3e>] blkdev_ioctl+0x2bc/0x83e
[<c017e5ca>] block_ioctl+0x1b/0x21
[<c01664c2>] do_ioctl+0x22/0x71
[<c0166566>] vfs_ioctl+0x55/0x28a
[<c01667ce>] sys_ioctl+0x33/0x51
[<c0103f32>] sysenter_past_esp+0x5f/0x85
=======================Then reattaching a usb mouse caused this (only once)
usb 2-1: USB disconnect, address 2
BUG: atomic counter underflow at:
[<c010456a>] show_trace_log_lvl+0x1a/0x30
[<c010508d>] show_trace+0x12/0x14
[<c01051e0>] dump_stack+0x15/0x17
[<c01418cf>] __free_pages+0x50/0x52
[<c01418f0>] free_pages+0x1f/0x21
[<c010783d>] dma_free_coherent+0x43/0x9c
[<c0315067>] hcd_buffer_free+0x43/0x6a
[<c030b2b4>] usb_buffer_free+0x23/0x29
[<c0346db4>] hid_free_buffers+0x23/0x71
[<c0346eb2>] hid_disconnect+0xb0/0xc8
[<c0313676>] usb_unbind_interface+0x30/0x72
[<c02c6df0>] __device_release_driver+0x6a/0x92
[<c02c71c3>] device_release_driver+0x20/0x36
[<c02c6736>] bus_remove_device+0x62/0x85
[<c02c49f8>] device_del+0x16d/0x27c
[<c0310f25>] usb_disable_device+0x7a/0xe2
[<c030d0bc>] usb_disconnect+0x94/0xde...
Please send me the full output of:
gcc --version (or whatever your gcc is called)
ld --version
ld --help (I know no better way to get the supported binutils
targets, and the default target)and the lparmap.s file. You might want to skip sending it
to the lists, it will be a bit big (and off-topic on most
of those lists, anyway).Segher
-
Well ... its 66kB. Not that bad. Please find it attached.
Needed gcc and ld info below.Regards,
Mariusz
$ gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.$ ld --version
GNU ld version 2.17 Debian GNU/Linux
Copyright 2005 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License. This program has absolutely no warranty.$ ld --help
Usage: ld [options] file...
Options:
-a KEYWORD Shared library control for HP/UX compatibility
-A ARCH, --architecture ARCH
Set architecture
-b TARGET, --format TARGET Specify target for following input files
-c FILE, --mri-script FILE Read MRI format linker script
-d, -dc, -dp Force common symbols to be defined
-e ADDRESS, --entry ADDRESS Set start address
-E, --export-dynamic Export all dynamic symbols
-EB Link big-endian objects
-EL Link little-endian objects
-f SHLIB, --auxiliary SHLIB Auxiliary filter for shared object symbol table
-F SHLIB, --filter SHLIB Filter for shared object symbol table
-g Ignored
-G SIZE, --gpsize SIZE Small data size (if no size, same as --shared)
-h FILENAME, -soname FILENAME
Set internal name of shared library
-I PROGRAM, --dynamic-linker PROGRAM
Set PROGRAM as the dynamic linker to use
-l LIBNAME, --library LIBNAME
Search for library LIBNAME
-L DIRECTORY, --library-path DIRECTORY
Add DIRECTORY to library search path
--sysroot=<DIRECTORY> Override the default sysroot location
-m EMULAT...
Thanks.
It seems like things go wrong when lparmap.s is generated with
(DWARF) debug info; could you try building it (manually) with -g0
added on the end of the compile line, and see if head_64.o compiles
okay for you then? If so, I'll prepare a proper patch for it, I
have a similar one (also for lparmap!) in my queue already...Segher
-
Ok it worked. I had to add -g0 to Makefile under arch/powerpc/kernel because
-g0 was added before -g and didn't have any effect when adding to Makefile
in top dir. But yes - it compiles now.Thanks,
Mariusz
-
Great, I'll combine it with my other lparmap build patch then.
Thanks for the report and testing!Segher
-
Can you see if the patch posted by Jiri fixes this or not?
thanks,
greg k-h
-
Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be
Weird. Could you do make V=1 and send me the output?
Paul.
-
Yes it is iMac G3. More or less sth like this:
http://upload.wikimedia.org/wikipedia/commons/c/c0/IMac_Bondi_Blue.jpgprocessor : 0
cpu : 740/750
temperature : 47-49 C (uncalibrated)
clock : 400MHz
revision : 2.2 (pvr 0008 0202)
bogomips : 796.67
machine : PowerMac2,1
motherboard : PowerMac2,1 MacRISC2 MacRISC Power Macintosh
detected as : 66 (iMac FireWire)
pmac flags : 00000005
L2 cache : 512K unified
memory : 256MBOk. Here it goes. The last screen. If you need all / more feel free to mail
me. Config is attached - please note that this is default allmodconfig.gcc -m64 -Wp,-MD,arch/powerpc/kernel/.machine_kexec.o.d -nostdinc -isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ -Iinclude -include
include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at-a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(machine_kexec)" -D"KBUILD_MODNAME=KBUILD_STR(machine_kexec)" -c -o
arch/powerpc/kernel/.tmp_machine_kexec.o arch/powerpc/kernel/machine_kexec.cgcc -m64 -Wp,-MD,arch/powerpc/kernel/.crash.o.d -nostdinc -isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ -Iinclude -include
include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at-a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(crash)" ...
It might be a bug nevertheless, there are more "issues" with
the interesting way lparmap.s is built and used.Segher
-
Some how your defconfig is targeting a PPC64 box:
CONFIG_PPC64=y
shouldn't be set if you want to build a kernel for a G3 imac.
- k
-
allyesconfig/allmodconfig select a 64-bit build always. Maybe
it shouldn't.Segher
-
hm, someone is working that, I think?
-
(CCs adjusted)
Mariusz,
I guess the patch below (which I have just added to my tree) fixes that,
right? Thanks.diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
index 6e73934..0a1f2b5 100644
--- a/drivers/hid/usbhid/hid-core.c
+++ b/drivers/hid/usbhid/hid-core.c
@@ -877,9 +877,9 @@ fail:
usb_free_urb(usbhid->urbin);
usb_free_urb(usbhid->urbout);
usb_free_urb(usbhid->urbctrl);
+ hid_free_buffers(dev, hid);
kfree(usbhid);
fail_no_usbhid:
- hid_free_buffers(dev, hid);
hid_free_device(hid);return NULL;
@@ -913,9 +913,9 @@ static void hid_disconnect(struct usb_interface *intf)
usb_free_urb(usbhid->urbin);
usb_free_urb(usbhid->urbctrl);
usb_free_urb(usbhid->urbout);
- kfree(usbhid);hid_free_buffers(hid_to_usb_dev(hid), hid);
+ kfree(usbhid);
hid_free_device(hid);
}-
Yes - that's correct. This patch fixes the bug. Thanks.
Mariusz
-
Does it also fix the "dma_pool_free" error?
Alan Stern
-
Yes - it does.
Regards,
Mariusz
-
I believe it should -- caused by calling usb_buffer_free() with bogus
dma_addr_t, as corresponding usbhid_device has been already kfree()d.--
Jiri Kosina
-
yeah, harmless.
Ingo
-
Hi Andrew,
For which problem this patch was coded? Is it a potential fix to the
updatedb problem?Is the patch effective without the filesystem dependant change you talk
about? (I use reiserfs)I've been thinking about a test case for the updatedb problem:
1. Script or program that create a large number of directories and zero
sized files. Same setup for everyone to have reproducible results.2. Run updatedb on those.
3. Observe the effects (with vmstat, slabinfo and meminfo) before,
during and after the updatedb run.4. Do something to trigger some reclaim like copying a large file.
5. See the effects.
What do you think? What would be the ideal test case for the problem in
your opinion?Best regards,
- Eric
-
Good question. I think the current behaviour is just wrong. What the
That's one workload which is particularly susceptible to the problemn which
that patch addresses, yes. But in my (brief) testing it didn't make muschSounds good, yes.
Or you could do something more real-worldly like start up OO, firefox and
friends, then run /etc/cron.daily/everything and see what the
before-and-after effects are. The aggregate info we're looking for is
captured in /proc/meminfo: swapped, Mapped, Cached, Buffers.-
IMO it will be harder to come with reproducible numbers, everyone
desktop is different, as their filesystem contents.Anyway I will cook up something and post it. It might be useful for
others to understand the updatedb problem.I intend to try only this specific patch not the full -mm, is there any
other patch I need to apply too?- Eric
-
no, it is standalone.
-
Testing, yes. Succeeding, no. It's utterly hosed on SH in its present
condition at least. Presumably it's been tested on at least one platform
with some measure of success, but it's certainly not mine ;-)I'll get you some patches that fix it up the rest of the way for SH
platforms in the next couple days. There's nothing too rough, though,
mostly serial driver fallout and changes in the current stub that aren't
reflected in the 'new' one.
-
Sh "not working" is a side effect of the fact that there is no one
maintaining the sh kgdb work. As an example, the sh-lite.patch which is
part of the kgdb git tree explicitly says that the sh-sci.c needs to be
"re-ported". There were many changes since the 2.6.17 code base that
KGDB was upreved from and the fact of the matter is that I have no sh
hardware, tool chain, or means to support it. I will happily merge in
the pieces to fix up the sh kgdb arch specifics. The same was true of
IA64 at first (meaning it did not work), but Bob Pico submitted some
further patches and IA64 is should compile build and work with kgdb.In the development branch, the kgdb core is compiled and test and should
work for the archs i386, x86_64, ppc, powerpc (32 & 64), mips (32 & 64)
and arm.Jason.
-
does kgdb actually have a chance to get merged ? with the history of
it, i just assumed it was never going in, so we've been using our own
kgdb patch on Blackfin ... so the version *we have* works great :) but
if there's a chance of this actually going mainline, we can see about
testing that version as well ...
-mike
-
The generic code has a better chance of being merged if it actually works
at least and doesn't break every platform out there that has an existing
stub. It offers quite a bit of new functionality and does clean things up
a bit, so it would certainly be nice to get things to use that, rather
than having to duplicate all of this crap in the architectures. If it's
not going to be merged, everyone will of course continue using the
existing in-tree stubs (sh, ppc, etc.).It's generally advantageous to get these things working on your
architecture _before_ things are merged however, as it's one less thing
to catch up on after the fact. It also helps to figure out if there are
issues with the current implementation by trying it out on your platform
in advance, it's a lot more work to push back against it once it's
already merged.The fact that no one has bothered to even compile for the platforms the
generic kgdb stuff is ported to does seem to suggest that the kgdb folks
aren't terribly serious about getting it merged, though.
-
of course ... but if there isnt a serious chance of this being merged,
then it isnt in our (Blackfin's) interest to investigate it since our
current solution seems to be chugging alone fine
-mike
-
I was hoping for a 2.6.24 merge. But I haven't actually looked at it yet.
Please, do so.
But runtime testing isn't actually the most important thing at this time -
if is doesn't work, well hey, we fix it, easy - we always have bugs. The
main emphasis right now should be on higher-level design/review/integration
stuff.-
The current version is quite messy. I'd be much happier if we could
start with a light version that doesn't have all the intrusions to random
code outside the kgdb core.-
I would disagree on at least one level. The KGDB tree is broken up into
incremental units each layer adding more functionality and or arch
specific pieces.As an example, the KGDB core itself is:
http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=c...If you can point to some specific examples vs a blanket statement "is
quite messy" perhaps I can explain what the changes are for and why they
are needed.Jason.
-
| Jeremy Allison | Re: [RFC] Heads up on sys_fallocate() |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Joerg Roedel | [PATCH 03/34] AMD IOMMU: add defines and structures for ACPI scanning code |
| Eric W. Biederman | [PATCH] powerpc pseries eeh: Convert to kthread API |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Natalie Protasevich | [BUG] New Kernel Bugs |
git: | |
