We should have one week between -rc releases, but I was gone for a week
over thanksgiving (as were some other kernel developers), so this one is a
bit late. It's been almost the rule rather than the exception, but I
promise I'll be better...Anyway, there aren't a lot of exciting changes here, but there's still a
_lot_ more churn than I really hoped for at the -rc4 stage. Blackfin, MIPS
and Power do stand out in the diffstats, but ARM and x86 got some updates
too.And we had some ACPI churn (processor throttling etc), along with various
driver updates: ATA, IDE, infiniband, SCSI, USB and network drivers.. And
on the filesystem side, cifs, NFS, ocfs2 and proc. Ugh. Too much.In fact, the diff from -rc3 is almost 36,000 lines, and that's the smaller
git one with the renames shown as renames (not the ones I upload as
patches to kernel.org - those are done so that people with GNU patch and
other legacy patch programs can use the diffs). I'll blame the two-week
window for some of it, but even so, this is a bit disheartening. I'm
really hoping that we're slowing down and -rc5 won't be anywhere near that
large.That said, none of the changes are really _exciting_ or really scary. And
we should have fixed a number of regressions, although more certainly
remain.Linus
--
As usually, if someone finds errors in http://kernelnewbies.org/Linux_2_6_24 ,
let me know it or change it yourself.
--
Looks like we have a zero "cfs_rq->load.weight".
Ingo? Both sched_slice() and __sched_slice() do a divide by the runqueue
weight, and at least dequeue_task_fair() explicitly checks for that being
zero, so clearly zero is a possible value. Hmm?Linus
--
yeah, i can reproduce this crash too.
The problem is on SMP: if sched_rr_get_interval() gets a task from an
otherwise idle runqueue, then rq->load.weight is 0. Normally
sched_slice() is only used on a busy runqueue. So the correct fixup site
is not in sched_slice() but in sys_sched_rr_get_interval() - i'm working
on the right fix, i hope to be able to send a pull request in a few
minutes.Ingo
--
the problem is on UP too - if there are no SCHED_OTHER tasks. I've
tested the fix and it solves the problem for various combinations of
crash.c. I've updated sched.git, please pull it from:git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git
It has another commit besides this fix. Thanks,
Ingo
------------------>
Ingo Molnar (2):
sched: fix crash in sys_sched_rr_get_interval()
sched: default to more agressive yield for SCHED_BATCH taskssched.c | 14 +++++++++-----
sched_fair.c | 7 ++++---
2 files changed, 13 insertions(+), 8 deletions(-)diff --git a/kernel/sched.c b/kernel/sched.c
index 59ff6b1..b062856 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4850,17 +4850,21 @@ long sys_sched_rr_get_interval(pid_t pid, struct timespec __user *interval)
if (retval)
goto out_unlock;- if (p->policy == SCHED_FIFO)
- time_slice = 0;
- else if (p->policy == SCHED_RR)
+ /*
+ * Time slice is 0 for SCHED_FIFO tasks and for SCHED_OTHER
+ * tasks that are on an otherwise idle runqueue:
+ */
+ time_slice = 0;
+ if (p->policy == SCHED_RR) {
time_slice = DEF_TIMESLICE;
- else {
+ } else {
struct sched_entity *se = &p->se;
unsigned long flags;
struct rq *rq;rq = task_rq_lock(p, &flags);
- time_slice = NS_TO_JIFFIES(sched_slice(cfs_rq_of(se), se));
+ if (rq->cfs.load.weight)
+ time_slice = NS_TO_JIFFIES(sched_slice(&rq->cfs, se));
task_rq_unlock(rq, &flags);
}
read_unlock(&tasklist_lock);
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 37bb265..c33f0ce 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -799,8 +799,9 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int sleep)
*/
static void yield_task_fair(struct rq *rq)
{
- struct cfs_rq *cfs_rq = task_cfs_rq(rq->curr);
- struct sched_entity *rightmost, *se = &rq->curr->se;
+ struct task_struct *curr = rq->curr;
+...
Can you make up something that I can apply for 2.6.23-stable? or is
this not an issue on that tree?thanks,
greg k-h
--
Em Tue, 4 Dec 2007 10:28:51 -0800
Greg KH <greg@kroah.com> escreveu:| On Tue, Dec 04, 2007 at 05:18:27PM +0100, Ingo Molnar wrote:
| >
| > * Ingo Molnar <mingo@elte.hu> wrote:
| >
| > > The problem is on SMP: if sched_rr_get_interval() gets a task from an
| > > otherwise idle runqueue, then rq->load.weight is 0. Normally
| > > sched_slice() is only used on a busy runqueue. So the correct fixup
| > > site is not in sched_slice() but in sys_sched_rr_get_interval() - i'm
| > > working on the right fix, i hope to be able to send a pull request in
| > > a few minutes.
| >
| > the problem is on UP too - if there are no SCHED_OTHER tasks. I've
| > tested the fix and it solves the problem for various combinations of
| > crash.c. I've updated sched.git, please pull it from:
| >
| > git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git
| >
| > It has another commit besides this fix. Thanks,
|
| Can you make up something that I can apply for 2.6.23-stable? or is
| this not an issue on that tree?FWIW I couldn't reproduce the problem with 2.6.23.9. sched_slice()
is quite different on that kernel and _maybe_ it won't never divide
by zero.My original report on vendor-sec was wrong. I've said that 2.6.23.9
had the same bug but turns out the kernel I tested had the Ingo's
CFS backport patch applied. I didn't know that, I thought it was a
vanilla kernel.Btw, I think it's important to release a new CFS backport patch
because maybe some distro is using it (Mandriva stable kernel is
using the CFS backport patch, but we didn't update to latest
version yet).--
Luiz Fernando N. Capitulino
--
no, this is due to a fairly recent commit, so 2.6.23 should not be
affected. (We cleaned up sched_rr_interval() in one of the 2.6.24this should only affect the v24 CFS version - i've updated the v24
backport patches. sched_rr_interval() is almost never used, and it's
basically never used for SCHED_OTHER tasks.Ingo
--
Em Tue, 4 Dec 2007 17:18:27 +0100
Ingo Molnar <mingo@elte.hu> escreveu:|
| * Ingo Molnar <mingo@elte.hu> wrote:
|
| > The problem is on SMP: if sched_rr_get_interval() gets a task from an
| > otherwise idle runqueue, then rq->load.weight is 0. Normally
| > sched_slice() is only used on a busy runqueue. So the correct fixup
| > site is not in sched_slice() but in sys_sched_rr_get_interval() - i'm
| > working on the right fix, i hope to be able to send a pull request in
| > a few minutes.
|
| the problem is on UP too - if there are no SCHED_OTHER tasks. I've
| tested the fix and it solves the problem for various combinations of
| crash.c. I've updated sched.git, please pull it from:
|
| git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git
|
| It has another commit besides this fix. Thanks,Yes, I tested the 'sched: fix crash in sys_sched_rr_get_interval()'
one and it really fixes the problem.Thanks a lot Ingo.
--
Luiz Fernando N. Capitulino
--
Em Tue, 4 Dec 2007 17:00:05 +0100
Ingo Molnar <mingo@elte.hu> escreveu:|
| * Linus Torvalds <torvalds@linux-foundation.org> wrote:
|
| >
| >
| > On Tue, 4 Dec 2007, Luiz Fernando N. Capitulino wrote:
| > >
| > > sched_rr_get_interval(1, NULL);
| >
| > Looks like we have a zero "cfs_rq->load.weight".
| >
| > Ingo? Both sched_slice() and __sched_slice() do a divide by the
| > runqueue weight, and at least dequeue_task_fair() explicitly checks
| > for that being zero, so clearly zero is a possible value. Hmm?
|
| yeah, i can reproduce this crash too.
|
| The problem is on SMP: if sched_rr_get_interval() gets a task from an
| otherwise idle runqueue, then rq->load.weight is 0. Normally
| sched_slice() is only used on a busy runqueue. So the correct fixup site
| is not in sched_slice() but in sys_sched_rr_get_interval() - i'm working
| on the right fix, i hope to be able to send a pull request in a few
| minutes.Ingo, I can reproduce this w/o SMP support as well.
(Also, the backtrace I sent was reproduced on a UP machine with a
SMP kernel).--
Luiz Fernando N. Capitulino
--
hm, if you run this as an RT task, right? Or can you trigger it via pure
SCHED_OTHER tasks as well? Below is my candidate fix.Ingo
--------------->
Subject: sched: fix crash in sys_sched_rr_get_interval()
From: Ingo Molnar <mingo@elte.hu>Luiz Fernando N. Capitulino reported that sched_rr_get_interval()
crashes for SCHED_OTHER tasks that are on an idle runqueue.The fix is to return a 0 timeslice for tasks that are on an idle
runqueue. (and which are not running, obviously)Reported-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4850,17 +4850,21 @@ long sys_sched_rr_get_interval(pid_t pid
if (retval)
goto out_unlock;- if (p->policy == SCHED_FIFO)
- time_slice = 0;
- else if (p->policy == SCHED_RR)
+ /*
+ * Time slice is 0 for SCHED_FIFO tasks and for SCHED_OTHER
+ * tasks that are on an otherwise idle runqueue:
+ */
+ time_slice = 0;
+ if (p->policy == SCHED_RR) {
time_slice = DEF_TIMESLICE;
- else {
+ } else {
struct sched_entity *se = &p->se;
unsigned long flags;
struct rq *rq;rq = task_rq_lock(p, &flags);
- time_slice = NS_TO_JIFFIES(sched_slice(cfs_rq_of(se), se));
+ if (rq->cfs.load.weight)
+ time_slice = NS_TO_JIFFIES(sched_slice(&rq->cfs, se));
task_rq_unlock(rq, &flags);
}
read_unlock(&tasklist_lock);
--
Any reason for this:
mode change 100644 => 100755 drivers/net/chelsio/cxgb2.c
mode change 100644 => 100755 drivers/net/chelsio/pm3393.c
mode change 100644 => 100755 drivers/net/chelsio/sge.c
mode change 100644 => 100755 drivers/net/chelsio/sge.hNicolas
--
As repeatedly mentioned on the list :) it is a mistake.
Jeff
--
Hi,
The patch ctc: make use of alloc_netdev() (commit 1c1478859017452a1179dbbdf7b9eb5b48438746)
introduces the build failureCC [M] drivers/s390/net/fsm.o
CC [M] drivers/s390/net/smsgiucv.o
CC [M] drivers/s390/net/ctcmain.o
drivers/s390/net/ctcmain.c: In function `ctc_init_netdevice':
drivers/s390/net/ctcmain.c:2805: error: implicit declaration of function `SET_MODULE_OWNER'
make[2]: *** [drivers/s390/net/ctcmain.o] Error 1
make[1]: *** [drivers/s390/net] Error 2
make: *** [drivers/s390] Error 2--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
the patch below should fix this.
Ingo
------------>
Subject: drivers/s390/net/ctcmain.c: fix build bug
From: Ingo Molnar <mingo@elte.hu>SET_MODULE_OWNER() is obsolete.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/s390/net/ctcmain.c | 1 -
1 file changed, 1 deletion(-)Index: linux/drivers/s390/net/ctcmain.c
===================================================================
--- linux.orig/drivers/s390/net/ctcmain.c
+++ linux/drivers/s390/net/ctcmain.c
@@ -2802,7 +2802,6 @@ void ctc_init_netdevice(struct net_devic
dev->type = ARPHRD_SLIP;
dev->tx_queue_len = 100;
dev->flags = IFF_POINTOPOINT | IFF_NOARP;
- SET_MODULE_OWNER(dev);
}--
Hi Uschi,
that last patch reverted commit 10d024c1b2fd58af8362670d7d6e5ae52fc33353.
That needs to get readded.--
blue skies,
Martin."Reality continues to ruin my life." - Calvin.
--
| Davide Libenzi | [patch 7/8] fdmap v2 - implement sys_socket2 |
| Benjamin Herrenschmidt | Re: [PATCH] Remove process freezer from suspend to RAM pathway |
| Greg Kroah-Hartman | [PATCH 011/196] sysfs: Fix a copy-n-paste typo in comment |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| Rémi Denis-Courmont | [PATCH] USB host CDC Phonet network interface driver |
| David Miller | [GIT]: Networking |
git: | |
