Re: [PATCH 2.6.27-rc5 re-resubmit] Fix itimer/many thread hang.

Previous thread: Re: ConfigFS + Target Mode Engine API discussion by Nicholas A. Bellinger on Friday, September 12, 2008 - 9:24 am. (2 messages)

Next thread: Partition check considered as error is breaking mounting in 2.6.27 by Herton Ronaldo Krzesinski on Friday, September 12, 2008 - 9:56 am. (4 messages)
From: Frank Mayhar
Date: Friday, September 12, 2008 - 9:54 am

This is a resubmission of the posix timer rework patch, posted a few days ago.
This addresses Oleg Nesterov's comments, removing the RCU stuff from the patch
and un-inlining the thread_group_cputime() function for SMP.  I left the
function itself as an inline defined in sched.h but the SMP version just calls
thread_group_cputime_smp() which is defined in posix-cpu-timers.c.  (The UP
version was a one-liner; I left it as an inline.)

The original README follows (to keep it together with the patch itself):


Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads.  It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."


Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine.  (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.)  To do this, at each tick we now update fields in
signal_struct as well as task_struct.  The run_posix_cpu_timers() function
uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
	cputime_t utime;
	cputime_t ...
From: Andrew Morton
Date: Friday, September 12, 2008 - 1:27 pm

On Fri, 12 Sep 2008 09:54:39 -0700

It took a bit of massaging to make it all apply and compile on
linux-next, but it looks like I got it landed OK.

From: Frank Mayhar <fmayhar@google.com>

Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this
code.

The problem we ran into, and the reason for this rework, has to do with
using a profiling timer in a process with a large number of threads.  It
appears that the performance of the old implementation of
run_posix_cpu_timers() was at least O(n*3) (where "n" is the number of
threads in a process) or worse.  Everything is fine with an increasing
number of threads until the time taken for that routine to run becomes the
same as or greater than the tick time, at which point things degrade
rather quickly.

This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9906, "Weird
hang with NPTL and SIGPROF."

Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make
it run in constant time for a particular machine.  (Performance may vary
between one machine and another depending upon whether the kernel is built
as single- or multiprocessor and, in the latter case, depending upon the
number of running processors.) To do this, at each tick we now update
fields in signal_struct as well as task_struct.  The
run_posix_cpu_timers() function uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
	cputime_t utime;
	cputime_t stime;
	unsigned long long sum_exec_runtime;
};

This is included in the structure "thread_group_cputime," which is a new
substructure of signal_struct and which varies for uniprocessor versus
multiprocessor kernels.  For uniprocessor kernels, it uses "task_cputime"
as a simple ...
From: Ingo Molnar
Date: Sunday, September 14, 2008 - 8:06 am

i've applied it to tip/timers/posixtimers for more testing, thanks 
Frank.

	Ingo
--

From: Ingo Molnar
Date: Sunday, September 14, 2008 - 8:07 am

applied the small build fix below to tip/timers/posixtimers.

	Ingo

------------>
From 430b5294bd72c085c730e1e4b86580f164d976bf Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Sun, 14 Sep 2008 16:33:01 +0200
Subject: [PATCH] timers: fix itimer/many thread hang, fix
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

fix:

 kernel/fork.c:843: error: ‘struct signal_struct’ has no member named ‘sum_sched_runtime’
 kernel/irq/handle.c:117: warning: ‘sparse_irq_lock’ defined but not used

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/fork.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index a8ac2ef..1181b9a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -834,7 +834,6 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	sig->min_flt = sig->maj_flt = sig->cmin_flt = sig->cmaj_flt = 0;
 	sig->inblock = sig->oublock = sig->cinblock = sig->coublock = 0;
 	task_io_accounting_init(&sig->ioac);
-	sig->sum_sched_runtime = 0;
 	INIT_LIST_HEAD(&sig->cpu_timers[0]);
 	INIT_LIST_HEAD(&sig->cpu_timers[1]);
 	INIT_LIST_HEAD(&sig->cpu_timers[2]);
--

From: Ingo Molnar
Date: Sunday, September 14, 2008 - 8:09 am

the patch wasnt built on UP i guess - see the fix below.

the wider question is, shouldnt the UP case be just the same as the SMP 
case? Especially the type assymetry struct thread_group_cputime looks 
ugly. (and assymetries like that tend to be a constant source of 
breakage like the one below.)

	Ingo

-------------->
From 0a8eaa4f9b58759595a1bfe13a1295fdc25ba026 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Sun, 14 Sep 2008 17:03:52 +0200
Subject: [PATCH] timers: fix itimer/many thread hang, fix #2
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

fix the UP build:

In file included from arch/x86/kernel/asm-offsets_32.c:9,
                 from arch/x86/kernel/asm-offsets.c:3:
include/linux/sched.h: In function ‘thread_group_cputime_clone_thread’:
include/linux/sched.h:2272: warning: no return statement in function returning non-void
include/linux/sched.h: In function ‘thread_group_cputime_account_user’:
include/linux/sched.h:2284: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h:2284: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h: In function ‘thread_group_cputime_account_system’:
include/linux/sched.h:2291: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h:2291: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h: In function ‘thread_group_cputime_account_exec_runtime’:
include/linux/sched.h:2298: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
distcc[14501] ERROR: compile arch/x86/kernel/asm-offsets.c on a/30 failed
make[1]: *** [arch/x86/kernel/asm-offsets.s] Error 1

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/sched.h |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26d7a5f..ed355f0 100644
--- a/include/linux/sched.h
+++ ...
From: Frank Mayhar
Date: Monday, September 15, 2008 - 11:09 am

Yeah, I've been focusing so much on getting the SMP case right that I've

I'm not overly fond of this one, either; I did it at Roland's suggestion
(it's all _his_ fault, yeah, _that's_ the ticket! :-); his opinion IIRC
was that the UP case will perform better without the extra pointer
dereferences.  I agree that it's a potential source of pain such as the
one you point out.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Ingo Molnar
Date: Tuesday, September 16, 2008 - 1:41 am

i dont know ... 

lets try that simplification as a delta patch, ok? Please check the 
before/after size of 'vmlinux' in a 'make defconfig' [with SMP disabled 
after make defconfig] UP build. If there's visible size difference then 
Roland's point holds.

	Ingo
--

From: Frank Mayhar
Date: Wednesday, September 17, 2008 - 12:03 pm

I finally finished this.  Yeah, there's a visible size difference,
although it's small:

bobble ~/build/linux-2.6>size vmlinux
   text    data     bss     dec     hex filename
6417840  824160  578416 7820416  775480 vmlinux

bobble ~/build/up-simplify-tree>size vmlinux
   text    data     bss     dec     hex filename
6417999  824160  578416 7820575  77551f vmlinux

For a difference of 159 bytes.  Not big, but nonzero.  The delta patch
(relative to a tree containing all suggested changes _except_ this
simplification) follows:

--- ../linux-2.6/include/linux/sched.h	2008-09-16 14:39:26.000000000 -0700
+++ ./include/linux/sched.h	2008-09-17 10:28:33.000000000 -0700
@@ -454,15 +454,9 @@ struct task_cputime {
  * This structure contains the version of task_cputime, above, that is
  * used for thread group CPU clock calculations.
  */
-#ifdef CONFIG_SMP
 struct thread_group_cputime {
 	struct task_cputime *totals;
 };
-#else
-struct thread_group_cputime {
-	struct task_cputime totals;
-};
-#endif
 
 /*
  * NOTE! "signal_struct" does not have it's own
@@ -2124,10 +2118,8 @@ static inline int spin_needbreak(spinloc
 /*
  * Thread group CPU time accounting.
  */
-#ifdef CONFIG_SMP
 
-extern int thread_group_cputime_alloc_smp(struct task_struct *);
-extern void thread_group_cputime_smp(struct task_struct *, struct task_cputime *);
+extern int thread_group_cputime_alloc(struct task_struct *);
 
 static inline void thread_group_cputime_init(struct signal_struct *sig)
 {
@@ -2138,9 +2130,13 @@ static inline int thread_group_cputime_c
 {
 	if (curr->signal->cputime.totals)
 		return 0;
-	return thread_group_cputime_alloc_smp(curr);
+	return thread_group_cputime_alloc(curr);
 }
 
+#ifdef CONFIG_SMP
+
+extern void thread_group_cputime_smp(struct task_struct *, struct task_cputime *);
+
 static inline void thread_group_cputime_free(struct signal_struct *sig)
 {
 	free_percpu(sig->cputime.totals);
@@ -2161,31 +2157,15 @@ static inline void thread_group_cputime(
 
 ...
From: Roland McGrath
Date: Wednesday, September 17, 2008 - 12:13 pm

I had been thinking not so much about code size, but about the unnecessary
indirection and allocation for the UP case.  I'm happy to defer to Ingo on
whether to worry about that.


Thanks,
Roland
--

From: Frank Mayhar
Date: Wednesday, September 17, 2008 - 1:12 pm

Yeah, I was just responding to his request for a code-size check.  I'm
okay either way, although making the data structures consistent between
the two cases does simplify the code a bit.

The next patch is waiting on Ingo's verdict. :-)
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Ingo Molnar
Date: Thursday, September 18, 2008 - 3:23 am

i'd not worry about those 160 bytes - getting this stuff to work fine is 
far more important. Details around threading seem to be one of the 
slowest converging technological details of Linux.


why is there _any_ assymetry needed between UP and SMP? These days we 
just write straight code for SMP, and UP is just a single-CPU 
special-case of it. _Sometimes_ if it's really worth it we do some UP 
special cases but it's the exception, not the rule.

	Ingo
--

From: Frank Mayhar
Date: Thursday, September 18, 2008 - 6:50 am

As far as I know, I still need to handle the SMP per_cpu allocate/free
differently from the UP kmalloc.  I'll check again, though; if that's no
longer the case, I'll fix it.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Frank Mayhar
Date: Monday, September 22, 2008 - 1:22 pm

This is the second resubmission of the posix timer rework patch, posted
a few days ago.  This includes the changes from the previous
resubmittion, which addressed Oleg Nesterov's comments, removing the RCU
stuff from the patch and un-inlining the thread_group_cputime() function
for SMP.  In addition, per Ingo Molnar it simplifies the UP code,
consolidating much of it with the SMP version and depending on
lower-level SMP/UP handling to take care of the differences.  It also
cleans up some UP compile errors, moves the scheduler stats-related
macros into kernel/sched_stats.h, cleans up a merge error in
kernel/fork.c and has a few other minor fixes and cleanups as suggested
by Oleg and Ingo.  Thanks for the review, guys.

The original README follows (to keep it together with the patch itself):

Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads.  It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."


Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine.  (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.)  To do this, at each tick we now update fields in
signal_struct ...
From: Ingo Molnar
Date: Tuesday, September 23, 2008 - 4:40 am

below is the delta patch i've applied to tip/timers/posixtimers - might 
be easier to review for those who've seen v1 and who'd like to check the 
changes.

i've also merged v2 into tip/master and started testing it.

	Ingo

---------------->
From bb34d92f643086d546b49cef680f6f305ed84414 Mon Sep 17 00:00:00 2001
From: Frank Mayhar <fmayhar@google.com>
Date: Fri, 12 Sep 2008 09:54:39 -0700
Subject: [PATCH] timers: fix itimer/many thread hang, v2

This is the second resubmission of the posix timer rework patch, posted
a few days ago.

This includes the changes from the previous resubmittion, which addressed
Oleg Nesterov's comments, removing the RCU stuff from the patch and
un-inlining the thread_group_cputime() function for SMP.

In addition, per Ingo Molnar it simplifies the UP code, consolidating much
of it with the SMP version and depending on lower-level SMP/UP handling to
take care of the differences.

It also cleans up some UP compile errors, moves the scheduler stats-related
macros into kernel/sched_stats.h, cleans up a merge error in
kernel/fork.c and has a few other minor fixes and cleanups as suggested
by Oleg and Ingo. Thanks for the review, guys.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/kernel_stat.h |    1 +
 include/linux/sched.h       |  183 +-----------------------------------------
 kernel/fork.c               |    5 +-
 kernel/posix-cpu-timers.c   |  155 +++++++++++++++---------------------
 kernel/sched.c              |   47 ++----------
 kernel/sched_stats.h        |  136 ++++++++++++++++++++++++++++++++
 6 files changed, 215 insertions(+), 312 deletions(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index cf9f40a..cac3750 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -52,6 +52,7 @@ ...
From: Ingo Molnar
Date: Tuesday, September 23, 2008 - 5:52 am

-tip testing found a boot crash on a 32-bit x86 test system, and i've 
bisected it down to this commit. I've attached the .config. The box is 
hard to debug and the crash happens very early so there's no log - but 
it was reproducible so you should be able to trigger it by doing this 
in the tip tree:

  git checkout 788222bf3d6d7ee08ce308f078c8774eb896c59e

[ that is the -tip tree just before the revert commit. ]

... and use the attached .config to build a test kernel and to reproduce 
the crash. (it doesnt matter if it wont fully boot on your testbox - the 
crash happen very early)

here is how you can access the tree:

  http://people.redhat.com/mingo/tip.git/README

i also found a boot hang on a 64-bit x86 test system - that too is 
solved with this revert. I've attached that config too.

The common theme in both configs is that they are !SMP - and many 
updates in v2 relate to !SMP details, so i'd suspect a bug in that area.

( and thanks to working with delta patches i was able to fall back to 
  the v1+fixes state in tip/timers/posix-timers, instead of having to 
  drop/revert the whole work. So please update this stuff with delta 
  patches in the future. )

	Ingo
From: Oleg Nesterov
Date: Tuesday, September 23, 2008 - 6:59 am

Confused. Unless I misread the patch, tgtimes->totals can be NULL
on UP too?

Do we really need the special version for !CONFIG_SMP ? per_cpu_ptr()
just returns the pointer. Yes, get_cpu() disables preemption, but
perhaps this is tolerable?


(of course, this can't explain the crash)

Oleg.

--

From: Frank Mayhar
Date: Tuesday, September 23, 2008 - 9:09 am

Yep.  I've had about four different irons in the fire over the last few
days and I think because of that I ended up giving this less attention
than I should have.  I'll beat on this some more and reroll the patch in
the next day or so.

I'll also provide the delta patch as Ingo requested.

Sorry, folks.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Frank Mayhar
Date: Tuesday, September 23, 2008 - 3:56 pm

Yeah.  I totally messed up on this, too many things going on at once.
I've eliminated the SMP/UP split in sched_stats.h and pulled it all into

I'll be investigating this in the morning (I'm at home today and without
a test box).
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Frank Mayhar
Date: Wednesday, September 24, 2008 - 2:23 pm

A couple of days ago I re-resubmitted the itimer/many thread hang.
Unfortunately, I had too many things going on at once and screwed it up,
not verifying the UP version and entirely missing a set of cleanups that
I had intended to include.  This rectifies that screwup.

At Ingo's request, this is an incremental patch, relative to the code in
his -tip tree.

These changes have been minimally tested on both UP and SMP
environments.  They simplify the UP code, consolidating pretty much all
of it with the SMP version and depending on lower-level SMP/UP handling
to take care of the differences.  It also cleans up some UP compile
errors, moves the scheduler stats-related macros into
kernel/sched_stats.h, cleans up a merge error in kernel/fork.c and has a
few other minor fixes and cleanups as suggested by Oleg and Ingo.
Again, thanks for the reviews, guys.

I'm dropping the original README; if you want it it's in the archives at
least three times.

Signed-off-by: Frank Mayhar <fmayhar@google.com>

 include/linux/kernel_stat.h |    1 +
 include/linux/sched.h       |  182 +------------------------------------------
 kernel/fork.c               |    5 +-
 kernel/posix-cpu-timers.c   |  155 +++++++++++++++---------------------
 kernel/sched.c              |   46 ++----------
 kernel/sched_stats.h        |   86 ++++++++++++++++++++
 6 files changed, 163 insertions(+), 312 deletions(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 21249d8..a97caac 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -70,6 +70,7 @@ static inline unsigned int kstat_irqs(unsigned int irq)
 	return sum;
 }
 
+extern unsigned long long task_delta_exec(struct task_struct *);
 extern void account_user_time(struct task_struct *, cputime_t);
 extern void account_user_time_scaled(struct task_struct *, cputime_t);
 extern void account_system_time(struct task_struct *, int, cputime_t);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ...
From: Ingo Molnar
Date: Saturday, September 27, 2008 - 11:08 am

[Empty message]
From: Ingo Molnar
Date: Monday, September 29, 2008 - 11:33 pm

hmmm ... where do we get 'rq' from?

in v3 you did this:

-       rq = task_rq_lock(p, &flags);

which removed the deadlock but left us with a random uninitialized rq 
variable ...

the right solution for the bug would have been to unlock it. 
Miraculously we didnt actually crash anywhere visibly, found it by 
reviewing the code. I thought this code gets excercised quite 
frequently. The commit below fixes it.

Could you please functionality-test latest tip/master:

  http://people.redhat.com/mingo/tip.git/README

with your testcase that excercises these codepaths heavily?

Thanks,

	Ingo
--

From: Frank Mayhar
Date: Tuesday, September 30, 2008 - 9:36 am

You know, I just knew I had a brain around here _somewhere_.

Weirdly, this didn't fall over on any of my testing.  The gods know what
it was actually doing, though.

I'm picking up tip/master right now and will test your fix for this
shortly.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Frank Mayhar
Date: Wednesday, October 1, 2008 - 9:20 am

Okay, I've put it through its paces and everything appears to be working
well with Ingo's fix.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Ingo Molnar
Date: Thursday, October 2, 2008 - 2:43 am

great, thanks for checking!

it has not caused any new problems in -tip testing either, and that 
means a couple of thousand test builds and test boots so far in the past 
week, so we've got the usual codepaths covered pretty well. With your 
test of the less common codepaths we should be pretty good in general.

	Ingo
--

From: Ingo Molnar
Date: Sunday, September 14, 2008 - 8:12 am

a couple of fixlets below, pointed out by scripts/checkpatch.pl.

	Ingo

---------->
From 5ce73a4a5a4893a1aa4cdeed1b1a5a6de42c43b6 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Sun, 14 Sep 2008 17:11:46 +0200
Subject: [PATCH] timers: fix itimer/many thread hang, cleanups

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/sched.h     |    2 +-
 kernel/posix-cpu-timers.c |    6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ed355f0..7ce8d4e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -430,7 +430,7 @@ struct pacct_struct {
  * @utime:		time spent in user mode, in &cputime_t units
  * @stime:		time spent in kernel mode, in &cputime_t units
  * @sum_exec_runtime:	total time spent on the CPU, in nanoseconds
- * 
+ *
  * This structure groups together three kinds of CPU time that are
  * tracked for threads and thread groups.  Most things considering
  * CPU time want to group these counts together and treat all three
diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index dba1c33..9a7ea04 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -94,7 +94,7 @@ void update_rlimit_cpu(unsigned long rlim_new)
 
 	cputime = secs_to_cputime(rlim_new);
 	if (cputime_eq(current->signal->it_prof_expires, cputime_zero) ||
-            cputime_lt(current->signal->it_prof_expires, cputime)) {
+	    cputime_lt(current->signal->it_prof_expires, cputime)) {
 		spin_lock_irq(&current->sighand->siglock);
 		set_process_cpu_timer(current, CPUCLOCK_PROF, &cputime, NULL);
 		spin_unlock_irq(&current->sighand->siglock);
@@ -1372,9 +1372,9 @@ void run_posix_cpu_timers(struct task_struct *tsk)
 	 * tsk->signal is non-NULL; this probably can't happen but cover the
 	 * possibility anyway.
 	 */
-	if (unlikely(!sig) || !fastpath_timer_check(tsk, sig)) {
+	if (unlikely(!sig) || !fastpath_timer_check(tsk, sig))
 		return;
-	}
+
 ...
From: Ingo Molnar
Date: Sunday, September 14, 2008 - 8:14 am

please uninline these functions.

	Ingo
--

From: Roland McGrath
Date: Sunday, September 14, 2008 - 12:31 pm

> please uninline these functions.

They are short and have only one caller.


Thanks,
Roland
--

From: Ingo Molnar
Date: Sunday, September 14, 2008 - 11:41 pm

then they should be defined in the .c file which uses them.

really, sched.h is large enough already and has lots of unnecessary 
stuff in it.

	Ingo
--

From: Frank Mayhar
Date: Monday, September 15, 2008 - 10:59 am

Well, they are actually used in sched.h, in the inline routines
account_group_user_time() and friends, which are in turn used by
routines in sched.c.  The point here is that the
thread_group_cputime_account_xxx() routines are defined differently
depending upon whether we're building for UP or SMP and the

Agreed, but Roland and I were trying to make life easier for folks who
have to maintain SMP and UP versions of this stuff, keeping it all in
one place rather than scattering it about.

If you insist, I'll un-inline the routines but this will mean moving all
of this (thread_group_cputime_account_xxx() and account_group_xxx()
routines) to sched.c.  Maybe that makes sense.  It's your call; I'll
hold off on the changes (and resubmitting the patch) while I wait for
you to let me know what you prefer.  Thanks.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--

From: Ingo Molnar
Date: Tuesday, September 16, 2008 - 1:39 am

it's already in tip/sched/posix-cpu-timers, so please send a delta patch 
against tip/master:

  http://people.redhat.com/mingo/tip.git/README

i'd suggest to move all the scheduler-internal accounting functions into 
kernel/sched_stats.h. AFAICS there's no scheduler-external user of these 
APIs. The most widely used external APIs is thread_group_cputime_smp(), 
and that is out of line already. (except on UP where it's trivial)

	Ingo
--

Previous thread: Re: ConfigFS + Target Mode Engine API discussion by Nicholas A. Bellinger on Friday, September 12, 2008 - 9:24 am. (2 messages)

Next thread: Partition check considered as error is breaking mounting in 2.6.27 by Herton Ronaldo Krzesinski on Friday, September 12, 2008 - 9:56 am. (4 messages)