Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and SIGPROF.

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Roland McGrath <roland@...>
Cc: <parag.warudkar@...>, Alejandro Riveira <ariveira@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, Ingo Molnar <mingo@...>, Thomas Gleixner <tglx@...>, Jakub Jelinek <jakub@...>
Date: Tuesday, March 4, 2008 - 3:52 pm

Put this on the patch but I'm emailing it as well.

On Mon, 2008-03-03 at 23:00 -0800, Roland McGrath wrote:

You're quite welcome.


Well, the iron is getting bigger, too, so it's beginning to be feasible
to run _lots_ of threads.


My first patch did essentially what you outlined above, incrementing
shared utime/stime/sched_time fields, except that they were in the
task_struct of the group leader rather than in the signal_struct.  It's
not clear to me exactly how the signal_struct is shared, whether it is
shared among all threads or if each has its own version.

So each timer routine had something like:

	/* If we're part of a thread group, add our time to the leader. */
	if (p->group_leader != NULL)
		p->group_leader->threads_sched_time += tmp;

and check_process_timers() had

	/* Times for the whole thread group are held by the group leader. */
	utime = cputime_add(utime, tsk->group_leader->threads_utime);
	stime = cputime_add(stime, tsk->group_leader->threads_stime);
	sched_time += tsk->group_leader->threads_sched_time;

Of course, this alone is insufficient.  It speeds things up a tiny bit
but not nearly enough.

The other issue has to do with the rest of the processing in
run_posix_cpu_timers(), walking the timer lists and walking the whole
thread group (again) to rebalance expiry times.  My second patch moved
all that work to a workqueue, but only if there were more than 100
threads in the process.  This basically papered over the problem by
moving the processing out of interrupt and into a kernel thread.  It's
still insufficient, though, because it takes just as long and will get
backed up just as badly on large numbers of threads.  This was made
clear in a test I ran yesterday where I generated some 200,000 threads.
The work queue was unreasonably large, as you might expect.

I am looking for a way to do everything that needs to be done in fewer
operations, but unfortunately I'm not familiar enough with the
SIGPROF/SIGVTALRM semantics or with the details of the Linux
implementation to know where it is safe to consolidate things.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and SIG..., Alejandro Riveira , (Thu Feb 7, 11:22 am)
Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and SIG..., Alejandro Riveira , (Thu Feb 7, 11:54 am)
Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and SIG..., Frank Mayhar, (Tue Mar 4, 3:52 pm)
posix-cpu-timers revamp, Roland McGrath, (Tue Mar 11, 3:50 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Mar 11, 5:05 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Mar 11, 5:35 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Thu Mar 13, 8:37 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Fri Mar 21, 3:18 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Fri Mar 21, 4:40 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Fri Mar 21, 1:57 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Sat Mar 22, 5:58 pm)
[PATCH 2.6.25-rc6] Fix itimer/many thread hang., Frank Mayhar, (Thu Mar 27, 8:52 pm)
Re: [PATCH 2.6.25-rc6] Fix itimer/many thread hang., Ingo Molnar, (Fri Mar 28, 6:28 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 24, 1:34 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Mon Mar 31, 1:44 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 31, 4:24 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Apr 1, 10:07 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 2:42 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 1:42 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Wed Apr 2, 3:48 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 4:34 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Fri Apr 4, 7:17 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Sun Apr 6, 1:26 am)
Re: posix-cpu-timers revamp, Roland McGrath, (Mon Apr 7, 4:08 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Apr 8, 5:27 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Apr 8, 6:49 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 9, 12:29 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Apr 8, 5:52 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Apr 7, 5:31 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Mon Apr 7, 6:02 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 5:42 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Thu Apr 3, 8:53 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 12:34 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 24, 6:43 pm)