Re: posix-cpu-timers revamp

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Roland McGrath
Date: Sunday, March 30, 2008 - 10:44 pm

> Well, if it's acceptable, for a first cut (and the patch I'll submit),

That's fine.


For clarity, please never mention that identifier without indicating which
struct you're talking about.  signal_struct.it_*_expires has never been
overloaded.  signal_struct.it_prof_expires is the ITIMER_PROF setting;
signal_struct.it_virt_expires is the ITIMER_VIRTUAL setting; there is no
signal_struct.it_sched_expires field.  task_struct.it_*_expires has never
been overloaded.  task_struct.it_prof_expires is the next value of
(task_struct.utime + task_struct.stime) at which run_posix_cpu_timers()
needs to check for work to do.


It is not "rolled into the itimer handling".  
run_posix_cpu_timers() handles three separate features:

1. ITIMER_VIRTUAL, ITIMER_PROF itimers (setitimer/getitimer)
2. POSIX timers CPU timers (timer_* calls)
3. CPU time rlimits (RLIMIT_CPU for process-wide, RLIMIT_RTTIME for each thread)

The poorly-named task_struct.it_*_expires fields serve a single purpose:
to optimize run_posix_cpu_timers().  task_struct.it_prof_expires is the
minimum of the values at which any of those three features need attention.


I don't see the point of adding this field at all.  It serves solely to
cache the secs_to_cputime calculation on the RLIMIT_CPU rlim_cur value,
which is just a division.


This does not make sense.  There is no need for any new state at all in
the UP case, just reorganizing what is already there.

The existing signal_struct fields utime, stime, and sum_sched_runtime are
no longer needed.  These accumulate the times of dead threads in the group
(see __exit_signal in exit.c) solely so cpu_clock_sample_group can add
them in.  Keeping both those old fields and the dynamically allocated
per-CPU counters is wrong.  You will count double all the threads that
died since the struct was allocated (i.e. since the first timer was set).

For the UP case, just replace these with a single struct thread_group_cputime
in signal_struct, and increment it directly on every tick.  __exit_signal
never touches it.

For the SMP case, you need a bit of complication.  When there are no
expirations (none of the three features in use on a process-wide CPU
clock) or only one live thread, then you don't need to allocate the
per-CPU counters.  But you need one or the other kind of state as soon
as a thread dies while others live, or there are multiple threads while
any process-wide expiration is set.  There are several options for how
to reconcile the dead-threads tracking with the started-on-demand
per-CPU counters.

You did not follow what I had in mind for abstracting the code.
Here is what I think will make it easiest to work through all these
angles without rewriting much of the code for each variant.

Define:

	struct task_cputime {
		cputime_t utime;
		cputime_t stime;
		unsigned long long sched_runtime;
	};

and then a second type to use in signal_struct.  
You can clean up task_struct by replacing its it_*_expires fields with one:
	
	struct task_cputime cputime_expires;

(That is, overload cputime_expires.stime for the utime+stime expiration
time.  Even with that kludge, I think it's cleaner to use this struct
for all these places.)

In signal_struct there are no conditionals, it's just:

	struct thread_group_cputime cputime;

The variants provide struct thread_group_cputime and the functions to go
with it.  However many sorts we need can be different big #if blocks
keeping all the related code together.

The UP version just does:

	struct thread_group_cputime {
		struct task_cputime totals;
	};

	static inline void thread_group_cputime(struct signal_struct *sig,
						struct task_cputime *cputime)
	{
		*cputime = sig->cputime;
	}

	static inline void account_group_user_time(struct task_struct *task,
						   cputime_t cputime)
	{
		struct task_cputime *times = &task->signal->cputime.totals;
		times->utime = cputime_add(times->utime, cputime);
	}

	static inline void account_group_system_time(struct task_struct *task,
						     cputime_t cputime)
	{
		struct task_cputime *times = &task->signal->cputime.totals;
		times->stime = cputime_add(times->stime, cputime);
	}

	static inline void account_group_exec_runtime(struct task_struct *task,
						      unsigned long long ns)
	{
		struct task_cputime *times = &task->signal->cputime.totals;
		times->sched_runtime += ns;
	}

Finally, you could consider adding another field to signal_struct:

	struct task_cputime cputime_expires;

This would be a cache, for each of the three process CPU clocks, of the
earliest expiration from any of the three features.  Each of setitimer,
timer_settime, setrlimit, and implicit timer/itimer reloading, would
recalculate the minimum of the head of the cpu_timers list, the itimer
(it_*_expires), and the rlimit.  The reason to do this is that the
common case in run_posix_cpu_timers() stays almost as cheap as it is now.
It also makes a nice parallel with the per-thread expiration cache.
i.e.:

	static int task_cputime_expired(const struct task_cputime *sample,
					const struct task_cputime *expires)
	{
		if (!cputime_eq(expires->utime, cputime_zero) &&
		    cputime_ge(sample->utime, expires->utime))
			return 1;
		if (!cputime_eq(expires->stime, cputime_zero) &&
		    cputime_ge(cputime_add(sample->utime, sample->stime),
			       expires->stime))
			return 1;
		if (expires->sched_runtime != 0 &&
		    sample->sched_runtime >= expires->sched_runtime)
			return 1;
		return 0;
	}

	...

		struct signal_struct *sig = task->signal;
		struct task_cputime task_sample = {
			.utime = task->utime,
			.stime = task->stime,
			.sched_runtime = task->se.sum_exec_runtime
		};
		struct task_cputime group_sample;
		thread_group_cputime(sig, &group_sample);

		if (!task_cputime_expired(&task_sample,
					  &task->cputime_expired) &&
		    !task_cputime_expired(&group_sample,
					  &sig->cputime_expired))
			return 0;

	...



Thanks,
Roland
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and S ..., Alejandro Riveira , (Thu Feb 7, 8:22 am)
Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and S ..., Alejandro Riveira , (Thu Feb 7, 8:54 am)
posix-cpu-timers revamp, Roland McGrath, (Tue Mar 11, 12:50 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Mar 11, 2:05 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Mar 11, 2:35 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Thu Mar 13, 5:37 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Fri Mar 21, 12:18 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Fri Mar 21, 10:57 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Fri Mar 21, 1:40 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Sat Mar 22, 2:58 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 24, 10:34 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 24, 3:43 pm)
[PATCH 2.6.25-rc6] Fix itimer/many thread hang., Frank Mayhar, (Thu Mar 27, 5:52 pm)
Re: [PATCH 2.6.25-rc6] Fix itimer/many thread hang., Ingo Molnar, (Fri Mar 28, 3:28 am)
Re: posix-cpu-timers revamp, Roland McGrath, (Sun Mar 30, 10:44 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Mar 31, 1:24 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Apr 1, 7:07 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 9:34 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 10:42 am)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 11:42 am)
Re: posix-cpu-timers revamp, Roland McGrath, (Wed Apr 2, 12:48 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 1:34 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 2, 2:42 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Thu Apr 3, 5:53 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Fri Apr 4, 4:17 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Sat Apr 5, 10:26 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Mon Apr 7, 1:08 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Mon Apr 7, 2:31 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Mon Apr 7, 3:02 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Apr 8, 2:27 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Tue Apr 8, 2:52 pm)
Re: posix-cpu-timers revamp, Roland McGrath, (Tue Apr 8, 3:49 pm)
Re: posix-cpu-timers revamp, Frank Mayhar, (Wed Apr 9, 9:29 am)