Re: RFC: A revised timerfd API

Previous thread: VGA text console display problem with kernel 2.6.23-rc5/6 by ben soo on Tuesday, September 18, 2007 - 3:26 am. (4 messages)

Next thread: Re: [GIT PATCH] USB autosuspend fixes for 2.6.23-rc6 by Hans de Goede on Monday, September 17, 2007 - 8:56 am. (1 message)
To: Davide Libenzi <davidel@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 3:27 am

After my earlier mail (full thread here:
http://thread.gmane.org/gmane.linux.kernel/574430/focus=579368 )
it seems that some people agree we should give a bit more
thought to how a final timerfd interface should look.

Davide's original API and the limitations that I see in it,
are described in the message cited above. In that message,
I proposed three alternatives (and I'm going to add a fourth
now) that provide "get-while-set" and "non-destructive-get"
functionality. Before trying to implement anything I'd like to
get input on the various possible designs.

The four designs are:

a) A multiplexing timerfd() system call.
b) Creating three syscalls analogous to the POSIX timers API (i.e.,
timerfd_create/timerfd_settime/timerfd_gettime).
c) Creating a simplified timerfd() system call that is integrated
with the POSIX timers API.
d) Extending the POSIX timers API to support the timerfd concept.

My order of preference for the implementations is currently
something like (in descending order): d, b, c, a.

The details follow:

====> a) Add an argument (a multiplexing timerfd() system call)

In an earlier mail (http://thread.gmane.org/gmane.linux.kernel/559193 ).
I proposed adding a further argument to timerfd(): old_utmr, which could
be used to return the time remaining until expiry for an existing timer
The proposed semantics that would allow get and get-while-setting
functionality.

Advantages:
1. Provides the desired get and get-while-setting functionality.
2. It's a simple interface (a single system call).

Disadvantage:
Jon Corbet pointed out
(http://thread.gmane.org/gmane.linux.kernel/559193/focus=570709 )
that this interface was starting to look like a multiplexing syscall,
because there is no case where all of the arguments are used (see
the use-case descriptions in the earlier mail).

I'm inclined to agree with Jon; therefore one of the remaining
solutions may be preferable

====> b) Create a timerfd interface analogous to POSIX timers

Crea...

To: Michael Kerrisk <mtk-manpages@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 12:51 pm

If you really want to shoot yourself in your foot, I'd pick bullet B.
Bullet A makes me sea-sick, and bullets C and D, well, let's leave POSIX
APIs being *POSIX* APIs.
Once you remove all the "ifs" and "elses" that resulted from your previous
bullet A multiplexing implementation, timerfd_gettime and timerfd_settime
should result in being pretty slick.
I still think we could have survived w/out all this done inside the
kernel though.

- Davide

-

To: Davide Libenzi <davidel@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David Härdema <david@...>
Date: Saturday, September 22, 2007 - 9:12 am

Davide, Andrew, Linus, et al.

At the start of this thread
(http://thread.gmane.org/gmane.linux.kernel/581115 ), I proposed 4
alternatives to Davide's original timerfd API. Based on the feedback in
that thread (and one or two earlier comments):

Let's dismiss option (a), since it is an unlovely multiplexing interface.

Option (b) seems a viable. The most notable concern was from Thomas
Gleixner, that we might end up duplicating code from the POSIX timers API
within the timerfd API -- some eventual refactoring might mitigate this
problem.

Option (c) seems overly complex. In addition, David Härdeman pointed out
that option (c) (and, I realised afterwards, option (d)) require the
userland programmer to maintain a mapping between timerfd file descriptors
and POSIX timer IDs. Thomas Gleixner proposed an API that: attempts to
avoid that problem; mixes features of options (c) and (d); and probably
helps avoid redundancy of kernel code between the timerfd system and the
POSIX timers system. I'll flesh out that API now as I understand it:

====> e) Integrate timerfd() with the POSIX timers API in such a way that
the POSIX timers API understands timerfd file descriptors.

Under the POSIX timers API, a new timer is created using:

int timer_create(clockid_t clockid, struct sigevent *evp,
timer_t *timerid);

When making this call, we would specify evp.sigev_notify to a new flag
value SIGEV_TIMERFD, to inform the system that this timer will deliver
notification via a timerfd file descriptor.

We would then have a timerfd() call that returns a file descriptor
for the newly created 'timerid':

fd = timerfd(timer_t timerid);

(A variant here would be to have timer_create() directly return a file
descriptor when SIGEV_TIMERFD is specified, although this breaks the
traditional semantics that timer_create() only returns 0 on success.)

We could then use the POSIX timers API to operate on the timer
(start it / modify it / fetch timer value):

int timer_settime(timer_t t...

To: Michael Kerrisk <mtk-manpages@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <david@...>
Date: Saturday, September 22, 2007 - 5:07 pm

I guess I better do, otherwise you'll continue to stress me ;)

int timerfd_create(int clockid);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);

Patch below. Builds, not tested yet (you need to remove the "broken"
status from CONFIG_TIMERFD in case you want to test - and plug the new
syscall to arch/xxx).
May that work for you?
Thomas-san, hrtimer_try_to_cancel() does not touch ->expires and I assume
it'll never do, granted?

- Davide

---
fs/compat.c | 32 ++++++++--
fs/timerfd.c | 144 +++++++++++++++++++++++++++++------------------
include/linux/compat.h | 7 +-
include/linux/syscalls.h | 7 +-
4 files changed, 126 insertions(+), 64 deletions(-)

Index: linux-2.6.mod/fs/timerfd.c
===================================================================
--- linux-2.6.mod.orig/fs/timerfd.c 2007-09-22 12:22:19.000000000 -0700
+++ linux-2.6.mod/fs/timerfd.c 2007-09-22 13:21:21.000000000 -0700
@@ -23,6 +23,7 @@

struct timerfd_ctx {
struct hrtimer tmr;
+ int clockid;
ktime_t tintv;
wait_queue_head_t wqh;
int expired;
@@ -46,7 +47,7 @@
return HRTIMER_NORESTART;
}

-static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags,
+static void timerfd_setup(struct timerfd_ctx *ctx, int flags,
const struct itimerspec *ktmr)
{
enum hrtimer_mode htmode;
@@ -58,7 +59,7 @@
texp = timespec_to_ktime(ktmr->it_value);
ctx->expired = 0;
ctx->tintv = timespec_to_ktime(ktmr->it_interval);
- hrtimer_init(&ctx->tmr, clockid, htmode);
+ hrtimer_init(&ctx->tmr, ctx->clockid, htmode);
ctx->tmr.expires = texp;
ctx->tmr.function = timerfd_tmrproc;
if (texp.tv64 != 0)
@@ -150,76 +151,109 @@
.read = timerfd_read,
};

-asmlinkage long sys_timerfd(int ufd, int clockid, int flags,
- const struct itimerspec ...

To: Davide Libenzi <davidel@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David Här <david@...>
Date: Sunday, September 23, 2007 - 1:33 pm

Hi Davide,

I applied this patch against 2.6.27-rc7, and wired up the syscalls as shown
in the definitions below. When I ran the the program below, my system
immediately froze. Can you try it on your system please.

Cheers,

Michael

/* Link with -lrt */

#define _GNU_SOURCE
#include <sys/syscall.h>
#include <unistd.h>
#include <time.h>
#if defined(__i386__)
#define __NR_timerfd_create 325
#define __NR_timerfd_settime 326
#define __NR_timerfd_gettime 327
17170:man-pages/man2> cat timerfd3_test.c
/* Link with -lrt */

#define _GNU_SOURCE
#include <sys/syscall.h>
#include <unistd.h>
#include <time.h>
#if defined(__i386__)
#define __NR_timerfd_create 325
#define __NR_timerfd_settime 326
#define __NR_timerfd_gettime 327
#endif

static int
timerfd_create(int clockid)
{
return syscall(__NR_timerfd_create, clockid);
}

static int
timerfd_settime(int ufd, int flags, struct itimerspec *utmr,
struct itimerspec *outmr)
{
return syscall(__NR_timerfd_settime, ufd, flags, utmr, outmr);
}

static int
timerfd_gettime(int ufd, struct itimerspec *outmr)
{
return syscall(__NR_timerfd_gettime, ufd, outmr);
}

/*
static int
timerfd(int ufd, int clockid, int flags, struct itimerspec *utmr,
struct itimerspec *outmr)
{
return syscall(__NR_timerfd, ufd, clockid, flags, utmr, outmr);
}

*/

/*
*/
#define TFD_TIMER_ABSTIME (1 << 0)

////////////////////////////////////////////////////////////

// #include <sys/timerfd.h>
#include <time.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> /* Definition of uint32_t */

#define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)

static void
print_elapsed_time(void)
{
static struct timespec start;
struct timespec curr;
static int first_call = 1;
int secs, nsecs;

if (first_call) {
first_call = 0;
if (clock_gettime...

To: Michael Kerrisk <mtk-manpages@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <david@...>
Date: Sunday, September 23, 2007 - 2:33 pm

There's an hrtimer_init() missing in timerfd_create(). I'll refactor the
patch.

- Davide

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <david@...>
Date: Sunday, September 23, 2007 - 2:41 pm

There's the case of a timerfd_gettime return status when the timerfd has
not been set yet (ie, soon after a timerfd_create), to handle.
Current way is to return an (itimerspec) { 0, 0 }. Ok?

- Davide

-

To: Davide Libenzi <davidel@...>
Cc: Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <david@...>
Date: Sunday, September 23, 2007 - 3:03 pm

Seems reasonable. In the analogous situation, the POSIX timers API
returns a structure containing all zeros, at least on Linux.
-

To: Davide Libenzi <davidel@...>
Cc: Michael Kerrisk <mtk-manpages@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David <david@...>
Date: Saturday, September 22, 2007 - 5:26 pm

Davide-san, I have no intention to change that, but remember there is
this file "Documentation/stable_api_nonsense.txt" :)

tglx

-

To: Thomas Gleixner <tglx@...>
Cc: Michael Kerrisk <mtk-manpages@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <david@...>
Date: Saturday, September 22, 2007 - 7:21 pm

Heh, I guess that'll work then ;)

- Davide

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: Davide Libenzi <davidel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David <david@...>
Date: Saturday, September 22, 2007 - 1:10 pm

Michael,

It should be possible to use the timerfd syscalls as wrappers for the
posix timer implementation and add the discussed SIGEV_TIMERFD only
internally in the kernel to signal the posix timer code new delivery
mechanism.

tglx

-

To: <linux-kernel@...>
Date: Saturday, September 22, 2007 - 10:32 am

Maybe it is possible to reimplement the POSIX API in usermode using the
kernel's FD implementation? (and drop the posix support from kernel)

Gruss
Bernd
-

To: Bernd Eckenfels <ecki@...>
Cc: <linux-kernel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David Härdeman <david@...>, <mtk-manpages@...>
Date: Saturday, September 22, 2007 - 12:07 pm

Hello Bernd,

Please don't trim the CC list when replying! I nearly did not see
your reply, and others will have missed it also.

It's a clever idea... Without thinking on it too long, I'm not sure
whether or not there might be some details which would make this

However we couldn't drop POSIX support from the kernel, because that
would break the ABI.

Cheers,

Michael
-

To: Michael Kerrisk <mtk.linux.lists@...>
Cc: Bernd Eckenfels <ecki@...>, <linux-kernel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <mtk-manpages@...>
Date: Saturday, September 22, 2007 - 7:37 pm

It seems to be a dangerous idea. It has the potential of breaking
userspace applications that rely on POSIX timers not creating fd's.

Image code like this:

/* Close stdin, stdout, stderr */
close(0);
close(1);
close(2);

/* Oh, a timer would be nice */
timer_create(x, y, z);

/* Create new stdin, stdout, stderr */
fd = open("/dev/null", flags);
dup(fd);
dup(fd);

Unless timer_create does some magic to avoid using the lowest available
fd, this would suddenly break as the timerfd would be fd 0.

--
David Härdeman
-

To: Michael Kerrisk <mtk.linux.lists@...>
Cc: Bernd Eckenfels <ecki@...>, <linux-kernel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, David <david@...>, <mtk-manpages@...>
Date: Saturday, September 22, 2007 - 1:05 pm

You'd need be quite masochistic to start such a project. The POSIX timer
API consists mostly of corner cases and I doubt that you get them even
halfway under control in a pure user space implementation.

It would be a rather huge performance penalty as well. You need at least

True. So there is no point in reinventing the wheel.

tglx

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: Davide Libenzi <davidel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 3:30 am

[Resend, because I got one email address wrong in the earlier send]

After my earlier mail (full thread here:
http://thread.gmane.org/gmane.linux.kernel/574430/focus=579368 )
it seems that some people agree we should give a bit more
thought to how a final timerfd interface should look.

Davide's original API and the limitations that I see in it,
are described in the message cited above. In that message,
I proposed three alternatives (and I'm going to add a fourth
now) that provide "get-while-set" and "non-destructive-get"
functionality. Before trying to implement anything I'd like to
get input on the various possible designs.

The four designs are:

a) A multiplexing timerfd() system call.
b) Creating three syscalls analogous to the POSIX timers API (i.e.,
timerfd_create/timerfd_settime/timerfd_gettime).
c) Creating a simplified timerfd() system call that is integrated
with the POSIX timers API.
d) Extending the POSIX timers API to support the timerfd concept.

My order of preference for the implementations is currently
something like (in descending order): d, b, c, a.

The details follow:

====> a) Add an argument (a multiplexing timerfd() system call)

In an earlier mail (http://thread.gmane.org/gmane.linux.kernel/559193 ).
I proposed adding a further argument to timerfd(): old_utmr, which could
be used to return the time remaining until expiry for an existing timer
The proposed semantics that would allow get and get-while-setting
functionality.

Advantages:
1. Provides the desired get and get-while-setting functionality.
2. It's a simple interface (a single system call).

Disadvantage:
Jon Corbet pointed out
(http://thread.gmane.org/gmane.linux.kernel/559193/focus=570709 )
that this interface was starting to look like a multiplexing syscall,
because there is no case where all of the arguments are used (see
the use-case descriptions in the earlier mail).

I'm inclined to agree with Jon; therefore one of the remaining
solutions may be preferable

=...

To: Michael Kerrisk <mtk-manpages@...>
Cc: Davide Libenzi <davidel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 5:10 am

Michael,

I'm not scared by the 3 system calls. I rather fear that we end up

The main problem here is, that there is no way to tell the posix timer
code that the delivery of the timer is through the file descriptor and
not via the usual posix timer mechanisms. We need something like the

What happens on close(fd) ? Is the posix timer automatically destroyed ?
Is the file descriptor invalidated when the timer is destroyed via
timer_delete(timer_id) ? The automatic file descriptor creation is a bit
ugly.

I'd rather see a combination of c) and d) as a solution:

Notify the posix timer code that the timer delivery is done via the file
descriptor mechanism (SIGEV_TIMERFD).

Use a new syscall to open a file descriptor on that timer.

When the file descriptor is closed the timer is not destroyed, but
delivery disabled (analogous to the SIGEV_NONE case), so you can reopen
and reactivate it later on.

This way we have it nicely integrated into the posix timer code and keep
the existing semantics of posix timers intact.

We need to think about the open file descriptor in the timer_delete()
case as well, but this should be not too hard to sort out.

tglx

-

To: Thomas Gleixner <tglx@...>
Cc: <Lee.Schermerhorn@...>, <torvalds@...>, <vda.linux@...>, <rdunlap@...>, <corbet@...>, <hch@...>, <akpm@...>, <linux-kernel@...>, <geoff@...>, <drepper@...>, <davidel@...>, David Härdeman <david@...>
Date: Tuesday, September 18, 2007 - 5:30 am

Fair enough. I mainly tried to do things that way to minimize

Yes. Perhaps some refactoring might be required, if we went

Well, I left it it kind of open whether the expiration
notification might be delivered via both the traditional
mechanism, and via the tiemrfd. But I realize that all

This seems like a workable idea also. But note David Härdeman's
critique of options c & d: the existence of a coupled timerfd
and a timerid means that the application must maintain a mapping
between the two, so that after an epoll call (for example) that
says the timerfd is ready, the timer can be manipulated using
the corresponding timerfd. This isn't IMO a fatal flaw, but
it does make the API a little more clumsy.

Cheers,

Michael
--
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7

Want to help with man page maintenance?
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages ,
read the HOWTOHELP file and grep the source
files for 'FIXME'.

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: <Lee.Schermerhorn@...>, <torvalds@...>, <vda.linux@...>, <rdunlap@...>, <corbet@...>, <hch@...>, <akpm@...>, <linux-kernel@...>, <geoff@...>, <drepper@...>, <davidel@...>, David <david@...>
Date: Tuesday, September 18, 2007 - 5:42 am

Hmm, we might do something like:

timer_gettime(fd | POSIX_TIMER_FD, .....);

So the kernel looks up the fd in order to figure out the timer_id, which
needs to be referenced in filep->private_data anyway.

tglx

-

To: Thomas Gleixner <tglx@...>
Cc: <david@...>, <davidel@...>, <drepper@...>, <geoff@...>, <linux-kernel@...>, <akpm@...>, <hch@...>, <corbet@...>, <rdunlap@...>, <vda.linux@...>, <torvalds@...>, <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 7:08 am

^^^^^^^

And you'd need similar for timer_settime() and, perhaps,
timer_getoverrun(). But it seems slightly ugly, in the same way that
my idea in option (d) of returning a file descriptor from
timer_create() seems a slightly ugly. (And can we guarantee that
the [timerid] space is distinct from the [fd|POSIX_TIMER_FD] space?)

Cheers,

Michael
--
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7

Want to help with man page maintenance?
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages ,
read the HOWTOHELP file and grep the source
files for 'FIXME'.

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: <david@...>, <davidel@...>, <drepper@...>, <geoff@...>, <linux-kernel@...>, <akpm@...>, <hch@...>, <corbet@...>, <rdunlap@...>, <vda.linux@...>, <torvalds@...>, <Lee.Schermerhorn@...>
Date: Tuesday, September 18, 2007 - 7:30 am

If we use the most significant bit for POSIX_TIMER_FD, we should be
fine.

tglx

-

To: Thomas Gleixner <tglx@...>
Cc: Michael Kerrisk <mtk-manpages@...>, <davidel@...>, <drepper@...>, <geoff@...>, <linux-kernel@...>, <akpm@...>, <hch@...>, <corbet@...>, <rdunlap@...>, <vda.linux@...>, <torvalds@...>, <lee.schermerhorn@...>
Date: Tuesday, September 18, 2007 - 9:13 am

I think alternative b) - three new syscalls, sounds better.

The only negatives so far are that it adds more syscalls and that it might
require code duplication with posix timers. The syscall numbers argument
seemed not to be very important and the code duplication should be fixable
by refactoring the code so that more is shared between the two systems (I
assume).

Overloading file descriptors with flags looks ugly, is there any other
syscall which does that?

--
David Härdeman
(sorry Thomas for the dupe, I missed replying to all on the first msg).

-

To: David Härdeman <david@...>
Cc: Thomas Gleixner <tglx@...>, <davidel@...>, <drepper@...>, <geoff@...>, <linux-kernel@...>, <akpm@...>, <hch@...>, <corbet@...>, <rdunlap@...>, <vda.linux@...>, <torvalds@...>, <lee.schermerhorn@...>
Date: Saturday, September 22, 2007 - 9:03 am

AFAIK there is no other syscall that does that. I agree that it's not very
pretty.

Cheers,

Michael

--
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7

Want to help with man page maintenance? Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages/
read the HOWTOHELP file and grep the source files for 'FIXME'.

-

To: Michael Kerrisk <mtk-manpages@...>
Cc: Davide Libenzi <davidel@...>, Ulrich Drepper <drepper@...>, <geoff@...>, lkml <linux-kernel@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, Christoph Hellwig <hch@...>, Jonathan Corbet <corbet@...>, Randy Dunlap <rdunlap@...>, <vda.linux@...>, Linus Torvalds <torvalds@...>, Lee Schermerhorn <lee.schermerhorn@...>
Date: Tuesday, September 18, 2007 - 4:05 am

Wouldn't this remove some of the usefulness of the timerfd?

For example, if a timerfd is one of the fd's that is returned by a
epoll_wait syscall, you manually need to do the mapping between the
timerfd and the timerid in order to be able to modify the timer.

The advantage of solution b) above is that the fd is everything that is
needed to work with the timer. With solution c) you have to keep two
references to the same timer around and use one of them depending on what
you want to do with the timer.

Also, if the timerfd is close():d, does that remove the underlying timer
(invalidate the timerid) as well?

--
David Härdeman

-

To: "David Härdeman" <david@...>
Cc: <lee.schermerhorn@...>, <torvalds@...>, <vda.linux@...>, <rdunlap@...>, <corbet@...>, <hch@...>, <tglx@...>, <akpm@...>, <linux-kernel@...>, <geoff@...>, <drepper@...>, <davidel@...>
Date: Tuesday, September 18, 2007 - 5:01 am

Hello David,

You're right, that makes the interface more clumsy. +1 for
the disadvantages. And of course solution (d) also suffers

Yes, true. Solution (b) would also be relatively easier (for me)

My gut feeling would be to say that closing the timerfd would not
remove the underlying timer (so the timerid would remain valid).
One could even do things like recreating a file descriptor
for the timer using another timerfd() call.

But now that raises the question: what are the semantics if
timerfd() is called more than once on the same timerid?
Perhaps a read() from any one of them (destructively)
reads the expiration count, as though one had read from a
dup()ed the file descriptor. All in all, solution (c)
starts to look overly complex, and maybe suffers from
various dirty corners in the API. (Solution (d) feels
slightly better, because the creation of the file descriptor
and the timerid are integrated into a single call, and the
fact that it integrates with an existing API, but
it still has the limitation you describe above.)

Cheers,

Michael
-

To: Michael Kerrisk <mtk-manpages@...>
Cc: "David <david@...>, <lee.schermerhorn@...>, <torvalds@...>, <vda.linux@...>, <rdunlap@...>, <corbet@...>, <hch@...>, <akpm@...>, <linux-kernel@...>, <geoff@...>, <drepper@...>, <davidel@...>
Date: Tuesday, September 18, 2007 - 5:27 am

I don't think it is a big problem to have several open file descriptors
on a single posix timer without having destructive reads, we just need
to store the event count per file descriptor in file->private_data. We
solved this in the UIO code already and it works perfectly fine.

tglx

-

Previous thread: VGA text console display problem with kernel 2.6.23-rc5/6 by ben soo on Tuesday, September 18, 2007 - 3:26 am. (4 messages)

Next thread: Re: [GIT PATCH] USB autosuspend fixes for 2.6.23-rc6 by Hans de Goede on Monday, September 17, 2007 - 8:56 am. (1 message)