[resend, since LKML didn't like last send.] Randy, Thanks for the input. I will try to send a revised patch after I get back from holiday. Some comments below. Davide -- ping! Can you please offer your comments about this change, and also thoughts on Jon's and my comments about a more radical API change later in this thread. Cheers, Michael Cheers, Michael Okay -- will do for a future revision of this patch. In the meantime, the most relevant earlier mail thread is this: http://article.gmane.org/gmane.linux.kernel/559193 The following are also of some relevance: http://article.gmane.org/gmane.linux.kernel/556043 Yes, that was just a marker to myself for comments that should eventually -
IMO the complexity of the resulting API (and resulting patch), and the ABI change, is not justified by the added value. - Davide -
Neither of the proposed APIs (either my multiplexed version of timerfd() or Jon's/my idea of using three system calls (like POSIX timers), or the notion of timerfd() integrated with POSIX timers) is more complicated than the existing POSIX timers API. The ABI change doesn't really matter, since timerfd() was broken in 2.6.22 anyway. Both previous APIs provided the features I have described provide: * the ability to fetch the old timer value when applying a new setting * the ability to non-destructively fetch the amount of time remaining on a timer. This is clearly useful for timers -- but you have not explained why you think this is not necessary for timerfd timers. Please -- let's do timerfd() better. Either three syscalls like: timerfd_create() timerfd_settime() timer_gettime() (the analogs of timer_create(), timer_settime(), timer_gettime()). Or (if possible, and even better) timerfd() integrated with POSIX timers. Then we have a good API for the coming decades. I'm prepared to help out with patches (for what my help is worth ;-)). Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages/ read the HOWTOHELP file and grep the source files for 'FIXME'. -
<wakes up> I'd have thought that the existing stuff would be near-useless without the capabilities which you describe? -
Not useless, but certainly less functional than it can/should be (and with not too much effort on the kernel implementation side). Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
Useless like it'd be a motorcycle w/out a cup-holder :) Seriously, the ability to get the previous values from "something" could have a meaning if this something is a shared global resource (like signals for example). In the timerfd case this makes little sense, since you can create as many timerfd as you like and you do not need to share a single one by changing/restoring the original context. On top of that, the cup-holder addition would cost in terms of API clarity (or in terms of two additional system calls in the other case), and in terms of kernel code footprint. Costs that IMO are not balanced by the added values. - Davide -
However, one can have multipe POSIX timers, just as you can have multiple timerfd timers; nevertheless POSIX timers provide I agree my proposed API is not as clean as it could be, that's why Or better still, have timerfd() integrated with POSIX tiemrs (if this is feasible). This givesus a simple API, exactly one new syscall, and all of the functionality of the existing POSIX Not sure what your concern is here. The total amount of new code for all of these options is pretty small. Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
The fact that POSIX defined a certain API in a given way, does not automatically mean that every other API has to look exactly like that. fs/compat.c | 34 ++++++++-- fs/timerfd.c | 147 +++++++++++++++++++++++++++++++++++++++-------- include/linux/compat.h | 3 include/linux/syscalls.h | 3 4 files changed, 153 insertions(+), 34 deletions(-) And the API definition becomes pretty messy. The other way is to add new system calls. 120+ lines of code more of new system calls wouldn't even be a problem in itself, if the added value was there. IMO, as I already said, the added value does not justify them. - Davide -
Davide, As I think about this more, I see more problems with your argument. timerfd needs the ability to get and get-while-setting just as much as the earlier APIs. Consider a library that creates a timerfd file descriptor that is handed off to an application: that library may want to modify the timer settings without having to create a new file descriptor (the app mey not be able to be told about the new fd). Your argument just doesn't hold, AFAICS. Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
Such hypotethical library, in case it really wanted to offer such functionality, could simply return an handle instead of the raw fd, and take care of all that stuff in userspace. Again, mimicking POSIX APIs doesn't always take you in the right place. - Davide -
Did I miss something? Is it not the case that as soon as the library returns a handle, rather than an fd, then the whole advantage of timerfd() (being able to select/poll/epoll on POSIX may goof in places, but in general it is the result of many smart people thinking about how to design/standardize APIs. So the onus is on us to explain why they got this point wrong. And it is not merely POSIX that did things things in the way I've described: so did the earlier setitimer()/getitimer(). Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
<wakes up even more> Seems that we need to pay some more attention to this, as time is pressing and we should get this stuff finalised in 2.6.23. Which means that most of us have some catching-up to do. Michael, could you please refresh our memories with a brief, from-scratch summary of what the current interface is, followed by a summary of what you believe to be the shortcomings to be? -
Why? The handle would simply be a little struct where the timerfd fd is stored, and a XXX_getfd() would return it. So my point is, I doubt such functionalities are really needed, and I also argue that the kernel is the best place for such wrapper code to go. - Davide -
So what happens if one thread (via the library) wants modify a timer's settings at the same timer as another thread is select()ing on it? The first thread can't do this by creating a new timerfd timer, since it wants to affect the select() in the other thread? Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
It can be done w/out any problems. The select thread will be notified whenever the new timer setting expires. - Davide -
We are going in circles here. I think you are missing my point.
Consider the following
[[
Thread A: calls library function which creates a timerfd file
descriptor.
Thread B: calls select() on the timerfd file descriptor.
Thread A: calls library function which wants to:
a) modify timer settings, and retrieve copy of current timer
settings, and later
b) restore old timer settings.
]]
This seems a quite reasonable use-case to me, and the existing
interface simply can't support it.
Cheers,
Michael
--
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7
Want to help with man page maintenance?
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages ,
read the HOWTOHELP file and grep the source
files for 'FIXME'.
-
"Quite reasonable"? :)
I honestly doubt it, but anyway. Modulo error checking:
struct tfd {
int fd, clockid;
struct itimerspec ts;
};
struct tfd *tfd_create(int clockid, int flags, const struct itimerspec *ts) {
struct tfd *th;
th = malloc(sizeof(*th));
th->clockid = clockid;
th->ts = *ts;
th->fd = timerfd(-1, clockid, flags, ts);
return th;
}
void tfd_close(struct tfd *th) {
close(th->fd);
free(th);
}
int tfd_getfd(const struct tfd *th) {
return th->fd;
}
int tfd_gettime(const struct tfd *th, int *clockid, struct itimerspec *ts) {
*clockid = th->clockid;
*ts = th->ts;
return 0;
}
int tfd_settime(struct tfd *th, int clockid, int flags,
const struct itimerspec *ts) {
th->fd = timerfd(th->fd, clockid, flags, ts);
th->clockid = clockid;
th->ts = *ts;
return 0;
}
Wrap the get/set with a mutex in case you plan to shoot yourself in a foot
by doing get/set from multiple threads ;)
So, once again:
- I sincerly doubt the above is common usage/design patters for timerfds
* timerfds are not a common global resource, ala signals, that requires
get+set+restore pattern - you can have many of them set to different
times
- Those IMO *very* special use cases can be handled in userspace with few
lines of code, *if* really needed
- Davide
-
^ (By the way, I assume that back there, there was a missing "not", You are asserting this in the face of two previous APIs designed by people who (at least in the case of POSIX timers) probably I'm not sure what problem this code is supposed to solve. It This function is *not at all* equivalent to the "get" functionality of the previous APIs. The "get" functionality of POSIX timers (for example) returns a structure that contains the timer interval and the *time until the next expiration of the timer* (not the initial timer string, as your code above does). So come up with a reliable, race-free way of doing that in userspace. Then make it work for both CLOCK_MONOTONIC and CLOCK_REALTIME timers. (You'll certainly need to be making some additional system calls, by the way: at least some See my comments above in this message. You may doubt it, but No! In the example I described the library is able to create and control exactly *one* timerfd file descriptor. It wants to hand that fd back to the application and then perform arbitrary manipulations of the timer's settings. Meanwhile the application needs to (repeatedly) monitor that one file descriptor in a select()/poll()/epoll() (and so the library can't just arbitrarily You have not demonstrated this ;-). Your userspace code does not solve the problem. Best regards, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
No, I don't think I will. Since 1) it's so trivial it's not even challenging 2) you know it can be done (I assume) 3) it solves a non-case made up to justify an API/kernel-code bloating. But you don't need *my* signoff. Cook a working patch, make a case for it, and gather support and signoffs. I won't be acking it because, as I said many times, I think it's useless bloating. - Davide -
Well, it seems to me that the proposed library wrapper code could start to get rather verbose by the time one adds in Sorry -- I was probably speaking too rhetorically. In fact, it's not completely clear to me that it can (always) be done. For example, the answer depends on knowing the clockid. But what if the file descriptor has been handed off to some code that doesn't know the clock type (i.e., isn't accessing the fd via your suggested library)? Cheers, -
I think at least ability to read remaining time from a timerfd is needed. -- vda -
Andrew,
I'll break this up into parts:
1. the existing timerfd interface
2. timerfd limitations
3. possible solutions
a) Add an argument
b) Create an interface similar to POSIX timers
c) Integrate timerfd with POSIX timers
Cheers,
Michael
1: the existing timerfd interface
=================================
In 2.6.22, Davide added timerfd() with the following interface:
returned_fd = timerfd(int fd, int clockid, int flags,
struct itimerspec *utimer);
If fd is -1, a new timer is created and started. The syscall
returns a file descriptor for the timer. 'utimer' specifies
the initial expiration and interval of the timer.
'clockid' is CLOCK_REALTIME or CLOCK_REALTIME. The 'utimer'
value is relative, unless TFD_TIMER_ABSTIME is specified in
'flags', in which case the initial expiration is specified
absolutely.
If 'fd' is not -1, then the call modifies the existing timer
referred to by the file descriptor 'fd'. The 'clockid', 'flags',
and 'utimer' can all be modified. The return value is 'fd'.
The key feature of timerfd() is that the caller can use
select/poll/epoll to wait on traditional file descriptors and
one or more timers.
read() from a timerfd file descriptor (should) return a 4-byte
integer that is the number of timer expirations since the last
read. (If no expiration has so far occurred, read() will block.)
IMPORTANT POINT: as implemented in 2.6.22, timerfd was broken:
only a single byte of info was returned by read(). I regard
this as a virtue: it gives us something closer to a blank slate
for fixing the problems described below; furthermore,
arguably at this point we could buy ourselves time by
pulling timerfd() from 2.6.23, and taking more time to get
things right in 2.6.24.
(More details on timerfd() can be found here:
http://lwn.net/Articles/245533/)
2. timerfd limitations
======================
Unix has two older timer interfaces:
* setitimer/getitimer and
* POSIX timers ...Sure. If you're implementing a timeout and you want to reset it, you might indeed want to know how close the old one was to expiring. Davide's proposal sounds like an awkward workaround for missing functionality. I don't think we'll have this settled and coded in time for 2.6.23. So I think the prudent thing to do is to push this back to 2.6.24 and not offer sys_timerfd() in 2.6.23. -
In the other thread I commented that the userspace solution starts to look pretty complex, and I doubt that it can I think this would be wise. I'd like to talk with Davide some more about the possibilities. Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
Did you want a patch to remove the syscall number for now, or will you do that? Cheers, Michael -- Michael Kerrisk maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 Want to help with man page maintenance? Grab the latest tarball at http://www.kernel.org/pub/linux/docs/manpages , read the HOWTOHELP file and grep the source files for 'FIXME'. -
