A new syscall is introduced that allows tuning of a POSIX clock. The syscall is implemented for four architectures: arm, blackfin, powerpc, and x86. The new syscall, clock_adjtime, takes two parameters, a frequency adjustment in parts per billion, and a pointer to a struct timespec containing the clock offset. If the pointer is NULL, a frequency adjustment is performed. Otherwise, the clock offset is immediately corrected by skipping to the new time value. In addtion, the patch provides way to unregister a posix clock. This function is need to support posix clocks implemented as modules. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> --- arch/arm/include/asm/unistd.h | 1 + arch/arm/kernel/calls.S | 1 + arch/blackfin/include/asm/unistd.h | 3 +- arch/blackfin/mach-common/entry.S | 1 + arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 3 +- arch/x86/ia32/ia32entry.S | 1 + arch/x86/include/asm/unistd_32.h | 3 +- arch/x86/include/asm/unistd_64.h | 2 + arch/x86/kernel/syscall_table_32.S | 1 + include/linux/posix-timers.h | 5 ++++ include/linux/syscalls.h | 3 ++ kernel/compat.c | 20 ++++++++++++++++++ kernel/posix-cpu-timers.c | 5 ++++ kernel/posix-timers.c | 38 ++++++++++++++++++++++++++++++++++++ 15 files changed, 85 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h index dd2bf53..6bea0b7 100644 --- a/arch/arm/include/asm/unistd.h +++ b/arch/arm/include/asm/unistd.h @@ -392,6 +392,7 @@ #define __NR_rt_tgsigqueueinfo (__NR_SYSCALL_BASE+363) #define __NR_perf_event_open (__NR_SYSCALL_BASE+364) #define __NR_recvmmsg (__NR_SYSCALL_BASE+365) +#define __NR_clock_adjtime (__NR_SYSCALL_BASE+366) /* * The following SWIs are ARM private. diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S index 37ae301..8a22fdd ...
FYI, this is going to hit a conflict as i'm about to push out an update to wire up the new 2.6.36 syscalls -mike --
Thanks for the "heads up." At this point, the patch is meant just to generate discussion and feedback. Thanks, Richard --
It could be, but the logic turns out the same either way. The semantics of the call is, if 'tp' is NULL, then adjust the frequency --
It looks well-implemented, and seems to be a reasonable extension to the clock API. I'm looking forward to your ptp patches on top of this to see how it all fits together. For new syscalls, it's best to take linux-api on Cc. I also added This part should probably be a separate patch, and you need to add some form of serialization here to avoid races between the clock EOPNOTSUPP is specific to sockets, better use -EINVAL here. So we already return -EOPNOTSUPP in some cases? The man page does not document this. It would be possible to add locks here to serialize unregistration of a clock against dereferencing members of posix_clocks[], but that would cause noticable overhead. A better alternative might be to make it an RCU-protected array of pointers, and use a rcu_assign_pointer/rcu_syncronize/kfree or call_rcu sequence in unregister_posix_clock. Or you just live with not being able to unload this module. Arnd --
ENOTTY is the usual errno for "inappropriate ioctl for device". Due to the way this patch has been chopped up, I can't tell if that's what is intended here. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
It's for the CLOCK_* syscall family, which I think is different enough from an ioctl that ENOTTY makes no sense. The documented return values of timer_create() are EAGAIN, EINVAL and ENOMEM. Arnd --
EOPNOTSUPP is also called ENOTSUP in userland. ENOTSUP is the appropriate POSIX errno code for a situation such as a clock type that cannot be used in a certain call (like setting when you can only read it, etc.). Thanks, Roland --
As I mentioned in the previous mail, I agree the new functionality (adjusting the time by an offset instantaneously) is useful, but I'd prefer it be done initially within the existing adjtimex() interface. Then if the posix-time clock_id multiplexing version of adjtimex is found to be necessary, the new syscall should be introduced, using the same API (not all clock_ids need to support all the adjtimex modes, but the new interface should be sufficient for NTPd to use). There are some other conceptual issues this new syscall introduces: 1) While clock_adjtimex(CLOCK_REALTIME,...) would be equivalent to adjtimex(), would clock_adjtimex(CLOCK_MONOTONIC,...) make sense? Given CLOCK_MONOTONIC and CLOCK_REALTIME are both based off the same notion of time, but offset from each other, any adjustment to one clock would be reflected in the other. However, the API would make it seem like they could be adjusted independently. 2) The same issue in #1 exists for CLOCK_REALTIME/MONOTONIC_COARSE variants. 3) Freq steering for MONOTONIC_RAW would defeat the purpose of the clock_id. 4) Does adjustments to CPU_TIME clock_ids make sense? I'm guessing "no" is the right call to all of the above, but am interested if others see it differently. thanks -john --
Would the new syscall need to take a struct timex? If so, I think it not worth the effort of adding a syscall. Instead, You could adjust the frequency of either one. As a side effect, the other clock would also be adjusted. You can only change the time offset on CLOCK_REALTIME, and that would If I understand correctly, MONOTONIC_RAW is just access to the Don't think so. Thanks, Richard --
Personally I'd add the new clock_adjtime interface, since it parallels the gettimeofday/clock_gettime() interface levels. Trying to multiplex This in most ways makes the most sense to me, since if CLOCK_REALTIME is But yes, this is another possibly valid interpretation. I don't prefer this one, but that doesn't make it invalid. And so with the new interface, and the possibility of multiple non-synced clocks, there are Not exactly. Its abstracted out a step. MONOTONIC_RAW was added as there were other applications that were trying to get to raw hardware counters through various means. Unfortunately that caused portability issues. So CLOCK_MONOTONIC_RAW allows a constant freq nanosecond representation of a hardware counter. This is similar to what I'm hoping to find here with CLOCK_PTP. Is there a step out that makes this interface similarly abstracted out and easier to understand from a userland perspective? (This is the similar to Alan's critique that it needs to not be PTP specific). Additionally, I'm trying to make sure that having multiple unsynced clocks accessible from the same top-level interface isn't going to become a headache down the road API wise. If we abstract CLOCK_PTP out, doesn't it in effect just be CLOCK_REALTIME_BUTDIFFERENT? thanks -john --
