Based upon high resolution timer and idea borrowed from semaphore,
nano semaphore is created.
Nano semaphore provides finer time resolution depending on system
configuration and capabilities.
Nano semaphore is not to replace semaphore, but used in application
environments where nano seconds are required.
Three methods, nano_semaphore_try_down, nano_semaphore_down and
nano_semaphore_up are implemented in a header file, and there is no
corresponding C file since nano semaphore is not complex.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
---
--- a/include/linux/nano_semaphore.h 1970-01-01 08:00:00.000000000 +0800
+++ b/include/linux/nano_semaphore.h 2010-12-26 12:38:36.000000000 +0800
@@ -0,0 +1,263 @@
+/*
+ * linux/nano_semaphore.h
+ *
+ * Definition and implementation of nano semaphore
+ *
+ * Nano semaphore provides finer time resolution depending on system
+ * configuration and capabilities.
+ *
+ * Nano semaphore could be used in parallel with semaphore.
+ *
+ * Started-by: Hillf Danton
+ *
+ * Credits:
+ * ideas are borrowed from semaphore and high resolution timer
+ *
+ * Distributed under the terms of GPL v2
+ */
+
+#ifndef __NANO_SEMAPHORE_H_
+#define __NANO_SEMAPHORE_H_
+
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/ktime.h>
+#include <linux/sched.h>
+#include <linux/hrtimer.h>
+
+/* must be initialized before use */
+struct nano_semaphore {
+ struct list_head chair;
+ struct task_struct *holder;
+ spinlock_t lock;
+};
+
+#define NANO_SEMAPHORE_INITIALIZER(name) \
+{ \
+ .chair = LIST_HEAD_INIT((name).chair), \
+ .holder = NULL, \
+ .lock = __SPIN_LOCK_UNLOCKED((name).lock), \
+}
+
+
+#define DEFINE_NANO_SEMAPHORE(name) \
+ struct nano_semaphore name = NANO_SEMAPHORE_INITIALIZER(name)
+
+
+static inline void nano_semaphore_init(struct nano_semaphore *s)
+{
+ INIT_LIST_HEAD(&s->chair);
+ s->holder = NULL;
+ spin_lock_init(&s->lock);
+}
+
+/*
+ * Helper functions about nano ...Above description tells how its done, what it is but its not clear why we should use it. Can I know why should we use it or its benefits? thanks, --
The outstanding benefit looks that nano semaphore could be used in cases where callers want to wait not only more than one jiffy, but far less than one jiffy also. thanks --
But, how do we know that resources are going to be available within one jiffy or far less than one jiffy? We we be deterministic? thanks, rakib --
This is a really hard question. Though I could not answer as fine as Ingo Molnar could, the deterministic has to be faced by most contentions for resources in kernel, simply because deadlock could occur in spin lock for instance. On the other hand, this question explores a byproduct advantage, not seriously considered before, of nano semaphore, that the caller could learn that there is something out of track if she waited over 100 microseconds in 5 consequent steps, 20 microseconds a step, if she think the resource should be available in 60 microseconds. And in nano semaphore method is available for callers to select waiting over one second, one millisecond, one microsecond, one nanosecond, depending on what the underlying system could offer. After waiting, the caller is free to determine what to do next if the resource is not available. Here is another sample, if taxi will not come in two minutes, I could either give up shopping downtown, or wait another three minutes, depending on what will happen two minutes later. thanks Hillf --
Why should caller think such a way, that resource will be available in How do we know that, what the underlying system will offer? Its an NP-type problem. We cannot determine what will happen on a systems context. When a caller need resource, without that resource is it possible to accomplish its job? If its not, then the ultimate way to But what, if your mother told you not to come home until you done shopping? :) thanks, --
There are methods in kernel, say watchdog, to detect cases that are out of track. The estimation about the availability of resource is utilized also in dispatching commands to scsi disk by registering timer. In nano semaphore, though the methods not implemented, the waiter is able to work out which task is holding it over 60us in This is another hard question. And Ingo could say a few words about This is not hard. When kmallocing 4k, if NULL returned, I will try half, Another way looks to return -EBUSY directly, like aborting what is I will info her every two minutes that taxi still not available, and go home after 15 infos. And lets return to nano semaphore, it is designed, unlike semaphore, not to loop until released by its holder. Another difference is that the caller is free to select the time period in nano seconds to wait. Should anything else be added? thanks, Hillf --
with lock held Several of the functions above could easily be modified to use kernel-doc Please use /** --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** desserts: http://www.xenotime.net/linux/recipes/ --
Based upon high resolution timer and idea borrowed from semaphore,
nano semaphore is created.
Nano semaphore provides finer time resolution depending on system
configuration and capabilities.
Nano semaphore is not to replace semaphore, but used in application
environments where nano seconds are required, rather than jiffy.
Three methods, nano_semaphore_try_down, nano_semaphore_down and
nano_semaphore_up are implemented in a header file, and there is no
corresponding C file since nano semaphore is not complex.
The benefit looks that, like timer and hrtimer, mutex and rtmutex,
file systems with and without journal, spin lock and rw_lock,
kernel is enriched with new element.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
---
--- a/include/linux/nano_semaphore.h 1970-01-01 08:00:00.000000000 +0800
+++ b/include/linux/nano_semaphore.h 2010-12-29 21:50:24.000000000 +0800
@@ -0,0 +1,300 @@
+/*
+ * linux/nano_semaphore.h
+ *
+ * Definition and implementation of nano semaphore
+ *
+ * Nano semaphore provides finer time resolution depending on system
+ * configuration and capabilities.
+ *
+ * Nano semaphore could be used in parallel with semaphore.
+ *
+ * Started-by: Hillf Danton
+ *
+ * Credits:
+ * ideas are borrowed from semaphore and high resolution timer
+ *
+ * Distributed under the terms of GPL v2
+ */
+
+#ifndef __NANO_SEMAPHORE_H_
+#define __NANO_SEMAPHORE_H_
+
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/ktime.h>
+#include <linux/sched.h>
+#include <linux/hrtimer.h>
+
+/**
+ * struct nano_semaphore - definition of nano semaphore
+ * @chair: the list head for pending waiters
+ * @holder: the current holder
+ * @lock: spin lock to serialize accesses in parallel
+ *
+ * must be initialized before usage.
+ */
+struct nano_semaphore {
+ struct list_head chair;
+ struct task_struct *holder;
+ spinlock_t lock;
+};
+
+#define NANO_SEMAPHORE_INITIALIZER(name) \
+{ \
+ .chair = ...There are very few users of real semaphores today, and we're trying to get rid of them. It's not clear what your requirements are, since you have not posted any new users of this, but instead of adding more locking primitives, I would recommend changing one of the existing ones (mutex, semaphore, rwsem) to have nanosecond timeouts instead of jiffies. The easiest way would certainly be to change the three users of down_timeout() to use nanoseconds. Arnd --
I want to add new way to do semaphore with finer time resolution, without considering much about its potential usage. I will try the three instances of down_timeout() soon. Thanks Hillf --
We for sure don't want new semaphores, or new semaphore usage in the kernel .. It should also be noted that the rtmutex (kernel/rtmutex.c) already has this capability. Although I don't think you can use an rtmutex from inside the kernel. If you really want this you should look into the rtmutex, and the regular mutex API's . Daniel -- Sent by an consultant of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. --
Yes. I once even tried unifying the semaphore and rwsem implementation, I wasn't aware we had already grown another one ;-) AFAICT, you can only use it inside of the kernel, but it's very specific and I wouldn't recommend using it unless a regular mutex cannot be used for some reason. The only user besides the futex If Hillf relies on counting semaphores, that won't work, but very few such users exist in code outside of textbooks. Arnd --
Would you please, Daniel, explain why there are so my file systems under
the fs directory? Would you think the ext file system is better than others?
And why there are in kernel spin lock, read/write lock, mutex, rw_mutex,
rtmutx, and semaphore, timer and hrtimer?
It looks hard to change rwsem, almost impossible, since it is based upon
But greping "struct semaphore" include/linux and fs dirs may tell us
Though capable in rtmutex, why mutex should no longer stay in Kernel?
However mutex could be changed based on hrtimer if needed for some reason.
Thanks
Hillf
---
--- a/kernel/mutex.c 2010-11-01 19:54:12.000000000 +0800
+++ b/kernel/mutex.c 2010-12-29 22:35:40.000000000 +0800
@@ -23,6 +23,7 @@
#include <linux/spinlock.h>
#include <linux/interrupt.h>
#include <linux/debug_locks.h>
+#include <linux/hrtimer.h>
/*
* In the DEBUG case we are using the "NULL fastpath" for mutexes,
@@ -248,7 +249,11 @@ __mutex_lock_common(struct mutex *lock,
/* didnt get the lock, go to sleep: */
spin_unlock_mutex(&lock->wait_lock, flags);
preempt_enable_no_resched();
- schedule();
+ do {
+ /* sleep 10,000 nanoseconds per loop */
+ ktime_t kt = ktime_set(0, 10000);
+ schedule_hrtimeout(&kt, HRTIMER_MODE_REL);
+ } while (0);
preempt_disable();
spin_lock_mutex(&lock->wait_lock, flags);
}
--
The problem with semaphores is that people use them in ways that are not very nice, and not very efficient.. Since they are so flexible they can be used in all sorts of ways, many of which are not clean. This is why, if you read the kernel history, most semaphore have been removed from the kernel and replaced with much nicer and cleaner mutexes. Daniel -- Sent by an consultant of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. --
Thanks for sharing the knowledge about rtmutx and semaphore. Hillf --
Most of the file systems are for compatibility with other operating systems. The ones that duplicate Linux-only functionality are there in order to provide backwards-compatibility with existing users. We can't remove them in the There are more of these, and they partly exist because it has been hard These two are subtly different, timers are optimized for not expiring, while That could be changed using the C implementation everywhere, There are three classes of semaphore users today: 1. Those that initialize the semaphore to >1, guarding access to a resource that has multiple users: acpi/osl, mthca, mlx4, megaraid, comedi/vmk80xx, udlfb, usblc, usb-skeleton, blizzard, hwa742, and 9p. 2. Those that use the semaphore as some sort of completion, or a combination of completion and mutex. 3. Those that can and should be converted to mutex: most of the staging drivers, plus some more. IMHO it would be nice to separate the first two classes in some way, so we can make the counting semaphores stricter and apply the same rules as Doing this would be extremely inefficient, because now the mutex wait function would wake up very frequently instead of just once when the mutex has been released by another thread. Arnd --
Is it waked up less than jeffy? If not, checking more frequently in the endless loop could help receive signal, other than that is extremely meaningless, as the holder of mutex is not ready to release. As to timeout, it is another story, in which waiter is able to determine how many jiffies or nanoseconds are acceptable if waiting is necessary. Thanks --
It should normally only wake up once, at exactly the time that the other thread releases the mutex. What exactly happens depends on the relative priorities of the two lock holders and whether all CPUs are busy already. If there is nothing else to do and the blocking thread is made runnable by giving up the mutex, it will run immediately. Even if we keep the current thread running, the jiffie timer is rather meaningless and Actually, the timeout is the time after which the waiter gives up and does something else because it no longer expects to get the lock. This is very rarely needed. Arnd --
But thats spinlock, not semaphore, right? Also, your example does not use the API right -- according to description, 42 is correct 'semaphore acquired' reply... -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
As I understand, the example, though could be totally removed, does
not show the recommended usage of nano semaphore, but expresses the
difference from semaphore. The recommended however looks
nano_semaphore_down(s, 0);
do_home_work();
nano_semaphore_up(s);
if you like to wait as long as possible.
Cheers
Hillf
--
