Re: [PATCH 0/3] refcounting improvements in sysfs.

Previous thread: [PATCH 1/3] tracing: Reduce overhead of module tracepoints by Li Zefan on Tuesday, March 23, 2010 - 7:57 pm. (15 messages)

Next thread: [PATCH 1/2] sis-agp: Remove SIS 760, handled by amd64-agp by Ben Hutchings on Tuesday, March 23, 2010 - 8:33 pm. (7 messages)
From: NeilBrown
Date: Tuesday, March 23, 2010 - 8:20 pm

This series tidies up the refcount of sysfs_dirents in sysfs,
using kref where appropriate and a new karef for s_active.
This achieves significant code simplification, especially the first
patch.

This is in part inspired by http://lwn.net/Articles/336224/ :-)

NeilBrown

---

NeilBrown (3):
      sysfs: simplify handling for s_active refcount
      sysfs: make s_count a kref
      kref: create karef and use for sysfs_dirent->s_active


 fs/sysfs/dir.c       |   74 ++++++++++++++++++++------------------------------
 fs/sysfs/mount.c     |    3 +-
 fs/sysfs/sysfs.h     |   19 +++++--------
 include/linux/kref.h |   38 ++++++++++++++++++++++++++
 lib/kref.c           |   13 +++++++++
 5 files changed, 90 insertions(+), 57 deletions(-)

-- 
Signature

--

From: NeilBrown
Date: Tuesday, March 23, 2010 - 8:20 pm

s_active counts the number of active references to a 'sysfs_direct'.
When we wish to deactivate a sysfs_direct, we subtract a large
number for the refcount so it will always appear negative.  When
it is negative, new references will not be taken.
After that subtraction, we wait for all the active references to
drain away.

The subtraction of the large number contains exactly the same
information as the setting of the flag SYSFS_FLAG_REMOVED.
(We know this as we already assert that SYSFS_FLAG_REMOVED is set
before adding the large-negative-bias).
So doing both is pointless.

By starting s_active with a value of 1, not 0 (as is typical of
reference counts) and using atomic_inc_not_zero, we can significantly
simplify the code while keeping exactly the same functionality.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/sysfs/dir.c   |   60 ++++++++++++++----------------------------------------
 fs/sysfs/sysfs.h |    2 --
 2 files changed, 16 insertions(+), 46 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 5907178..76a2d10 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -95,26 +95,13 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
  */
 struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
 {
-	if (unlikely(!sd))
+	if (likely(sd)
+	    && (sd->s_flags & SYSFS_FLAG_REMOVED) == 0
+	    && atomic_inc_not_zero(&sd->s_active)) {
+		rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
+		return sd;
+	} else
 		return NULL;
-
-	while (1) {
-		int v, t;
-
-		v = atomic_read(&sd->s_active);
-		if (unlikely(v < 0))
-			return NULL;
-
-		t = atomic_cmpxchg(&sd->s_active, v, v + 1);
-		if (likely(t == v)) {
-			rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
-			return sd;
-		}
-		if (t < 0)
-			return NULL;
-
-		cpu_relax();
-	}
 }
 
 /**
@@ -126,34 +113,26 @@ struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
  */
 void sysfs_put_active(struct sysfs_dirent *sd)
 {
-	struct completion *cmpl;
-	int v;
-
 ...
From: Eric W. Biederman
Date: Thursday, March 25, 2010 - 9:24 pm

Overall your logic appears correct but in detail this patch scares me.

sd->s_flags is protected by the sysfs_mutex, and you aren't
taking it when you read it.  So in general I don't see the new check
if (sd->s_flags & SYSFS_FLAG_REMOVED) == 0 providing any guarantee of
progress whatsoever with user space applications repeated reading from
a sysfs file when that sysfs file is being removed.  They could easily
have the sd->s_flags value cached and never see the new value, given a
crazy enough cache architecture.

So as attractive as this patch is I don't think it is correct.

Eric 
--

From: Neil Brown
Date: Thursday, March 25, 2010 - 10:32 pm

On Thu, 25 Mar 2010 21:24:43 -0700

As you say, this is only a liveness issue.  The atomic_inc_not_zero
guarantees that we don't take a new reference after the last one is gone.
The test on SYSFS_FLAG_REMOVED is only there to ensure that the count does
eventually get to zero.
There could only be a problem here if the change to s_flags was not
propagated to all CPUs in some reasonably small time.

I'm not expert on these things but it was my understanding that interesting
cache architectures could arbitrarily re-order accesses, but does not delay
them indefinitely.
Inserting barriers could possibly make this more predictable, but that would
just delay certain loads/stores until a known state was reached - it would
not make the data visible at another CPU any faster (or would it?).

So unless there is no cache-coherency protocol at all, I think that
SYSFS_FLAG_REMOVED will be seen promptly and that s_active will drop to zero

I'm pleased you find it attractive - I certainly think the
"atomic_inc_not_zero" is much more readable than the code it replaces.
Hopefully if there are really problems (maybe I've fundamentally
misunderstood caches) they can be easily resolved (a couple of memory
barriers at worst?).

Thanks for the review,
NeilBrown
--

From: Tejun Heo
Date: Thursday, March 25, 2010 - 10:42 pm

Hello, Neil.


Oh, no, please don't do that.  That will inject a whole lot more of
complexity into the mix.  Now, it's only a problem of logical
complexity.  If you add memory barriers there, it not only complicates
the problem itself beyond recognition but also reduces problem
reproducibility (especially because x86/64 are relatively strongly
ordered) and thus test coverage significantly.  Now the refcounting
can be understood by most people who put some time into it but if you
put memory barriers into it, only Oleg would be able to identify
problems.  :-P

If you really want to kill the bias and in an easy readable way,
please put it inside a struct w/ dedicated spinlock but if you are
gonna do that it might as well be better to simply implement the bias
anti-pattern correctly there once.

Thanks.

-- 
tejun
--

From: Eric W. Biederman
Date: Friday, March 26, 2010 - 12:53 am

Having this handled correctly is a key part of the abstraction
implemented with sysfs_get_active sysfs_put_active and
sysfs_deactivate.  It is a requirement that userspace can not cuause

This is my problem with your patch.  You have replaced something that
used an existing abstraction properly, with something that does not
use any abstraction at all and we are reduced to talking about cpu
memory ordering requirements.

Furthermore this code you are changing is not used raw but it is the
building blocks for a well defined abstraction.  The users of
sysfs_get_active sysfs_put_active and sysfs_deactivate don't have to
know the details of how they are implemented to think about and
reason about it accurately.

I am very much in favor of a general primitive that we can use for
blocking ref counts.  We have them in sysfs, proc, sysctl, and a few
other places with similar requirements.  Each location has a different
implementation, but essentially the same semantics.

What is really needed is something with the semantics of:
read_trylock
read_unlock
write_lock() perhaps synchronize?

With multiple reads in side the read side critical section, and 
biased so that the read side is very cheap, and the data structure
size is very cheap.  We have an ugly but otherwise near optimal
implementation in sysfs.  srcu is close the data structure
costs too much to use on every sysfs dirent.

Improving and getting it right is very attractive.  The code that
exists today is correct and signals that something fishy is going
on so there is no point in doing something less than right.

Eric

--

From: Neil Brown
Date: Sunday, March 28, 2010 - 9:43 pm

On Fri, 26 Mar 2010 00:53:48 -0700

We are not talking about memory ordering requirements.  There are no memory
ordering requirements introduced in the patch.  It really doesn't matter if
the setting of SYSFS_FLAG_REMOVED is seen before or after anything else.
All that matters is that it does get seen, and it will be seen long before
you could possibly describe the situation as 'denial of service'.

However if we do consider memory ordering guarantees we can describe a clear
limit to the possibly delay between SYSFS_FLAG_REMOVED being set, and being
seen.  The atomic_inc_not_zero serves as a memory barrier in exactly the same
way that the current code requires atomic_dec_return.  So while

	if (likely(sd)
	    && (sd->s_flags & SYSFS_FLAG_REMOVED) == 0
	    && atomic_inc_not_zero(&sd->s_active)) {

could possibly gain a reference even 'after' SYS_FLAG_REMOVED as been set,
a second call to this on the same processor will see SYSFS_FLAG_REMOVED.
So at the absolute most, we could see NCPUS active references gained and
dropped after SYSFS_FLAG_REMOVED was set - a clear limit which is all we need.
I'm still not sure we even need to argue in terms of memory barriers to be
sure the code is correct, but it seems they are sufficient to give a simple
proof.
The sysfs code already has a dependency on implicit memory barriers to ensure
that ->s_sibling points to a completion.  My change does not significantly

True.  However it is raw in the sense that if some other developer wanted
similar functionality in their own code they might copy the low level stuff
as they wouldn't be able to use the sys_* routines directly.  Copying

Yes - I think the synergy between locks an refcounts is really interesting...

As you say, simply using a general abstraction which provides these functions
would incur a space cost that you don't want.  You avoid this cost by
re-using s_sibling to store a reference to a completion.
Finding a primitive that allows such optimisations is an ...
From: Neil Brown
Date: Monday, March 29, 2010 - 12:47 am

On Mon, 29 Mar 2010 15:43:25 +1100

It just occurred to me that this 'proof' isn't quite complete in itself.  I
need to also show that there is a suitable memory barrier after
SYSFS_FLAG_REMOVED is set.  There is as it is always set under sysfs_mutex,
so the mutex_unlock provides a barrier.
So after sysfs_mutex is unlocked, it is conceivable that each CPU could grant
one active reference against the sysfs_dirent before SYSFS_FLAG_REMOVED was

NeilBrown
--

From: NeilBrown
Date: Tuesday, March 23, 2010 - 8:20 pm

s_count in sysfs behaves exactly like a kref, so change it to
be one.
This requires adding a KREF_INIT macro to kref.h

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/sysfs/dir.c       |   12 +++++++++---
 fs/sysfs/mount.c     |    2 +-
 fs/sysfs/sysfs.h     |   15 +++++++--------
 include/linux/kref.h |    1 +
 4 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 76a2d10..63790ac 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -177,8 +177,14 @@ static void sysfs_free_ino(ino_t ino)
 	spin_unlock(&sysfs_ino_lock);
 }
 
-void release_sysfs_dirent(struct sysfs_dirent * sd)
+static void no_recurse(struct kref *kref)
 {
+}
+void release_sysfs_dirent(struct kref *count)
+{
+	struct sysfs_dirent *sd = container_of(count,
+					       struct sysfs_dirent,
+					       s_count);
 	struct sysfs_dirent *parent_sd;
 
  repeat:
@@ -199,7 +205,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 	kmem_cache_free(sysfs_dir_cachep, sd);
 
 	sd = parent_sd;
-	if (sd && atomic_dec_and_test(&sd->s_count))
+	if (sd && kref_put(&sd->s_count, no_recurse))
 		goto repeat;
 }
 
@@ -289,7 +295,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 	if (sysfs_alloc_ino(&sd->s_ino))
 		goto err_out2;
 
-	atomic_set(&sd->s_count, 1);
+	kref_init(&sd->s_count);
 	atomic_set(&sd->s_active, 1);
 
 	sd->s_name = name;
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 0cb1088..07bff03 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -33,7 +33,7 @@ static const struct super_operations sysfs_ops = {
 
 struct sysfs_dirent sysfs_root = {
 	.s_name		= "",
-	.s_count	= ATOMIC_INIT(1),
+	.s_count	= KREF_INIT,
 	.s_flags	= SYSFS_DIR,
 	.s_mode		= S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO,
 	.s_ino		= 1,
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 6a2a60e..f003a88 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -49,7 +49,7 @@ struct sysfs_inode_attrs {
  * requires ...
From: Eric W. Biederman
Date: Thursday, March 25, 2010 - 9:29 pm

Except where it doesn't.  The whole no_recurse thing is definitely
not idiomatic kref usage.  I could find only 3 instances of someone even
looking at the return value.  I would argue based on that the return
value of kref_put should be removed.

KREF_INIT if we want it should be added in a separate patch.

kref should be kept for the stupid simple cases where so we don't have
to think about refcounting, and just know it works.  kref should not
be where we are getting clever, and it this feels like getting clever
to me.

It isn't like the atomic primitives are any worse than kref.

Eric
--

From: NeilBrown
Date: Tuesday, March 23, 2010 - 8:20 pm

->s_active is almost a kref, but needs atomic_inc_not_zero which
is not generally appropriate for a kref, as you can only access a
kreffed object if you hold a reference, and in that case the counter
cannot possibly be zero.

So introduce 'karef' - a counter of active references.  An active
reference is separate from an existential reference and normally
implies one.
For a karef, get_not_zero have be appropriate providing a separate
existential reference is held.

Then change sysfs_dirent->s_active to be the first (and so-far only)
karef. super_block->s_active is another candidate to be a karef.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/sysfs/dir.c       |   18 ++++++++++++------
 fs/sysfs/mount.c     |    1 +
 fs/sysfs/sysfs.h     |    2 +-
 include/linux/kref.h |   37 +++++++++++++++++++++++++++++++++++++
 lib/kref.c           |   13 +++++++++++++
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 63790ac..f0e2303 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -97,13 +97,22 @@ struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
 {
 	if (likely(sd)
 	    && (sd->s_flags & SYSFS_FLAG_REMOVED) == 0
-	    && atomic_inc_not_zero(&sd->s_active)) {
+	    && karef_get_not_zero(&sd->s_active)) {
 		rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
 		return sd;
 	} else
 		return NULL;
 }
 
+
+static void sysfs_release(struct karef *k)
+{
+	struct sysfs_dirent *sd = container_of(k, struct sysfs_dirent, s_active);
+	struct completion *cmpl = (void*)sd->s_sibling;
+
+	complete(cmpl);
+}
+	
 /**
  *	sysfs_put_active - put an active reference to sysfs_dirent
  *	@sd: sysfs_dirent to put an active reference to
@@ -117,10 +126,7 @@ void sysfs_put_active(struct sysfs_dirent *sd)
 		return;
 
 	rwsem_release(&sd->dep_map, 1, _RET_IP_);
-	if (atomic_dec_and_test(&sd->s_active)) {
-		struct completion *cmpl = (void*)sd->s_sibling;
-		complete(cmpl);
-	}
+	karef_put(&sd->s_active, ...
From: Eric W. Biederman
Date: Thursday, March 25, 2010 - 9:50 pm

If this actually captured the semantics of active references I would find
this interesting.  But you currently miss the inconvenient but crucial
part of preventing new references after deletion, and you miss the
lockdep bits.

Given that with the completion we act enough like a lock we can cause
deadlocks I would expect a generic version for dummies (aka a kref flavor)
to include lockdep annotations from the start.

Lock ordering issues combined with ref counts are rare but they really
suck, are hard to find (without lockdep) and hard to reason about.

The logical semantics are:

read_trylock() sysfs_get_active
read_unlock() sysfs_put_active
write_lock() sysfs_deactivate

Maybe your karef stuff is good for something but it models the sysfs
usage poorly.

Eric
--

From: Eric W. Biederman
Date: Thursday, March 25, 2010 - 8:10 pm

Neil I would appreciate if you would cc' Tejun and myself on
these kind of patches as we wrote the code and the best canidates
to review it.

Thanks,
Eric
--

From: Neil Brown
Date: Thursday, March 25, 2010 - 8:28 pm

On Thu, 25 Mar 2010 20:10:15 -0700

Yes, you are right of course.
I just pulled names out of MAINTAINERS figuring the maintainer would know
what best to do with the patch.

NeilBrown
--

From: Tejun Heo
Date: Thursday, March 25, 2010 - 9:49 pm

Hello,


Nice article.  In general, yeap, I agree it would be nice to have a
working reference count abstraction.  However, kref along with kobject
is a good example of obscurity by abstraction anti pattern.  :-)

kobject API incorrectly suggests that it deals with the last put
problem.  There still are large number of code paths which do the
following,

  if (!(kob = kobject_get(kobj)))
	return;

I believe (or at least hope) the actual problem cases are mostly fixed
now but there still are a lot of misconceptions around how stuff built
on kref/kobject is synchronized and they sometimes lead to race
conditions buried deep under several layers of abstractions and it
becomes very hard to see those race conditions when they are buried
deep.

If you want to kill refcounts w/ bias based off switch, please put it
inside an abstraction which at least synchronizes itself properly.
Open coding w/ bias at least warns you that there is some complex
stuff going on and you need to trade carefully.  Putting the switch on
a separate flag - people often forget how bits in a flag field are
synchronized - and the rest of refcount in a nice looking kref bundle
is very likely to lead to subtle race conditions which are *very*
difficult to notice.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Thursday, March 25, 2010 - 10:10 pm

One thing to add.


I think one of the reasons why k* hasn't really worked as well as it
was orginally imagined to do is the way we - the kernel programmers -
think and work.  We add abstractions when something is functionally
necessary, so in a lot of cases the functional requirements become the
implementation and the communication among us (ie. serves as implied
documentation).  The k* stuff detracts from this principle.  Those
abstractions are there for the purpose of abstracting and our usual
mindset becomes very susceptible to misinterpretations as no easily
identifiable functional requirements are there - we either end up
imaginging something up or waste time frustrated trying to figure out
why the hell that abstraction is there.

This actualy is a very generic problem.  When a LOT of people are
trying to work together sharing a lot of infrastructures, it is very
deterimental to impose certain paradigm upon them.  People can easily
agree upon functional necessities but one guy's wildest, most
ambitious paradigmatic vision looks like a complete bull to another
gal.

So, let's keep the abstractions to the just necessary level and
communicate at the functional layer.  In this case, mount and sysfs
shares the requirement for a refcount w/ a kill switch.  I'm not sure
it warrants common abstraction at this stage but if the *function* can
be wrapped nicely along with lockdep annotations and all, why not?

Thanks.

-- 
tejun
--

From: Neil Brown
Date: Thursday, March 25, 2010 - 11:02 pm

On Fri, 26 Mar 2010 13:49:36 +0900

I'm not at all sure that opinion would be universal....

refcounting is something that it is quite easy to get wrong.  There are
several slightly different models for refcounting and if you don't have a
clear understanding of the different use cases it is easy to get confused
about exactly what model is being used and so use a refcount wrongly.
kref certainly doesn't cover all models for refcounting but it does cover one
fairly common one very well and I think that it's use bring clarity rather
than obscurity.
Of course if it is used for a refcount which should really follow a different

kobject_get *always* returns exactly the argument that was passed to it.
(kref_get doesn't have a return value.)

I don't see how the code above has any bearing on the last-put problem, which

I agree that there probably misconceptions about how kref works and they are
probably based on a lack of appreciation of the subtle differences in
flavours of refcounts.  Hence my desire to create and document different

The only other use of a BIAS that I am aware of is in struct super_block, and
Al Viro recently removed that in his bleeding edge tree (two days before I
sent him a patch to do the same thing:-)

It is dangerous to build too much into an abstraction else you will find that
no-one uses it as it is too specific.

The s_count and s_active in struct super_block are very similar to s_count
and s_active in struct sysfs_dirent, however they are also quite different.
super_block uses a non-atomic s_count (because a spinlock is always held
anyway) and has a separate way of preventing new s_active references (s_root
becomes NULL).  The only real similarity is that they both have an 'active'
refcount that *can* become zero and still be visible, which is different to a
kref but still a model worth encapsulating (I think) in karef.

BTW I'd be perfectly happy if the first patch was taken and subsequent ones
not.  I think they are a good idea, but I'm happy to ...
From: Tejun Heo
Date: Thursday, March 25, 2010 - 11:32 pm

Hello, Neil.


I don't know.  After spending some time with k* and device model, I
grew a pretty strong dislike for abstractions without clear functional
necessities.  kref being much simpler than kobject, the abuse is much
less severe but there have been kref misuses (in kobject and SCSI
midlayer) which could have been avoided or at least easily located if
they simply had used atomic_t instead of dreaming up some mystical

Oh, I should have been more explicit.  It's not directly related to
kref but just something that always comes to my mind when thinking
about k* abstractions.  The above bogus condition checks used to be
used quite widely.  The programmer for some reason believed the last
kobject_put() somehow will magically make future kobject_get()s return
NULL, which of course doesn't make any sense but hey the
implementation is buried kilometers deep, the API and other usages
seem to suggest that and it's easy to imagine something up when you're
tired.

As an another example, please take a look at the kref API.

  int kref_put(struct kref *kref, void (*release) (struct kref *kref));

The function takes @release callback but under which context is it
called?  If you look at the source code, it's called in-line which
isn't clear from the API at all (why the hell take a callback if
you're gonna call it in-line?) and there have been *several* subtle
bugs which could have been avoided or at least would have been caught
much easier if it were not for that meaningless separate release
callback.  It's just too easy to forget about the executing context
when people write and review stuff over function boundaries.

 void put_something(something)
 {
	 if (kref_put(&something->kref))
		 do something which might dead lock;
 }

is way easier to avoid bugs and review than

 void really_kill_something(struct kref *kref)
 {
	 struct my_something *something = container_of(...);

	 do something which might dead lock;
 }

 void put_something(something)
 {
	 ...
From: Neil Brown
Date: Sunday, March 28, 2010 - 10:10 pm

On Fri, 26 Mar 2010 15:32:51 +0900

Hi Tejun,

this strikes me as really valuable experience that it would be great to
share.   While I generally like kref and see value in at least some of
kobject I don't for a moment suppose they are perfect.  Fixing them requires
a good understanding of the problems they cause.  If you have that knowledge,
it would be a great resource for anyone wanting to 'fix' kobject.

This would argue that having a return value from kobject_get violates Rusty's
law that interfaces should be hard to misuse.
We could probably change that - it is only used 19 times in the current
kernel.
It probably doesn't help that Documentation/kobject.txt includes the text:

    A successful call to kobject_get() will increment the kobject's reference
    counter and return the pointer to the kobject.


I'm not immediately convinced by this, though maybe I haven't seen enough
examples yet.

The deadlocks that I have come across would not have been any more obvious in
either of the above - they were caused because sysfs_remove deadlocks when
called from inside a sysfs attribute action...

Also, while this re-write is possible for kref uses it isn't really possible
in kobject as the 'final_put' function must be included in the ktype (though
maybe you don't like that either).

What would be really helpful here is a survey of what sorts of things are
actually done in final_put functions so would could create some guidelines
about how to write the release functions.

Thanks,

--

From: Tejun Heo
Date: Tuesday, March 30, 2010 - 8:20 pm

Hello, Neil.


I'm not sure I have enough to say to write something up about. :-) One
thing I've been hoping to do is to hide the k* abstraction behind
device model abstraction so that device model API users only have to
deal with device model abstractions - devices and drivers.  They are
what the driver writers need and have pretty good tangible concrete
concept about.  I think my opinions can be compressed into

* Add abstractions which serve concrete functional necessities as
  doing so makes communication among peer programmers much easier by
  reducing confusion.

* Don't get too attached to software engineering theories.  They're
  not bad in themselves but often end up labeling only certain small
  subset of all the things which need to be considered, with the
  adverse side effect of emphasizing those named ones out of

Yeap, the return value is only for convenience and I think it's more
harmful overall, but then again I suppose most people got used to it
now, which is another important factor to consider.  Given that there
are not many users of the return value, I agree that removing it would

Ah, right, I probably was mixing kobject and kref release functions
when talking about the problems I've seen.  I still don't like the
kref API and have chosen atomic_t over it in several cases.  To me,
the separate ->release seems way too heavy compared to the
functionality the abstraction actually provides and hinders writing

For kobject, I'm not sure what should be done other than hiding it as
much as possible behind device model abstractions.  For kref, I think
it would be great if it can be made simpler and dumber so that it only
clarifies the intention of the usage instead of forcing separate
->release.  If fancier reference helper is needed, capability to defer
release to a separate context and handle module symbol dereference
problem automatically would be nice provided they can be done in clean
manner, but these are just something I've been vaguely hoping for ...
Previous thread: [PATCH 1/3] tracing: Reduce overhead of module tracepoints by Li Zefan on Tuesday, March 23, 2010 - 7:57 pm. (15 messages)

Next thread: [PATCH 1/2] sis-agp: Remove SIS 760, handled by amd64-agp by Ben Hutchings on Tuesday, March 23, 2010 - 8:33 pm. (7 messages)