Re: [PATCH] net: add destructor for skb data.

Previous thread: [PATCH 1/2] net: make struct tun_struct private to tun.c by Rusty Russell on Saturday, April 5, 2008 - 4:53 am. (6 messages)

Next thread: [PATCH RFC 1/5] vringfd syscall by Rusty Russell on Saturday, April 5, 2008 - 5:02 am. (24 messages)
From: Rusty Russell
Date: Saturday, April 5, 2008 - 4:56 am

If we want to notify something when an skb is truly finished (such as
for tun vringfd support), we need a destructor on the data.  We don't
need to add other fields, since we can just allocate extra room at the
end.

(I wonder if we could *reduce* the shinfo allocation where no frags are needed?)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 include/linux/skbuff.h |   29 ++++++++++++++++++++++++++---
 net/core/skbuff.c      |   12 +++++++++---
 2 files changed, 35 insertions(+), 6 deletions(-)

diff -r 77871c14566e include/linux/skbuff.h
--- a/include/linux/skbuff.h	Fri Mar 28 13:41:36 2008 +1100
+++ b/include/linux/skbuff.h	Mon Mar 31 23:01:58 2008 +1000
@@ -148,6 +148,7 @@ struct skb_shared_info {
 	__be32          ip6_frag_id;
 	struct sk_buff	*frag_list;
 	skb_frag_t	frags[MAX_SKB_FRAGS];
+	void		(*destructor)(struct skb_shared_info *);
 };
 
 /* We divide dataref into two halves.  The higher 16 bits hold references
@@ -344,17 +346,18 @@ extern void kfree_skb(struct sk_buff *sk
 extern void kfree_skb(struct sk_buff *skb);
 extern void	       __kfree_skb(struct sk_buff *skb);
 extern struct sk_buff *__alloc_skb(unsigned int size,
-				   gfp_t priority, int fclone, int node);
+				   gfp_t priority, int fclone, unsigned extra,
+				   int node);
 static inline struct sk_buff *alloc_skb(unsigned int size,
 					gfp_t priority)
 {
-	return __alloc_skb(size, priority, 0, -1);
+	return __alloc_skb(size, priority, 0, 0, -1);
 }
 
 static inline struct sk_buff *alloc_skb_fclone(unsigned int size,
 					       gfp_t priority)
 {
-	return __alloc_skb(size, priority, 1, -1);
+	return __alloc_skb(size, priority, 1, 0, -1);
 }
 
 extern struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src);
diff -r 77871c14566e net/core/skbuff.c
--- a/net/core/skbuff.c	Fri Mar 28 13:41:36 2008 +1100
+++ b/net/core/skbuff.c	Mon Mar 31 23:01:58 2008 +1000
@@ -169,6 +169,7 @@ EXPORT_SYMBOL(skb_truesize_bug);
  *	@gfp_mask: allocation mask
  *	@fclone: allocate ...
From: Evgeniy Polyakov
Date: Saturday, April 5, 2008 - 6:06 am

Hi.


There is skb->destructor already, for your case of vringfd it can be
used safely afaics.

-- 
	Evgeniy Polyakov
--

From: Rusty Russell
Date: Saturday, April 5, 2008 - 8:20 pm

Hi Evgeniy,

    I don't think so.  For a start, the skb destructor is called while the skb 
is still in the socket queue (ie. the data is still live).  Secondly, the 
original skb can be freed while clones still reference the data.

Cheers,
Rusty.
--

From: Evgeniy Polyakov
Date: Sunday, April 6, 2008 - 2:20 am

Hi Rusty.


That is what it is for - to remove data from any queues and free it.
One can check if skb was cloned and do not perform some steps, instead
call old destructor. Destructor for the last clone will cleanup whatever
is needed. Thoughts?

Actually I'm not that opposed agains additional destructor, I just want
to bring attention to this topic, since this is second time some steps
are going to be setup for the destruction time, so I want a clear
solution :)

-- 
	Evgeniy Polyakov
--

From: Evgeniy Polyakov
Date: Sunday, April 6, 2008 - 6:32 am

Hi.


Actually the question is who is allowed to set that callback?
Essentially what I want is to get the same notifications as you propose
in patch 4 but for socket layer. AFAICS this will not collide with
tun/tap skbs.

-- 
	Evgeniy Polyakov
--

From: Rusty Russell
Date: Sunday, April 6, 2008 - 2:10 pm

The old destructor is in some other skb, you'd have to carry it around.  
And skb_orphan() calls the destructor early deliberately.

The current skb destructor is for the sk_buff, not the data.  It's clearest to 
keep them separate.

Cheers,
Rusty.
--

Previous thread: [PATCH 1/2] net: make struct tun_struct private to tun.c by Rusty Russell on Saturday, April 5, 2008 - 4:53 am. (6 messages)

Next thread: [PATCH RFC 1/5] vringfd syscall by Rusty Russell on Saturday, April 5, 2008 - 5:02 am. (24 messages)