[34-longterm 197/260] ip: fix truesize mismatch in ip fragmentation

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Paul Gortmaker
Date: Sunday, January 2, 2011 - 12:18 am

From: Eric Dumazet <eric.dumazet@gmail.com>

commit 3d13008e7345fa7a79d8f6438150dc15d6ba6e9d upstream.

Special care should be taken when slow path is hit in ip_fragment() :

When walking through frags, we transfert truesize ownership from skb to
frags. Then if we hit a slow_path condition, we must undo this or risk
uncharging frags->truesize twice, and in the end, having negative socket
sk_wmem_alloc counter, or even freeing socket sooner than expected.

Many thanks to Nick Bowler, who provided a very clean bug report and
test program.

Thanks to Jarek for reviewing my first patch and providing a V2

While Nick bisection pointed to commit 2b85a34e911 (net: No more
expensive sock_hold()/sock_put() on each tx), underlying bug is older
(2.6.12-rc5)

A side effect is to extend work done in commit b2722b1c3a893e
(ip_fragment: also adjust skb->truesize for packets not owned by a
socket) to ipv6 as well.

Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com>
Tested-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 net/ipv4/ip_output.c  |   19 +++++++++++++------
 net/ipv6/ip6_output.c |   18 +++++++++++++-----
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d1bcc9f..e8a6860 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -479,9 +479,8 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 	 * we can switch to copy when see the first bad fragment.
 	 */
 	if (skb_has_frags(skb)) {
-		struct sk_buff *frag;
+		struct sk_buff *frag, *frag2;
 		int first_len = skb_pagelen(skb);
-		int truesizes = 0;
 
 		if (first_len - hlen > mtu ||
 		    ((first_len - hlen) & 7) ||
@@ -494,18 +493,18 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 			if (frag->len > mtu ||
 			    ((frag->len & 7) && frag->next) ||
 			    skb_headroom(frag) < hlen)
-			    goto slow_path;
+				goto slow_path_clean;
 
 			/* Partially cloned skb? */
 			if (skb_shared(frag))
-				goto slow_path;
+				goto slow_path_clean;
 
 			BUG_ON(frag->sk);
 			if (skb->sk) {
 				frag->sk = skb->sk;
 				frag->destructor = sock_wfree;
 			}
-			truesizes += frag->truesize;
+			skb->truesize -= frag->truesize;
 		}
 
 		/* Everything is OK. Generate! */
@@ -515,7 +514,6 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 		frag = skb_shinfo(skb)->frag_list;
 		skb_frag_list_init(skb);
 		skb->data_len = first_len - skb_headlen(skb);
-		skb->truesize -= truesizes;
 		skb->len = first_len;
 		iph->tot_len = htons(first_len);
 		iph->frag_off = htons(IP_MF);
@@ -567,6 +565,15 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 		}
 		IP_INC_STATS(dev_net(dev), IPSTATS_MIB_FRAGFAILS);
 		return err;
+
+slow_path_clean:
+		skb_walk_frags(skb, frag2) {
+			if (frag2 == frag)
+				break;
+			frag2->sk = NULL;
+			frag2->destructor = NULL;
+			skb->truesize += frag2->truesize;
+		}
 	}
 
 slow_path:
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 75d5ef8..60daecc 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -646,7 +646,7 @@ static int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 
 	if (skb_has_frags(skb)) {
 		int first_len = skb_pagelen(skb);
-		int truesizes = 0;
+		struct sk_buff *frag2;
 
 		if (first_len - hlen > mtu ||
 		    ((first_len - hlen) & 7) ||
@@ -658,18 +658,18 @@ static int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 			if (frag->len > mtu ||
 			    ((frag->len & 7) && frag->next) ||
 			    skb_headroom(frag) < hlen)
-			    goto slow_path;
+				goto slow_path_clean;
 
 			/* Partially cloned skb? */
 			if (skb_shared(frag))
-				goto slow_path;
+				goto slow_path_clean;
 
 			BUG_ON(frag->sk);
 			if (skb->sk) {
 				frag->sk = skb->sk;
 				frag->destructor = sock_wfree;
-				truesizes += frag->truesize;
 			}
+			skb->truesize -= frag->truesize;
 		}
 
 		err = 0;
@@ -700,7 +700,6 @@ static int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 
 		first_len = skb_pagelen(skb);
 		skb->data_len = first_len - skb_headlen(skb);
-		skb->truesize -= truesizes;
 		skb->len = first_len;
 		ipv6_hdr(skb)->payload_len = htons(first_len -
 						   sizeof(struct ipv6hdr));
@@ -763,6 +762,15 @@ static int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 			      IPSTATS_MIB_FRAGFAILS);
 		dst_release(&rt->u.dst);
 		return err;
+
+slow_path_clean:
+		skb_walk_frags(skb, frag2) {
+			if (frag2 == frag)
+				break;
+			frag2->sk = NULL;
+			frag2->destructor = NULL;
+			skb->truesize += frag2->truesize;
+		}
 	}
 
 slow_path:
-- 
1.7.3.3

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[34-longterm 000/260] v2.6.34.8 longterm review, Paul Gortmaker, (Sun Jan 2, 12:14 am)
[34-longterm 003/260] ath5k: drop warning on jumbo frames, Paul Gortmaker, (Sun Jan 2, 12:14 am)
[34-longterm 019/260] ext4: Show journal_checksum option, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 025/260] ext4: Fix compat EXT4_IOC_ADD_GROUP, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 028/260] ext4: fix freeze deadlock under IO, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 030/260] xen: handle events as edge-triggered, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 045/260] USB: ehci-ppc-of: problems in unwind, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 047/260] USB: CP210x Add new device ID, Paul Gortmaker, (Sun Jan 2, 12:15 am)
[34-longterm 065/260] irda: off by one, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 089/260] sched: Optimize task_rq_lock(), Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 090/260] sched: Fix nr_uninterruptible count, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 093/260] sched: Fix select_idle_sibling(), Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 098/260] arm: fix really nasty sigreturn bug, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 106/260] drm/i915: Prevent double dpms on, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 110/260] gro: fix different skb headrooms, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 111/260] gro: Re-fix different skb headrooms, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 115/260] tcp: fix three tcp sysctls tuning, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 118/260] rds: fix a leak of kernel memory, Paul Gortmaker, (Sun Jan 2, 12:16 am)
[34-longterm 126/260] Staging: vt6655: fix buffer overflow, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 134/260] percpu: fix pcpu_last_unit_cpu, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 136/260] inotify: send IN_UNMOUNT events, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 139/260] fix siglock, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 145/260] AT91: change dma resource index, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 154/260] inotify: fix inotify oneshot support, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 158/260] alpha: Fix printk format errors, Paul Gortmaker, (Sun Jan 2, 12:17 am)
[34-longterm 188/260] atl1: fix resume, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 190/260] De-pessimize rds_page_copy_user, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 192/260] xfrm4: strip ECN bits from tos field, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 193/260] tcp: Fix &gt;4GB writes on 64-bit., Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 197/260] ip: fix truesize mismatch in ip frag ..., Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 199/260] tcp: Fix race in tcp_poll, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 200/260] netxen: dont set skb-&gt;truesize, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 203/260] skge: add quirk to limit DMA, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 208/260] b44: fix carrier detection on bind, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 224/260] bluetooth: Fix missing NULL check, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 236/260] KVM: x86: Fix SVM VMCB reset, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 240/260] p54usb: fix off-by-one on !CONFIG_PM, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 241/260] p54usb: add five more USBIDs, Paul Gortmaker, (Sun Jan 2, 12:18 am)
[34-longterm 256/260] libsas: fix NCQ mixing with non-NCQ, Paul Gortmaker, (Sun Jan 2, 12:19 am)
[34-longterm 257/260] gdth: integer overflow in ioctl, Paul Gortmaker, (Sun Jan 2, 12:19 am)
[34-longterm 258/260] Fix race when removing SCSI devices, Paul Gortmaker, (Sun Jan 2, 12:19 am)
Re: [34-longterm 000/260] v2.6.34.8 longterm review, Paul Gortmaker, (Sun Jan 2, 3:46 am)
Re: [34-longterm 000/260] v2.6.34.8 longterm review, Jiri Slaby, (Mon Jan 3, 3:41 am)
Re: [34-longterm 000/260] v2.6.34.8 longterm review, Paul Gortmaker, (Tue Jan 4, 12:11 pm)