Re: [PATCH] ipv4/ipv6: check hop limit field on input

Previous thread: [PATCH 02/10] Add constants for the ieee 802.15.4/ZigBee stack by Dmitry Eremin-Solenikov on Monday, June 1, 2009 - 7:54 am. (46 messages)

Next thread: Re: [E1000-devel] 2.6.30rc7: ksoftirqd CPU saturation (x86-64 only, not x86-32) (e1000e-related?) by Brandeburg, Jesse on Monday, June 1, 2009 - 9:48 am. (2 messages)
From: Nicolas Dichtel
Date: Monday, June 1, 2009 - 8:13 am

Hi,

when network stack receives a packet, it didn't check value of ttl/hop limit
field. RFC indicates that a router must drop the packet if this field is 0.


Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

From: Florian Westphal
Date: Monday, June 1, 2009 - 9:19 am

Whats wrong with the checks in ip(6)_forward?
--

From: Nicolas Dichtel
Date: Monday, June 1, 2009 - 9:49 am

It's on forward, not on input. Router must not process it.
For example, if you try to ping (with ttl set to 0) the router, you will receive 
a reply.

Nicolas
--

From: Florian Westphal
Date: Monday, June 1, 2009 - 10:13 am

Ah.  That makes more sense.
However, I'd argue that this is sane behaviour.

The datagram did reach its intended destination and the TTL did not
"exceed in transit" (if it had, the datagram would not have been
received).  Why discard an otherwise perfectly legal packet?
--

From: Nicolas Dichtel
Date: Tuesday, June 2, 2009 - 2:30 am

Because RFC requires this:

RFC792 Page 6:
   If the gateway processing a datagram finds the time to live field
   is zero it must discard the datagram.  The gateway may also notify
   the source host via the time exceeded message.

RFC4443 Section 3.3:
    If a router receives a packet with a Hop Limit of zero, or if a
    router decrements a packet's Hop Limit to zero, it MUST discard the
    packet and originate an ICMPv6 Time Exceeded message with Code 0 to
    the source of the packet.

Nicolas

--

From: Eric Dumazet
Date: Monday, June 1, 2009 - 11:43 am

You seem to mix requirements for routers and hosts. ttl processing
is relevant for a gateway only, not for a host.

(terminology : gateway / host in rfc 792)

I would say : who sent this ttl=0 packet at first ?

ping -t 0 host
ping: can't set unicast time-to-live: Invalid argument

So Linux is not able to do that, unless using tricks of course, or patching IP_TTL

BTW, sending ttl=0 packets to my cisco host (also a router but not relevant)
is ok, it replies to this packets...

I wonder why Linux forbids sending ttl=0 packets, time to read again all these RFCs :)


--

From: Brian Haley
Date: Monday, June 1, 2009 - 11:55 am

'ping6 -t 0 host' does work however.  The problem I see is that if you ping a system,
if it's a host it will respond, if it's a router it won't - the RFCs don't
explicitly state the host should drop the packet.  I don't know if that difference
in behavior is desired.  Do we know how any other OSes behave?

-Brian
--

From: John Dykstra
Date: Monday, June 1, 2009 - 6:54 pm

There are two cases--an echo request to an address assigned to a
router's interface, and to an address _beyond_ the router on another
link.

Any given interface on a router can have forwarding dynamically enabled
or disabled.  I don't remember prescribed echo request or hop limit
behavior changing depending on the forwarding enable, so it seems that
if you ping an address assigned to a router's interface, the router is
expected to follow the (apparently unwritten) host rules.  

Echo requests forwarded by a router should obviously have the hop limit

FWIW, the random BSD flavors I have on hand all check hop limit when
forwarding, but not when processing local ingress traffic.

Also FWIW, as I remember, the TAHI tests only check hop limit behavior
on forwarded traffic.

Nicolas, what's driving your patch?  Are you trying to align slow path
behavior with one of the 6WIND fast path implementations?

  --  John

--

From: David Miller
Date: Monday, June 1, 2009 - 7:02 pm

From: John Dykstra <john.dykstra1@gmail.com>

And this is the behavior that makes the most sense to me.

The local system is "accounted for" in the hop limit by the previous
hop system.  No other behavior makes any sense.

And I even remember there are applications that use multicast and
a hop limit of zero explicitly to keep application traffic only on
the local subnet.  So any change like that proposed could break
things.

--

From: John Dykstra
Date: Tuesday, June 2, 2009 - 2:22 am

Are you thinking of multicast apps that explicitly set a hop limit, or Multicast
Listener Discovery?  The hop limit specified for MLDv2 messages is one,
not zero.

  --  John

--

From: David Miller
Date: Tuesday, June 2, 2009 - 2:32 am

From: John Dykstra <john.dykstra1@gmail.com>

I am very much thinking of multicast apps that explicitly set a hop
limit.

The proof is in the pudding.  As mentioned in other parts of this
thread, we explicitly DO NOT allow a zero normal socket TTL option
but we DO allow a zero multicast TTL socket option to be set.
--

From: Nicolas Dichtel
Date: Tuesday, June 2, 2009 - 2:35 am

No. I'm just checking RFC conformance ;-)


--

From: Nicolas Dichtel
Date: Tuesday, June 2, 2009 - 2:30 am

I've ask the IETF mailing list about host case. Response was:
"process as normal."

--

From: Nicolas Dichtel
Date: Tuesday, June 2, 2009 - 2:30 am

It's why I test the forwarding value in my patch. If forwarding is 
I find a statement which tell to drop the packet but maybe I've missing 
something.


Nicolas
--

From: David Miller
Date: Monday, June 1, 2009 - 7:04 pm

From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>

It only must do this when executing the forwarding function.  It's an
egress check, not an ingress one.

I'm not applying this patch, it can even break some applications
out there that use a TTL of zero intentionally to keep traffic
only on a local subnet.
--

From: Eric Dumazet
Date: Monday, June 1, 2009 - 10:31 pm

I wonder if we then should allow setting ttl to zero. I had to patch
my kernel to allow ping to do this...

I'll check RFC when time permits.

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index e2d1f87..efe2797 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -558,7 +558,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 	case IP_TTL:
 		if (optlen<1)
 			goto e_inval;
-		if (val != -1 && (val < 1 || val>255))
+		if (val != -1 && (val < 0 || val>255))
 			goto e_inval;
 		inet->uc_ttl = val;
 		break;


--

From: David Miller
Date: Monday, June 1, 2009 - 10:43 pm

From: Eric Dumazet <dada1@cosmosbay.com>

Eric, notice how I mentioned in my other reply to this thread
"multicast" applications, which use mc_ttl which we allow to be set to
zero.
--

From: Nicolas Dichtel
Date: Tuesday, June 2, 2009 - 2:36 am

In my understanding, it can be on input to:
RFC4443 Section 3.3:
    If a router receives a packet with a Hop Limit of zero, or if a
    router decrements a packet's Hop Limit to zero, it MUST discard the
    packet and originate an ICMPv6 Time Exceeded message with Code 0 to
OK ok. John sends good arguments ;-)


Nicolas

--

From: David Miller
Date: Tuesday, June 2, 2009 - 2:37 am

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

I think this was unintentional.

Usage of zero TTL by multicast applications is very well established.

Running such an application on a router should work as well.
--

Previous thread: [PATCH 02/10] Add constants for the ieee 802.15.4/ZigBee stack by Dmitry Eremin-Solenikov on Monday, June 1, 2009 - 7:54 am. (46 messages)

Next thread: Re: [E1000-devel] 2.6.30rc7: ksoftirqd CPU saturation (x86-64 only, not x86-32) (e1000e-related?) by Brandeburg, Jesse on Monday, June 1, 2009 - 9:48 am. (2 messages)