Re: [PATCH v2] bonding: send IPv6 neighbor advertisement on failover

Previous thread: [PATCH]: ppp: Use skb_queue_walk() in ppp_mp_insert(). by David Miller on Thursday, October 9, 2008 - 4:40 pm. (1 message)

Next thread: drivers/net/enic/vnic_cq.c by Andrew Morton on Thursday, October 9, 2008 - 9:12 pm. (12 messages)
From: Brian Haley
Date: Thursday, October 9, 2008 - 5:52 pm

Updated to address Vlad's comment about storing the link-local IPv6 
address in the bond/vlan structure, I no longer overwrite the existing 
address with a newer one.  Also added missed locking for the inet6_dev.

I don't see any way to address David Steven's comment since there is 
currently no NA tunable in the IPv6 code.

---

This patch adds better IPv6 failover support for bonding devices,
especially when in active-backup mode and there are only IPv6 addresses
configured, as reported by Alex Sidorenko.

- Creates a new file, net/drivers/bonding/bond_ipv6.c, for the
   IPv6-specific routines.  Both regular bonds and VLANs over bonds
   are supported.

- Adds a new tunable, num_unsol_na, to limit the number of unsolicited
   IPv6 Neighbor Advertisements that are sent on a failover event.
   Default is 1.

- Creates two new IPv6 neighbor discovery functions:

   ndisc_build_skb()
   ndisc_send_skb()

   These were required to support VLANs since we have to be able to
   add the VLAN id to the skb since ndisc_send_na() and friends
   shouldn't be asked to do this.  These two routines are basically
   __ndisc_send() split into two pieces, in a slightly different order.

- Updates Documentation/networking/bonding.txt and bumps the rev of bond
   support to 3.4.0.

On failover, this new code will generate one packet:

- An unsolicited IPv6 Neighbor Advertisement, which helps the switch
   learn that the address has moved to the new slave.

Testing has shown that sending just the NA results in pretty good
behavior when in active-back mode, I saw no lost ping packets for example.

-Brian

Signed-off-by: Brian Haley <brian.haley@hp.com>
---

From: David Stevens
Date: Thursday, October 9, 2008 - 7:23 pm

Brian,
        How do you feel about doubling up with dad_transmits for that
part?
        At least the switch/cache updating portion is the same
in both cases.

                                                        +-DLS

--

From: Brian Haley
Date: Friday, October 10, 2008 - 7:34 am

I don't really want to since this is bonding-specific behavior, and 
we're not performing DAD for the address.  This is just another sysfs 

I don't think they are.  In the DAD case we're probing for another node 
that might have the address configured, in the bond failover case we 
want to update the switch quickly so we don't drop packets.

-Brian
--

From: David Stevens
Date: Friday, October 10, 2008 - 8:03 am

I think they really are the same case, and doing DAD
would solve the problem just as well. But I can hold my nose a
little bit and live with it. :-) Getting the problem solved is
more important than the details.

                                                        +-DLS

--

From: Jay Vosburgh
Date: Friday, October 10, 2008 - 8:53 am

If I'm reading things correctly, DAD sends neighbor
solicitations, and we're sending neighbor advertisements, and not
running the DAD logic.

	As a semi-related question, what does IPv6 do if it receives a
gratutitous NA, and finds a duplicate?

	I agree that doing DAD would update the switches, peers, etc,
but, if I'm reading the IPv6 code correctly, the delay between probes is
one second (nd_tbl.retrans_time), and it looks like there's an initial
delay of up to 1 second as well (in addrconf_dad_kick, the
rtr_solicit_delay).  For failover purposes, we want to issue the
gratuitous ARP or NA packets immediately with a minimal delay between
probes.

	Is my understanding of the DAD behavior correct?

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
--

From: Brian Haley
Date: Friday, October 10, 2008 - 9:04 am

If a node has an IPv6 neighbor entry and receives an unsolicited NA it 
will change it's state to stale, forcing a re-lookup on the next 

Right, and since a bond failover event should be rare, I think sending 
the NA immediately is OK.  All those random delays are meant to cover 
the case, for example, when a router sends an advertisement with a new 
prefix - you don't want everyone that receives it to do DAD immediately 
since it could overwhelm the network.

-Brian
--

From: Vlad Yasevich
Date: Friday, October 10, 2008 - 9:29 am

^^^^^^^^^^^^

You probably meant to say "solicited".  Unsolicited NAs can only change
the state to STALE.

Also, the re-lookup will happen on a delay after the transmit.


--

From: Sridhar Samudrala
Date: Friday, October 10, 2008 - 9:56 am

I think Jay was asking about the case where the target address in the NA 
matches the address of the receiving interface. The RFCs don't describe
how to handle such a case and leave it to the implementation. Linux logs
a warning message and ignores such NA.

Thanks


--

From: Vlad Yasevich
Date: Friday, October 10, 2008 - 10:15 am

Yes, but in this case, you have duplicate addresses configured.  This
can happen when subnets merge and isn't really related to bonding driver.

In such a case, if we do an NS, it will trigger an NA and we'll end up
logging such a warning.  If we do an NA, the other end, if it's linux, will
log this warning, or deal with it in its own manner.

So, it's really a draw.


--

From: Vlad Yasevich
Date: Friday, October 10, 2008 - 8:27 am

Looks good to me.  This provides a nice base that we can build on.

-vlad
--

From: Brian Haley
Date: Monday, October 27, 2008 - 1:07 pm

Hi Jay,

Did you ever get a chance to look at this patch?  Now that net-next is 
open I'd like to resubmit it.

Thanks,

-Brian

--

From: Jay Vosburgh
Date: Monday, October 27, 2008 - 5:24 pm

I'm good with it; I'm merging it with a couple of other feature
things and I'll post it with that set in a day or two when things open
(unless I missed a mail, I don't think net-next is open as I type this).

	-J

---
--

Previous thread: [PATCH]: ppp: Use skb_queue_walk() in ppp_mp_insert(). by David Miller on Thursday, October 9, 2008 - 4:40 pm. (1 message)

Next thread: drivers/net/enic/vnic_cq.c by Andrew Morton on Thursday, October 9, 2008 - 9:12 pm. (12 messages)