Adaptive Load Balancing on 2.4.23

Submitted by vofka
on December 12, 2003 - 11:05am

I'm trying to use the Bonding driver in kernel version 2.4.23 to aggregate two NIC's together. The switches don't support Round-Robin, XOR or 802.3ad mode, so I am trying to use Adaptive Load Balancing (ALB) mode. Unfortunately, I'm running into a few snags!

I'm trying to use two 3Com 3c905C NIC's. Both NIC's work fine themselves. To get things working, I am doing:

    # modprobe bonding mode=6
    # ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
    # ifenslave bond0 eth0 eth1

The first two work OK, and the bonding interface comes up. When I try and enslave the two NIC's, ifenslave responds with:

    SIOCBONDENSLAVE: Operation not supported.

And the following entries are recorded in /var/log/messages:

    bonding: Error: alb_set_slave_mac_addr: dev->set_mac_address of dev eth0 failed!
    ALB mode requires that the base driver support setting the hw address also when the
    network devices interface is open

(and the same for eth1).

I've looked in the code for the Bonding driver, and I can see exactly where it is failing, but I don't know nearly enough about kernel dirvers and modules to trace back the problem to it's root (which appears to be in the 3c59x driver).

I've looked for a list of devices which are compatable with ALB in the Bonding driver, and have come up empty (google was not my friend!)

I have a few quick questions:

* Does anyone have any good pointers to a compatability list between network drivers and the bonding driver?

* Can someone give me any pointers about which bits of the code in each driver to look at so that I can make such a list for myself?

* Does anyone know of any patches for the 3c59x driver which will allow the bonding driver to change the MAC address while the interface is open? (Is this even possible in the hardware, or is it purely a driver issue?)

Any pointers to more information, or advice on which bits of the code to look at would be greatly appreciated. I'm happy to apply patches, or even to have a stab at writing one myself if I can work out where to start!!!

drivers..

on
December 15, 2003 - 3:30pm


cd drivers/net
find . -name \*.c -or -name \*.h | xargs grep set_mac_address

set_mac_address function bindings...

on
December 16, 2003 - 11:40am

Thanks...

That helped me track things a little closer to the source. It seems that the 3c59x driver has no binding in the "net_device" structure for the "set_mac_address" function.

I've looked at Donald Beckers 3c59x site at http://www.scyld.com/network/vortex.html, but this version of the code also has no bindings for the set_mac_address function.

It's looking like my only options are going to be:

* Get new network switches

* Use different NIC's (such as the Intel EtherExpress e100's)

* Get my head buried in the kernel code and write a patch

Maybe I'll have a look at 2.6.0test11 to see if there is support lurking in there somewhere - unfortunately, from what I recall, the 2.6.x series still has no support for the Software RAID features on the HighPoint HPT374 IDE RAID Controller (the motherboard is an ABIT AT7-MAX board, with 8x80GB HDD's on the HPT374, and an 8GB boot HDD!)

Resolved...

on
December 17, 2003 - 6:38am

Well, following my original post, I posted a similar request to the LKML with no response. I then followed up by sending the follwing message to Donald Becker at: vortex -at- scyld.com:

Hi,

I've got this address as a contact address from the top of the latest BK
commit of the 3c59x.c driver in the Linux Kernel.

I'm not 100% sure this is the correct place to post my query, but I've
already posted to the LKML to no luck.  If this is not the right place,
I would greatly appreciate if you could let me know where is the correct
place to post this query...

Basically, I am trying to use two 3c905C (Tornado) NIC's with the 2.4.23
bonding driver.  This driver needs the base NIC driver to support
changing the card MAC address whilst the card is open.  It appears that
the 3c59x Driver does not support this operation.

I have looked at the 3c59x code, and the Bonding driver code, as well as
the code for drivers which do support this operation (such as the Intel
EtherExpress e100) to see if I can glean any useful information. 
Unfortunately, I'm not as familiar with the kernel code as I would like
to be, so I'm a little in the dark.

I have found that cards which do support changing the MAC address of the
card while the device is open have a function binding to the
set_mac_address function in the net_device structure for the card.  What
I can't work out is whether it would be possible to write an appropriate
function for the 3c59x driver.

I've also looked at other versions of the driver at scyld.com, and also
at 3Com's own 3c90x driver to see if I can get any useful information
there - or to see if there is an implementation of the set_mac_address
function that I can adapt to the version of the driver included with the
stock 2.4.23 kernel, also to no luck.

Is it possible to change the MAC address of these cards whilst the
device is open?  If not, is it simply a driver limitation, or is there a
hardware limitation which prevents the change?  Finally, if it is just a
driver limitation, are there any good resources on the kernel drivers
that I can look at that would help me start to write a patch to enable
this functionality?

Any help or advice would be greatly appreciated.  In the meantime, my
searches through Google and other kernel sites continues...

I very promptly received this reply:

On 16 Dec 2003, Mike Insch wrote:

> Basically, I am trying to use two 3c905C (Tornado) NIC's with the 2.4.23
> bonding driver.  This driver needs the base NIC driver to support
> changing the card MAC address whilst the card is open.

This is deeply flawed: there are inherent race and packet drop
problems.
Most NICs do not allow changing the driver MAC address when the receiver
is active because of the problems.

> It appears that the 3c59x Driver does not support this operation.
> I have looked at the 3c59x code, and the Bonding driver code, as well as
> the code for drivers which do support this operation (such as the Intel
> EtherExpress e100) to see if I can glean any useful information. 

The e100 is a good example of why this is silly.  The chip changes the
MAC address using the same command queue as transmitting a packet, which
means that the operation might be delayed for a significant period of
time.  To know when the command is executed, the driver must stop the
transmit queue, wait until the queued packets are sent, send the
'CmdIASetup' update command, and then resume normal operation.  And
while the command is running, the receiver is disabled, so the NIC is
silently dropping packets.  Doh!  It's exactly as bad as down-change-up,
except that the driver is now more complex.

So, It seems at the moment that a software solution to my particular problem is not entirely the right way to go. Time to get managemnt to upgrade our switches!

Thanks to Donald for a fast and extremely informative reply :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.