On Mon, Mar 16, 2009 at 2:01 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
Eric, based on your inability to recreate this, I tried on some other
hardware I had lying around that has an AMD chipset built-in NIC.
I could not recreate the problem on that hardware. I'm starting to
think this is an e1000 problem. In both the e1000 and e1000e
drivers they do the following logic:
/* clear the old settings from the multicast hash table */
for (i = 0; i < mta_reg_count; i++) {
E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
E1000_WRITE_FLUSH();
}
/* load any remaining addresses into the hash table */
for (; mc_ptr; mc_ptr = mc_ptr->next) {
hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
e1000_mta_set(hw, hash_value);
}
There's clearly a window where the NIC doesn't have the multicast
addresses loaded. This may just be broken-as-designed. If anyone
else happens to have some e1000 hardware and wants to see if you
can recreate this, I'd be curious.
Some other notes just FYI...
- RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
- there are no messages in dmesg
- frames get dropped when the program calls exit() and all the sockets
get closed
(and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
- The problem happens even when adding a sleep(1) in between each of the
ADD_MEMBERSHIP calls.
--
Dave B
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html