Matthew Kent
2008-Dec-04 00:20 UTC
[Bridge] received packet with own address as source address
Trying to determine if I have a misconfiguration, misunderstanding or have stumbled on a bug. Have a CentOS 5.2 server with 2 e1000e nics in a balance-xor bond, check this out: Test 1 - arping the bond device @ 172.16.0.117 from a second machine --- root at foo [/root]# arping -c 1 -b -I eth0 172.16.0.117 ARPING 172.16.0.117 from 172.16.0.116 eth0 Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.607ms Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.648ms Sent 1 probes (1 broadcast(s)) Received 2 response(s) Nothing logged to the kernel ring buffer on the server. All looks good and happy. Test 2 - now add a bridge @ 172.16.0.117 with the bond as a port: --- root at foo [/root]# arping -c 1 -b -I eth0 172.16.0.117 ARPING 172.16.0.117 from 172.16.0.116 eth0 Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.603ms Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.642ms Sent 1 probes (1 broadcast(s)) Received 2 response(s) Looks good, traffic moves, etc but now the following gets logged on the server for every broadcast packet: [26489.040112] bond0: received packet with own address as source address Capturing the traffic for both tests and comparing, they look identical, but the bridge wants to throw the warning level message in net/bridge/br_fdb.c br_fdb_update(). The printk warning, being rate limited, seems to add a noticable lag. Disabling the warning with a patch made things happy again. Couple questions from this for the more knowledgeable folks: 1) Are the two arp replies from a balance-xor bond normal? 2) Should the bridge handle this or should I just not be tying an ip to the bridge in balance-xor mode? Thanks for your time! -- Matthew Kent \ SA \ bravenet.com
Jay Vosburgh
2008-Dec-04 00:46 UTC
[Bridge] [Bonding-devel] received packet with own address as source address
Matthew Kent <matt at bravenet.com> wrote:>Trying to determine if I have a misconfiguration, misunderstanding or >have stumbled on a bug. > >Have a CentOS 5.2 server with 2 e1000e nics in a balance-xor bond, check >this out:I suspect your switch is misconfigured. The balance-xor mode is nominally "Etherchannel compatible" and the switch ports connected to bonding balance-xor should be in Etherchannel mode ("Trunking", etc, but not "LACP" or "802.3ad"). If the switch doesn't know the ports are aggregated, it may very well send broadcasts recieved on one port back out the other port, which may be the cause of what you're seeing. The switch might also whine about flapping of the MAC address. If I set up bonding here with the switch unconfigured for Etherchannel, I see the same behavior as this: ARPING 172.16.0.117 from 172.16.0.116 eth0 Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.607ms Unicast reply from 172.16.0.117 [00:15:17:70:A3:88] 0.648ms Sent 1 probes (1 broadcast(s)) Received 2 response(s) Specifically the "Received 2 response(s)" part. This happens because the bond receives one copy of the packet on each port, and responds to each. After I configure the switch ports correctly for Etherchannel, there is only 1 response. Without the switch configuration IPv6 addrconf also complains about duplicate address detected. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar at us.ibm.com