Adrian P
2017-Dec-16 18:14 UTC
[Bridge] linux bridge does not forward arp reply back packets in a vmware vm
On Sat, Dec 16, 2017 at 7:35 PM, Stephen Hemminger <stephen at networkplumber.org> wrote:> On Sat, 16 Dec 2017 16:19:03 +0200 > Adrian P <adrian27oradea at gmail.com> wrote: > >> Investigating this further, I have noticed that the mac address of the >> eth0 interface from the cirros VM instance does not appear in the >> bridge forwarding table, and this explains why everything starts >> working only when I set ageing time to 0, because in that case all >> packets are flooded on all ports and the bridge behaves like a hub. >> >> So now the question is: why the bridge does not learn the mac address >> of the eth0 interface from the cirros VM instance? I am able to see >> the arp request (ARP, Request who-has 10.20.21.1 tell 10.20.21.233) >> going out from the cirros VM instance on tap interface, so the bridge >> should learn the mac address and add it to the forwarding table. >> >> The reply back to the arp request (Reply 10.20.21.1 is-at >> 00:17:08:c4:52:80) does not reach the cirros VM instance anymore, and >> now I know why: there is no mac address in the forwarding table, so >> the bridge does not know on which port to send the arp reply back. >> >> This happens with tap interfaces only. I can see many mac addresses >> associated with "physical" interface ens160 (that is interface number >> 1) in the forwarding table, but in case of the tap interfaces, there >> are only two entries, and both entries shows the mac address of the >> tap interfaces only: > > VMWare does ARP spoofing maybe it consumes the ARPThe flow is like this: vmware <--> ens160 <--> bridge <--> tap if <--> eth0 So my understanding is that when the arp request goes out on the eth0, through the tap interface, the bridge should learn the eth0 mac address and add it to the forwarding table. But this does not happen, so when the arp reply back comes back, the bridge does not know on which interface to send the arp reply, and this is why nothing works. I have another bare metal environment, without vmware, and I still have this issue. I suspect this is a bug related to the bridge and the tap interface, since I see many mac address learned by the bridge on the ens160 interface... or maybe some tap / bridge setting? I have no clue. Where should I report such issue?
Adrian P
2017-Dec-16 20:01 UTC
[Bridge] linux bridge does not forward arp reply back packets in a vmware vm
On Sat, Dec 16, 2017 at 8:14 PM, Adrian P <adrian27oradea at gmail.com> wrote:> On Sat, Dec 16, 2017 at 7:35 PM, Stephen Hemminger > <stephen at networkplumber.org> wrote: >> On Sat, 16 Dec 2017 16:19:03 +0200 >> Adrian P <adrian27oradea at gmail.com> wrote: >> >>> Investigating this further, I have noticed that the mac address of the >>> eth0 interface from the cirros VM instance does not appear in the >>> bridge forwarding table, and this explains why everything starts >>> working only when I set ageing time to 0, because in that case all >>> packets are flooded on all ports and the bridge behaves like a hub. >>> >>> So now the question is: why the bridge does not learn the mac address >>> of the eth0 interface from the cirros VM instance? I am able to see >>> the arp request (ARP, Request who-has 10.20.21.1 tell 10.20.21.233) >>> going out from the cirros VM instance on tap interface, so the bridge >>> should learn the mac address and add it to the forwarding table. >>> >>> The reply back to the arp request (Reply 10.20.21.1 is-at >>> 00:17:08:c4:52:80) does not reach the cirros VM instance anymore, and >>> now I know why: there is no mac address in the forwarding table, so >>> the bridge does not know on which port to send the arp reply back. >>> >>> This happens with tap interfaces only. I can see many mac addresses >>> associated with "physical" interface ens160 (that is interface number >>> 1) in the forwarding table, but in case of the tap interfaces, there >>> are only two entries, and both entries shows the mac address of the >>> tap interfaces only: >> >> VMWare does ARP spoofing maybe it consumes the ARP > > The flow is like this: > > vmware <--> ens160 <--> bridge <--> tap if <--> eth0 > > So my understanding is that when the arp request goes out on the eth0, > through the tap interface, the bridge should learn the eth0 mac > address and add it to the forwarding table. But this does not happen, > so when the arp reply back comes back, the bridge does not know on > which interface to send the arp reply, and this is why nothing works. > > I have another bare metal environment, without vmware, and I still > have this issue. > > I suspect this is a bug related to the bridge and the tap interface, > since I see many mac address learned by the bridge on the ens160 > interface... or maybe some tap / bridge setting? I have no clue. Where > should I report such issue?Further investigation reveals something strange: when the communication starts with an arp request (which happens almost all the time), the bridge wrongly assigns the eth0 mac address to port 1, instead of port 3. Flow again: default gw --- vmware --- [ ens160 bridge tap ] --- eth0 On my bridge, ens160 is port 1, and the tap interface is port 3. Eth0 mac address is fa:16:3e:9a:04:95 What I have found is that in the forwarding table, the bridge wrongly assigns the eth0 mac address to the port 1, which is ens160 interface, instead of assigning it to the port 3, which is the tap interface. This happens only if the arp table in the cirros VM instance does not contain the mac address of the destination I am pinging (default gw in this case), so the cirros VM sends an arp request. See below the eth0 mac address wrongly assigned in the forwarding table to the port 1: # brctl showmacs brq025a9a94-58 | grep fa:16:3e:9a:04:95 1 fa:16:3e:9a:04:95 no 0.67 However, if I manually add the mac address of the destination IP I am pining into the cirros VM instance arp table, and there is no arp request sent, just icmp packets going out, then the bridge correctly assigns the eth0 mac address to the port 3, which is the tap interface, and everything starts working fine. See below the eth0 mac address correctly assigned in the forwarding table to the port 3: # brctl showmacs brq025a9a94-58 | grep fa:16:3e:9a:04:95 3 fa:16:3e:9a:04:95 no 9.26