On Fri, Jan 08, 2016 at 02:52:45PM -0700, John Nielsen wrote:
> Hi all-
>
> I'm trying to troubleshoot a problem on a machine running recent
10-STABLE. The machine has two physical interfaces and hosts a number of
services, including a bhyve VM (FreeBSD 10.2-RELEASE) acting as a network
appliance. The VM has three interfaces: external, internal-trusted and
internal-guest. Each VM interface is plumbed to a TAP device on the host which
in turn is a member of a bridge. Here is the current (working) setup:
>
> External <--------> Host <-> Host <-> Host <-> VM
> port re0 bridge2 tap21 vtnet1
>
> Switch <-> Host <-> Host <-> Host <-> Host
<-> VM
> port em0 em0.2 bridge0 tap20 vtnet0
> ^
> \-----> Host <-> Host <-> Host <-> VM
> em0.103 bridge1 tap22 vtnet2
>
> Since there is not much external traffic, most of the bandwidth potential
of re0 is wasted while em0 is sometimes busy. So I'd like to move to a LAGG
setup, as below:
>
> External Trusted Untrusted
> VLAN 99 VLAN 2 VLAN 103
> | | |
> \ | /
> /---------------\ /------> Host <--> Host <-> Host
<-> VM
> | switch | | lagg0.99 bridge2 tap21 vtnet1
> \---------------/ |
> | | | /---> Host <--> Host <-> Host
<-> VM
> | v | | lagg0.2 bridge0 tap20 vtnet0
> | Host v v
> \ re0 <-----> Host <-> Host <--> Host <->
Host <-> VM
> \ lagg0 lagg0.103 bridge1 tap22 vtnet2
> \-> Host ^
> em0 <------/
>
> So in other words, plugging the external port into the switch, creating a
new "external" VLAN, adding both em0 and re0 into a new LAGG and
creating VLAN child interfaces off of that.
>
> I tried the new setup today and it worked except that the VM no longer
received ARP replies from the external network. Using tcpdump on the host's
lagg0.99, I saw the ARP request from the VM go out and an ARP reply come back,
but that's as far as it went. I did not see the arp reply on the host's
bridge2 or tap21 interfaces, and the VM never received it.
>
> I didn't make any changes on the VM, and all I changed on the host was
the networking via /etc/rc.conf. The host does run ipfw but I verified that none
of the rules reference any stale interface names. I have also previously
disabled all firewalling of bridged packets:
> net.link.bridge.pfil_onlyip=0
> net.link.bridge.pfil_member=0
> net.link.bridge.pfil_bridge=0
>
> I also verified that "ifconfig bridge2 addr" contained the MAC
addresses of both the VM and the external device on the correct ports.
>
> So in the LAGG setup, why aren't the ARP replies going across bridge2
to the VM? Any ideas on how to narrow down the cause appreciated.
Did you try to use ng_bridge?
I will be abuse by ARP on if_bridge and switch to ng_bridge:
network_interfaces="lo0 vr0 ath0 ngeth0"
cat /etc/start_if.ngeth0
#!/bin/sh
kldload -v ng_eiface ng_ether ng_bridge
ngctl mkpeer . eiface hook ether
ifconfig ngeth0 ether 00:40:63:c1:87:02
ngctl mkpeer ngeth0: bridge ether link0
ngctl name ngeth0:ether br0
ngctl connect wlan0: br0: lower link1
ngctl msg wlan0: setpromisc 1
ngctl msg wlan0: setautosrc 0
ngctl connect vr0: br0: lower link2
ngctl msg vr0: setpromisc 1
ngctl msg vr0: setautosrc 0