Hi all-
I'm trying to troubleshoot a problem on a machine running recent 10-STABLE.
The machine has two physical interfaces and hosts a number of services,
including a bhyve VM (FreeBSD 10.2-RELEASE) acting as a network appliance. The
VM has three interfaces: external, internal-trusted and internal-guest. Each VM
interface is plumbed to a TAP device on the host which in turn is a member of a
bridge. Here is the current (working) setup:
External <--------> Host <-> Host <-> Host <-> VM
port re0 bridge2 tap21 vtnet1
Switch <-> Host <-> Host <-> Host <-> Host <-> VM
port em0 em0.2 bridge0 tap20 vtnet0
^
\-----> Host <-> Host <-> Host <-> VM
em0.103 bridge1 tap22 vtnet2
Since there is not much external traffic, most of the bandwidth potential of re0
is wasted while em0 is sometimes busy. So I'd like to move to a LAGG setup,
as below:
External Trusted Untrusted
VLAN 99 VLAN 2 VLAN 103
| | |
\ | /
/---------------\ /------> Host <--> Host <-> Host <->
VM
| switch | | lagg0.99 bridge2 tap21 vtnet1
\---------------/ |
| | | /---> Host <--> Host <-> Host <->
VM
| v | | lagg0.2 bridge0 tap20 vtnet0
| Host v v
\ re0 <-----> Host <-> Host <--> Host <-> Host
<-> VM
\ lagg0 lagg0.103 bridge1 tap22 vtnet2
\-> Host ^
em0 <------/
So in other words, plugging the external port into the switch, creating a new
"external" VLAN, adding both em0 and re0 into a new LAGG and creating
VLAN child interfaces off of that.
I tried the new setup today and it worked except that the VM no longer received
ARP replies from the external network. Using tcpdump on the host's lagg0.99,
I saw the ARP request from the VM go out and an ARP reply come back, but
that's as far as it went. I did not see the arp reply on the host's
bridge2 or tap21 interfaces, and the VM never received it.
I didn't make any changes on the VM, and all I changed on the host was the
networking via /etc/rc.conf. The host does run ipfw but I verified that none of
the rules reference any stale interface names. I have also previously disabled
all firewalling of bridged packets:
net.link.bridge.pfil_onlyip=0
net.link.bridge.pfil_member=0
net.link.bridge.pfil_bridge=0
I also verified that "ifconfig bridge2 addr" contained the MAC
addresses of both the VM and the external device on the correct ports.
So in the LAGG setup, why aren't the ARP replies going across bridge2 to the
VM? Any ideas on how to narrow down the cause appreciated.
Thanks!
-John Nielsen
> On Jan 8, 2016, at 2:52 PM, John Nielsen <lists at jnielsen.net> wrote: > > I'm trying to troubleshoot a problem on a machine running recent 10-STABLE. ... So in other words, plugging the external port into the switch, creating a new "external" VLAN, adding both em0 and re0 into a new LAGG and creating VLAN child interfaces off of that. > > ... > > So in the LAGG setup, why aren't the ARP replies going across bridge2 to the VM?For the archives: this turned out to be operator error in the form of a MAC address conflict between the lagg0 interface on the host and the vtnet1 interface in the VM. JN
On Fri, Jan 08, 2016 at 02:52:45PM -0700, John Nielsen wrote:> Hi all- > > I'm trying to troubleshoot a problem on a machine running recent 10-STABLE. The machine has two physical interfaces and hosts a number of services, including a bhyve VM (FreeBSD 10.2-RELEASE) acting as a network appliance. The VM has three interfaces: external, internal-trusted and internal-guest. Each VM interface is plumbed to a TAP device on the host which in turn is a member of a bridge. Here is the current (working) setup: > > External <--------> Host <-> Host <-> Host <-> VM > port re0 bridge2 tap21 vtnet1 > > Switch <-> Host <-> Host <-> Host <-> Host <-> VM > port em0 em0.2 bridge0 tap20 vtnet0 > ^ > \-----> Host <-> Host <-> Host <-> VM > em0.103 bridge1 tap22 vtnet2 > > Since there is not much external traffic, most of the bandwidth potential of re0 is wasted while em0 is sometimes busy. So I'd like to move to a LAGG setup, as below: > > External Trusted Untrusted > VLAN 99 VLAN 2 VLAN 103 > | | | > \ | / > /---------------\ /------> Host <--> Host <-> Host <-> VM > | switch | | lagg0.99 bridge2 tap21 vtnet1 > \---------------/ | > | | | /---> Host <--> Host <-> Host <-> VM > | v | | lagg0.2 bridge0 tap20 vtnet0 > | Host v v > \ re0 <-----> Host <-> Host <--> Host <-> Host <-> VM > \ lagg0 lagg0.103 bridge1 tap22 vtnet2 > \-> Host ^ > em0 <------/ > > So in other words, plugging the external port into the switch, creating a new "external" VLAN, adding both em0 and re0 into a new LAGG and creating VLAN child interfaces off of that. > > I tried the new setup today and it worked except that the VM no longer received ARP replies from the external network. Using tcpdump on the host's lagg0.99, I saw the ARP request from the VM go out and an ARP reply come back, but that's as far as it went. I did not see the arp reply on the host's bridge2 or tap21 interfaces, and the VM never received it. > > I didn't make any changes on the VM, and all I changed on the host was the networking via /etc/rc.conf. The host does run ipfw but I verified that none of the rules reference any stale interface names. I have also previously disabled all firewalling of bridged packets: > net.link.bridge.pfil_onlyip=0 > net.link.bridge.pfil_member=0 > net.link.bridge.pfil_bridge=0 > > I also verified that "ifconfig bridge2 addr" contained the MAC addresses of both the VM and the external device on the correct ports. > > So in the LAGG setup, why aren't the ARP replies going across bridge2 to the VM? Any ideas on how to narrow down the cause appreciated.Did you try to use ng_bridge? I will be abuse by ARP on if_bridge and switch to ng_bridge: network_interfaces="lo0 vr0 ath0 ngeth0" cat /etc/start_if.ngeth0 #!/bin/sh kldload -v ng_eiface ng_ether ng_bridge ngctl mkpeer . eiface hook ether ifconfig ngeth0 ether 00:40:63:c1:87:02 ngctl mkpeer ngeth0: bridge ether link0 ngctl name ngeth0:ether br0 ngctl connect wlan0: br0: lower link1 ngctl msg wlan0: setpromisc 1 ngctl msg wlan0: setautosrc 0 ngctl connect vr0: br0: lower link2 ngctl msg vr0: setpromisc 1 ngctl msg vr0: setautosrc 0