About two weeks ago, I upgraded from the latest 11-stable to the latest 12-stable. After that, I periodically see the network throughput come to a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the default 1500. Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping times), I know the problem has occurred because my lldpd reports: Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on bridge0 And if I turn on ipfw verbose messages, I see tons of: Jul 26 16:02:23 namale kernel: ipfw: pullup failed This leads to me to believe packets are being corrupted on ingress. I?ve applied all the recent iflib changes, but the problem persists. What causes it, I don?t know. The only thing that changed (and yes, it?s a big one) is I upgraded to 12-stable. Meaning, the rest of the network infra and topology has remained the same. This did not happen at all in 11-stable. I?m open to suggestions. Thanks. Joe --- PGP Key : http://www.marcuscom.com/pgp.asc
27.07.2020 5:16, Joe Clarke wrote:> About two weeks ago, I upgraded from the latest 11-stable to the latest 12-stable. After that, I periodically see the network throughput come to a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the default 1500. > > Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping times), I know the problem has occurred because my lldpd reports: > > Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on bridge0 > > And if I turn on ipfw verbose messages, I see tons of: > > Jul 26 16:02:23 namale kernel: ipfw: pullup failed > > This leads to me to believe packets are being corrupted on ingress. I?ve applied all the recent iflib changes, but the problem persists. What causes it, I don?t know. > > The only thing that changed (and yes, it?s a big one) is I upgraded to 12-stable. Meaning, the rest of the network infra and topology has remained the same. This did not happen at all in 11-stable. > > I?m open to suggestions.First, try: ifconfig $ifname -rxcsum -txcsum
On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote:> About two weeks ago, I upgraded from the latest 11-stable to the latest 12-stable. After that, I periodically see the network throughput come to a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the default 1500. > > Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping times), I know the problem has occurred because my lldpd reports: > > Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on bridge0 > > And if I turn on ipfw verbose messages, I see tons of: > > Jul 26 16:02:23 namale kernel: ipfw: pullup failed > > This leads to me to believe packets are being corrupted on ingress. I?ve applied all the recent iflib changes, but the problem persists. What causes it, I don?t know. > > The only thing that changed (and yes, it?s a big one) is I upgraded to 12-stable. Meaning, the rest of the network infra and topology has remained the same. This did not happen at all in 11-stable. > > I?m open to suggestions.There are some fixes for vmx not present in stable/12 (yet). I did a merge of a number of outstanding revisions. Would you be able to test the patch? I haven't observed any problems with it on a host using igb, but I have no ability to test vmx at the moment. https://people.freebsd.org/~markj/patches/iflib-stable12.diff