> On Jul 27, 2020, at 15:41, Joe Clarke <jclarke at marcuscom.com>
wrote:
>
>
>
>> On Jul 27, 2020, at 15:01, Mark Johnston <markj at freebsd.org>
wrote:
>>
>> On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote:
>>> About two weeks ago, I upgraded from the latest 11-stable to the
latest 12-stable. After that, I periodically see the network throughput come to
a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It
acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs
ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN
interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the
default 1500.
>>>
>>> Besides seeing massive packet loss and huge latency (~ 200 ms for
on-LAN ping times), I know the problem has occurred because my lldpd reports:
>>>
>>> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv
received on bridge0
>>>
>>> And if I turn on ipfw verbose messages, I see tons of:
>>>
>>> Jul 26 16:02:23 namale kernel: ipfw: pullup failed
>>>
>>> This leads to me to believe packets are being corrupted on ingress.
I?ve applied all the recent iflib changes, but the problem persists. What causes
it, I don?t know.
>>>
>>> The only thing that changed (and yes, it?s a big one) is I upgraded
to 12-stable. Meaning, the rest of the network infra and topology has remained
the same. This did not happen at all in 11-stable.
>>>
>>> I?m open to suggestions.
>>
>> There are some fixes for vmx not present in stable/12 (yet). I did a
>> merge of a number of outstanding revisions. Would you be able to test
>> the patch? I haven't observed any problems with it on a host using
igb,
>> but I have no ability to test vmx at the moment.
>
> I?m down to test anything. I did notice quite a few vmxnet3 changes around
performance that appealed to me. I tried a few of them on my last kernel. That
took much longer to exhibit the problem, but eventually did.
>
> I can tell you I don?t have all of these patches in, though. I?ll build
with this diff and start running it now. I?ll let you know how it goes.
So it?s been just over a week of runtime with this full patch set. I have seen
no further issues with ingress packet ?truncation?, and performance has been
what I expect. I?m going to keep running, but I think this seems like a good
set to MFC.
Thanks again for your help.
Joe
---
PGP Key : http://www.marcuscom.com/pgp.asc