Jose G. Juanino
2019-Nov-05 21:44 UTC
High Send-Q values in netstat after upgrading to 12.1-RELEASE: system is unusable
Hi all, today I have upgraded to 12.1-RELEASE from 12.0-RELEASE-p10 via freebsd-update. The host is a VMware guest with vmx net interface. The system runs with 12.0-RELEASE perfectly, with no issue, reliably for months. I have boot environments enabled (thanks god) and therefore I am able to switch between both 12.0 and 12.1 versions. Just after boot the new 12.1 release, I noticed that my apache did not serve pages with the previous speed, but rather the opposite: any web operation from clients hangs. After a single inspection in the FreeBSD 12.1 server, I see that Send-Q in netstat is the root cause of the issue. While in 12.0 Send-Q is almost always zero or close to zero in every socket, in 12.1 I get the following (snipped output): $ netstat -x -4 Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 31848 XXXXXXXXXXXXX.http YYYYYYYYYYYY.57220 0 12 0 12 66350 33175 1 2048 0 48128 530800 265400 62.90 0.00 7069.49 0.00 0.00 1.09 tcp4 0 36 XXXXXXXXXXXXX.ssh YYYYYYYYYYYY.57189 0 1 0 0 66350 33175 1 2048 0 256 530800 265400 0.28 0.00 6836.19 0.00 0.00 0.00 As you can see, Send-Q is huge. But after this revelation, I have to admit that I am stuck. I have no idea whether the cause of this high Send-Q values are related with the vmx driver, with the upgrade itself, or with something related with the hypervisor. Please, help me to make a deeper troubleshooting. As I have said before, my boot environment let me to start cleanly with both versions. Other relevant fact may be also the difference between netstat -m and ifconfig outputs: In 12.0 netstat -m outputs: 268/1517/1785 mbufs in use (current/cache/total) 260/766/1026/250998 mbuf clusters in use (current/cache/total/max) 260/499 mbuf+clusters out of packet secondary zone in use (current/cache) 0/2/2/125499 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/37184 9k jumbo clusters in use (current/cache/total/max) 0/0/0/20916 16k jumbo clusters in use (current/cache/total/max) 587K/1919K/2506K bytes allocated to network (current/cache/total) while in 12.1 outputs: 572/1213/1785 mbufs in use (current/cache/total) 515/515/1030/250998 mbuf clusters in use (current/cache/total/max) 4/502 mbuf+clusters out of packet secondary zone in use (current/cache) 24/2/26/125499 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/37184 9k jumbo clusters in use (current/cache/total/max) 0/0/0/20916 16k jumbo clusters in use (current/cache/total/max) 1269K/1341K/2610K bytes allocated to network (current/cache/total) In 12.0 ifconfig vmx0 outputs (ip address and mac obfuscated): vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6> ether TTTTTTTTTTTTTTTTT inet DD.DD.DD.DD netmask 0xffffff00 broadcast BB.BB.BB.BB media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> while in 12.1 outputs (notices JUMBO frames enabled): vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether TTTTTTTTTTTTTTTTT inet DDDDDDDDDDDD netmask 0xffffff00 broadcast DDDDDDDDDDDDD media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> Thanks in advanced, regards -- Jose G. Juanino
Jose G. Juanino
2019-Nov-05 23:37 UTC
High Send-Q values in netstat after upgrading to 12.1-RELEASE: system is unusable
On Tuesday, November 05 at 22:44:10 CET, Jose G. Juanino wrote:> Hi all, > > today I have upgraded to 12.1-RELEASE from 12.0-RELEASE-p10 via > freebsd-update. The host is a VMware guest with vmx net interface. The > system runs with 12.0-RELEASE perfectly, with no issue, reliably for months. > > I have boot environments enabled (thanks god) and therefore I am able to > switch between both 12.0 and 12.1 versions. > > Just after boot the new 12.1 release, I noticed that my apache did not > serve pages with the previous speed, but rather the opposite: any web > operation from clients hangs. > > After a single inspection in the FreeBSD 12.1 server, I see that Send-Q > in netstat is the root cause of the issue. While in 12.0 Send-Q is > almost always zero or close to zero in every socket, in 12.1 I get the > following (snipped output): > > [ .... ] >Hi again, after doing a further research, I have noticed that in 12.0 version dev.vmx.0.txq0.hstats.tso_packets is always increasing (especially when doing large transfers), but in 12.1 is statically equal to zero, never increases. As workaround, I have disabled TCP segment offloading by setting the net.inet.tcp.tso=0 sysctl in 12.1 version and the host again performs almost as well as before in 12.0. I hope this helps others suffering the same issue. Regards -- Jose G. Juanino