Pete French
2019-Sep-29 20:11 UTC
Running iperf3 as a server drops all connections to a machine
This is odd - I have six FreeBSD boxes on the local ether here. Five are HP Microservers, one is an AMD Ryzen. They are all running 12.1 r352847 built yesterday - and its one compilation I did which is rsynced out to all the machines, so an identical build. I was running iperf3 -s on one machine and iperf3 -c on another to measure TCp/IP rates as I am experimenting with the RACK version of the stack and using CUBIC as a congetion control algorithm. All goes well between the Intel boxes, and if I use the Ryzen box as a client ot the Intel boxes. But if I run the iperf3 server on the Ryzemn box and use one of the Intel boxes as a client then after a few seconds all the TCP connection on the Ryzen box are dropped. There is nothing in dmesg, no indication that I can find of anything going wrong, and I can immediately ssh back in. On the client side it looks like this: [webadmin at dogbert ~]$ iperf3 -c dilbert Connecting to host dilbert, port 5201 [ 5] local 10.64.50.50 port 41134 connected to 10.64.50.6 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 938 Mbits/sec 1 78.2 KBytes [ 5] 1.00-2.00 sec 112 MBytes 936 Mbits/sec 2 54.8 KBytes [ 5] 2.00-3.02 sec 14.7 MBytes 122 Mbits/sec 3 1.43 KBytes [ 5] 3.02-4.00 sec 0.00 Bytes 0.00 bits/sec 2 1.43 KBytes [ 5] 4.00-5.02 sec 0.00 Bytes 0.00 bits/sec 1 1.43 KBytes [ 5] 5.02-6.02 sec 0.00 Bytes 0.00 bits/sec 1 1.43 KBytes [ 5] 5.02-6.02 sec 0.00 Bytes 0.00 bits/sec 1 1.43 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-6.02 sec 238 MBytes 332 Mbits/sec 10 sender [ 5] 0.00-6.02 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: error - the server has terminated [webadmin at dogbert ~]$ ...and on the server side, this... petefrench at dilbert ~]$ iperf3 -s ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- Accepted connection from 10.64.50.50, port 41133 [ 5] local 10.64.50.6 port 5201 connected to 10.64.50.50 port 41134 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 105 MBytes 883 Mbits/sec [ 5] 1.00-2.00 sec 112 MBytes 936 Mbits/sec [ 5] 2.00-3.01 sec 21.3 MBytes 177 Mbits/sec Connection to dilbert.ingresso.co.uk closed by remote host. Connection to dilbert.ingresso.co.uk closed. This only happens on the Ryzen machine. It happens whether I am using the RACk stack or the default one, and whether I am using CUBIC or the default too. The system is build with CPUTYPE set to 'core2' and some modules skipped, but apart from that nothing special. Its a bit worrying to me that this can happen, especially in userland. Any opinions or things people would ike me to check ? Indeed, can anyone reproduce this ? cheers, -pete.
Eugene Grosbein
2019-Sep-29 20:30 UTC
Running iperf3 as a server drops all connections to a machine
30.09.2019 3:11, Pete French wrote:> Its a bit worrying to me that this can happen, especially in userland. > > Any opinions or things people would ike me to check ?netstat -p tcp -ss tcpdump -i $interface -npvs0 icmp or 'tcp[tcpflags] & (tcp-rst) != 0'
Pete French
2019-Oct-03 10:49 UTC
Running iperf3 as a server drops all connections to a machine
> > Any opinions or things people would ike me to check ? > > netstat -p tcp -ss > tcpdump -i $interface -npvs0 icmp or 'tcp[tcpflags] & (tcp-rst) !=3D 0'I havent had a chnace to look at this for a couple of days, but I thought I would give it mor testing now (doing the above). Updated to the latest -STABLE as I always do before checking things, and the problem has now gone away. Theres nothing I can see in the commits over the last few days which touches this - the only thign I can see is the changes to ixgbe, and mine is igb so it cant be that. The test I did at the time showed the network rate slowing to zero, but staying up if I didnt use cubic - the disconnect only happened using cubic, if thats a useful data point. I will try and find time to go back to the older ernel and see if it still does it. If not then I guess I can say it was a hardware fault, which seems to have fixed itself, but it didnt feel like it at the time. So, thats the update - will follow up to this thread if I can get it to happen again, and will make the suggested tests. (I assume you mean to run that inside 'screen' on the server side, yes?) thanks, -pete.