Hello, I try the 0.9.11-pre12 version of the gplpv drivers. They seem to work nicely, but annoyingly slow on a SMP Windows 2003 virtual machine: [root@xen02 ~]# ttcp -t -s tsm-neu.bl-group.physik.uni-muenchen.de ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 ttcp-t: sockbufsize=1048576, # tcp -> tsm-neu.bl-group.physik.uni-muenchen.de # ttcp-t: connect ttcp-t: 16777216 bytes in 0.368 real seconds = 44494.293 KB/sec +++ ttcp-t: 2048 I/O calls, msec/call = 0.184, calls/sec = 5561.787 ttcp-t: 0.000user 0.120sys 0:00real 33% 0i+0d 0maxrss 0+3pf 25+25csw [root@xen02 ~]# ttcp -r -s ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 ttcp-r: sockbufsize=1048576, # tcp # ttcp-r: accept from 192.168.180.199 ttcp-r: 819200 bytes in 15.175 real seconds = 52.719 KB/sec +++ ttcp-r: 240 I/O calls, msec/call = 64.745, calls/sec = 15.816 ttcp-r: 0.000user 0.004sys 0:15real 0% 0i+0d 0maxrss 0+2pf 296+0csw [root@xen02 ~]# It looks like it waits too long for packets to be sent. Sincerly, Klaus _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > Hello, > > I try the 0.9.11-pre12 version of the gplpv drivers. They seem to work > nicely, but annoyingly slow on a SMP Windows 2003 virtual machine: > > [root@xen02 ~]# ttcp -t -s tsm-neu.bl-group.physik.uni-muenchen.de > ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 > ttcp-t: sockbufsize=1048576, # tcp -> > tsm-neu.bl-group.physik.uni-muenchen.de # > ttcp-t: connect > ttcp-t: 16777216 bytes in 0.368 real seconds = 44494.293 KB/sec +++ > ttcp-t: 2048 I/O calls, msec/call = 0.184, calls/sec = 5561.787 > ttcp-t: 0.000user 0.120sys 0:00real 33% 0i+0d 0maxrss 0+3pf 25+25csw > [root@xen02 ~]# ttcp -r -s > ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 > ttcp-r: sockbufsize=1048576, # tcp # > ttcp-r: accept from 192.168.180.199 > ttcp-r: 819200 bytes in 15.175 real seconds = 52.719 KB/sec +++ > ttcp-r: 240 I/O calls, msec/call = 64.745, calls/sec = 15.816 > ttcp-r: 0.000user 0.004sys 0:15real 0% 0i+0d 0maxrss 0+2pf 296+0csw > [root@xen02 ~]# > > It looks like it waits too long for packets to be sent. >In testing from Dom0 to DomU with vcpus=4 I get marginally worse performance than with vcpus=1. Can you tell me as much as you can about the above setup so I can attempt to reproduce it? James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > > > It looks like it waits too long for packets to be sent. > > > > In testing from Dom0 to DomU with vcpus=4 I get marginally worse > performance than with vcpus=1. Can you tell me as much as you canabout> the above setup so I can attempt to reproduce it? >Actually... I wish to change my statement :) When I use iperf with -P4, I get the following for the 4 connections: [ ID] Interval Transfer Bandwidth [1800] 0.0-10.0 sec 1.19 GBytes 1.02 Gbits/sec [1844] 0.0-10.0 sec 1.13 GBytes 967 Mbits/sec [1748] 0.0-14.0 sec 13.5 MBytes 8.05 Mbits/sec [1836] 0.0-15.9 sec 391 MBytes 207 Mbits/sec [SUM] 0.0-15.9 sec 2.71 GBytes 1.47 Gbits/sec The test is only supposed to run for 10 seconds, so the fact that it takes an additional 4-6 seconds for two of the connections to finish up means that maybe things aren''t as they should be. Sometimes iperf runs all the connections at full speed (giving me around 2.4GBit/second) but sometimes some of them stall and I get much worse performance. Something to investigate I suppose. James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi James,>>> It looks like it waits too long for packets to be sent. >>> >> In testing from Dom0 to DomU with vcpus=4 I get marginally worse >> performance than with vcpus=1. Can you tell me as much as you can > about >> the above setup so I can attempt to reproduce it?The domU is a Windows 2003 x86_64 Server with all Updates. I tested it with both vcpu=1 and vcpu=2 domU Memory: 4 GByte The dom0 runs on a Dell Poweredge 6800 with 4 Xeon 2.6 Ghz Dualcore, 16 GByte Memory connected to a SAN. Here are my tests: From dom0 to domU with dual test: [root@xen02 ~]# iperf -c tsm-neu.bl-group.physik.uni-muenchen.de -d ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ ------------------------------------------------------------ Client connecting to tsm-neu.bl-group.physik.uni-muenchen.de, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 5] local 192.168.160.91 port 48790 connected with 192.168.180.199 port 5001 [ 4] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1120 [ 5] 0.0-10.0 sec 339 MBytes 284 Mbits/sec [ 4] 0.0-10.0 sec 344 KBytes 281 Kbits/sec As you can see the direction to the domU is fast, but the other way is a factor of 1000 slower. Now the client on domU: [root@xen02 ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1121 [ 4] 0.0-10.6 sec 264 KBytes 204 Kbits/sec [root@xen02 ~]# Now with -d on the client: [root@xen02 ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1122 ------------------------------------------------------------ Client connecting to 192.168.180.199, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 6] local 192.168.160.91 port 36322 connected with 192.168.180.199 port 5001 [ 6] 0.0-10.0 sec 340 MBytes 285 Mbits/sec [ 4] 0.0-10.3 sec 632 KBytes 503 Kbits/sec [root@xen02 ~]# Sincerly, Klaus _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> [ 5] 0.0-10.0 sec 339 MBytes 284 Mbits/sec > [ 4] 0.0-10.0 sec 344 KBytes 281 Kbits/sec > > As you can see the direction to the domU is fast, but the other way is a > factor of 1000 slower.Yeah. That certainly is sucky performance :| Is that the same with and without SMP (I''ve just gotten back from a holiday and I have a cold so forgive me if you''ve already answered that question) Can you turn off Large Send Offload in the Advanced properties of the network adapter in device manager? If that makes a big difference, can you do a test in the direction of most suckiness, and capture port 5001 traffic on the vifX.0 interface in Dom0, and the xennet network interface in the DomU? Then send them both to me, or give me a link to somewhere I can download them from. 344Kbytes of packets isn''t much (although I''d expect it will be a bit more without -d), but you can reduce that again if you use the -t option to reduce the test time to a few seconds and then gzip it (iperf data gzips beautifully). Thanks James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Quoting "James Harper" <james.harper@bendigoit.com.au>:>> [ 5] 0.0-10.0 sec 339 MBytes 284 Mbits/sec >> [ 4] 0.0-10.0 sec 344 KBytes 281 Kbits/sec >> >> As you can see the direction to the domU is fast, but the other way is a >> factor of 1000 slower. > > Yeah. That certainly is sucky performance :| > > Is that the same with and without SMP (I''ve just gotten back from a holiday > and I have a cold so forgive me if you''ve already answered that question)I had now the time to test some things. All test now with pre13. First a non SMP guest on another host. That''s fairly fast: [root@xen03 ~]# iperf -c mws-neu.bl-group.physik.uni-muenchen.de ------------------------------------------------------------ Client connecting to mws-neu.bl-group.physik.uni-muenchen.de, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.160.92 port 40278 connected with 192.168.180.149 port 5001 [ 3] 0.0-10.0 sec 254 MBytes 213 Mbits/sec [root@xen03 ~]# virt-manager & [1] 23662 [root@xen03 ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.92 port 5001 connected with 192.168.180.149 port 1065 [ 4] 0.0-10.2 sec 285 MBytes 235 Mbits/sec [root@xen03 ~]# iperf -c mws-neu.bl-group.physik.uni-muenchen.de ------------------------------------------------------------ Client connecting to mws-neu.bl-group.physik.uni-muenchen.de, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.160.92 port 36917 connected with 192.168.180.149 port 5001 [ 3] 0.0-10.0 sec 641 MBytes 538 Mbits/sec [root@xen03 ~]# [root@xen03 ~]# iperf -c mws-neu.bl-group.physik.uni-muenchen.de -P 4 write2 failed: Connection reset by peer ------------------------------------------------------------ Client connecting to mws-neu.bl-group.physik.uni-muenchen.de, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 6] local 192.168.160.92 port 60092 connected with 192.168.180.149 port 5001 [ 6] 0.0- 0.0 sec 1.00 MBytes 1.91 Gbits/sec [ 3] local 192.168.160.92 port 60089 connected with 192.168.180.149 port 5001 [ 4] local 192.168.160.92 port 60090 connected with 192.168.180.149 port 5001 [ 5] local 192.168.160.92 port 60091 connected with 192.168.180.149 port 5001 [ 4] 0.0-10.0 sec 310 MBytes 260 Mbits/sec [ 3] 0.0-10.0 sec 237 MBytes 199 Mbits/sec [ 5] 0.0-10.1 sec 241 MBytes 201 Mbits/sec [SUM] 0.0-10.1 sec 790 MBytes 659 Mbits/sec [root@xen03 ~]# [root@xen03 ~]# iperf -s -P 4 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.92 port 5001 connected with 192.168.180.149 port 1071 [ 5] local 192.168.160.92 port 5001 connected with 192.168.180.149 port 1072 [ 6] local 192.168.160.92 port 5001 connected with 192.168.180.149 port 1073 [ 7] local 192.168.160.92 port 5001 connected with 192.168.180.149 port 1074 [ 4] 0.0-10.0 sec 178 MBytes 149 Mbits/sec [ 5] 0.0-10.0 sec 183 MBytes 153 Mbits/sec [ 7] 0.0-10.0 sec 173 MBytes 145 Mbits/sec [ 6] 0.0-10.0 sec 173 MBytes 145 Mbits/sec [SUM] 0.0-10.0 sec 707 MBytes 592 Mbits/sec [root@xen03 ~]# Ok, linux PVM Guest are much faster ( between 1 - 2 GBit /s), and that even with copying receive path.> Can you turn off Large Send Offload in the Advanced properties of the > network adapter in device manager?Yep, that''s it. The first dom0 I tested the SMP machine has TCP segmentation offload disabled (probably by the BIOS): [root@xen02 ~]# ethtool -k peth0 Offload parameters for peth0: Cannot get device udp large send offload settings: Operation not supported rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: off [root@xen02 ~]# When i start the SMP guest on another machine with TSO enabled I get fast transfers (around 200 - 250 Mbit/s). I then switch off Large Send Offload in the windows driver an started the domU again on the original host (the other one has faster CPU''s so the numbers are not comparable), and voila, I get fast transfers also: [root@xen02 ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1077 [ 4] 0.0-10.0 sec 198 MBytes 166 Mbits/sec [ 5] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1078 [ 5] 0.0-10.0 sec 195 MBytes 163 Mbits/sec [root@xen02 ~]# iperf -s -P 4 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1079 [ 5] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1080 [ 6] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1081 [ 7] local 192.168.160.91 port 5001 connected with 192.168.180.199 port 1082 [ 5] 0.0-10.0 sec 111 MBytes 92.7 Mbits/sec [ 6] 0.0-10.0 sec 110 MBytes 91.8 Mbits/sec [ 7] 0.0-10.0 sec 109 MBytes 90.9 Mbits/sec [ 4] 0.0-10.0 sec 109 MBytes 91.4 Mbits/sec [SUM] 0.0-10.0 sec 438 MBytes 366 Mbits/sec [root@xen02 ~]# So as a conclusion: Its very important to match the Large Send Offload setting to the setting of the dom0. Sincerly, Klaus -- Klaus Steinberger Beschleunigerlaboratorium Phone: (+49 89)289 14287 Am Coulombwall 6, D-85748 Garching, Germany FAX: (+49 89)289 14280 EMail: Klaus.Steinberger@Physik.Uni-Muenchen.DE URL: http://www.physik.uni-muenchen.de/~Klaus.Steinberger/ ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users