Santos, Jose Renato G (Jose Renato Santos)
2005-Apr-06 00:37 UTC
RE: [Xen-devel] MPI benchmark performance gap between native linux anddomU
Xuehai, Thanks for posting your new results. In fact it seems that your problem is not the same as the one we encountered. I believe your problem is due to a higher network latency in Xen. Your formula to compute throughput uses the inverse of round trip latency (if I understood it correctly). This probably means that your application is sensitive to the round trip latency. Your latency mesurements show a higher value for domainU and this is the reason for the lower throughput. I am not sure but it is possible that network interrupts or event notifications in the inter-domain channel are being coalesced and causing longer latency. Keir, do event notifications get coalesced in the inter-domain I/O channel for networking? Renato>> -----Original Message----- >> From: xuehai zhang [mailto:hai@cs.uchicago.edu] >> Sent: Tuesday, April 05, 2005 3:23 PM >> To: Santos, Jose Renato G (Jose Renato Santos); >> m+Ian.Pratt@cl.cam.ac.uk >> Cc: Xen-devel@lists.xensource.com; Aravind Menon; Turner, >> Yoshio; G John Janakiraman >> Subject: Re: [Xen-devel] MPI benchmark performance gap >> between native linux anddomU >> >> >> Hi Ian and Jose, >> >> Based on your suggestions, I did two more experiments: one >> (with tag "domU-B" in table below) is >> changing the TCP advertised window of domU to -2 (the >> default is 2) and the other (with tag "dom0" >> in table below) is to repeat the experiment in dom0 (only >> dom0 is running). The following table >> contains the results from these two new experiments plus two >> old ones (with tags "native-linux" and >> "domU-A" in table below) in my previous email. >> >> I have the following observation from the results: >> >> 1. Decreasing the scaling of TCP window ("domU-B") doesn''t >> buy any good to the performance but >> slightly slowdown the performance (comparing with "domU-A"). >> >> 2. Generally, the performance of running the experiments in >> dom0 ("dom0" column) is very close >> (slightly less) to the performance on native linux >> ("native-linux" column). However, in certain >> situations, it outperforms the performance on native linux. >> For example, throughput values when >> message size is 64KB and latency values when message size is >> 1 , or 2, or 4, or 8 bytes. >> >> 3. The performance gap between domU and dom0 is big, >> similarly as domU and native linux. >> >> BTW, each reported data point in the following table is the >> average of over 10 runs of the same >> experiments. I forget to mention that in experiment using >> user domains, the 8 domU forms a private >> network and each domU is assigned a private network IP (for >> example, 192.168.254.X). >> >> Xuehai >> >> ********************************* >> *SendRecv Throughput(Mbytes/sec)* >> ********************************* >> >> Msg Size(bytes) native-linux dom0 domU-A >> domU-B >> 0 0 0.00 0 0.00 >> 1 0 0.01 0 0.00 >> 2 0 0.01 0 0.00 >> 4 0 0.03 0 0.00 >> 8 0.04 0.05 0.01 0.01 >> 16 0.16 0.11 0.01 0.01 >> 32 0.34 0.21 0.02 0.02 >> 64 0.65 0.42 0.04 0.04 >> 128 1.17 0.79 0.09 0.10 >> 256 2.15 1.44 0.59 0.58 >> 512 3.4 2.39 1.23 1.22 >> 1024 5.29 3.79 2.57 2.50 >> 2048 7.68 5.30 3.5 3.44 >> 4096 10.7 8.51 4.96 5.23 >> 8192 13.35 11.06 7.07 6.00 >> 16384 14.9 13.60 3.77 4.62 >> 32768 9.85 11.13 3.68 4.34 >> 65536 5.06 9.06 3.02 3.14 >> 131072 7.91 7.61 4.94 5.04 >> 262144 7.85 7.65 5.25 5.29 >> 524288 7.93 7.77 6.11 5.40 >> 1048576 7.85 7.82 6.5 5.62 >> 2097152 8.18 7.35 5.44 5.32 >> 4194304 7.55 6.88 4.93 4.92 >> >> ********************************* >> * SendRecv Latency(millisec) * >> ********************************* >> >> Msg Size(bytes) native-linux dom0 domU-A >> domU-B >> 0 1979.6 1920.83 3010.96 >> 3246.71 >> 1 1724.16 397.27 3218.88 >> 3219.63 >> 2 1669.65 297.58 3185.3 >> 3298.86 >> 4 1637.26 285.27 3055.67 >> 3222.34 >> 8 406.77 282.78 2966.17 >> 3001.24 >> 16 185.76 283.87 2777.89 >> 2761.90 >> 32 181.06 284.75 2791.06 >> 2798.77 >> 64 189.12 293.93 2940.82 >> 3043.55 >> 128 210.51 310.47 2716.3 >> 2495.83 >> 256 227.36 338.13 843.94 853.86 >> 512 287.28 408.14 796.71 805.51 >> 1024 368.72 515.59 758.19 786.67 >> 2048 508.65 737.12 1144.24 >> 1150.66 >> 4096 730.59 917.97 1612.66 >> 1516.35 >> 8192 1170.22 1411.94 2471.65 >> 2650.17 >> 16384 2096.86 2297.19 8300.18 >> 6857.13 >> 32768 6340.45 5619.56 17017.99 >> 14392.36 >> 65536 24640.78 13787.31 41264.5 >> 39871.19 >> 131072 31709.09 32797.52 50608.97 >> 49533.68 >> 262144 63680.67 65174.67 94918.13 >> 94157.30 >> 524288 125531.7 128116.73 162168.47 >> 189307.05 >> 1048576 251566.94 252257.55 321451.02 >> 361714.44 >> 2097152 477431.32 527432.60 707981 >> 728504.38 >> 4194304 997768.35 1108898.61 1503987.61 >> 1534795.56 >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xuehai zhang
2005-Apr-06 04:24 UTC
Re: [Xen-devel] MPI benchmark performance gap between native linux anddomU
Jose, Thank you for your help to diagnose the problem! I kinda agree with you that the problem is due to the network latency. The throughput calculation of SencRecv benchmark is actually directly related to the latency and the following is its formula (where #_of_messages is 2 and the unit of message_size is bytes and the unit of latency is milliseconds): throughput = ((#_of_messages * message_size)/220)/(latency/106) So, the performance gap really comes from the delayed latency in domU. It is true that PMB''s SendRecv benchmark is sensitive to the round trip latency. I would like to hear Keir''s comments on the behavior of event notifications in the inter-domain I/O channel for networking very much. BTW, as I stated in my previous emails, besides the SendRecv benchmark, I also have other 11 PMB''s benchmark results for both native linux and domU. The following are PingPing results (between 2 nodes) in my experiments. As you can see, the performance gap is not that big as SendRecv and the performance is very closer in several testing cases. Part of the reason might come from the fact only two nodes are used and only one-way latency is used for the calculation of the latency and throughput values. Best, Xuehai P.S. Note: each reported data point in the following table is the average of over 10 runs of the same experiments, similarly as the SendRecv. PingPing Throughput (MB/sec) Msg-size(bytes) #repetitions native-linux domU 0 1000 0.00 0.00 1 1000 0.01 0.00 2 1000 0.01 0.01 4 1000 0.02 0.01 8 1000 0.04 0.02 16 1000 0.09 0.04 32 1000 0.17 0.09 64 1000 0.33 0.17 128 1000 0.65 0.33 256 1000 1.19 0.62 512 1000 1.95 1.06 1024 1000 2.80 1.73 2048 1000 3.74 2.52 4096 1000 5.38 3.77 8192 1000 6.49 4.79 16384 1000 7.45 4.97 32768 1000 6.74 5.27 65536 640 5.89 3.07 131072 320 5.27 3.11 262144 160 5.09 3.88 524288 80 5.00 4.84 1048576 40 4.95 4.91 2097152 20 4.94 4.89 4194304 10 4.93 4.92 PingPing Latency/Startup (usec) Msg-size(bytes) #repetitions native-linux domU 0 1000 172.78 342.89 1 1000 176.12 346.23 2 1000 173.48 344.20 4 1000 177.05 346.15 8 1000 177.54 343.56 16 1000 178.71 346.47 32 1000 176.71 351.25 64 1000 183.83 359.41 128 1000 188.09 371.94 256 1000 204.64 393.79 512 1000 250.63 462.45 1024 1000 349.20 565.03 2048 1000 521.56 773.63 4096 1000 726.62 1036.23 8192 1000 1204.54 1630.43 16384 1000 2097.42 3143.95 32768 1000 4633.77 5930.04 65536 640 10604.54 20335.55 131072 320 23717.61 40174.68 262144 160 49146.14 64505.20 524288 80 99962.09 103390.30 1048576 40 202000.30 203478.00 2097152 20 404857.10 408950.55 4194304 10 812047.60 813135.50 Santos, Jose Renato G (Jose Renato Santos) wrote:> Xuehai, > > Thanks for posting your new results. In fact it seems that your > problem is not the same as the one we encountered. > > I believe your problem is due to a higher network latency in Xen. Your > formula to compute throughput uses the inverse of round trip latency (if > I understood it correctly). This probably means that your application is > sensitive to the round trip latency. Your latency mesurements show a > higher value for domainU and this is the reason for the lower > throughput. I am not sure but it is possible that network interrupts or > event notifications in the inter-domain channel are being coalesced and > causing longer latency. Keir, do event notifications get coalesced in > the inter-domain I/O channel for networking? > > Renato > > > >>>-----Original Message----- >>>From: xuehai zhang [mailto:hai@cs.uchicago.edu] >>>Sent: Tuesday, April 05, 2005 3:23 PM >>>To: Santos, Jose Renato G (Jose Renato Santos); >>>m+Ian.Pratt@cl.cam.ac.uk >>>Cc: Xen-devel@lists.xensource.com; Aravind Menon; Turner, >>>Yoshio; G John Janakiraman >>>Subject: Re: [Xen-devel] MPI benchmark performance gap >>>between native linux anddomU >>> >>> >>>Hi Ian and Jose, >>> >>>Based on your suggestions, I did two more experiments: one >>>(with tag "domU-B" in table below) is >>>changing the TCP advertised window of domU to -2 (the >>>default is 2) and the other (with tag "dom0" >>>in table below) is to repeat the experiment in dom0 (only >>>dom0 is running). The following table >>>contains the results from these two new experiments plus two >>>old ones (with tags "native-linux" and >>>"domU-A" in table below) in my previous email. >>> >>>I have the following observation from the results: >>> >>>1. Decreasing the scaling of TCP window ("domU-B") doesn''t >>>buy any good to the performance but >>>slightly slowdown the performance (comparing with "domU-A"). >>> >>>2. Generally, the performance of running the experiments in >>>dom0 ("dom0" column) is very close >>>(slightly less) to the performance on native linux >>>("native-linux" column). However, in certain >>>situations, it outperforms the performance on native linux. >>>For example, throughput values when >>>message size is 64KB and latency values when message size is >>>1 , or 2, or 4, or 8 bytes. >>> >>>3. The performance gap between domU and dom0 is big, >>>similarly as domU and native linux. >>> >>>BTW, each reported data point in the following table is the >>>average of over 10 runs of the same >>>experiments. I forget to mention that in experiment using >>>user domains, the 8 domU forms a private >>>network and each domU is assigned a private network IP (for >>>example, 192.168.254.X). >>> >>>Xuehai >>> >>>********************************* >>>*SendRecv Throughput(Mbytes/sec)* >>>********************************* >>> >>>Msg Size(bytes) native-linux dom0 domU-A >>> domU-B >>> 0 0 0.00 0 0.00 >>> 1 0 0.01 0 0.00 >>> 2 0 0.01 0 0.00 >>> 4 0 0.03 0 0.00 >>> 8 0.04 0.05 0.01 0.01 >>> 16 0.16 0.11 0.01 0.01 >>> 32 0.34 0.21 0.02 0.02 >>> 64 0.65 0.42 0.04 0.04 >>> 128 1.17 0.79 0.09 0.10 >>> 256 2.15 1.44 0.59 0.58 >>> 512 3.4 2.39 1.23 1.22 >>> 1024 5.29 3.79 2.57 2.50 >>> 2048 7.68 5.30 3.5 3.44 >>> 4096 10.7 8.51 4.96 5.23 >>> 8192 13.35 11.06 7.07 6.00 >>> 16384 14.9 13.60 3.77 4.62 >>> 32768 9.85 11.13 3.68 4.34 >>> 65536 5.06 9.06 3.02 3.14 >>> 131072 7.91 7.61 4.94 5.04 >>> 262144 7.85 7.65 5.25 5.29 >>> 524288 7.93 7.77 6.11 5.40 >>> 1048576 7.85 7.82 6.5 5.62 >>> 2097152 8.18 7.35 5.44 5.32 >>> 4194304 7.55 6.88 4.93 4.92 >>> >>>********************************* >>>* SendRecv Latency(millisec) * >>>********************************* >>> >>>Msg Size(bytes) native-linux dom0 domU-A >>> domU-B >>> 0 1979.6 1920.83 3010.96 >>> 3246.71 >>> 1 1724.16 397.27 3218.88 >>> 3219.63 >>> 2 1669.65 297.58 3185.3 >>> 3298.86 >>> 4 1637.26 285.27 3055.67 >>> 3222.34 >>> 8 406.77 282.78 2966.17 >>> 3001.24 >>> 16 185.76 283.87 2777.89 >>> 2761.90 >>> 32 181.06 284.75 2791.06 >>> 2798.77 >>> 64 189.12 293.93 2940.82 >>> 3043.55 >>> 128 210.51 310.47 2716.3 >>> 2495.83 >>> 256 227.36 338.13 843.94 853.86 >>> 512 287.28 408.14 796.71 805.51 >>> 1024 368.72 515.59 758.19 786.67 >>> 2048 508.65 737.12 1144.24 >>> 1150.66 >>> 4096 730.59 917.97 1612.66 >>> 1516.35 >>> 8192 1170.22 1411.94 2471.65 >>> 2650.17 >>> 16384 2096.86 2297.19 8300.18 >>> 6857.13 >>> 32768 6340.45 5619.56 17017.99 >>> 14392.36 >>> 65536 24640.78 13787.31 41264.5 >>> 39871.19 >>> 131072 31709.09 32797.52 50608.97 >>> 49533.68 >>> 262144 63680.67 65174.67 94918.13 >>> 94157.30 >>> 524288 125531.7 128116.73 162168.47 >>> 189307.05 >>> 1048576 251566.94 252257.55 321451.02 >>> 361714.44 >>> 2097152 477431.32 527432.60 707981 >>> 728504.38 >>> 4194304 997768.35 1108898.61 1503987.61 >>> 1534795.56 >>> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel