xuehai zhang
2005-Apr-04 22:43 UTC
[Xen-devel] MPI benchmark performance gap between native linux and domU
Hi all, I did the following experiments to explore the MPI application execution performance on both native linux machines and inside of unpriviledged Xen user domains. I use 8 machines with identical HW configurations (498.756 MHz dual CPU, 512MB memory, on a 10MB/sec LAN) and I use Pallas MPI Benchmarks (PMB). Experiment 1: I boot all 8 nodes with native linux (nosmp, kernel 2.4.29) and use all of them for PMB tests. Experiment 2: I boot all 8 nodes with Xen running and start a single user domain (port 2.6.10,using file-backed VBD) on each node with 360MB memory. Then I run the same PMB tests among these 8 user domains. The expreiment results show, running a same MPI benchmark in user domains usually results in a worse (sometimes very bad) performance comparing with on native linux machines. The following are the results for PMB SendRecv benchmark for both experiments (table1 and table2 report throughput and latency respectively). As you may notice, SendRecv can achieve a 14.9MB/sec throughput on native linux machines but can get a maximum 7.07 MB/sec throughput if running inside of user domains. The latency results also have big gap. Clearly, there is difference between the memory used in the native linux machine of Experiment 1 (512MB) and in the user domain (360MB, can not go higher because dom0 started with 128MB memory) of Experiment 2. However, I don''t think it is the main cause of the performance gap because the tested message sizes are much smaller than both memory sizes. I will appreciate your help if you had the similar experience and wanna share your insights. BTW, if you are not familar with PMB SendRecv benchmark, you can find a detailed explaination at http://people.cs.uchicago.edu/~hai/PMB-MPI1.pdf (see section 4.3.1). Thanks in advance for you help. Xuehai P.S. Table 1: SendRecv throughput (MB/sec) performance Message_Size(bytes) Experiment_1 Experiment_2 0 0 0 1 0 0 2 0 0 4 0 0 8 0.04 0.01 16 0.16 0.01 32 0.34 0.02 64 0.65 0.04 128 1.17 0.09 256 2.15 0.59 512 3.4 1.23 1K 5.29 2.57 2K 7.68 3.5 4K 10.7 4.96 8K 13.35 7.07 16K 14.9 3.77 32K 9.85 3.68 64K 5.06 3.02 128K 7.91 4.94 256K 7.85 5.25 512K 7.93 6.11 1M 7.85 6.5 2M 8.18 5.44 4M 7.55 4.93 Table 2: SendRecv latency (millisec) performance Message_Size(bytes) Experiment_1 Experiment_2 0 1979.6 3010.96 1 1724.16 3218.88 2 1669.65 3185.3 4 1637.26 3055.67 8 406.77 2966.17 16 185.76 2777.89 32 181.06 2791.06 64 189.12 2940.82 128 210.51 2716.3 256 227.36 843.94 512 287.28 796.71 1K 368.72 758.19 2K 508.65 1144.24 4K 730.59 1612.66 8K 1170.22 2471.65 16K 2096.86 8300.18 32K 6340.45 17017.99 64K 24640.78 41264.5 128K 31709.09 50608.97 256K 63680.67 94918.13 512K 125531.7 162168.47 1M 251566.94 321451.02 2M 477431.32 707981 4M 997768.35 1503987.61 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xuehai zhang
2005-Apr-04 23:18 UTC
[Xen-devel] MPI benchmark performance gap between native linux and domU
Hi all, I did the following experiments to explore the MPI application execution performance on both native linux machines and inside of unpriviledged Xen user domains. I use 8 machines with identical HW configurations (498.756 MHz dual CPU, 512MB memory, on a 10MB/sec LAN) and I use Pallas MPI Benchmarks (PMB). Experiment 1: I boot all 8 nodes with native linux (nosmp, kernel 2.4.29) and use all of them for PMB tests. Experiment 2: I boot all 8 nodes with Xen running and start a single user domain (port 2.6.10,using file-backed VBD) on each node with 360MB memory. Then I run the same PMB tests among these 8 user domains. The expreiment results show, running a same MPI benchmark in user domains usually results in a worse (sometimes very bad) performance comparing with on native linux machines. The following are the results for PMB SendRecv benchmark for both experiments (table1 and table2 report throughput and latency respectively). As you may notice, SendRecv can achieve a 14.9MB/sec throughput on native linux machines but can get a maximum 7.07 MB/sec throughput if running inside of user domains. The latency results also have big gap. Clearly, there is difference between the memory used in the native linux machine of Experiment 1 (512MB) and in the user domain (360MB, can not go higher because dom0 started with 128MB memory) of Experiment 2. However, I don''t think it is the main cause of the performance gap because the tested message sizes are much smaller than both memory sizes. I will appreciate your help if you had the similar experience and wanna share your insights. BTW, if you are not familar with PMB SendRecv benchmark, you can find a detailed explaination at http://people.cs.uchicago.edu/~hai/PMB-MPI1.pdf (see section 4.3.1). Thanks in advance for you help. Xuehai P.S. Table 1: SendRecv throughput (MB/sec) performance Message_Size(bytes) Experiment_1 Experiment_2 0 0 0 1 0 0 2 0 0 4 0 0 8 0.04 0.01 16 0.16 0.01 32 0.34 0.02 64 0.65 0.04 128 1.17 0.09 256 2.15 0.59 512 3.4 1.23 1K 5.29 2.57 2K 7.68 3.5 4K 10.7 4.96 8K 13.35 7.07 16K 14.9 3.77 32K 9.85 3.68 64K 5.06 3.02 128K 7.91 4.94 256K 7.85 5.25 512K 7.93 6.11 1M 7.85 6.5 2M 8.18 5.44 4M 7.55 4.93 Table 2: SendRecv latency (millisec) performance Message_Size(bytes) Experiment_1 Experiment_2 0 1979.6 3010.96 1 1724.16 3218.88 2 1669.65 3185.3 4 1637.26 3055.67 8 406.77 2966.17 16 185.76 2777.89 32 181.06 2791.06 64 189.12 2940.82 128 210.51 2716.3 256 227.36 843.94 512 287.28 796.71 1K 368.72 758.19 2K 508.65 1144.24 4K 730.59 1612.66 8K 1170.22 2471.65 16K 2096.86 8300.18 32K 6340.45 17017.99 64K 24640.78 41264.5 128K 31709.09 50608.97 256K 63680.67 94918.13 512K 125531.7 162168.47 1M 251566.94 321451.02 2M 477431.32 707981 4M 997768.35 1503987.61 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Nivedita Singhvi
2005-Apr-04 23:37 UTC
Re: [Xen-devel] MPI benchmark performance gap between native linux and domU
xuehai zhang wrote:> Experiment 1: I boot all 8 nodes with native linux (nosmp, kernel > 2.4.29) and use all> Experiment 2: I boot all 8 nodes with Xen running and start a single > user domain > (port 2.6.10,using file-backed VBD) on each node with 360MB memory. ThenWhat do you get when you compare 2.4.29 native Linux against 2.6.10 native Linux, without Xen involved at all? thanks, Nivedita _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xuehai zhang
2005-Apr-05 04:49 UTC
Re: [Xen-devel] MPI benchmark performance gap between native linux and domU
Nivedita Singhvi wrote:> xuehai zhang wrote: > > >> Experiment 1: I boot all 8 nodes with native linux (nosmp, kernel >> 2.4.29) and use all > > >> Experiment 2: I boot all 8 nodes with Xen running and start a single >> user domain >> (port 2.6.10,using file-backed VBD) on each node with 360MB memory. Then > > > > What do you get when you compare 2.4.29 native Linux > against 2.6.10 native Linux, without Xen involved at > all?2.6.10 is not for native Linux but for domU (Xen is running on the machine). Xuehai _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel