Hi Keir/Jan, Recently I got a chance to access a big machine (2T mem/160 cpus) and I tested your patch: http://xenbits.xen.org/hg/xen-unstable.hg/rev/177fdda0be56 Attached is the result. Test environment old: # xm info host : ovs-3f-9e-04 release : 2.6.39-300.17.1.el5uek version : #1 SMP Fri Oct 19 11:30:08 PDT 2012 machine : x86_64 nr_cpus : 160 nr_nodes : 8 cores_per_socket : 10 threads_per_core : 2 cpu_mhz : 2394 hw_caps : bfebfbff:2c100800:00000000:00003f40:02bee3ff:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 2097142 free_memory : 2040108 free_cpus : 0 xen_major : 4 xen_minor : 1 xen_extra : .3OVM xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : dom0_mem=31390M no-bootscrub cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-48) cc_compile_by : mockbuild cc_compile_domain : us.oracle.com cc_compile_date : Fri Oct 19 21:34:08 PDT 2012 xend_config_format : 4 # uname -a Linux ovs-3f-9e-04 2.6.39-300.17.1.el5uek #1 SMP Fri Oct 19 11:30:08 PDT 2012 x86_64 x86_64 x86_64 GNU/Linux # cat /boot/grub/grub.conf ... kernel /xen.gz dom0_mem=31390M no-bootscrub dom0_vcpus_pin dom0_max_vcpus=32 Test environment new: old env + cs 26056 Test script: test-vm-memory-allocation.sh (attached) My conclusion from the test: - HVM create time is greatly reduced. - PVM create time is increased dramatically for 4G, 8G, 16G, 32G, 64G, 128G. - HVM/PVM destroy time is not affected. - If most of our customers are using PVM, I think this patch is bad: because most VM memory should under 128G. - If they are using HVM, then this patch is great. Questions for discussion: - Did you get the same result? - It seems this result is not ideal. We may need to improve it. Please note: Imay not have access to the same machine for awhile. Thanks, Zhigang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 12.11.12 at 16:01, Zhigang Wang <zhigang.x.wang@oracle.com> wrote: > My conclusion from the test: > > - HVM create time is greatly reduced. > - PVM create time is increased dramatically for 4G, 8G, 16G, 32G, 64G, 128G. > - HVM/PVM destroy time is not affected. > - If most of our customers are using PVM, I think this patch is bad: because > most VM memory should under 128G. > - If they are using HVM, then this patch is great. > > Questions for discussion: > > - Did you get the same result? > - It seems this result is not ideal. We may need to improve it.We''d first of all need to understand how this rather odd behavior can be explained. In order to have a better comparison basis, did you also do this for traditional PV? Or maybe I misunderstand what PVM stands for, and am mixing it up with PVH? You certainly agree that the two curves for what you call PVM have quite unusual a relationship. Jan
On 11/12/2012 10:17 AM, Jan Beulich wrote:>>>> On 12.11.12 at 16:01, Zhigang Wang <zhigang.x.wang@oracle.com> wrote: >> My conclusion from the test: >> >> - HVM create time is greatly reduced. >> - PVM create time is increased dramatically for 4G, 8G, 16G, 32G, 64G, 128G. >> - HVM/PVM destroy time is not affected. >> - If most of our customers are using PVM, I think this patch is bad: because >> most VM memory should under 128G. >> - If they are using HVM, then this patch is great. >> >> Questions for discussion: >> >> - Did you get the same result? >> - It seems this result is not ideal. We may need to improve it. > We''d first of all need to understand how this rather odd behavior > can be explained. In order to have a better comparison basis, did > you also do this for traditional PV? Or maybe I misunderstand > what PVM stands for, and am mixing it up with PVH? You certainly > agree that the two curves for what you call PVM have quite > unusual a relationship. >Let me attach the HVM and PV guest configure files. Actually I use xm create -p to create the VM, and destroy it immediately. So the guest kernel doesn''t matter. Please see the test script for detail. Thanks, Zhigang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
> From: Jan Beulich [mailto:JBeulich@suse.com] > Subject: Re: VM memory allocation speed with cs 26056 > > >>> On 12.11.12 at 16:01, Zhigang Wang <zhigang.x.wang@oracle.com> wrote: > > My conclusion from the test: > > > > - HVM create time is greatly reduced. > > - PVM create time is increased dramatically for 4G, 8G, 16G, 32G, 64G, 128G. > > - HVM/PVM destroy time is not affected. > > - If most of our customers are using PVM, I think this patch is bad: because > > most VM memory should under 128G. > > - If they are using HVM, then this patch is great. > > > > Questions for discussion: > > > > - Did you get the same result? > > - It seems this result is not ideal. We may need to improve it. > > We''d first of all need to understand how this rather odd behavior > can be explained. In order to have a better comparison basis, did > you also do this for traditional PV? Or maybe I misunderstand > what PVM stands for, and am mixing it up with PVH? You certainly > agree that the two curves for what you call PVM have quite > unusual a relationship.("PVM" is unfortunately often used within Oracle and means the same as "PV". "PVM" == paravirtualized virtual machine.) One significant difference is that a PV domain always allocates memory one 4K page at a time and the patch improves allocation performance only for larger-order allocations. A reasonable hypothesis is that the patch reduces performance on long sequences of 4K pages, though this doesn''t explain the curve of the PV_create measurements at 256G and above. With a one-line* hypervisor patch in alloc_heap_pages, one can change HVM allocation so that all larger allocations are rejected. It would be very interesting to see if that would result in an HVM create curve similar to the PV create curve. * change "unlikely(order > MAX_ORDER)" to "order > 0"
On 12/11/2012 15:01, "Zhigang Wang" <zhigang.x.wang@oracle.com> wrote:> Hi Keir/Jan, > > Recently I got a chance to access a big machine (2T mem/160 cpus) and I tested > your patch: http://xenbits.xen.org/hg/xen-unstable.hg/rev/177fdda0be56 > > Attached is the result.The PVM result is weird, there is a small-ish slowdown for small domains, becoming a very large %age slowdown as domain memory increases, and then turning into a *speedup* as the memory size gets very large indeed. What are the error bars like on these measurements I wonder? One thing we could do to allow PV guests doing 4k-at-a-time allocations through alloc_heap_pages() to benefit from the TLB-flush improvements, is pull the filtering-and-flush out into populate_physmap() and increase_reservation(). This is listed as a todo in the original patch (26056). To be honest I don''t know why the original patch would make PV domain creation slower, and certainly not by a varying %age depending on domain memory size! -- Keir> Test environment old: > > # xm info > host : ovs-3f-9e-04 > release : 2.6.39-300.17.1.el5uek > version : #1 SMP Fri Oct 19 11:30:08 PDT 2012 > machine : x86_64 > nr_cpus : 160 > nr_nodes : 8 > cores_per_socket : 10 > threads_per_core : 2 > cpu_mhz : 2394 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:02bee3ff:00000000:00000001:00000000 > virt_caps : hvm hvm_directio > total_memory : 2097142 > free_memory : 2040108 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .3OVM > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : unavailable > xen_commandline : dom0_mem=31390M no-bootscrub > cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-48) > cc_compile_by : mockbuild > cc_compile_domain : us.oracle.com > cc_compile_date : Fri Oct 19 21:34:08 PDT 2012 > xend_config_format : 4 > > # uname -a > Linux ovs-3f-9e-04 2.6.39-300.17.1.el5uek #1 SMP Fri Oct 19 11:30:08 PDT > 2012 x86_64 x86_64 x86_64 GNU/Linux > > # cat /boot/grub/grub.conf > ... > kernel /xen.gz dom0_mem=31390M no-bootscrub dom0_vcpus_pin > dom0_max_vcpus=32 > > Test environment new: old env + cs 26056 > > Test script: test-vm-memory-allocation.sh (attached) > > My conclusion from the test: > > - HVM create time is greatly reduced. > - PVM create time is increased dramatically for 4G, 8G, 16G, 32G, 64G, 128G. > - HVM/PVM destroy time is not affected. > - If most of our customers are using PVM, I think this patch is bad: because > most VM memory should under 128G. > - If they are using HVM, then this patch is great. > > Questions for discussion: > > - Did you get the same result? > - It seems this result is not ideal. We may need to improve it. > > Please note: Imay not have access to the same machine for awhile. > > Thanks, > > Zhigang >
On 11/12/2012 01:25 PM, Keir Fraser wrote:> On 12/11/2012 15:01, "Zhigang Wang" <zhigang.x.wang@oracle.com> wrote: > >> Hi Keir/Jan, >> >> Recently I got a chance to access a big machine (2T mem/160 cpus) and I tested >> your patch: http://xenbits.xen.org/hg/xen-unstable.hg/rev/177fdda0be56 >> >> Attached is the result. > The PVM result is weird, there is a small-ish slowdown for small domains, > becoming a very large %age slowdown as domain memory increases, and then > turning into a *speedup* as the memory size gets very large indeed. > > What are the error bars like on these measurements I wonder? One thing we > could do to allow PV guests doing 4k-at-a-time allocations through > alloc_heap_pages() to benefit from the TLB-flush improvements, is pull the > filtering-and-flush out into populate_physmap() and increase_reservation(). > This is listed as a todo in the original patch (26056). > > To be honest I don''t know why the original patch would make PV domain > creation slower, and certainly not by a varying %age depending on domain > memory size! > > -- KeirI did it second time. It seems the result (attached) is promising. I think the strange result is due to the order of testing: start_physical_machine -> test_hvm -> test_pvm. This time, I did: start_physical_machine -> test_pvm -> test_hvm. You can see the pvm memory allocation speed is not affected by your patch this time. So I believe this patch is excellent now. Thanks, Zhigang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
> From: Zhigang Wang > Subject: Re: [Xen-devel] VM memory allocation speed with cs 26056 > > On 11/12/2012 01:25 PM, Keir Fraser wrote: > > On 12/11/2012 15:01, "Zhigang Wang" <zhigang.x.wang@oracle.com> wrote: > > > >> Hi Keir/Jan, > >> > >> Recently I got a chance to access a big machine (2T mem/160 cpus) and I tested > >> your patch: http://xenbits.xen.org/hg/xen-unstable.hg/rev/177fdda0be56 > >> > >> Attached is the result. > > The PVM result is weird, there is a small-ish slowdown for small domains, > > becoming a very large %age slowdown as domain memory increases, and then > > turning into a *speedup* as the memory size gets very large indeed. > > > > What are the error bars like on these measurements I wonder? One thing we > > could do to allow PV guests doing 4k-at-a-time allocations through > > alloc_heap_pages() to benefit from the TLB-flush improvements, is pull the > > filtering-and-flush out into populate_physmap() and increase_reservation(). > > This is listed as a todo in the original patch (26056). > > > > To be honest I don''t know why the original patch would make PV domain > > creation slower, and certainly not by a varying %age depending on domain > > memory size! > > > > -- Keir > I did it second time. It seems the result (attached) is promising. > > I think the strange result is due to the order of testing: > start_physical_machine -> test_hvm -> test_pvm. > > This time, I did: start_physical_machine -> test_pvm -> test_hvm. > > You can see the pvm memory allocation speed is not affected by your patch this time. > > So I believe this patch is excellent now.I don''t know about Keir''s opinion, but to me the performance dependency on ordering is even more bizarre and still calls into question the acceptability of the patch. Customers aren''t going to like being told that, for best results, they should launch all PV domains before their HVM domains. Is scrubbing still/ever done lazily? That might explain some of the weirdness. Dan
On 13/11/2012 16:13, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>> I did it second time. It seems the result (attached) is promising. >> >> I think the strange result is due to the order of testing: >> start_physical_machine -> test_hvm -> test_pvm. >> >> This time, I did: start_physical_machine -> test_pvm -> test_hvm. >> >> You can see the pvm memory allocation speed is not affected by your patch >> this time. >> >> So I believe this patch is excellent now. > > I don''t know about Keir''s opinion, but to me the performance dependency > on ordering is even more bizarre and still calls into question the > acceptability of the patch. Customers aren''t going to like being > told that, for best results, they should launch all PV domains before > their HVM domains.Yes it''s very odd. I would want to see more runs to be convinced that nothing else weird is going on and affecting the results. If it is something to do with the movement of the TLB-flush filtering logic, it may just be a tiny difference being amplified by the vast number of times alloc_heap_pages() gets called. We should be able to actually *improve* PV build performance by pulling the flush filtering out into populate_physmap and increase_reservation.> Is scrubbing still/ever done lazily? That might explain some of > the weirdness.No lazy scrubbing any more. -- Keir
On 11/13/2012 10:46 AM, Zhigang Wang wrote:> On 11/12/2012 01:25 PM, Keir Fraser wrote: >> On 12/11/2012 15:01, "Zhigang Wang" <zhigang.x.wang@oracle.com> wrote: >> >>> Hi Keir/Jan, >>> >>> Recently I got a chance to access a big machine (2T mem/160 cpus) and I tested >>> your patch: http://xenbits.xen.org/hg/xen-unstable.hg/rev/177fdda0be56 >>> >>> Attached is the result. >> The PVM result is weird, there is a small-ish slowdown for small domains, >> becoming a very large %age slowdown as domain memory increases, and then >> turning into a *speedup* as the memory size gets very large indeed. >> >> What are the error bars like on these measurements I wonder? One thing we >> could do to allow PV guests doing 4k-at-a-time allocations through >> alloc_heap_pages() to benefit from the TLB-flush improvements, is pull the >> filtering-and-flush out into populate_physmap() and increase_reservation(). >> This is listed as a todo in the original patch (26056). >> >> To be honest I don''t know why the original patch would make PV domain >> creation slower, and certainly not by a varying %age depending on domain >> memory size! >> >> -- Keir > I did it second time. It seems the result (attached) is promising. > > I think the strange result is due to the order of testing: > start_physical_machine -> test_hvm -> test_pvm. > > This time, I did: start_physical_machine -> test_pvm -> test_hvm. > > You can see the pvm memory allocation speed is not affected by your patch this time. > > So I believe this patch is excellent now.I get another chance to run the test without (old) and with (new) cs 25056 with order: start_physical_machine -> test_hvm -> test_pvm It seems PV guest memory allocation is not affected by this patch, although it makes a big difference if testing with order: start_physical_machine -> test_pvm -> test_hvm Thanks for the patch. Zhigang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel