Hello Jeremy + Konrad + all kernel devs, while it is great to see that vanilla dom0 seems to work, the performance breakdown compared to xenified 2.6.34 (from opensuse) is huge. Here are my benchmarks: Xen 4.1.1, Xeon E5620, Supermicro X8DTi-F, 12 GB RAM, dom0 2 VCPUs time emerge apache: 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 real 1m0.560s 0m59.971s 0m47.689s user 0m40.939s 0m40.619s 0m41.355s sys 0m18.865s 0m18.305s 0m11.441s time make -j4 (3.2.12 linux compile): 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 real 5m8.793s 5m4.888s 4m20.576s user 8m1.746s 7m59.726s 7m10.375s sys 1m39.010s 1m32.994s 0m56.304s Regards Andreas
On 02/05/12 14:11, Andreas Kinzler wrote:> Hello Jeremy + Konrad + all kernel devs, > > while it is great to see that vanilla dom0 seems to work, the > performance breakdown compared to xenified 2.6.34 (from opensuse) is > huge. Here are my benchmarks: > > Xen 4.1.1, Xeon E5620, Supermicro X8DTi-F, 12 GB RAM, dom0 2 VCPUs > > time emerge apache: > > 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 > real 1m0.560s 0m59.971s 0m47.689s > user 0m40.939s 0m40.619s 0m41.355s > sys 0m18.865s 0m18.305s 0m11.441sCan you apply 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac "xen: correctly check for pending events when restoring irq flags"[1] and see how much it helps? This patch is in 3.4-rc5 and is queued for 3.3.5. David [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=7eb7ce4d2e8991aff4ecb71a81949a907ca755ac
On Wed, 2012-05-02 at 14:11 +0100, Andreas Kinzler wrote:> Hello Jeremy + Konrad + all kernel devs, > > while it is great to see that vanilla dom0 seems to work, the > performance breakdown compared to xenified 2.6.34 (from opensuse) is > huge. Here are my benchmarks:There were a couple of performance fixes posted recently, one of them was "xen: correctly check for pending events when restoring irq flags" from David Vrabel, which is now in mainline as 7eb7ce4d2e89 and marked for stable backport. The other was something to do with blk i/o performance from Stefano Stabellini which I don''t have a handy reference too or status on (hopefully Konrad does though). I think those will undoubtedly help although I think performance tuning of the upstream dom0 kernel is still something we need to do more of in the short term.> Xen 4.1.1, Xeon E5620, Supermicro X8DTi-F, 12 GB RAM, dom0 2 VCPUs > > time emerge apache: > > 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 > real 1m0.560s 0m59.971s 0m47.689s > user 0m40.939s 0m40.619s 0m41.355s > sys 0m18.865s 0m18.305s 0m11.441s > > time make -j4 (3.2.12 linux compile): > > 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 > real 5m8.793s 5m4.888s 4m20.576s > user 8m1.746s 7m59.726s 7m10.375s > sys 1m39.010s 1m32.994s 0m56.304sDid you happen to also compare 3.2.12 and 2.6.34.10 running these workloads natively? Ian.
On 02.05.2012 15:30, David Vrabel wrote:> Can you apply 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac "xen: correctly > check for pending events when restoring irq flags"[1] and see how much > it helps?There is some minor improvement - but it is still far away from xenified 2.6.34.10. time emerge apache: 3.2.12-dom0 3.3.4-dom0 (w. patch) 2.6.34.10-dom0 real 1m0.560s 0m59.971s (0m58.029s) 0m47.689s user 0m40.939s 0m40.619s (0m40.291s) 0m41.355s sys 0m18.865s 0m18.305s (0m16.837s) 0m11.441s time make -j4 (3.2.12 linux compile): 3.2.12-dom0 3.3.4-dom0 (w. patch) 2.6.34.10-dom0 real 5m8.793s 5m4.888s (5m1.408s) 4m20.576s user 8m1.746s 7m59.726s (7m57.534s) 7m10.375s sys 1m39.010s 1m32.994s (1m29.518s) 0m56.304s Regards Andreas
On Wed, May 2, 2012 at 6:01 PM, Andreas Kinzler <ml-xen-devel@hfp.de> wrote:> On 02.05.2012 15:30, David Vrabel wrote: >> >> Can you apply 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac "xen: correctly >> check for pending events when restoring irq flags"[1] and see how much >> it helps? > > > There is some minor improvement - but it is still far away from xenified > 2.6.34.10.Just FYI, the reason Ian suggested making the same comparison for native is that the performance of linux overall on bare-metal has also suffered since 2.6.34. It''s likely that a non-trivial amount of the performance regression is due to moving from 2.6.34 to 3.{2,3}, over and above whatever regressions may have happened when moving from xenified to pvops. -George
On 02.05.2012 15:31, Ian Campbell wrote:> Did you happen to also compare 3.2.12 and 2.6.34.10 running these > workloads natively?Yes, I tested against 2.6.32.x. Differences exist, but are minor. time emerge apache: 2.6.32.36 3.2.12 3.3.4 real 0m31.419s 0m34.770s 0m35.210s user 0m45.479s 0m38.994s 0m39.750s sys 0m6.488s 0m4.584s 0m4.928s make -j4: 2.6.32.36 3.2.12 3.3.4 real 2m3.531s 2m4.423s 2m2.348s user 7m45.817s 7m21.456s 7m18.291s sys 0m35.194s 0m28.758s 0m28.974s Regards Andreas
On 03.05.2012 10:32, George Dunlap wrote:>>> Can you apply 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac "xen: correctly >>> check for pending events when restoring irq flags"[1] and see how much >>> it helps? >> There is some minor improvement - but it is still far away from xenified >> 2.6.34.10. > Just FYI, the reason Ian suggested making the same comparison for > native is that the performance of linux overall on bare-metal has also > suffered since 2.6.34. It''s likely that a non-trivial amount of the > performance regression is due to moving from 2.6.34 to 3.{2,3}, over > and above whatever regressions may have happened when moving from > xenified to pvops.I took his suggestion serious - and actually I had performed these tests (see my other post). Unfortunately, the minor loss on bare-metal and the huge loss on xenified 2.6.34 vs pvops 3.x show that the problem is clearly with the Xen changes and not the bare-metal changes. Regards Andreas
On Thu, 2012-05-03 at 11:43 +0100, Andreas Kinzler wrote:> On 02.05.2012 15:31, Ian Campbell wrote: > > Did you happen to also compare 3.2.12 and 2.6.34.10 running these > > workloads natively? > > Yes, I tested against 2.6.32.x. Differences exist, but are minor.Good to know, thanks for testing The other potential Xen perf thing which just occurred to to me is the ACPI power management stuff which the xen-acpi-processor patches in 3.4-rcN are fixing. These are necessary to enable things like turbo mode so have a pretty large perf impact. Are you able to try the latest 3.4-rc kernel? I''m not sure if backports to the kernels you are running exist or not, Konrad? Ian.
On Wed, 2 May 2012, Andreas Kinzler wrote:> Hello Jeremy + Konrad + all kernel devs, > > while it is great to see that vanilla dom0 seems to work, the > performance breakdown compared to xenified 2.6.34 (from opensuse) is > huge. Here are my benchmarks: >Thanks for running benchmarks!> Xen 4.1.1, Xeon E5620, Supermicro X8DTi-F, 12 GB RAM, dom0 2 VCPUs > > time emerge apache: > > 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 > real 1m0.560s 0m59.971s 0m47.689s > user 0m40.939s 0m40.619s 0m41.355s > sys 0m18.865s 0m18.305s 0m11.441s > > time make -j4 (3.2.12 linux compile): > > 3.2.12-dom0 3.3.4-dom0 2.6.34.10-dom0 > real 5m8.793s 5m4.888s 4m20.576s > user 8m1.746s 7m59.726s 7m10.375s > sys 1m39.010s 1m32.994s 0m56.304sJust for clarity, are you running this in dom0 (not in a VM), correct? If you are running the test in a VM, is it a PV or an HVM guest? What is the guest kernel version? What is the vcpu and memory configuration? And finally, what are you using as the disk image (file or LVM)?
On Wed, 2 May 2012, Ian Campbell wrote:> On Wed, 2012-05-02 at 14:11 +0100, Andreas Kinzler wrote: > > Hello Jeremy + Konrad + all kernel devs, > > > > while it is great to see that vanilla dom0 seems to work, the > > performance breakdown compared to xenified 2.6.34 (from opensuse) is > > huge. Here are my benchmarks: > > There were a couple of performance fixes posted recently, one of them > was "xen: correctly check for pending events when restoring irq flags" > from David Vrabel, which is now in mainline as 7eb7ce4d2e89 and marked > for stable backport. The other was something to do with blk i/o > performance from Stefano Stabellini which I don''t have a handy reference > too or status on (hopefully Konrad does though).It is this one: http://marc.info/?l=xen-devel&m=133526478318742&w=2 but it is only relevant if you are running the benchmark in a VM, with the disk image stored in a file.
On Thu, May 03, 2012 at 11:51:57AM +0100, Ian Campbell wrote:> On Thu, 2012-05-03 at 11:43 +0100, Andreas Kinzler wrote: > > On 02.05.2012 15:31, Ian Campbell wrote: > > > Did you happen to also compare 3.2.12 and 2.6.34.10 running these > > > workloads natively? > > > > Yes, I tested against 2.6.32.x. Differences exist, but are minor. > > Good to know, thanks for testing > > The other potential Xen perf thing which just occurred to to me is the > ACPI power management stuff which the xen-acpi-processor patches in > 3.4-rcN are fixing. These are necessary to enable things like turbo mode > so have a pretty large perf impact. > > Are you able to try the latest 3.4-rc kernel? > > I''m not sure if backports to the kernels you are running exist or not, > Konrad?No. But they should be easy to cherry-pick. However, I would suggest trying v3.4-rc6 first and seeing if that makes a difference.> > Ian.
On 03.05.2012 12:51, Ian Campbell wrote:> On Thu, 2012-05-03 at 11:43 +0100, Andreas Kinzler wrote: >> On 02.05.2012 15:31, Ian Campbell wrote: >>> Did you happen to also compare 3.2.12 and 2.6.34.10 running these >>> workloads natively? >> Yes, I tested against 2.6.32.x. Differences exist, but are minor. > > Good to know, thanks for testing > > The other potential Xen perf thing which just occurred to to me is the > ACPI power management stuff which the xen-acpi-processor patches in > 3.4-rcN are fixing. These are necessary to enable things like turbo mode > so have a pretty large perf impact. > > Are you able to try the latest 3.4-rc kernel?Yes, meanwhile I tried 3.4-rc7. There is some improvement but still a good bit away from 2.6.34 xenified: time emerge apache: 3.2.12-dom0 3.3.4-dom0 (w. patch) 3.4.0-rc7 2.6.34.10-dom0 real 1m0.560s 0m59.971s (0m58.029s) 0m55.000s 0m47.689s user 0m40.939s 0m40.619s (0m40.291s) 0m37.846s 0m41.355s sys 0m18.865s 0m18.305s (0m16.837s) 0m16.041s 0m11.441s time make -j4 (3.2.12 linux compile): 3.2.12-dom0 3.3.4-dom0 (w. patch) 3.4.0-rc7 2.6.34.10-dom0 real 5m8.793s 5m4.888s (5m1.408s) 4m48.839s 4m20.576s user 8m1.746s 7m59.726s (7m57.534s) 7m40.129s 7m10.375s sys 1m39.010s 1m32.994s (1m29.518s) 1m20.993s 0m56.304s Regards Andreas