thr3ads.net - Xen devel - [Xen-devel] Xen Benchmarking guidelines [Aug 2007]

If this information is useful, please help other people find it:
Share via:

Nick L. Petroni Jr.

2007-Aug-14 10:37 UTC

[Xen-devel] Xen Benchmarking guidelines

Hi,

I''m trying to perform some simple benchmarks for an HVM domain and
I''m
wondering if there is a set of guidelines available for configuration 
tweaking etc. VMware gives a set of guidelines along the lines of 
"pre-allocate disks, don''t use page sharing, etc." and
I''m curious if
there is an analogous document/resource for Xen.

The main problem I am having is high variance in my benchmark runs, 
specifically SPECCPU2006 workloads in an HVM domain. Sometimes I get 
fast runs of a given workload once or twice, but invariably at least one 
of three consecutive runs is much worse than the best overall time.

Below are some details of my specific problem. I''m most interested in
test
repeatability, but maximizing performance would be great too.

Thanks and sorry if I missed this info on this or other forums.

Best,
nick

Details:

When running SPECCPU2006 in HVM domain, experience high variance in test 
times. Here are some specific results to consider (times given for three 
consecutive runs of each workload, in seconds).

Xen First test:
   403.gcc:         2290, 2290, 2290   <-- low variance, poor performance
   456.hmmer:       3530, 3530, 3530   <-- low variance, poor performance
   458.sjeng:       1340, 2640, 2630   <-- High variance
   462.libquantum:  3340, 3820, 3810   <-- High variance, poor perofrmance

Xen Second test:
   403.gcc:         1540, 2780, 2770   <-- High variance
   456.hmmer:       1770, 3580, 3590   <-- High variance
   458.sjeng:       1360, 1360, 2680   <-- High variance
   462.libquantum:  2300, 3130, 3900   <-- High variance

Bare HW test for comparison:
   403.gcc:         767, 775, 784
   456.hmmer:       1660, 1670, 1640
   458.sjeng:       1190, 1220, 1220
   462.libquantum:  2130, 2080, 2030

HW:
  - IBM/Lenovo T60
  - Core Duo T2500 2.00GHz, VT-x and multicore enabled in BIOS
  - 1.5GB physical memory

Xen:
  - Similar results for both 3.0.2-2 and 3.1.0 stable releases

Dom0:
  - Debian Lenny, default 2.6.18-xen kernel build

HVM domain:
  - 1200MB RAM
  - 1 vcpu
  - file disk image
  - Red Hat 7.3 with default 2.4.18-3 (non-smp) kernel
    (yes this is old, but I''m stuck with this distro for other reasons)
  - SPECCPU2006, base run

In general, I''m using standard benchmarking best practices, i.e., 
disabling unnecessary services in host and guest, etc. Perhaps this is
related to VCPU scheduling? Should I try pinning a vcpu? Please let 
me know if I can provide more details. Any help or insight would be much 
appreciated!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Aug-14 10:53 UTC

head link

Re: [Xen-devel] Xen Benchmarking guidelines

On 14/8/07 11:37, "Nick L. Petroni Jr." <npetroni@cs.umd.edu>
wrote:
> In general, I''m using standard benchmarking best practices, i.e.,
> disabling unnecessary services in host and guest, etc. Perhaps this is
> related to VCPU scheduling? Should I try pinning a vcpu? Please let
> me know if I can provide more details. Any help or insight would be much
> appreciated!
It should be possible, and in fact difficult not, to get almost native
scores on SPECINT benchmarks from within an HVM guest. There''s no I/O
or
system activity at all -- it''s just measuring raw CPU speed.

The most likely culprits are scheduling problems or time problems in the HVM
guest.

To discount scheduling issues, it''s probably worth pinning your HVM
VCPU to
a single physical CPU (and set the affinity of dom0 so that it
*doesn''t* run
on that physical CPU) and see if that helps.

For time issues, you can time your SPECINT runs with a stopwatch. Or perhaps
you can come with some more automatable means, but you should aim to take
before/after timestamps from *outside* the HVM guest, since you''re
trying to
ascertain whether the HVM timekeeping is screwed on your system.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Nick L. Petroni Jr.

2007-Aug-14 11:31 UTC

head link

Re: [Xen-devel] Xen Benchmarking guidelines

Thanks for the fast reply.
> Itshould be possible, and in fact difficult not, to get almost native
> scores on SPECINT benchmarks from within an HVM guest. There''s no
I/O or
> system activity at all -- it''s just measuring raw CPU speed.
Yeah, this was my thought as well. My VMware numbers showed some 
degradation over native, which might be attributable to time issues (and 
maybe the fact that it''s Workstation, not ESX), but they were far more 
consistent across runs.
> The most likely culprits are scheduling problems or time problems in the
HVM
> guest.
>
> To discount scheduling issues, it''s probably worth pinning your
HVM VCPU to
> a single physical CPU (and set the affinity of dom0 so that it
*doesn''t* run
> on that physical CPU) and see if that helps.
OK, I''ll try this. I may try just disabling multi-core and see what
1CPU
does too.  I''ll let you know how it turns out.
> For time issues, you can time your SPECINT runs with a stopwatch. Or
perhaps
> you can come with some more automatable means, but you should aim to take
> before/after timestamps from *outside* the HVM guest, since you''re
trying to
> ascertain whether the HVM timekeeping is screwed on your system.
I suspect there are some time issues here, but they are definitely not 
the primary culprit. I did use a stopwatch for some of my tests and the 
tests with more "bad" runs took up to a couple of hours longer to run
in
real clock time. One problem is that it''s more difficult to report
numbers
with external times. I''ve seen recommendations to use ping timestamps
etc.
to an external machine, but I''m mostly concerned with relative
degradation
after I add some workload, so I''m hoping the timing issue 
will affect all of my tests similarly and be less of an issue. First I 
need to get at least a consistent run without my workload though.

Thanks again for your help,
nick

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Nick L. Petroni Jr.

2007-Aug-15 15:20 UTC

head link

Re: [Xen-devel] Xen Benchmarking guidelines

Hi,
> It should be possible, and in fact difficult not, to get almost native
> scores on SPECINT benchmarks from within an HVM guest. There''s no
I/O or
> system activity at all -- it''s just measuring raw CPU speed.
>
> The most likely culprits are scheduling problems or time problems in the
HVM
> guest.
>
> To discount scheduling issues, it''s probably worth pinning your
HVM VCPU to
> a single physical CPU (and set the affinity of dom0 so that it
*doesn''t* run
> on that physical CPU) and see if that helps.
I''ve run some additional tests, this time with the following settings:
xm vcpu-pin 0 0 0
xm vcpu-pin 1 0 1

The overall numbers were better. That is, the best and worst times were, 
in general, faster than before. However, the variance is still high. Here 
are my results for my Red Hat 7.3 HVM guest:

http://www.cs.umd.edu/~npetroni/xen_cpu_results/CINT2006.001.html

(NOTE: I forgot to update the configuration description before running, so 
it says this is Xen 3.0.2-2. It''s actually 3.1.0)

I only ran four of the workloads (hence the "Invalid Run" wallpaper)
and
experienced the same trend as before -- after a few workloads, the 
numbers get worse. To be clear, the benchmark runs the workloads in column 
order, not row. So, the test goes: gcc, hmmer, sjeng, libquantum, gcc, 
hmmer, sjeng, ...

I thought this could be a guest scheduler issue of some sort, so I re-ran 
with a vanilla Fedora Core 6 (SELinux etc. disabled) HVM domain. Here are 
those results:

http://www.cs.umd.edu/~npetroni/xen_cpu_results/CINT2006.003.html

The trend, and some of the numbers, are nearly identical. After some time, 
the system just appears to degrade.

I''m a little stumped at this point, but I''m out of time to
keep tracking
down the issue, for this week anyway. In the mean time, I''m going to
try
running each workload separately with reboots in-between so I can at least 
get an idea of peak performance for each.

Take care and thanks,
nick

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2007 - Xen Benchmarking guidelines

[Xen-devel] Xen Benchmarking guidelines

Re: [Xen-devel] Xen Benchmarking guidelines

Re: [Xen-devel] Xen Benchmarking guidelines

Re: [Xen-devel] Xen Benchmarking guidelines