thr3ads.net - Xen devel - [Xen-devel] HYBRID: PV in HVM container [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Mukesh Rathor

2011-Jun-27 19:24 UTC

[Xen-devel] HYBRID: PV in HVM container

Hi guys,

Cheers!! I got PV in HVM container prototype working with single VCPU
(pinned to a cpu). Basically, I create a VMX container just like for
HVM guest (with some differences that I''ll share soon when I clean up
the code). The PV guest starts in Protected mode with the usual
entry point startup_xen().

0. Guest kernel runs in ring 0, CS:0x10.

1. I use xen for all pt management just like a PV guest. So at present
   all faults are going to xen, and when fixup_page_fault() fails, they
   are injected into the container for the guest to handle it.

2. The guest manages the GDT, LDT, TR, in the container.

3. The guest installs the trap table in the vmx container instead of 
   do_set_trap_table(). 

4. Events/INTs are delivered via HVMIRQ_callback_vector.

5. MSR_GS_BASE is managed by the guest in the container itself.

6. Currently, I''m managing cr4 in the container, but going to xen
   for cr0. I need to revisit that.

7. Currently, VPID is disabled, I need to figure it out, and revisit.

8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at 
   that. 

These are the salient points I can think of at the moment. Next, I am 
going to run LMBench and figure out the gains. After that, make sure
SMP works, and things are stable, and look at any enhancements. I need
to look at couple unrelated bugs at the moment, but hope to return back 
to this very soon.

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jun-27 19:36 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 27/06/2011 20:24, "Mukesh Rathor" <mukesh.rathor@oracle.com>
wrote:
> 
> Hi guys,
> 
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I''ll share soon when I clean
up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().
> 
> 0. Guest kernel runs in ring 0, CS:0x10.
> 
> 1. I use xen for all pt management just like a PV guest. So at present
>    all faults are going to xen, and when fixup_page_fault() fails, they
>    are injected into the container for the guest to handle it.
> 
> 2. The guest manages the GDT, LDT, TR, in the container.
> 
> 3. The guest installs the trap table in the vmx container instead of
>    do_set_trap_table().
To be clear, you intend for this to work with unmodified PV guests, right?
All of this translation can easily be done in Xen, avoiding multiple paths
needed in the guest kernel (not really tenable for upstreaming).

 -- Keir
> 4. Events/INTs are delivered via HVMIRQ_callback_vector.
> 
> 5. MSR_GS_BASE is managed by the guest in the container itself.
> 
> 6. Currently, I''m managing cr4 in the container, but going to xen
>    for cr0. I need to revisit that.
> 
> 7. Currently, VPID is disabled, I need to figure it out, and revisit.
> 
> 8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at
>    that. 
> 
> These are the salient points I can think of at the moment. Next, I am
> going to run LMBench and figure out the gains. After that, make sure
> SMP works, and things are stable, and look at any enhancements. I need
> to look at couple unrelated bugs at the moment, but hope to return back
> to this very soon.
> 
> thanks,
> Mukesh
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jun-28 01:51 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Mon, 27 Jun 2011 20:36:18 +0100
Keir Fraser <keir.xen@gmail.com> wrote:
> On 27/06/2011 20:24, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> 
> > 
> > Hi guys,
> > 
> > Cheers!! I got PV in HVM container prototype working with single
> > VCPU (pinned to a cpu). Basically, I create a VMX container just
> > like for HVM guest (with some differences that I''ll share
soon when
> > I clean up the code). The PV guest starts in Protected mode with
> > the usual entry point startup_xen().
> > 
> > 0. Guest kernel runs in ring 0, CS:0x10.
> > 
> > 1. I use xen for all pt management just like a PV guest. So at
> > present all faults are going to xen, and when fixup_page_fault()
> > fails, they are injected into the container for the guest to handle
> > it.
> > 
> > 2. The guest manages the GDT, LDT, TR, in the container.
> > 
> > 3. The guest installs the trap table in the vmx container instead of
> >    do_set_trap_table().
> 
> To be clear, you intend for this to work with unmodified PV guests,
> right? All of this translation can easily be done in Xen, avoiding
> multiple paths needed in the guest kernel (not really tenable for
> upstreaming).
> 
>  -- Keir
Hi Keir,

Actually, I modified the PVops guest.  The changes in the pvops are
minimal and mostly confied to xen specific files. So I think it has
a fair shot of being upstreamed, at least, worth a shot. I will run
them by Jeremy/Konrad and get their opinions.

thanks
Mukesh
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jun-28 07:46 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com>
wrote:
>> To be clear, you intend for this to work with unmodified PV guests,
>> right? All of this translation can easily be done in Xen, avoiding
>> multiple paths needed in the guest kernel (not really tenable for
>> upstreaming).
>> 
>>  -- Keir
> 
> Hi Keir,
> 
> Actually, I modified the PVops guest.  The changes in the pvops are
> minimal and mostly confied to xen specific files. So I think it has
> a fair shot of being upstreamed, at least, worth a shot. I will run
> them by Jeremy/Konrad and get their opinions.
Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests.
I''m
not sure that adding explicitly HVM-PV guests as well isn''t just a
bloody
mess.

 -- Keir
> thanks
> Mukesh
>  


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2011-Jun-28 08:30 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:> On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> 
> >> To be clear, you intend for this to work with unmodified PV
guests,
> >> right? All of this translation can easily be done in Xen, avoiding
> >> multiple paths needed in the guest kernel (not really tenable for
> >> upstreaming).
> >> 
> >>  -- Keir
> > 
> > Hi Keir,
> > 
> > Actually, I modified the PVops guest.  The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
> 
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests.
I''m
> not sure that adding explicitly HVM-PV guests as well isn''t just a
bloody
> mess.
Ideally this container could be used to accelerate existing 64 bit
guests (e.g. older distros running classic-Xen) unmodified (or at least
only with latent bugs fixed) too.

Getting something working with a modified guest seems like a useful
first step (to get to a working baseline) but I''m not sure it should be
the end goal.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2011-Jun-28 08:31 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Mon, 2011-06-27 at 20:24 +0100, Mukesh Rathor wrote:> Hi guys,
> 
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I''ll share soon when I clean
up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().
Great stuff! I''ve been eagerly awaiting this functionality ;-)

Do you have any timeline for when you think you might post the code?

I presume you managed to avoid bouncing through the hypervisor for
syscalls?

Cheers,
Ian.
> 
> 0. Guest kernel runs in ring 0, CS:0x10.
> 
> 1. I use xen for all pt management just like a PV guest. So at present
>    all faults are going to xen, and when fixup_page_fault() fails, they
>    are injected into the container for the guest to handle it.
> 
> 2. The guest manages the GDT, LDT, TR, in the container.
> 
> 3. The guest installs the trap table in the vmx container instead of 
>    do_set_trap_table(). 
> 
> 4. Events/INTs are delivered via HVMIRQ_callback_vector.
> 
> 5. MSR_GS_BASE is managed by the guest in the container itself.
> 
> 6. Currently, I''m managing cr4 in the container, but going to xen
>    for cr0. I need to revisit that.
> 
> 7. Currently, VPID is disabled, I need to figure it out, and revisit.
> 
> 8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at 
>    that. 
> 
> These are the salient points I can think of at the moment. Next, I am 
> going to run LMBench and figure out the gains. After that, make sure
> SMP works, and things are stable, and look at any enhancements. I need
> to look at couple unrelated bugs at the moment, but hope to return back 
> to this very soon.
> 
> thanks,
> Mukesh
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jun-28 08:35 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 28/06/2011 09:30, "Ian Campbell" <Ian.Campbell@citrix.com>
wrote:
> On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:
>> On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
>> 
>> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests.
I''m
>> not sure that adding explicitly HVM-PV guests as well isn''t
just a bloody
>> mess.
> 
> Ideally this container could be used to accelerate existing 64 bit
> guests (e.g. older distros running classic-Xen) unmodified (or at least
> only with latent bugs fixed) too.
There was a question mark over whether unmodified PV guests would tolerate
running in ring 0, rather than entirely in ring 3. I believe we''re
confident
it should work, and thus supporting classic-Xen guests should certainly be
the aim.
> Getting something working with a modified guest seems like a useful
> first step (to get to a working baseline) but I''m not sure it
should be
> the end goal.
I certainly don''t think we should commit such a thing without careful
thought.

 -- Keir
> Ian.
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2011-Jun-28 08:49 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 2011-06-28 at 09:35 +0100, Keir Fraser wrote:> On 28/06/2011 09:30, "Ian Campbell"
<Ian.Campbell@citrix.com> wrote:
> 
> > On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:
> >> On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> >> 
> >> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM
guests. I''m
> >> not sure that adding explicitly HVM-PV guests as well
isn''t just a bloody
> >> mess.
> > 
> > Ideally this container could be used to accelerate existing 64 bit
> > guests (e.g. older distros running classic-Xen) unmodified (or at
least
> > only with latent bugs fixed) too.
> 
> There was a question mark over whether unmodified PV guests would tolerate
> running in ring 0, rather than entirely in ring 3. I believe we''re
confident
> it should work, and thus supporting classic-Xen guests should certainly be
> the aim.
A guest which does XENFEAT_supervisor_mode_kernel (and perhaps one or
two other XENFEATs) should work, but that was the primary source of the
latent bugs I was thinking of...

In particular the pvops kernel probably doesn''t do all the right things
for XENFEAT_supervisor_mode_kernel, since it has never been run that
way, but it also doesn''t advertise it via XEN_ELFNOTE_FEATURES so we
can
at least detect when it is safe to enable the container from the builder
side.
> > Getting something working with a modified guest seems like a useful
> > first step (to get to a working baseline) but I''m not sure it
should be
> > the end goal.
> 
> I certainly don''t think we should commit such a thing without
careful
> thought.
Absolutely.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Jun-28 10:46 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 28 Jun 2011, Keir Fraser wrote:> On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> 
> >> To be clear, you intend for this to work with unmodified PV
guests,
> >> right? All of this translation can easily be done in Xen, avoiding
> >> multiple paths needed in the guest kernel (not really tenable for
> >> upstreaming).
> >> 
> >>  -- Keir
> > 
> > Hi Keir,
> > 
> > Actually, I modified the PVops guest.  The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
> 
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests.
I''m
> not sure that adding explicitly HVM-PV guests as well isn''t just a
bloody
> mess.
I very much agree on this point.

However it could still be useful at the very least to run a 64-bit hvm
dom0 (assuming there is a significant performance improvement in doing
so, compared to a traditional 64-bit dom0).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2011-Jun-28 10:50 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 2011-06-28 at 11:46 +0100, Stefano Stabellini
wrote:> On Tue, 28 Jun 2011, Keir Fraser wrote:
> > On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> > 
> > >> To be clear, you intend for this to work with unmodified PV
guests,
> > >> right? All of this translation can easily be done in Xen,
avoiding
> > >> multiple paths needed in the guest kernel (not really tenable
for
> > >> upstreaming).
> > >> 
> > >>  -- Keir
> > > 
> > > Hi Keir,
> > > 
> > > Actually, I modified the PVops guest.  The changes in the pvops
are
> > > minimal and mostly confied to xen specific files. So I think it
has
> > > a fair shot of being upstreamed, at least, worth a shot. I will
run
> > > them by Jeremy/Konrad and get their opinions.
> > 
> > Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests.
I''m
> > not sure that adding explicitly HVM-PV guests as well isn''t
just a bloody
> > mess.
> 
> I very much agree on this point.
> 
> However it could still be useful at the very least to run a 64-bit hvm
> dom0 (assuming there is a significant performance improvement in doing
> so, compared to a traditional 64-bit dom0).
That case is no different to the guest case in this respect, so we
should still be aiming for not needing to modify the kernel. We
certainly don''t want to a new special case HVM-PV for dom0 only!

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jun-28 17:56 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 28 Jun 2011 09:31:57 +0100
Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2011-06-27 at 20:24 +0100, Mukesh Rathor wrote:
> > Hi guys,
> > 
> > Cheers!! I got PV in HVM container prototype working with single
> > VCPU (pinned to a cpu). Basically, I create a VMX container just
> > like for HVM guest (with some differences that I''ll share
soon when
> > I clean up the code). The PV guest starts in Protected mode with
> > the usual entry point startup_xen().
> 
> Great stuff! I''ve been eagerly awaiting this functionality ;-)
> 
> Do you have any timeline for when you think you might post the code?
> 
> I presume you managed to avoid bouncing through the hypervisor for
> syscalls?
Yup, that was the primary goal. DB benchmarks suffered quite a bit
because of syscall overhead.

thanks,
Mukesh

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jun-28 18:32 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Tue, 28 Jun 2011 08:46:08 +0100
Keir Fraser <keir.xen@gmail.com> wrote:
> On 28/06/2011 02:51, "Mukesh Rathor"
<mukesh.rathor@oracle.com> wrote:
> > Hi Keir,
> > 
> > Actually, I modified the PVops guest.  The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
> 
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM
> guests. I''m not sure that adding explicitly HVM-PV guests as well
> isn''t just a bloody mess.
Could we perhaps define a HYBRID type that will have characteristics
like, this runs in HVM container, it doesn''t use EPT, it uses HVM
callback, etc.. We can they modify it without defining any new types in
future, say we find it works better with EPT under certain
circumstances etc..  What do you think?

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jun-28 18:39 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 28/06/2011 19:32, "Mukesh Rathor" <mukesh.rathor@oracle.com>
wrote:
>> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM
>> guests. I''m not sure that adding explicitly HVM-PV guests as
well
>> isn''t just a bloody mess.
> 
> Could we perhaps define a HYBRID type that will have characteristics
> like, this runs in HVM container, it doesn''t use EPT, it uses HVM
> callback, etc.. We can they modify it without defining any new types in
> future, say we find it works better with EPT under certain
> circumstances etc..  What do you think?
Yes, I don''t mind the idea of some HVM extensions for performance, and
that
will probably become increasingly important. I just think we should support
unmodified PV as a baseline, with best performance possible (i.e., the basic
HVM container approach should support it).

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jul-01 01:54 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Mon, 27 Jun 2011 12:24:04 -0700
Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> 
> Hi guys,
> 
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I''ll share soon when I clean
up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().
> 
> 0. Guest kernel runs in ring 0, CS:0x10.

JFYI.. as expected, running in ring 0 and not bouncing syscalls thru
xen, syscalls do very well. fork/execs are slow prob beause VPIDs are 
turned off right now. I''m trying to figure VPIDs out, and hopefully
that
would help. BTW, dont'' compare to anything else, both kernels
below are unoptimized debug kernels.

LMbench:
Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh
                             call  I/O stat clos TCP   inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
STOCK     Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82 3.87 433. 1134 3145
HYBRID    Linux 2.6.39m 2745 0.13 0.22 0.88 2.04 3.287 0.28 1.11 526. 1393 3923


thanks,
Mukesh




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jul-09 01:53 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

> JFYI.. as expected, running in ring 0 and not bouncing syscalls thru
> xen, syscalls do very well. fork/execs are slow prob beause VPIDs are 
> turned off right now. I''m trying to figure VPIDs out, and
hopefully
> that would help. BTW, dont'' compare to anything else, both kernels
> below are unoptimized debug kernels.
> 
> LMbench:
> Processor, Processes - times in microseconds - smaller is better
> ----------------------------------------------------------------
> Host                 OS  Mhz null null      open selct sig  sig  fork
> exec sh call  I/O stat clos TCP   inst hndl proc proc proc
> --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ----
> ---- ---- STOCK     Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82
> 3.87 433. 1134 3145 HYBRID    Linux 2.6.39m 2745 0.13 0.22 0.88 2.04
> 3.287 0.28 1.11 526. 1393 3923
> 
JFYI again, I seem to have caught up with pure PV on almost all with some
optimizations:

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh
                             call  I/O stat clos TCP   inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
STOCK:    Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82 3.87 433. 1134 3145
N4        Linux 2.6.39m 2745 0.13 0.21 0.86 2.03 3.279 0.28 1.18 479. 1275 3502
N5        Linux 2.6.39m 2752 0.13 0.21 0.91 2.07 3.284 0.28 1.14 439. 1168 3155

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host                 OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                        ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
STOCK:    Linux 2.6.39+ 5.800 6.2400 6.8700 6.6700 8.4600 7.13000 8.63000
N4        Linux 2.6.39m 6.420 6.9300 8.0100 7.2600 8.7600 7.97000 9.25000
N5        Linux 2.6.39m 6.650 7.0000 7.8400 7.3900 8.8000 7.90000 9.06000

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
STOCK:    Linux 2.6.39+ 5.800  18.9 22.3  28.7  32.8  34.9  44.6 89.8
N4        Linux 2.6.39m 6.420  17.1 18.1  26.9  28.7  34.2  40.1 76.3
N5        Linux 2.6.39m 6.650  18.1 17.7  24.4  33.4  33.9  40.7 76.7

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host                 OS   0K File      10K File      Mmap    Prot    Page
                        Create Delete Create Delete  Latency Fault   Fault
--------- ------------- ------ ------ ------ ------  ------- -----   -----
STOCK:    Linux 2.6.39+                               3264.0 0.828 3.00000
N4        Linux 2.6.39m                               3990.0 1.351 4.00000
N5        Linux 2.6.39m                               3362.0 0.235 4.00000


where the only difference between N4 and N5 is that in N5 I''ve enabled
vmexits only for page faults on write protection, ie, err code 0x3. 

I''m trying to figure out how vtlb implemention relates to SDM 28.3.5.
It seems in xen, vtlb is mostly for shadows glancing at the code, which
I am not worrying for now (I''ve totally ignored migration for now). 
Any thoughts any body?

Also, at present I am not using vtsc, is it worth looking into? some of
the tsc stuff makes my head spin just like the shadow code does :)... 

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jul-09 07:35 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 09/07/2011 02:53, "Mukesh Rathor" <mukesh.rathor@oracle.com>
wrote:
> where the only difference between N4 and N5 is that in N5 I''ve
enabled
> vmexits only for page faults on write protection, ie, err code 0x3.
> 
> I''m trying to figure out how vtlb implemention relates to SDM
28.3.5.
> It seems in xen, vtlb is mostly for shadows glancing at the code, which
> I am not worrying for now (I''ve totally ignored migration for
now).
> Any thoughts any body?
You don''t have to understand it very much. Trapping on write faults
from
supervisor mode only is fine for normal operation, and you''ll have to
fault
on everything during live migration (since shadow page tables are built up
via read & write demand faults).
> Also, at present I am not using vtsc, is it worth looking into? some of
> the tsc stuff makes my head spin just like the shadow code does :)...
You have to understand that even less. For pure PV CR4.TSD gets set
appropriately for the VTSC mode. You can hook off that, or duplicate that,
to enable/disable RDTSC exiting instead. You don''t have to actually
*do* any
vtsc work, as no doubt you jump out of your wrapper code into the proper PV
paths for actually getting any real hypervisor work done (like vtsc).

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Jul-28 01:58 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

Hi folks,

Well, I did some benchmarking and found interesting results. Following
runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB 
RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB 
and 1 cpu.  HVM guest has EPT enabled. HT is on.

So, unless the NUMA''ness interfered with results (using some memory on 
remote socket), it appears HVM does very well. To the point that it
seems a hybrid is not going to be worth it. I am currently running
tests on a single socket system just to be sure.

I am attaching my diff''s in case any one wants to see what I did. I
used
xen 4.0.2 and linux 2.6.39. 

thanks,
Mukesh

                 L M B E N C H  3 . 0   S U M M A R Y

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
PV        Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294 3308
Hybrid    Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294 3246
HVM       Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354. 1324
Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386. 1434

Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host                 OS  intgr intgr  intgr  intgr  intgr
                          bit   add    mul    div    mod
--------- ------------- ------ ------ ------ ------ ------
PV        Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
Hybrid    Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
HVM       Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800

Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host                 OS  float  float  float  float
                         add    mul    div    bogo
--------- ------------- ------ ------ ------ ------
PV        Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
Hybrid    Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
HVM       Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS  double double double double
                         add    mul    div    bogo
--------- ------------- ------  ------ ------ ------
PV        Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
Hybrid    Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
HVM       Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
PV        Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
Hybrid    Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000
HVM       Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000
Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
PV        Linux 2.6.39f 5.280  16.6 21.3  25.9  33.7  34.7  41.8  87.
Hybrid    Linux 2.6.39f 4.920  11.2 14.4  19.6  26.1  27.5  32.9  71.
HVM       Linux 2.6.39f 1.310 4.416 6.15 9.386  14.8  15.8  20.1  45.
Baremetal Linux 2.6.39+ 1.550 4.625 7.34  14.3  19.8  21.4  26.4  66.

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
PV        Linux 2.6.39f                               24.0K 0.746 3.55870 2.184
Hybrid    Linux 2.6.39f                               24.6K 0.238 4.00100 1.480
HVM       Linux 2.6.39f                              4716.0 0.202 0.96600 1.468
Baremetal Linux 2.6.39+                              6898.0 0.325 0.93610 1.620

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
PV        Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472 5633.
Hybrid    Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470 5672.
HVM       Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453 5634.
Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541 5752.

Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem    Guesses
--------- -------------   ---   ----   ----    --------    --------    -------
PV        Linux 2.6.39f  2639 1.5160 5.9170   29.7        97.5
Hybrid    Linux 2.6.39f  2639 1.5170 7.5000   29.7        97.4
HVM       Linux 2.6.39f  2639 1.5190 4.0210   29.8       105.4
Baremetal Linux 2.6.39+  2649 1.5090 3.8370   29.2        78.0






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Jul-28 11:34 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Thu, 28 Jul 2011, Mukesh Rathor wrote:> Hi folks,
> 
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB 
> RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB 
> and 1 cpu.  HVM guest has EPT enabled. HT is on.
> 
> So, unless the NUMA''ness interfered with results (using some
memory on
> remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.
> 
The high level benchmarks I run to compare PV and PV on HVM guests show
a very similar scenario.

It is still worth having HYBRID guests (running with EPT?) in order to
support dom0 in an HVM container one day not too far from now.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-29 15:43 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Wed, Jul 27, 2011 at 06:58:28PM -0700, Mukesh Rathor
wrote:> Hi folks,
> 
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB 
> RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB 
> and 1 cpu.  HVM guest has EPT enabled. HT is on.
Is this PVonHVM? Or is it real HVM without _any_ PV enablement? Ah, the
.config tells me it is PVonHVM - so IRQ callbacks, and timers are PV
actually.
> 
> So, unless the NUMA''ness interfered with results (using some
memory on
> remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.
The xm has some NUMA capability while xl does not. Did you use xm or xl to
run this?
> 
> I am attaching my diff''s in case any one wants to see what I did.
I used
> xen 4.0.2 and linux 2.6.39. 
Wow. That is surprisingly a compact set of changes to the Linux kernel.
Good job.> 
> thanks,
> Mukesh
> 
>                  L M B E N C H  3 . 0   S U M M A R Y
> 
> Processor, Processes - times in microseconds - smaller is better
>
------------------------------------------------------------------------------
> Host                 OS  Mhz null null      open slct sig  sig  fork exec
sh
>                              call  I/O stat clos TCP  inst hndl proc proc
proc
> --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
----
> PV        Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294
3308
> Hybrid    Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294
3246
Hm, so it follows baremetal until fork/exec/sh. At which it is as bad as
PV.
> HVM       Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354.
1324
<blinks> So HVM is better than baremetal?
> Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386.
1434
> 
> Basic integer operations - times in nanoseconds - smaller is better
> -------------------------------------------------------------------
> Host                 OS  intgr intgr  intgr  intgr  intgr
>                           bit   add    mul    div    mod
> --------- ------------- ------ ------ ------ ------ ------
> PV        Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
> Hybrid    Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
> HVM       Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
> Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800
> 
> Basic float operations - times in nanoseconds - smaller is better
> -----------------------------------------------------------------
> Host                 OS  float  float  float  float
>                          add    mul    div    bogo
> --------- ------------- ------ ------ ------ ------
> PV        Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
> Hybrid    Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
> HVM       Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
> Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700
> 
> Basic double operations - times in nanoseconds - smaller is better
> ------------------------------------------------------------------
> Host                 OS  double double double double
>                          add    mul    div    bogo
> --------- ------------- ------  ------ ------ ------
> PV        Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
> Hybrid    Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
> HVM       Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
> Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800
> 
> Context switching - times in microseconds - smaller is better
> -------------------------------------------------------------------------
> Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
>                          ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
> --------- ------------- ------ ------ ------ ------ ------ ------- -------
> PV        Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
> Hybrid    Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000
So the diff between PV an Hybrid looks to be 8%..

And then ~50% difference between Hybrid and baremetal. So syscall is
only causing 8% drop in performance - what is the other 42%?
> HVM       Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000
This is really bizzare. HVM kicks baremetal butt?> Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000
> 
> *Local* Communication latencies in microseconds - smaller is better
> ---------------------------------------------------------------------
> Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
>                         ctxsw       UNIX         UDP         TCP conn
> --------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
> PV        Linux 2.6.39f 5.280  16.6 21.3  25.9  33.7  34.7  41.8  87.
> Hybrid    Linux 2.6.39f 4.920  11.2 14.4  19.6  26.1  27.5  32.9  71.
> HVM       Linux 2.6.39f 1.310 4.416 6.15 9.386  14.8  15.8  20.1  45.
> Baremetal Linux 2.6.39+ 1.550 4.625 7.34  14.3  19.8  21.4  26.4  66.
> 
> File & VM system latencies in microseconds - smaller is better
>
-------------------------------------------------------------------------------
> Host                 OS   0K File      10K File     Mmap    Prot   Page  
100fd
>                         Create Delete Create Delete Latency Fault  Fault 
selct
> --------- ------------- ------ ------ ------ ------ ------- ----- -------
-----
> PV        Linux 2.6.39f                               24.0K 0.746 3.55870
2.184
> Hybrid    Linux 2.6.39f                               24.6K 0.238 4.00100
1.480
Could the mmap and the pagetable creations be the fault (ha! a pun!) of sucky
performance?  Perhaps running with autotranslate pagetables would eliminate
this?

Is the mmap doing small little 4K runs or something much bigger?

> HVM       Linux 2.6.39f                              4716.0 0.202 0.96600
1.468
> Baremetal Linux 2.6.39+                              6898.0 0.325 0.93610
1.620
> 
> *Local* Communication bandwidths in MB/s - bigger is better
>
-----------------------------------------------------------------------------
> Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem  
Mem
>                              UNIX      reread reread (libc) (hand) read
write
> --------- ------------- ---- ---- ---- ------ ------ ------ ------ ----
-----
> PV        Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472
5633.
> Hybrid    Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470
5672.
> HVM       Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453
5634.
> Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541
5752.
> 
> Memory latencies in nanoseconds - smaller is better
>     (WARNING - may not be correct, check graphs)
>
------------------------------------------------------------------------------
> Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem   
Guesses
> --------- -------------   ---   ----   ----    --------    --------   
-------
> PV        Linux 2.6.39f  2639 1.5160 5.9170   29.7        97.5
> Hybrid    Linux 2.6.39f  2639 1.5170 7.5000   29.7        97.4
> HVM       Linux 2.6.39f  2639 1.5190 4.0210   29.8       105.4
> Baremetal Linux 2.6.39+  2649 1.5090 3.8370   29.2        78.0

OK, so once you have access to the memory, using it under PV is actually OK.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-29 15:48 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Thu, Jul 28, 2011 at 12:34:42PM +0100, Stefano Stabellini
wrote:> On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> > Hi folks,
> > 
> > Well, I did some benchmarking and found interesting results. Following
> > runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> > with maxcpus=2 and entire RAM. All guests were started with 1vcpu and
2GB
> > RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> > and 1 cpu.  HVM guest has EPT enabled. HT is on.
> > 
> > So, unless the NUMA''ness interfered with results (using some
memory on
> > remote socket), it appears HVM does very well. To the point that it
> > seems a hybrid is not going to be worth it. I am currently running
> > tests on a single socket system just to be sure.
> > 
> 
> The high level benchmarks I run to compare PV and PV on HVM guests show
> a very similar scenario.
> 
> It is still worth having HYBRID guests (running with EPT?) in order to
> support dom0 in an HVM container one day not too far from now.
I am just wondering how much dom0 cares about this? I mean if you use
blkback, netback - etc - they are all in the kernel. The device drivers
are also in the kernel.

Based on Mukeshs''s results the PVonHVM work you did really paid
itself off for guests. The numbers are even better than baremetal .. which
I am bit surprised - maybe the collecting of the data (gettimeofday) is
not the best?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Jul-29 16:41 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:> On Thu, Jul 28, 2011 at 12:34:42PM +0100, Stefano Stabellini wrote:
> > On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> > > Hi folks,
> > > 
> > > Well, I did some benchmarking and found interesting results.
Following
> > > runs are on a westmere with 2 sockets and 10GB RAM.  Xen was
booted
> > > with maxcpus=2 and entire RAM. All guests were started with 1vcpu
and 2GB
> > > RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted
with 2GB
> > > and 1 cpu.  HVM guest has EPT enabled. HT is on.
> > > 
> > > So, unless the NUMA''ness interfered with results (using
some memory on
> > > remote socket), it appears HVM does very well. To the point that
it
> > > seems a hybrid is not going to be worth it. I am currently
running
> > > tests on a single socket system just to be sure.
> > > 
> > 
> > The high level benchmarks I run to compare PV and PV on HVM guests
show
> > a very similar scenario.
> > 
> > It is still worth having HYBRID guests (running with EPT?) in order to
> > support dom0 in an HVM container one day not too far from now.
> 
> I am just wondering how much dom0 cares about this? I mean if you use
> blkback, netback - etc - they are all in the kernel. The device drivers
> are also in the kernel.
There are always going to be some userspace processes, even with
stubdoms.
Besides if we have HVM dom0, we can enable
XENFEAT_auto_translated_physmap and EPT and have the same level of
performances of a PV on HVM guest. Moreover since we wouldn''t be using
the mmu pvops anymore we could drop them completely: that would greatly
simplify the Xen maintenance in the Linux kernel as well as gain back
some love from the x86 maintainers :)

The way I see it, normal Linux guests would be PV on HVM guests, but we
still need to do something about dom0.
This work would make dom0 exactly like PV on HVM guests apart from
the boot sequence: dom0 would still boot from xen_start_kernel,
everything else would be pretty much the same.

I would ask you to run some benchmarks using
XENFEAT_auto_translated_physmap but I am afraid it bitrotted over the
years so it would need some work to get it working.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-29 17:28 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

> > I am just wondering how much dom0 cares about this? I mean if you use
> > blkback, netback - etc - they are all in the kernel. The device
drivers
> > are also in the kernel.
> 
> There are always going to be some userspace processes, even with
> stubdoms.
Stubdomains? Linux HVM''s have now PVonHVM - and for Windows there are
multitude of PV drivers available? But sure there are some processes - like
snort or other packet filtering userland software.
> Besides if we have HVM dom0, we can enable
> XENFEAT_auto_translated_physmap and EPT and have the same level of
> performances of a PV on HVM guest. Moreover since we wouldn''t be
using
> the mmu pvops anymore we could drop them completely: that would greatly
Sure. It also means you MUST have an IOMMU in the box.
> simplify the Xen maintenance in the Linux kernel as well as gain back
> some love from the x86 maintainers :)
> 
> The way I see it, normal Linux guests would be PV on HVM guests, but we
> still need to do something about dom0.
> This work would make dom0 exactly like PV on HVM guests apart from
> the boot sequence: dom0 would still boot from xen_start_kernel,
> everything else would be pretty much the same.
Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
backends. Thought sticking in the HVM container in PV that Mukesh
made work would also benefit.

Or just come back to the idea of "real" HVM device driver domains
and have the PV dom0 be a light one loading the rest. But the setup of
it is just so complex.. And the PV dom0 needs to deal with the PCI backend
xenstore, and able to comprehend ACPI _PRT... and then launch the "device
driver" Dom0, which at its simplest form would have all of the devices
passed in to it.

So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd :-)
Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
hypervisor binary.. I should stop here.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Jul-29 18:00 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:> > Besides if we have HVM dom0, we can enable
> > XENFEAT_auto_translated_physmap and EPT and have the same level of
> > performances of a PV on HVM guest. Moreover since we wouldn''t
be using
> > the mmu pvops anymore we could drop them completely: that would
greatly
> 
> Sure. It also means you MUST have an IOMMU in the box.
Why?
We can still remap interrupts into event channels.
Maybe you mean VMX?

> > simplify the Xen maintenance in the Linux kernel as well as gain back
> > some love from the x86 maintainers :)
> > 
> > The way I see it, normal Linux guests would be PV on HVM guests, but
we
> > still need to do something about dom0.
> > This work would make dom0 exactly like PV on HVM guests apart from
> > the boot sequence: dom0 would still boot from xen_start_kernel,
> > everything else would be pretty much the same.
> 
> Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
> pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
> backends. Thought sticking in the HVM container in PV that Mukesh
> made work would also benefit.
Yes for startup, spinlock, timers and backends. I would use HVM for cpu
operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
example).

> Or just come back to the idea of "real" HVM device driver domains
> and have the PV dom0 be a light one loading the rest. But the setup of
> it is just so complex.. And the PV dom0 needs to deal with the PCI backend
> xenstore, and able to comprehend ACPI _PRT... and then launch the
"device
> driver" Dom0, which at its simplest form would have all of the devices
> passed in to it.
> 
> So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd :-)
> Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
> hypervisor binary.. I should stop here.
The goal of splitting up dom0 into multiple management domain is surely
a worthy goal, no matter is the domains are PV or HVM or PV on HVM, but
yeah the setup is hard. I hope that the we''ll be able to simplify it in
the near future, maybe after the switchover to the new qemu and seabios
is completed.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-29 18:00 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Fri, Jul 29, 2011 at 07:00:07PM +0100, Stefano Stabellini
wrote:> On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> > > Besides if we have HVM dom0, we can enable
> > > XENFEAT_auto_translated_physmap and EPT and have the same level
of
> > > performances of a PV on HVM guest. Moreover since we
wouldn''t be using
> > > the mmu pvops anymore we could drop them completely: that would
greatly
> > 
> > Sure. It also means you MUST have an IOMMU in the box.
> 
> Why?
For HVM dom0s.. But I think when you say HVM here, you mean using
PV with the hypervisor''s code that is used for managing page-tables -
EPT/NPT/HAP.

So PV+HAP = Stefano''s HVM :-)
> We can still remap interrupts into event channels.
> Maybe you mean VMX?
> 
> 
> > > simplify the Xen maintenance in the Linux kernel as well as gain
back
> > > some love from the x86 maintainers :)
> > > 
> > > The way I see it, normal Linux guests would be PV on HVM guests,
but we
> > > still need to do something about dom0.
> > > This work would make dom0 exactly like PV on HVM guests apart
from
> > > the boot sequence: dom0 would still boot from xen_start_kernel,
> > > everything else would be pretty much the same.
> > 
> > Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
> > pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
> > backends. Thought sticking in the HVM container in PV that Mukesh
> > made work would also benefit.
> 
> Yes for startup, spinlock, timers and backends. I would use HVM for cpu
> operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
> example).
OK, so a SVM/VMX setup is required.> 
> 
> > Or just come back to the idea of "real" HVM device driver
domains
> > and have the PV dom0 be a light one loading the rest. But the setup of
> > it is just so complex.. And the PV dom0 needs to deal with the PCI
backend
> > xenstore, and able to comprehend ACPI _PRT... and then launch the
"device
> > driver" Dom0, which at its simplest form would have all of the
devices
> > passed in to it.
> > 
> > So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd
:-)
> > Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
> > hypervisor binary.. I should stop here.
> 
> The goal of splitting up dom0 into multiple management domain is surely
> a worthy goal, no matter is the domains are PV or HVM or PV on HVM, but
> yeah the setup is hard. I hope that the we''ll be able to simplify
it in
> the near future, maybe after the switchover to the new qemu and seabios
> is completed.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Jul-29 18:16 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:> On Fri, Jul 29, 2011 at 07:00:07PM +0100, Stefano Stabellini wrote:
> > On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> > > > Besides if we have HVM dom0, we can enable
> > > > XENFEAT_auto_translated_physmap and EPT and have the same
level of
> > > > performances of a PV on HVM guest. Moreover since we
wouldn''t be using
> > > > the mmu pvops anymore we could drop them completely: that
would greatly
> > > 
> > > Sure. It also means you MUST have an IOMMU in the box.
> > 
> > Why?
> 
> For HVM dom0s.. But I think when you say HVM here, you mean using
> PV with the hypervisor''s code that is used for managing
page-tables - EPT/NPT/HAP.
> 
> So PV+HAP = Stefano''s HVM :-)
:-)
> > Yes for startup, spinlock, timers and backends. I would use HVM for
cpu
> > operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
> > example).
> 
> OK, so a SVM/VMX setup is required.
Yes, I would actually require SVM/VMX for performances and to simplify
the setup and maintenance of the code in general.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2011-Aug-09 08:54 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On Thu, 2011-07-28 at 12:34 +0100, Stefano Stabellini
wrote:> On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> > Hi folks,
> > 
> > Well, I did some benchmarking and found interesting results. Following
> > runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> > with maxcpus=2 and entire RAM. All guests were started with 1vcpu and
2GB
> > RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> > and 1 cpu.  HVM guest has EPT enabled. HT is on.
> > 
> > So, unless the NUMA''ness interfered with results (using some
memory on
> > remote socket), it appears HVM does very well. To the point that it
> > seems a hybrid is not going to be worth it. I am currently running
> > tests on a single socket system just to be sure.
> > 
> 
> The high level benchmarks I run to compare PV and PV on HVM guests show
> a very similar scenario.
> 
> It is still worth having HYBRID guests (running with EPT?) in order to
> support dom0 in an HVM container one day not too far from now.
I think it is also worth bearing in mind that once we have basic support
for HYBRID we can begin looking at/measuring which hardware features
offer advantages to PV guests and enhancing the PV interfaces for use by
HYBRID guests etc. (i.e. make things truly hybrid PV+Hardware and not
just contained PV)

Also there are arguments to be made for HYBRID over PVHVM in terms of
ease of manageability (i.e. a lot of folks like the dom0-supplied kernel
idiom which PV enables), avoiding the need for a virtualised BIOS and
emulated boot paths, HYBRID can potentially give a best of both in the
trade off between standard-PV vs. HVM/PVHVM while also not needing a
QEMU process for each guest (which helps scalability and so on) etc. I
think HYBRID is worthwhile even if it is basically on-par with PVHVM for
some workloads.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2011-Aug-17 19:27 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

On 08/09/2011 01:54 AM, Ian Campbell wrote:> Also there are arguments to be made for HYBRID over PVHVM in terms of
> ease of manageability (i.e. a lot of folks like the dom0-supplied kernel
> idiom which PV enables), avoiding the need for a virtualised BIOS and
> emulated boot paths, HYBRID can potentially give a best of both in the
> trade off between standard-PV vs. HVM/PVHVM while also not needing a
> QEMU process for each guest (which helps scalability and so on) etc. I
> think HYBRID is worthwhile even if it is basically on-par with PVHVM for
> some workloads.
And it''s amazing how much stuff goes away when you can set
CONFIG_PCI=n...

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mukesh Rathor

2011-Nov-17 23:38 UTC

head link

Re: [Xen-devel] HYBRID: PV in HVM container

Alright, got hybrid with EPT numbers in now from my prototype, it needs
some perf work.. 

Attaching the diffs from my prototype. Linux: 2.6.39. Xen 4.0.2.


Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh  
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
PV        Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294 3308
Hybrid    Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294 3246
HVM       Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354. 1324
Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386. 1434
HYB-EPT   Linux 2.6.39f 2639 0.13 0.21 0.68 1.95 3.04 0.25 3.09 145. 452. 1542


Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host                 OS  intgr intgr  intgr  intgr  intgr  
                          bit   add    mul    div    mod   
--------- ------------- ------ ------ ------ ------ ------ 
PV        Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
Hybrid    Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
HVM       Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800
HYB-EPT   Linux 2.6.39f 0.3800 0.0100 0.1700 9.1200 9.0500


Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host                 OS  float  float  float  float
                         add    mul    div    bogo
--------- ------------- ------ ------ ------ ------ 
PV        Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
Hybrid    Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
HVM       Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700
HYB-EPT   Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000


Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS  double double double double
                         add    mul    div    bogo
--------- ------------- ------  ------ ------ ------ 
PV        Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
Hybrid    Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
HVM       Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800
HYB-EPT   Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300


Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
PV        Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
Hybrid    Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000
HVM       Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000
Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000
HYB-EPT   Linux 2.6.39f 3.2000 3.6100 4.1700 4.3600 6.1200 4.81000 6.20000


*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
PV        Linux 2.6.39f 5.280  16.6 21.3  25.9  33.7  34.7  41.8  87.
Hybrid    Linux 2.6.39f 4.920  11.2 14.4  19.6  26.1  27.5  32.9  71.
HVM       Linux 2.6.39f 1.310 4.416 6.15 9.386  14.8  15.8  20.1  45.
Baremetal Linux 2.6.39+ 1.550 4.625 7.34  14.3  19.8  21.4  26.4  66.
HYB-EPT   Linux 2.6.39f 3.200 8.669 15.3  17.5  23.5  25.1  30.4  66.


File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
PV        Linux 2.6.39f                               24.0K 0.746 3.55870 2.184
Hybrid    Linux 2.6.39f                               24.6K 0.238 4.00100 1.480
HVM       Linux 2.6.39f                              4716.0 0.202 0.96600 1.468
Baremetal Linux 2.6.39+                              6898.0 0.325 0.93610 1.620
HYB-EPT   Linux 2.6.39f                              5321.0 0.347 1.19510 1.480


*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
PV        Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472 5633.
Hybrid    Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470 5672.
HVM       Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453 5634.
Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541 5752.
HYB-EPT   Linux 2.6.39f 2104 2480 1231 3451.5 5503.4 3067.7 2751.0 4438 5636.


Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem    Guesses
--------- -------------   ---   ----   ----    --------    --------    -------
PV        Linux 2.6.39f  2639 1.5160 5.9170   29.7        97.5
Hybrid    Linux 2.6.39f  2639 1.5170 7.5000   29.7        97.4
HVM       Linux 2.6.39f  2639 1.5190 4.0210   29.8       105.4
Baremetal Linux 2.6.39+  2649 1.5090 3.8370   29.2        78.0
HYB-EPT   Linux 2.6.39f  2639 1.5180 4.0060   29.9       109.9


thanks,
Mukesh


On Wed, 27 Jul 2011 18:58:28 -0700
Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> Hi folks,
> 
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and
> 2GB RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted
> with 2GB and 1 cpu.  HVM guest has EPT enabled. HT is on.
> 
> So, unless the NUMA''ness interfered with results (using some
memory
> on remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.
> 
> I am attaching my diff''s in case any one wants to see what I did.
I
> used xen 4.0.2 and linux 2.6.39. 
> 
> thanks,
> Mukesh
> 
>                  L M B E N C H  3 . 0   S U M M A R Y
> 
> Processor, Processes - times in microseconds - smaller is better
>
------------------------------------------------------------------------------


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stefano Stabellini

2011-Nov-18 12:21 UTC

head link

Re: HYBRID: PV in HVM container

On Thu, 17 Nov 2011, Mukesh Rathor wrote:> Alright, got hybrid with EPT numbers in now from my prototype, it needs
> some perf work.. 
Is HVM a PV on HVM guest or a pure HVM guest (no CONFIG_XEN)?

> Processor, Processes - times in microseconds - smaller is better
>
------------------------------------------------------------------------------
> Host                 OS  Mhz null null      open slct sig  sig  fork exec
sh
>                              call  I/O stat clos TCP  inst hndl proc proc
proc
> --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
----
> PV        Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294
3308
> Hybrid    Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294
3246
> HVM       Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354.
1324
> Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386.
1434
> HYB-EPT   Linux 2.6.39f 2639 0.13 0.21 0.68 1.95 3.04 0.25 3.09 145. 452.
1542
good, hybrid == HVM in this test

[...]
 
> Context switching - times in microseconds - smaller is better
> -------------------------------------------------------------------------
> Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
>                          ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
> --------- ------------- ------ ------ ------ ------ ------ ------- -------
> PV        Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
> Hybrid    Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000
> HVM       Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000
> Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000
> HYB-EPT   Linux 2.6.39f 3.2000 3.6100 4.1700 4.3600 6.1200 4.81000 6.20000
How is it possible that the HYB-EPT numbers here are so much worse than
HVM? Shouldn''t they be the same as in the other tests?

> *Local* Communication latencies in microseconds - smaller is better
> ---------------------------------------------------------------------
> Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
>                         ctxsw       UNIX         UDP         TCP conn
> --------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
> PV        Linux 2.6.39f 5.280  16.6 21.3  25.9  33.7  34.7  41.8  87.
> Hybrid    Linux 2.6.39f 4.920  11.2 14.4  19.6  26.1  27.5  32.9  71.
> HVM       Linux 2.6.39f 1.310 4.416 6.15 9.386  14.8  15.8  20.1  45.
> Baremetal Linux 2.6.39+ 1.550 4.625 7.34  14.3  19.8  21.4  26.4  66.
> HYB-EPT   Linux 2.6.39f 3.200 8.669 15.3  17.5  23.5  25.1  30.4  66.
>
> *Local* Communication bandwidths in MB/s - bigger is better
>
-----------------------------------------------------------------------------
> Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem  
Mem
>                              UNIX      reread reread (libc) (hand) read
write
> --------- ------------- ---- ---- ---- ------ ------ ------ ------ ----
-----
> PV        Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472
5633.
> Hybrid    Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470
5672.
> HVM       Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453
5634.
> Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541
5752.
> HYB-EPT   Linux 2.6.39f 2104 2480 1231 3451.5 5503.4 3067.7 2751.0 4438
5636.
same on these two tests



> Attaching the diffs from my prototype. Linux: 2.6.39. Xen 4.0.2.
lin.diff:

> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index e3c6a06..53ceae0 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -110,7 +110,7 @@ struct shared_info *HYPERVISOR_shared_info = (void
*)&xen_dummy_shared_info;
>   *
>   * 0: not available, 1: available
>   */
> -static int have_vcpu_info_placement = 1;
> +static int have_vcpu_info_placement = 0;
>  
>  static void clamp_max_cpus(void)
>  {
> @@ -195,6 +195,13 @@ static void __init xen_banner(void)
>  	printk(KERN_INFO "Xen version: %d.%d%s%s\n",
>  	       version >> 16, version & 0xffff, extra.extraversion,
>  	       xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? "
(preserve-AD)" : "");
> +
> +        if (xen_hybrid_domain()) {
> +	        printk(KERN_INFO "MUK: is MUK HYBRID domain....");
> +		if (xen_feature(XENFEAT_auto_translated_physmap))
> +                	printk(KERN_INFO "with EPT...");
> +        	printk(KERN_INFO "\n");
> +        }
>  }
>  
>  static __read_mostly unsigned int cpuid_leaf1_edx_mask = ~0;
> @@ -222,8 +229,10 @@ static void xen_cpuid(unsigned int *ax, unsigned int
*bx,
>  		maskebx = 0;
>  		break;
>  	}
> -
> -	asm(XEN_EMULATE_PREFIX "cpuid"
> +        if (xen_hybrid_domain()) {
> +                native_cpuid(ax, bx, cx, dx);
> +        } else
> +	        asm(XEN_EMULATE_PREFIX "cpuid"
>  		: "=a" (*ax),
>  		  "=b" (*bx),
>  		  "=c" (*cx),
> @@ -244,6 +253,7 @@ static __init void xen_init_cpuid_mask(void)
>  		~((1 << X86_FEATURE_MCE)  |  /* disable MCE */
>  		  (1 << X86_FEATURE_MCA)  |  /* disable MCA */
>  		  (1 << X86_FEATURE_MTRR) |  /* disable MTRR */
> +                  (1 << X86_FEATURE_PSE)  |  /* disable 2M pages */
>  		  (1 << X86_FEATURE_ACC));   /* thermal monitoring */
>  
>  	if (!xen_initial_domain())
> @@ -393,6 +403,10 @@ static void xen_load_gdt(const struct desc_ptr *dtr)
>  		make_lowmem_page_readonly(virt);
>  	}
>  
> +        if (xen_hybrid_domain()) {
> +                native_load_gdt(dtr);
> +                return;
> +        }
>  	if (HYPERVISOR_set_gdt(frames, size / sizeof(struct desc_struct)))
>  		BUG();
>  }
> @@ -431,6 +445,10 @@ static __init void xen_load_gdt_boot(const struct
desc_ptr *dtr)
>  		frames[f] = mfn;
>  	}
>  
> +        if (xen_hybrid_domain()) {
> +                native_load_gdt(dtr);
> +                return;
> +        }
>  	if (HYPERVISOR_set_gdt(frames, size / sizeof(struct desc_struct)))
>  		BUG();
>  }
> @@ -849,9 +867,11 @@ void xen_setup_shared_info(void)
>  
>  		HYPERVISOR_shared_info >  			(struct shared_info
*)fix_to_virt(FIX_PARAVIRT_BOOTMAP);
> -	} else
> +	} else {
>  		HYPERVISOR_shared_info >  			(struct shared_info
*)__va(xen_start_info->shared_info);
> +        	return;
> +	}
>  
>  #ifndef CONFIG_SMP
>  	/* In UP this is as good a place as any to set up shared info */
> @@ -944,6 +964,71 @@ static const struct pv_init_ops xen_init_ops
__initdata = {
>  	.patch = xen_patch,
>  };
>  
> +extern void native_iret(void);
> +extern void native_irq_enable_sysexit(void);
> +extern void native_usergs_sysret32(void);
> +extern void native_usergs_sysret64(void);
> +
> +static const struct pv_cpu_ops xen_hybrid_cpu_ops __initdata = {
> +	.cpuid = xen_cpuid,
> +	.set_debugreg = xen_set_debugreg,
> +	.get_debugreg = xen_get_debugreg,
> +
> +	.clts = xen_clts,
> +
> +	.read_cr0 = xen_read_cr0,
> +	.write_cr0 = xen_write_cr0,
> +
> +	.read_cr4 = native_read_cr4,
> +	.read_cr4_safe = native_read_cr4_safe,
> +	.write_cr4 = native_write_cr4,
> +
> +	.wbinvd = native_wbinvd,
> +
> +	.read_msr = native_read_msr_safe,
> +	.write_msr = native_write_msr_safe,
> +	.read_tsc = native_read_tsc,
> +	.read_pmc = native_read_pmc,
> +
> +	.iret = native_iret,
> +	.irq_enable_sysexit = native_irq_enable_sysexit,
> +#ifdef CONFIG_X86_64
> +	.usergs_sysret32 = native_usergs_sysret32,
> +	.usergs_sysret64 = native_usergs_sysret64,
> +#endif
> +
> +	.load_tr_desc = native_load_tr_desc,
> +	.set_ldt = native_set_ldt,
> +	.load_gdt = native_load_gdt,
> +	.load_idt = native_load_idt,
> +	.load_tls = native_load_tls,
> +#ifdef CONFIG_X86_64
> +	.load_gs_index = native_load_gs_index,
> +#endif
> +
> +	.alloc_ldt = paravirt_nop,
> +	.free_ldt = paravirt_nop,
> +
> +	.store_gdt = native_store_gdt,
> +	.store_idt = native_store_idt,
> +	.store_tr = native_store_tr,
> +
> +	.write_ldt_entry = native_write_ldt_entry,
> +	.write_gdt_entry = native_write_gdt_entry,
> +	.write_idt_entry = native_write_idt_entry,
> +	.load_sp0 = native_load_sp0,
> +
> +	.set_iopl_mask = native_set_iopl_mask,
> +	.io_delay = xen_io_delay,
> +
> +	/* Xen takes care of %gs when switching to usermode for us */
> +	.swapgs = native_swapgs,
> +
> +	.start_context_switch = paravirt_start_context_switch,
> +	.end_context_switch = xen_end_context_switch,
why are you using the paravirt version of start_context_switch and
end_context_switch? Is this for the non-autotranslate version?

> +};
> +
>  static const struct pv_cpu_ops xen_cpu_ops __initdata = {
>  	.cpuid = xen_cpuid,
>  
> @@ -1010,6 +1095,11 @@ static const struct pv_apic_ops xen_apic_ops
__initdata = {
>  #endif
>  };
>  
> +static void __init xen_hybrid_override_autox_cpu_ops(void)
> +{
> +        pv_cpu_ops.cpuid = xen_cpuid;
> +}
> +
>  static void xen_reboot(int reason)
>  {
>  	struct sched_shutdown r = { .reason = reason };
> @@ -1071,6 +1161,10 @@ static const struct machine_ops __initdata
xen_machine_ops = {
>   */
>  static void __init xen_setup_stackprotector(void)
>  {
> +        if (xen_hybrid_domain()) {
> +                switch_to_new_gdt(0);
> +                return;
> +        }
>  	pv_cpu_ops.write_gdt_entry = xen_write_gdt_entry_boot;
>  	pv_cpu_ops.load_gdt = xen_load_gdt_boot;
>  
> @@ -1093,14 +1187,22 @@ asmlinkage void __init xen_start_kernel(void)
>  
>  	xen_domain_type = XEN_PV_DOMAIN;
>  
> +	xen_setup_features();
>  	xen_setup_machphys_mapping();
>  
>  	/* Install Xen paravirt ops */
>  	pv_info = xen_info;
>  	pv_init_ops = xen_init_ops;
> -	pv_cpu_ops = xen_cpu_ops;
>  	pv_apic_ops = xen_apic_ops;
>  
> +        if (xen_hybrid_domain()) {
> +	        if (xen_feature(XENFEAT_auto_translated_physmap))
> +                        xen_hybrid_override_autox_cpu_ops();
> +                else
> +	        	pv_cpu_ops = xen_hybrid_cpu_ops;
> +        } else
> +	        pv_cpu_ops = xen_cpu_ops;
[...]
>  void __init xen_init_mmu_ops(void)
>  {
> +	memset(dummy_mapping, 0xff, PAGE_SIZE);
> +	x86_init.paging.pagetable_setup_done = xen_pagetable_setup_done;
> +
> +	if (xen_feature(XENFEAT_auto_translated_physmap))
> +        	return;
> +
>  	x86_init.mapping.pagetable_reserve = xen_mapping_pagetable_reserve;
>  	x86_init.paging.pagetable_setup_start = xen_pagetable_setup_start;
> -	x86_init.paging.pagetable_setup_done = xen_pagetable_setup_done;
> -	pv_mmu_ops = xen_mmu_ops;
> +        pv_mmu_ops = xen_mmu_ops;
>  
> -	memset(dummy_mapping, 0xff, PAGE_SIZE);
> +        if (xen_hybrid_domain())      /* hybrid without EPT, ie, pv
paging. */
> +		xen_hyb_override_mmu_ops();
>  }
>  
>  /* Protected by xen_reservation_lock. */
So in theory HYB-EPT is running with native_cpu_ops and native_mmu_ops;
in this case I don''t understand why the performances are lower than
HVM.

Mukesh Rathor

2011-Nov-19 00:17 UTC

head link

Re: HYBRID: PV in HVM container

On Fri, 18 Nov 2011 12:21:19 +0000
Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
> On Thu, 17 Nov 2011, Mukesh Rathor wrote:
> > Alright, got hybrid with EPT numbers in now from my prototype, it
> > needs some perf work.. 
> 
> Is HVM a PV on HVM guest or a pure HVM guest (no CONFIG_XEN)?
PV on HVM.

> How is it possible that the HYB-EPT numbers here are so much worse
> than HVM? Shouldn''t they be the same as in the other tests? 
Yeah I know. I wondered that myself. I need to investigate.

> why are you using the paravirt version of start_context_switch and
> end_context_switch? Is this for the non-autotranslate version?
this is for non-autotranslate version.
> 
> So in theory HYB-EPT is running with native_cpu_ops and
> native_mmu_ops; in this case I don''t understand why the
performances
> are lower than HVM.
Yup, same here. I''ll have to investigate to see whats going on. Keep
you
posted. I am working on SMP now, then I''ll take a look.


thanks,
Mukesh

Xen devel - Jun 2011 - HYBRID: PV in HVM container

[Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: [Xen-devel] HYBRID: PV in HVM container

Re: HYBRID: PV in HVM container

Re: HYBRID: PV in HVM container