thr3ads.net - Xen devel - Xen 4.3 development update [Mar 2013]

If this information is useful, please help other people find it:
Share via:

George Dunlap

2013-Mar-05 12:44 UTC

Xen 4.3 development update

OK, with conferences and things it''s been a while since I''ve
sent an
update.  The initially scheduled feature freeze is in 20 days time, so
please respond with any status updates, as well as any suggestions for
any interventions if you have them.

The big change from last time is that PVH changed to "tech preview
only" for 4.3.

This information will be mirrored on the Xen 4.3 Roadmap wiki page:
 http://wiki.xen.org/wiki/Xen_Roadmap/4.3

= Timeline 
We are planning on a 9-month release cycle.  Based on that, below are
our estimated dates:
* Feature Freeze: 25 March 2013
* First RC: 6 May 2013
* Release: 17 June 2013

The RCs and release will of course depend on stability and bugs, and
will therefore be fairly unpredictable.  The feature freeze may be
slipped for especially important features which are near completion.

Last updated: 17 January 2013

= Feature tracking 
Below is a list of features we''re tracking for this release. Please
respond to this mail with any updates to the status.

There are a number of items whose owners are marked as "?".  If you
are working on this, or know who is working on it, please respond and
let me know.  Alternately, if you would *like* to work on it, please
let me know as well.

And if there is something you''re working on you''d like
tracked, please
respond, and I will add it to the list.

NB: Several of the items on this list are from external projects:
linux, qemu, and libvirt.  These are not part of the Xen tree, but are
directly related to our users'' experience (e.g., work in Linux or
qemu) or to integration with other important projects (e.g., libvirt
bindings).  Since all of these are part of the Xen community work, and
comes from the same pool of labor, it makes sense to track the
progress here, even though they won''t explicitly be released as part
of 4.3.

Meanings of prognoses:
- Excellent: It would be very unlikely for this not to be finished in time.
- Good: Everything is on track, and is likely to make it.
- Fair: A pretty good chance of making it, but not as certain
- Poor: Likely not to make it unless intervention is made
- Not for 4.3: Self-explanatory

== Completed =
* Serial console improvements
  -EHCI debug port

* Default to QEMU upstream (partial)
 - pci pass-thru (external)
 - enable dirtybit tracking during migration (external)
 - xl cd-{insert,eject} (external)

* CPUID-based idle (don''t rely on ACPI info f/ dom0)

* Persistent grants for blk (external)
 - Linux

* Allow XSM to override IS_PRIV checks in the hypervisor

* Scalability: 16TiB of RAM

* xl QXL Spice support

== Bugs =
* xl, compat mode, and older kernels
  owner: ?
  Many older 32-bit PV kernels that can run on a 64-bit hypervisor with
  xend do not work when started with xl.  The following work-around seems to
  work:
    xl create -p lightning.cfg
    xenstore-write /local/domain/$(xl domid
lightning)/device/vbd/51713/protocol x86_32-abi
    xl unpause lightning
  This node is normally written by the guest kernel, but for older kernels
  seems not to be.  xend must have a work-around; port this work-around to xl.

* AMD NPT performance regression after c/s 24770:7f79475d3de7
  owner: ?
  Reference: http://marc.info/?l=xen-devel&m=135075376805215

* qemu-upstream: cd-insert and cd-eject not working
  http://marc.info/?l=xen-devel&m=135850249808960

== Not yet complete =
* PVH mode (w/ Linux)
  owner: mukesh@oracle
  status (Linux): 3rd draft patches posted.
  status (Xen): RFC submitted
  prognosis: Tech preview only

* Event channel scalability
  owner: wei@citrix
  status: RFC submitted
  prognosis: Good
  Increase limit on event channels (currently 1024 for 32-bit guests,
  4096 for 64-bit guests)

* ARM v7 server port
  owner: ijc@citrix
  prognosis: Excellent
  status: Core hypervisor and Linux patches accepted.  Tools patches submitted.

* ARM v8 server port (tech preview)
  owner: ijc@citrix
  status: ?
  prognosis: Tech preview only

* NUMA scheduler affinity
  critical
  owner: dario@citrix
  status: Patches posted
  prognosis: Excellent

* NUMA Memory migration
  owner: dario@citrix
  status: in progress
  prognosis: Fair

* blktap3
  owner: thanos@citrix
  status: RFCs posted
  prognosis: Not for 4.3

* Default to QEMU upstream
 > Add "intel-hda" to xmexample file, since it works with 64-bit
Win7/8
 - qemu-based stubdom (Linux or BSD libc)
   owner: anthony@citrix
   status: in progress
   prognosis: Good
   qemu-upstream needs a more fully-featured libc than exists in
   mini-os.  Either work on a minimalist linux-based stubdom with
   glibc, or port one of the BSD libcs to minios.

* Persistent grants for blk (external)
  owner: roger@citrix
  status: qemu patches submitted
  prognosis: Good

* Persistent grants for net
  owner: annie.li@citrix
  status: Initial implementation not getting expected gains; more
investigation requried
  prognosis: Poor

* Multi-page blk rings (external)
 - blkback in kernel roger@citrix
 - qemu blkback
  status: Overall blk architecture being discussed
  prognosis: Fair

* Multi-page net protocol (external)
  owner: ijc@citrix or annie.li@oracle
  status: Initial patches posted (by Wei Liu)
  prognosis: Poor
  expand the network ring protocol to allow multiple pages for
  increased throughput

* vTPM updates
  owner: Matthew Fioravante @ Johns Hopkins
  status: some patches submitted, more in progress
  prognosis: Good
  - Allow all vTPM components to run in stub domains for increased security
  - Update vtpm to 0.7.4
  - Remove dom0-based vtpmd

* Guest EFI booting
 - status: tianocore in-tree, some build problems.
   prognosis: Poor.
   Needs new owner.

* libvirt/libxl integration (external)
 - Update libvirt to 4.2
   status: Patch accepted
 - Migration
   owner: cyliu@suse (?)
   status: first draft implemented, not yet submitted
   prognosis: ?
 - Itemize other things that need work
   To begin with, we need someone to go and make some lists:
   - Features available in libvirt/KVM not available in libvirt/libxl
     See http://libvirt.org/hvsupport.html
   - Features available in xl/Xen but not available in libvirt/Xen

* V4V: Inter-domain communication
  owner (Xen): jean.guyader@citrix.com
  status (Xen): patches submitted
  prognosis: ?
  owner (Linux driver):  stefano.panella@citrix
  status (Linux driver): in progress

* Wait queues for mm
  owner: ?
  status: Draft posted Feb 2012; more work to do.
  prognosis: Poor

* xl PVUSB pass-through for PV guests
* xl PVUSB pass-through for HVM guests
  owner: George
  status: ?
  prognosis: Fair
  xm/xend supports PVUSB pass-through to guests with PVUSB drivers
(both PV and HVM guests).
  - port the xm/xend functionality to xl.
  - this PVUSB feature does not require support or emulation from Qemu.
  - upstream the Linux frontend/backend drivers. Current
work-in-progress versions are in Konrad''s git tree.
  - James Harper''s GPLPV drivers for Windows include PVUSB frontend
drivers.

* xl USB pass-through for HVM guests using Qemu USB emulation
  owner: George
  status: Config file pass-through submitted.
  prognosis: Good
  xm/xend with qemu-traditional supports USB passthrough to HVM guests
using the Qemu emulated USB controller.
  The HVM guest does not need any special drivers for this feature.
  So basicly the qemu cmdline needs to have:
     -usb -usbdevice host:xxxx:yyyy
  - port the xm/xend functionality to xl.
  - make sure USB passthrough with xl works with both qemu-traditional
and qemu-upstream.

* xl: passing more defaults in configuration in xl.conf
  owner: ?
  There are a number of options for which it might be useful to pass a
  default in xl.conf.  For example, if we could have a default
  "backend" parameter for vifs, then it would be easy to switch back
  and forth between a backend in a driver domain and a backend in dom0.

* Remove hardcoded mobprobe''s in xencommons
  owner: ?
  status: ?
  prognosis: Poor.

* openvswitch toostack integration
  owner: ?
  prognosis: Poor
  status: Sample script posted by Bastian ("[RFC] openvswitch support
script")
  - See if we can engage Bastian to do a more fully-featured script?

* Rationalized backend scripts
  owner: roger@citrix
  status: libxl hotplug sumbmitted.  Protocol still needs to be finalized.
  prognosis: Good

* Scripts for driver domains (depends on backend scripts)
  owner: roger@citrix
  status:
  prognosis: Fair

* Xen EFI feature: pvops dom0 able to make use of EFI run-time
services (external)
 owner: Daniel Kiper
 status: Just begun
 prognosis: Probably not for 4.3 (?)

* Xen EFI feature: Xen can boot from grub.efi
 owner: Daniel Kiper
 status: Just begun
 prognosis: Fair

* Serial console improvements
  owner: ?
  status: Stalled (see below)
  prognosis: Probably not for 4.3.
  -xHCI debug port (Needs hardware)
  -Firewire (needs hardware)

* Make storage migration possible
  owner: ?
  status: none
  prognosis: Probably delay until 4.4
  There needs to be a way, either via command-line or via some hooks,
  that someone can build a "storage migration" feature on top of libxl
  or xl.

* Full-VM snapshotting
  owner: ?
  status: none
  prognosis: Probably delay until 4.4
  Have a way of coordinating the taking and restoring of VM memory and
  disk snapshots.  This would involve some investigation into the best
  way to accomplish this.

* VM Cloning
  owner: ?
  status: none
  prognosis: Probably need 4.4
  Again, a way of coordinating the memory and disk aspects.  Research
  into the best way to do this would probably go along with the
  snapshotting feature.

* xl vm-{export,import}
  owner: ?
  status: none
  prognosis: Prob put off until 4.4 (or GSoC project)
  Allow xl to import and export VMs to other formats; particularly
  ovf, perhaps the XenServer format, or more.

* Memory: Replace PoD with paging mechanism
  owner: george@citrix
  status: none
  prognosis: Prob put off until 4.4

* PV audio (audio for stubdom qemu)
  owner: stefano.panella@citrix
  status: ?
  prognosis: ?

* IllumOS (OpenSolaris fork) support
  owner: Igor Kozhukov
  status: Stopped

Jan Beulich

2013-Mar-05 14:32 UTC

head link

Re: Xen 4.3 development update

>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>   owner: ?
>   Reference: http://marc.info/?l=xen-devel&m=135075376805215 
Suravee, Jacob?

Jan

Jan Beulich

2013-Mar-05 14:36 UTC

head link

Re: Xen 4.3 development update

>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
> * V4V: Inter-domain communication
>   owner (Xen): jean.guyader@citrix.com 
>   status (Xen): patches submitted
>   prognosis: ?
Last I recall was that all issues I had pointed out had been
addressed. It would be Keir needing to ack and/or commit them,
if no-one else spotted any issues with this code.

Jan
>   owner (Linux driver):  stefano.panella@citrix
>   status (Linux driver): in progress

Jan Beulich

2013-Mar-05 14:40 UTC

head link

Re: Xen 4.3 development update

>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
> OK, with conferences and things it''s been a while since
I''ve sent an
> update.  The initially scheduled feature freeze is in 20 days time, so
> please respond with any status updates, as well as any suggestions for
> any interventions if you have them.
With recent Linux development, there''s one more thing I''d like
to
add (and if I can find enough time, also carry out): Multi-vector
PCI MSI support at least for Dom0. It wouldn''t look nice if we
didn''t do this for 4.3, as that would mean we''d lag behind
native
Linux for almost a year (i.e. until whenever 4.4 gets released).

Jan

David Vrabel

2013-Mar-05 14:45 UTC

head link

Re: Xen 4.3 development update

On 05/03/13 14:36, Jan Beulich wrote:>>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
>> * V4V: Inter-domain communication
>>   owner (Xen): jean.guyader@citrix.com 
>>   status (Xen): patches submitted
>>   prognosis: ?
> 
> Last I recall was that all issues I had pointed out had been
> addressed. It would be Keir needing to ack and/or commit them,
> if no-one else spotted any issues with this code.
We''re taking at look at the v4v patches and at the moment
we''re not
happy with the quality as-is.  I think there were some locking bugs and
issues with the compat code.

Dominic (CCc''d) is doing the work and he should be able to give more
details.

David

David Vrabel

2013-Mar-05 14:57 UTC

head link

Re: Xen 4.3 development update

On 05/03/13 14:45, David Vrabel wrote:> On 05/03/13 14:36, Jan Beulich wrote:
>>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
>>> * V4V: Inter-domain communication
>>>   owner (Xen): jean.guyader@citrix.com 
>>>   status (Xen): patches submitted
>>>   prognosis: ?
>>
>> Last I recall was that all issues I had pointed out had been
>> addressed. It would be Keir needing to ack and/or commit them,
>> if no-one else spotted any issues with this code.
> 
> We''re taking at look at the v4v patches and at the moment
we''re not
> happy with the quality as-is.  I think there were some locking bugs and
> issues with the compat code.
> 
> Dominic (CCc''d) is doing the work and he should be able to give
more
> details.
Actually Cc him.

David

Konrad Rzeszutek Wilk

2013-Mar-05 15:51 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 05, 2013 at 02:40:08PM +0000, Jan Beulich
wrote:> >>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
> > OK, with conferences and things it''s been a while since
I''ve sent an
> > update.  The initially scheduled feature freeze is in 20 days time, so
> > please respond with any status updates, as well as any suggestions for
> > any interventions if you have them.
> 
> With recent Linux development, there''s one more thing I''d
like to
> add (and if I can find enough time, also carry out): Multi-vector
> PCI MSI support at least for Dom0. It wouldn''t look nice if we
Oh that would be good! I presume this is along the line of a new
hypercall that would just allocate nvec''s and provide the pirqs
back to dom0?
> didn''t do this for 4.3, as that would mean we''d lag
behind native
> Linux for almost a year (i.e. until whenever 4.4 gets released).
> 
> Jan
>

Jan Beulich

2013-Mar-05 16:07 UTC

head link

Re: Xen 4.3 development update

>>> On 05.03.13 at 16:51, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Tue, Mar 05, 2013 at 02:40:08PM +0000, Jan Beulich wrote:
>> >>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
>> > OK, with conferences and things it''s been a while since
I''ve sent an
>> > update.  The initially scheduled feature freeze is in 20 days
time, so
>> > please respond with any status updates, as well as any suggestions
for
>> > any interventions if you have them.
>> 
>> With recent Linux development, there''s one more thing
I''d like to
>> add (and if I can find enough time, also carry out): Multi-vector
>> PCI MSI support at least for Dom0. It wouldn''t look nice if we
> 
> Oh that would be good! I presume this is along the line of a new
> hypercall that would just allocate nvec''s and provide the pirqs
> back to dom0?
Yes, sort of at least.

Jan
>> didn''t do this for 4.3, as that would mean we''d lag
behind native
>> Linux for almost a year (i.e. until whenever 4.4 gets released).
>> 
>> Jan
>>

Konrad Rzeszutek Wilk

2013-Mar-05 16:12 UTC

head link

Re: Xen 4.3 development update

> == Not yet complete => 
> * PVH mode (w/ Linux)
>   owner: mukesh@oracle
>   status (Linux): 3rd draft patches posted.
>   status (Xen): RFC submitted
>   prognosis: Tech preview only
Mukesh was going to post an RFC this week but that has to be
delayed.
> * Persistent grants for blk (external)
>   owner: roger@citrix
>   status: qemu patches submitted
>   prognosis: Good
> 
> * Persistent grants for net
>   owner: annie.li@citrix
>   status: Initial implementation not getting expected gains; more
> investigation requried
>   prognosis: Poor
Going to be delayed further. Might as well move it out out of Xen 4.3
feature list to Xen 4.4.> 
> * Multi-page blk rings (external)
>  - blkback in kernel roger@citrix
>  - qemu blkback
>   status: Overall blk architecture being discussed
>   prognosis: Fair

I think we came up with so many design things that Roger is going to
be burried with this for a year or so :-(

But I would say that the ''multi-page'' aspect of this (which I
think
is actually the indirect descriptor) is going to show up in v3.10.
Which is two months away. The only Xen patches are to update the
protocol description.

> 
> * Multi-page net protocol (external)
>   owner: ijc@citrix or annie.li@oracle
>   status: Initial patches posted (by Wei Liu)
>   prognosis: Poor
>   expand the network ring protocol to allow multiple pages for
>   increased throughput
> 
> * vTPM updates
>   owner: Matthew Fioravante @ Johns Hopkins
>   status: some patches submitted, more in progress
>   prognosis: Good
>   - Allow all vTPM components to run in stub domains for increased security
>   - Update vtpm to 0.7.4
>   - Remove dom0-based vtpmd
I think we are waiting for Matthew. He posted the Linux ones but
they need to be updated. Could make it for v3.10.

.. snip..> * Xen EFI feature: pvops dom0 able to make use of EFI run-time
> services (external)
>  owner: Daniel Kiper
>  status: Just begun
>  prognosis: Probably not for 4.3 (?)
Not for 4.3. Might not even be ready for v3.10.> 
> * Xen EFI feature: Xen can boot from grub.efi
>  owner: Daniel Kiper
>  status: Just begun
>  prognosis: Fair
Not sure, Daniel?

Tim Deegan

2013-Mar-05 16:33 UTC

head link

Re: Xen 4.3 development update

At 14:57 +0000 on 05 Mar (1362495447), David Vrabel
wrote:> On 05/03/13 14:45, David Vrabel wrote:
> > On 05/03/13 14:36, Jan Beulich wrote:
> >>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
> >>> * V4V: Inter-domain communication
> >>>   owner (Xen): jean.guyader@citrix.com 
> >>>   status (Xen): patches submitted
> >>>   prognosis: ?
> >>
> >> Last I recall was that all issues I had pointed out had been
> >> addressed. It would be Keir needing to ack and/or commit them,
> >> if no-one else spotted any issues with this code.
> > 
> > We''re taking at look at the v4v patches and at the moment
we''re not
> > happy with the quality as-is.  I think there were some locking bugs
and
> > issues with the compat code.
> > 
> > Dominic (CCc''d) is doing the work and he should be able to
give more
> > details.
> 
> Actually Cc him.
Actually actually Cc him. :)

Ian Campbell

2013-Mar-05 16:36 UTC

head link

Re: Xen 4.3 development update

On Tue, 2013-03-05 at 16:33 +0000, Tim Deegan wrote:> At 14:57 +0000 on 05 Mar (1362495447), David Vrabel wrote:
> > On 05/03/13 14:45, David Vrabel wrote:
> > > On 05/03/13 14:36, Jan Beulich wrote:
> > >>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
> > >>> * V4V: Inter-domain communication
> > >>>   owner (Xen): jean.guyader@citrix.com 
> > >>>   status (Xen): patches submitted
> > >>>   prognosis: ?
> > >>
> > >> Last I recall was that all issues I had pointed out had been
> > >> addressed. It would be Keir needing to ack and/or commit
them,
> > >> if no-one else spotted any issues with this code.
> > > 
> > > We''re taking at look at the v4v patches and at the
moment we''re not
> > > happy with the quality as-is.  I think there were some locking
bugs and
> > > issues with the compat code.
> > > 
> > > Dominic (CCc''d) is doing the work and he should be able
to give more
> > > details.
> > 
> > Actually Cc him.
> 
> Actually actually Cc him. :)
Nope ;-)

Mailman has a "feature" where it will strip the CC of people whoare
subscribed to the list and have unchecked the  "recieve copies of my own
postings" option (or one of the ones like that...)

Ian.

Tim Deegan

2013-Mar-05 16:43 UTC

head link

Re: Xen 4.3 development update

At 16:36 +0000 on 05 Mar (1362501373), Ian Campbell
wrote:> On Tue, 2013-03-05 at 16:33 +0000, Tim Deegan wrote:
> > At 14:57 +0000 on 05 Mar (1362495447), David Vrabel wrote:
> > > On 05/03/13 14:45, David Vrabel wrote:
> > > > On 05/03/13 14:36, Jan Beulich wrote:
> > > >>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
> > > >>> * V4V: Inter-domain communication
> > > >>>   owner (Xen): jean.guyader@citrix.com 
> > > >>>   status (Xen): patches submitted
> > > >>>   prognosis: ?
> > > >>
> > > >> Last I recall was that all issues I had pointed out had
been
> > > >> addressed. It would be Keir needing to ack and/or commit
them,
> > > >> if no-one else spotted any issues with this code.
> > > > 
> > > > We''re taking at look at the v4v patches and at the
moment we''re not
> > > > happy with the quality as-is.  I think there were some
locking bugs and
> > > > issues with the compat code.
> > > > 
> > > > Dominic (CCc''d) is doing the work and he should be
able to give more
> > > > details.
> > > 
> > > Actually Cc him.
> > 
> > Actually actually Cc him. :)
> 
> Nope ;-)
> 
> Mailman has a "feature" where it will strip the CC of people
whoare
> subscribed to the list and have unchecked the  "recieve copies of my
own
> postings" option (or one of the ones like that...)
Hilarious.  Well, Dominic''s been well and truly Cc''d, then. :)

Tim.

Keir Fraser

2013-Mar-05 18:22 UTC

head link

Re: Xen 4.3 development update

On 05/03/2013 14:45, "David Vrabel" <david.vrabel@citrix.com>
wrote:
> On 05/03/13 14:36, Jan Beulich wrote:
>>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
>>> * V4V: Inter-domain communication
>>>   owner (Xen): jean.guyader@citrix.com
>>>   status (Xen): patches submitted
>>>   prognosis: ?
>> 
>> Last I recall was that all issues I had pointed out had been
>> addressed. It would be Keir needing to ack and/or commit them,
>> if no-one else spotted any issues with this code.
> 
> We''re taking at look at the v4v patches and at the moment
we''re not
> happy with the quality as-is.  I think there were some locking bugs and
> issues with the compat code.
I also have to make a judgement call on it. I want to be sure that it really
is providing something useful beyond what can be built out of existing
primitives, and that we don''t lower the bar for acceptance because
it''s a
Citrix submission. I don''t think we do, but I''m wary of it.

 -- Keir
> Dominic (CCc''d) is doing the work and he should be able to give
more
> details.
> 
> David
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Roger Pau Monné

2013-Mar-05 19:13 UTC

head link

Re: Xen 4.3 development update

On 05/03/13 13:44, George Dunlap wrote:> * Persistent grants for blk (external)
>   owner: roger@citrix
>   status: qemu patches submitted
>   prognosis: Good
Done. For both Linux kernel and Qemu.

Suravee Suthikulpanit

2013-Mar-05 21:27 UTC

head link

Re: Xen 4.3 development update

On 3/5/2013 8:32 AM, Jan Beulich wrote:>>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>    owner: ?
>>    Reference: http://marc.info/?l=xen-devel&m=135075376805215
> Suravee, Jacob?Let me take a look and I''ll get back to you.

Suravee> Jan
>
>

Ian Campbell

2013-Mar-06 03:00 UTC

head link

Re: Xen 4.3 development update

On Tue, 2013-03-05 at 14:36 +0000, Jan Beulich wrote:> >>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
> > * V4V: Inter-domain communication
> >   owner (Xen): jean.guyader@citrix.com 
> >   status (Xen): patches submitted
> >   prognosis: ?
> 
> Last I recall was that all issues I had pointed out had been
> addressed. It would be Keir needing to ack and/or commit them,
> if no-one else spotted any issues with this code.
One concern I have is that AFAIK there hasn''t been any non PoC client
side code posted to use this stuff, e.g. libvchan integration or AF_XEN
socket support in the kernel, etc. There are some interesting use cases
for v4v but we don''t seem to actually have any ways for people to use
it
for them.

I was also concerned about the lack of a maintainer now that Jean has
moved onto other things, it''s one thing when a maintainer disappears
after we commit something but accepting code which has already been
orphaned is a step too far! It seems that Dominic is going to picking
that up so I''m not so concerned about that any more.

Ian.
> 
> Jan
> 
> >   owner (Linux driver):  stefano.panella@citrix
> >   status (Linux driver): in progress
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Daniel Kiper

2013-Mar-06 08:44 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 05, 2013 at 11:12:16AM -0500, Konrad Rzeszutek Wilk wrote:

[...]
> > * Xen EFI feature: pvops dom0 able to make use of EFI run-time
> > services (external)
> >  owner: Daniel Kiper
> >  status: Just begun
> >  prognosis: Probably not for 4.3 (?)
>
> Not for 4.3. Might not even be ready for v3.10.
I hope it will be in ~3.12.
> > * Xen EFI feature: Xen can boot from grub.efi
> >  owner: Daniel Kiper
> >  status: Just begun
> >  prognosis: Fair
>
> Not sure, Daniel?
I think that this thing will be ready close to feature freeze date.

Daniel

Wei Liu

2013-Mar-06 11:14 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 05, 2013 at 12:44:23PM +0000, George Dunlap
wrote:> * Event channel scalability
>   owner: wei@citrix
>   status: RFC submitted
>   prognosis: Good
>   Increase limit on event channels (currently 1024 for 32-bit guests,
>   4096 for 64-bit guests)
> 
Hypervisor side RFC V4 submitted, will submitted RFC V5 shortly.
Rewriting kernel side code.


Wei.

George Dunlap

2013-Mar-06 11:44 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 5, 2013 at 2:40 PM, Jan Beulich <JBeulich@suse.com>
wrote:>>>> On 05.03.13 at 13:44, George Dunlap <dunlapg@umich.edu>
wrote:
>> OK, with conferences and things it''s been a while since
I''ve sent an
>> update.  The initially scheduled feature freeze is in 20 days time, so
>> please respond with any status updates, as well as any suggestions for
>> any interventions if you have them.
>
> With recent Linux development, there''s one more thing I''d
like to
> add (and if I can find enough time, also carry out): Multi-vector
> PCI MSI support at least for Dom0. It wouldn''t look nice if we
> didn''t do this for 4.3, as that would mean we''d lag
behind native
> Linux for almost a year (i.e. until whenever 4.4 gets released).
OK, I''ll add it in.  FWIW, given the number of nearly-completed
projects that we have that will miss 4.3, I think we might consider a
shorter release cycle for 4.4.  But we can cross that bridge once we
do the code freeze.

 -George

George Dunlap

2013-Mar-06 11:46 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 5, 2013 at 2:57 PM, David Vrabel <david.vrabel@citrix.com>
wrote:> On 05/03/13 14:45, David Vrabel wrote:
>> On 05/03/13 14:36, Jan Beulich wrote:
>>>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
>>>> * V4V: Inter-domain communication
>>>>   owner (Xen): jean.guyader@citrix.com
>>>>   status (Xen): patches submitted
>>>>   prognosis: ?
>>>
>>> Last I recall was that all issues I had pointed out had been
>>> addressed. It would be Keir needing to ack and/or commit them,
>>> if no-one else spotted any issues with this code.
>>
>> We''re taking at look at the v4v patches and at the moment
we''re not
>> happy with the quality as-is.  I think there were some locking bugs and
>> issues with the compat code.
>>
>> Dominic (CCc''d) is doing the work and he should be able to
give more
>> details.
>
> Actually Cc him.
Dominic, shall I put you down as the contact for this feature, then?

 -George

George Dunlap

2013-Mar-06 11:52 UTC

head link

Re: Xen 4.3 development update

>>
>> * Multi-page blk rings (external)
>>  - blkback in kernel roger@citrix
>>  - qemu blkback
>>   status: Overall blk architecture being discussed
>>   prognosis: Fair
>
>
> I think we came up with so many design things that Roger is going to
> be burried with this for a year or so :-(
>
> But I would say that the ''multi-page'' aspect of this
(which I think
> is actually the indirect descriptor) is going to show up in v3.10.
> Which is two months away. The only Xen patches are to update the
> protocol description.
It sounds like we''re actually talking about different things -- it
seems like you''re talking about a large blk protocol architecture
rewrite, whereas I think I had always intended this just to be about
multi-page blk rings, as that was percieved to be a scalability
limitation.  If the multi-page aspect will turn up within a few months
of the 4.3 release, I think that still counts as a success for this
item.

If that aspect looks good, can we mark this one as "Good" instead of
"Fair"?

 -George

George Dunlap

2013-Mar-06 12:03 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 5, 2013 at 7:13 PM, Roger Pau Monné <roger.pau@citrix.com>
wrote:> On 05/03/13 13:44, George Dunlap wrote:
>> * Persistent grants for blk (external)
>>   owner: roger@citrix
>>   status: qemu patches submitted
>>   prognosis: Good
>
> Done. For both Linux kernel and Qemu.
Great thanks!
 -G

George Dunlap

2013-Mar-06 12:05 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 5, 2013 at 9:27 PM, Suravee Suthikulpanit
<suravee.suthikulpanit@amd.com> wrote:> On 3/5/2013 8:32 AM, Jan Beulich wrote:
>>>>>
>>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
>>>
>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>    owner: ?
>>>    Reference: http://marc.info/?l=xen-devel&m=135075376805215
>>
>> Suravee, Jacob?
>
> Let me take a look and I''ll get back to you.
FYI, if you do reproduce a regression from this changeset, it''s in
part Andres'' responsibility to help sort it out, since it would be his
code that introduced it.

 -George

George Dunlap

2013-Mar-06 14:47 UTC

head link

Re: Xen 4.3 development update

On Tue, Mar 5, 2013 at 12:44 PM, George Dunlap <dunlapg@umich.edu>
wrote:> * openvswitch toostack integration
>   owner: ?
>   prognosis: Poor
>   status: Sample script posted by Bastian ("[RFC] openvswitch support
script")
>   - See if we can engage Bastian to do a more fully-featured script?

James,  It looked like you had started to take a look at this.  Are
you still working on it / planning to work on it?

 -George

Dominic Curran

2013-Mar-06 17:01 UTC

head link

Re: Xen 4.3 development update

> -----Original Message-----
> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of
> George Dunlap
> Sent: 06 March 2013 3:46 AM
> To: David Vrabel
> Cc: Jan Beulich; Jean Guyader; xen-devel@lists.xen.org; Dominic Curran
> Subject: Re: [Xen-devel] Xen 4.3 development update
> 
> On Tue, Mar 5, 2013 at 2:57 PM, David Vrabel
<david.vrabel@citrix.com>
> wrote:
> > On 05/03/13 14:45, David Vrabel wrote:
> >> On 05/03/13 14:36, Jan Beulich wrote:
> >>>>>> On 05.03.13 at 13:44, George Dunlap
<dunlapg@umich.edu> wrote:
> >>>> * V4V: Inter-domain communication
> >>>>   owner (Xen): jean.guyader@citrix.com
> >>>>   status (Xen): patches submitted
> >>>>   prognosis: ?
> >>>
> >>> Last I recall was that all issues I had pointed out had been
> >>> addressed. It would be Keir needing to ack and/or commit them,
if
> >>> no-one else spotted any issues with this code.
> >>
> >> We''re taking at look at the v4v patches and at the moment
we''re not
> >> happy with the quality as-is.  I think there were some locking
bugs
> >> and issues with the compat code.
> >>
> >> Dominic (CCc''d) is doing the work and he should be able
to give more
> >> details.
> >
> > Actually Cc him.
> 
> Dominic, shall I put you down as the contact for this feature, then?
> 
>  -George
Yes, that''s fine. 
I am still testing (and finding issues) so I can''t give you a date for
re-submitting yet.

George Dunlap

2013-Mar-06 17:15 UTC

head link

Re: Xen 4.3 development update

On Wed, Mar 6, 2013 at 5:01 PM, Dominic Curran
<dominic.curran@citrix.com> wrote:>> Dominic, shall I put you down as the contact for this feature, then?
>>
>>  -George
>
> Yes, that''s fine.
> I am still testing (and finding issues) so I can''t give you a date
for re-submitting yet.
Cool, thanks.  Could take a stab at a prognosis from one of the list below?

Meanings of prognoses:
- Excellent: It would be very unlikely for this not to be finished in time.
- Good: Everything is on track, and is likely to make it.
- Fair: A pretty good chance of making it, but not as certain
- Poor: Likely not to make it unless intervention is made
- Not for 4.3: Self-explanatory

George Dunlap

2013-Mar-06 17:15 UTC

head link

Re: Xen 4.3 development update

On Wed, Mar 6, 2013 at 5:15 PM, George Dunlap <dunlapg@umich.edu>
wrote:> On Wed, Mar 6, 2013 at 5:01 PM, Dominic Curran
> <dominic.curran@citrix.com> wrote:
>>> Dominic, shall I put you down as the contact for this feature,
then?
>>>
>>>  -George
>>
>> Yes, that''s fine.
>> I am still testing (and finding issues) so I can''t give you a
date for re-submitting yet.
>
> Cool, thanks.  Could take a stab at a prognosis from one of the list below?
...meant to add: keeping in mind that the scheduled feature freeze is 25 March?
>
> Meanings of prognoses:
> - Excellent: It would be very unlikely for this not to be finished in time.
> - Good: Everything is on track, and is likely to make it.
> - Fair: A pretty good chance of making it, but not as certain
> - Poor: Likely not to make it unless intervention is made
> - Not for 4.3: Self-explanatory

Dominic Curran

2013-Mar-06 18:54 UTC

head link

Re: Xen 4.3 development update

> >>
> >> Yes, that''s fine.
> >> I am still testing (and finding issues) so I can''t give
you a date for re-
> submitting yet.
> >
> > Cool, thanks.  Could take a stab at a prognosis from one of the list
below?
> 
> ...meant to add: keeping in mind that the scheduled feature freeze is 25
> March?
> 
> >
> > Meanings of prognoses:
> > - Excellent: It would be very unlikely for this not to be finished in
time.
> > - Good: Everything is on track, and is likely to make it.
> > - Fair: A pretty good chance of making it, but not as certain
> > - Poor: Likely not to make it unless intervention is made
> > - Not for 4.3: Self-explanatory
Errr, I''d say Fair b/c I just don''t have a handle on what the
issue is at the moment.

Konrad Rzeszutek Wilk

2013-Mar-06 19:50 UTC

head link

Re: Xen 4.3 development update

On Wed, Mar 06, 2013 at 11:52:50AM +0000, George Dunlap
wrote:> >>
> >> * Multi-page blk rings (external)
> >>  - blkback in kernel roger@citrix
> >>  - qemu blkback
> >>   status: Overall blk architecture being discussed
> >>   prognosis: Fair
> >
> >
> > I think we came up with so many design things that Roger is going to
> > be burried with this for a year or so :-(
> >
> > But I would say that the ''multi-page'' aspect of this
(which I think
> > is actually the indirect descriptor) is going to show up in v3.10.
> > Which is two months away. The only Xen patches are to update the
> > protocol description.
> 
> It sounds like we''re actually talking about different things -- it
> seems like you''re talking about a large blk protocol architecture
> rewrite, whereas I think I had always intended this just to be about
> multi-page blk rings, as that was percieved to be a scalability
> limitation.  If the multi-page aspect will turn up within a few months
> of the 4.3 release, I think that still counts as a success for this
> item.
Oh, in that case it is Bad. The multi-page has not even been on the list as
the other protocol architecture re-writes give a bigger boost in performance.
> 
> If that aspect looks good, can we mark this one as "Good" instead
of "Fair"?
> 
>  -George

Mukesh Rathor

2013-Mar-07 01:40 UTC

head link

Re: Xen 4.3 development update

On Tue, 5 Mar 2013 11:12:16 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > == Not yet complete => > 
> > * PVH mode (w/ Linux)
> >   owner: mukesh@oracle
> >   status (Linux): 3rd draft patches posted.
> >   status (Xen): RFC submitted
> >   prognosis: Tech preview only
> 
> Mukesh was going to post an RFC this week but that has to be
> delayed.
Ok, I''ve gone thru all patch comments now and addressed them. I tested
it and works. 

My xen tree is about 2 months old now. So, next is to refresh it and
submit V2 of the patches.

Thanks for the patience.
Mukesh

George Dunlap

2013-Mar-07 10:18 UTC

head link

Re: Xen 4.3 development update

On 06/03/13 18:54, Dominic Curran wrote:>>>> Yes, that''s fine.
>>>> I am still testing (and finding issues) so I can''t
give you a date for re-
>> submitting yet.
>>> Cool, thanks.  Could take a stab at a prognosis from one of the
list below?
>> ...meant to add: keeping in mind that the scheduled feature freeze is
25
>> March?
>>
>>> Meanings of prognoses:
>>> - Excellent: It would be very unlikely for this not to be finished
in time.
>>> - Good: Everything is on track, and is likely to make it.
>>> - Fair: A pretty good chance of making it, but not as certain
>>> - Poor: Likely not to make it unless intervention is made
>>> - Not for 4.3: Self-explanatory
> Errr, I''d say Fair b/c I just don''t have a handle on what
the issue is at the moment.
Great, thanks!

  -George

Paolo Bonzini

2013-Mar-08 10:51 UTC

head link

Re: Xen 4.3 development update

Il 05/03/2013 13:44, George Dunlap ha scritto:> * xl, compat mode, and older kernels
>   owner: ?
>   Many older 32-bit PV kernels that can run on a 64-bit hypervisor with
>   xend do not work when started with xl.  The following work-around seems
to
>   work:
>     xl create -p lightning.cfg
>     xenstore-write /local/domain/$(xl domid
> lightning)/device/vbd/51713/protocol x86_32-abi
>     xl unpause lightning
>   This node is normally written by the guest kernel, but for older kernels
>   seems not to be.  xend must have a work-around; port this work-around to
xl.
The workaround uses the native_protocol field of struct xc_dom_arch.

Here is the code in the xend DevController module:

        frontpath = self.frontendPath(devid)
        backpath  = self.backendPath(backdom, devid)

        frontDetails.update({
            ''backend'' : backpath,
            ''backend-id'' : "%i" %
backdom.getDomid(),
            ''state'' :
str(xenbusState[''Initialising''])
            })

        if self.vm.native_protocol:
            frontDetails.update({''protocol'' :
self.vm.native_protocol})

        backDetails.update({
            ''domain'' : self.vm.getName(),
            ''frontend'' : frontpath,
            ''frontend-id'' : "%i" %
self.vm.getDomid(),
            ''state'' :
str(xenbusState[''Initialising'']),
            ''online'' : "1"
            })

Paolo

Roger Pau Monné

2013-Mar-08 17:17 UTC

head link

Re: Xen 4.3 development update

On 05/03/13 13:44, George Dunlap wrote:> * Rationalized backend scripts
>   owner: roger@citrix
>   status: libxl hotplug sumbmitted.  Protocol still needs to be finalized.
>   prognosis: Good
I''ve sent v1 quite some time ago (collecting comments from the RFC),
but
still no reply. I would like some of the tools maintainers to take a
look at the series:

http://lists.xen.org/archives/html/xen-devel/2013-01/msg01962.html

Also, I would like to add a new item to the list, that''s xm/xl parity
related:

 * Add vif-route support to libxl/xl
   owner: roger@citrix
   status: needs review
   prognosis: Good

The last version of the series can be found at:

http://lists.xen.org/archives/html/xen-devel/2013-02/msg00450.html

George Dunlap

2013-Apr-02 14:07 UTC

head link

Xen 4.3 development update

This information will be mirrored on the Xen 4.3 Roadmap wiki page:
 http://wiki.xen.org/wiki/Xen_Roadmap/4.3

A couple of notes:

- I have moved the "Code freezing point" to 15 April, since one of the
toolstack maintainers (Ian Campbell) is away until the 8th.

- As we focus on getting a release for the 4.3 codebase, I have
removed items from the list that are either "not for 4.3" or are
purely external (e.g., Linux kernel or libvirt).

- Please start suggesting bug reports to put on this list.

= Timeline 
We are planning on a 9-month release cycle.  Based on that, below are
our estimated dates:
* Feature freeze: 25 March 2013
* Code freezing point: 15 April 2013
* First RC: 6 May 2013
* Release: 17 June 2013

The RCs and release will of course depend on stability and bugs, and
will therefore be fairly unpredictable.  Each new feature will be
considered on a case-by-case basis; but the general rule will be as
follows:

* Between feature freeze and code freeze, only features which have had
a v1 posted before the feature freeze, or are on this list, will be
considered for inclusion.

* Between the "code freezing point" and the first RC, any new code
will need to be justified, and it will become progressively more
difficult to get non-bugfix patches accepted.  Critera will include
the size of the patch, the importance of the codepath, whether it''s
new functionality being added or existing functionality being changed,
and so on.

Last updated: 2 January 2013

= Feature tracking 
Below is a list of features we''re tracking for this release. Please
respond to this mail with any updates to the status.

There are a number of items whose owners are marked as "?".  If you
are working on this, or know who is working on it, please respond and
let me know.  Alternately, if you would *like* to work on it, please
let me know as well.

And if there is something you''re working on you''d like
tracked, please
respond, and I will add it to the list.

NB: Several of the items on this list are from external projects:
linux, qemu, and libvirt.  These are not part of the Xen tree, but are
directly related to our users'' experience (e.g., work in Linux or
qemu) or to integration with other important projects (e.g., libvirt
bindings).  Since all of these are part of the Xen community work, and
comes from the same pool of labor, it makes sense to track the
progress here, even though they won''t explicitly be released as part
of 4.3.

Meanings of prognoses:
- Excellent: It would be very unlikely for this not to be finished in time.
- Good: Everything is on track, and is likely to make it.
- Fair: A pretty good chance of making it, but not as certain
- Poor: Likely not to make it unless intervention is made
- Not for 4.3: Self-explanatory

== Completed =
* Serial console improvements
  -EHCI debug port

* Default to QEMU upstream (partial)
 - pci pass-thru (external)
 - enable dirtybit tracking during migration (external)
 - xl cd-{insert,eject} (external)

* CPUID-based idle (don''t rely on ACPI info f/ dom0)

* Persistent grants for blk (external)
 - Linux
 - qemu

* Allow XSM to override IS_PRIV checks in the hypervisor

* Scalability: 16TiB of RAM

* xl QXL Spice support

== Bugs =
* xl, compat mode, and older kernels
  owner: ?
  Many older 32-bit PV kernels that can run on a 64-bit hypervisor with
  xend do not work when started with xl.  The following work-around seems to
  work:
    xl create -p lightning.cfg
    xenstore-write /local/domain/$(xl domid
lightning)/device/vbd/51713/protocol x86_32-abi
    xl unpause lightning
  This node is normally written by the guest kernel, but for older kernels
  seems not to be.  xend must have a work-around; port this work-around to xl.

* AMD NPT performance regression after c/s 24770:7f79475d3de7
  owner: ?
  Reference: http://marc.info/?l=xen-devel&m=135075376805215

* qemu-upstream: cd-insert and cd-eject not working
  http://marc.info/?l=xen-devel&m=135850249808960

* Install into /usr/local by default
  owner: Ian Campbell

== Not yet complete =
* PVH mode (w/ Linux)
  owner: mukesh@oracle
  status (Linux): 3rd draft patches posted.
  status (Xen): RFC submitted
  prognosis: Tech preview only

* Event channel scalability
  owner: wei@citrix or david@citrix
  status: RFC v5 submitted
  prognosis: Deciding whether to shoot for 3-level (4.3) or FIFO (4.4)
  Increase limit on event channels (currently 1024 for 32-bit guests,
  4096 for 64-bit guests)

* ARM v7 server port
  owner: ijc@citrix
  prognosis: Excellent
  status: SMP support missing.

* ARM v8 server port (tech preview)
  owner: ijc@citrix
  status: ?
  prognosis: Tech preview only

* NUMA scheduler affinity
  critical
  owner: dario@citrix
  status: Patches posted
  prognosis: Excellent

* NUMA Memory migration
  owner: dario@citrix
  status: in progress
  prognosis: Fair

* Default to QEMU upstream
 > Add "intel-hda" to xmexample file, since it works with 64-bit
Win7/8
 - qemu-based stubdom (Linux or BSD libc)
   owner: anthony@citrix
   status: in progress
   prognosis: ?
   qemu-upstream needs a more fully-featured libc than exists in
   mini-os.  Either work on a minimalist linux-based stubdom with
   glibc, or port one of the BSD libcs to minios.

* Multi-vector PCI MSI (support at least for Dom0)
  owner: jan@suse
  status: Draft hypervisor side done, linux side in progress.
  prognosis: Fair

* vTPM updates
  owner: Matthew Fioravante @ Johns Hopkins
  status: some patches submitted, more in progress
  prognosis: Good
  - Allow all vTPM components to run in stub domains for increased security
  - Update vtpm to 0.7.4
  - Remove dom0-based vtpmd

* V4V: Inter-domain communication
  owner (Xen): dominic.curran@citrix.com
  status (Xen): patches submitted
  prognosis: Fair
  owner (Linux driver):  stefano.panella@citrix
  status (Linux driver): in progress

* xl PVUSB pass-through for PV guests
* xl PVUSB pass-through for HVM guests
  owner: George
  status: ?
  prognosis: Poor
  xm/xend supports PVUSB pass-through to guests with PVUSB drivers
(both PV and HVM guests).
  - port the xm/xend functionality to xl.
  - this PVUSB feature does not require support or emulation from Qemu.
  - upstream the Linux frontend/backend drivers. Current
work-in-progress versions are in Konrad''s git tree.
  - James Harper''s GPLPV drivers for Windows include PVUSB frontend
drivers.

* xl USB pass-through for HVM guests using Qemu USB emulation
  owner: George
  status: Config file pass-through submitted.
  prognosis: Fair
  xm/xend with qemu-traditional supports USB passthrough to HVM guests
using the Qemu emulated USB controller.
  The HVM guest does not need any special drivers for this feature.
  So basicly the qemu cmdline needs to have:
     -usb -usbdevice host:xxxx:yyyy
  - port the xm/xend functionality to xl.
  - make sure USB passthrough with xl works with both qemu-traditional
and qemu-upstream.

* xl: passing more defaults in configuration in xl.conf
  owner: ?
  There are a number of options for which it might be useful to pass a
  default in xl.conf.  For example, if we could have a default
  "backend" parameter for vifs, then it would be easy to switch back
  and forth between a backend in a driver domain and a backend in dom0.

* Remove hardcoded mobprobe''s in xencommons
  owner: ?
  status: ?
  prognosis: Poor.

* openvswitch toostack integration
  owner: ?
  prognosis: Poor
  status: Sample script posted by Bastian ("[RFC] openvswitch support
script")
  - See if we can engage Bastian to do a more fully-featured script?

* Rationalized backend scripts
  owner: roger@citrix
  status: libxl hotplug sumbmitted.  Protocol still needs to be finalized.
  prognosis: Good

* Scripts for driver domains (depends on backend scripts)
  owner: roger@citrix
  status:
  prognosis: Fair

Jan Beulich

2013-Apr-02 15:42 UTC

head link

Re: Xen 4.3 development update

>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>   owner: ?
>   Reference: http://marc.info/?l=xen-devel&m=135075376805215 
This is supposedly fixed with the RTC changes Tim committed the
other day. Suravee, is that correct?
> * Remove hardcoded mobprobe''s in xencommons
>   owner: ?
>   status: ?
>   prognosis: Poor.
So before 4.2 got released it was promised to get dealt with, and
now it''s "poor" with no owner and status? Disappointing.

Jan

Suravee Suthikulanit

2013-Apr-02 15:45 UTC

head link

Re: Xen 4.3 development update

On 4/2/2013 10:42 AM, Jan Beulich wrote:>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>    owner: ?
>>    Reference: http://marc.info/?l=xen-devel&m=135075376805215
> This is supposedly fixed with the RTC changes Tim committed the
> other day. Suravee, is that correct?Let me verify this again with the new changes.  I was looking at the 
clock drifting issue on the 64-bit XP which was running fine.  Let me 
check 32-bit and get back to you today.
>> * Remove hardcoded mobprobe''s in xencommons
>>    owner: ?
>>    status: ?
>>    prognosis: Poor.
> So before 4.2 got released it was promised to get dealt with, and
> now it''s "poor" with no owner and status? Disappointing.I was not aware of this issue.  Could you give me some context of this?

Suravee>
> Jan
>
>

George Dunlap

2013-Apr-02 15:51 UTC

head link

Re: Xen 4.3 development update

On 02/04/13 16:45, Suravee Suthikulanit wrote:> On 4/2/2013 10:42 AM, Jan Beulich wrote:
>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>     owner: ?
>>>     Reference: http://marc.info/?l=xen-devel&m=135075376805215
>> This is supposedly fixed with the RTC changes Tim committed the
>> other day. Suravee, is that correct?
> Let me verify this again with the new changes.  I was looking at the
> clock drifting issue on the 64-bit XP which was running fine.  Let me
> check 32-bit and get back to you today.
>
>>> * Remove hardcoded mobprobe''s in xencommons
>>>     owner: ?
>>>     status: ?
>>>     prognosis: Poor.
>> So before 4.2 got released it was promised to get dealt with, and
>> now it''s "poor" with no owner and status?
Disappointing.
> I was not aware of this issue.  Could you give me some context of this?
Just to be clear, this doesn''t have anything to do with AMD --
it''s a
separate subject Jan is complaining about. :-)

Jan: "Poor" just means that it needs intervention; namely, we need 
someone to step up and volunteer to do it.  I''ll ask for volunteers 
sometime this week.

  -George

Tim Deegan

2013-Apr-02 16:34 UTC

head link

Re: Xen 4.3 development update

At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich
wrote:> >>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> > * AMD NPT performance regression after c/s 24770:7f79475d3de7
> >   owner: ?
> >   Reference: http://marc.info/?l=xen-devel&m=135075376805215 
> 
> This is supposedly fixed with the RTC changes Tim committed the
> other day. Suravee, is that correct?
This is a separate problem.  IIRC the AMD XP perf issue is caused by the
emulation of LAPIC TPR accesses slowing down with Andres''s p2m locking
patches.  XP doesn''t have ''lazy IRQL'' or support for
CR8, so it takes a
_lot_ of vmexits for IRQL reads and writes.

Tim.

Suravee Suthikulpanit

2013-Apr-02 16:47 UTC

head link

Re: Xen 4.3 development update

On 4/2/2013 11:34 AM, Tim Deegan wrote:> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>    owner: ?
>>>    Reference: http://marc.info/?l=xen-devel&m=135075376805215
>> This is supposedly fixed with the RTC changes Tim committed the
>> other day. Suravee, is that correct?
> This is a separate problem.  IIRC the AMD XP perf issue is caused by the
> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
> patches.  XP doesn''t have ''lazy IRQL'' or support
for CR8, so it takes a
> _lot_ of vmexits for IRQL reads and writes.Is this only for 32-bit XP? or also 64-bit ?

Suravee> Tim.
>

Suravee Suthikulpanit

2013-Apr-02 17:06 UTC

head link

Re: Xen 4.3 development update

On 4/2/2013 11:34 AM, Tim Deegan wrote:> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>    owner: ?
>>>    Reference: http://marc.info/?l=xen-devel&m=135075376805215
>> This is supposedly fixed with the RTC changes Tim committed the
>> other day. Suravee, is that correct?
> This is a separate problem.  IIRC the AMD XP perf issue is caused by the
> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
> patches.  XP doesn''t have ''lazy IRQL'' or support
for CR8, so it takes a
> _lot_ of vmexits for IRQL reads and writes.Is there any tools or good ways to count the number of VMexit in Xen?

Thanks,
Suravee>
> Tim.
>

Suravee Suthikulanit

2013-Apr-02 23:48 UTC

head link

Re: Xen 4.3 development update

On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote:> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>> wrote:
>>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>>    owner: ?
>>>>    Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>> This is supposedly fixed with the RTC changes Tim committed the
>>> other day. Suravee, is that correct?
>> This is a separate problem.  IIRC the AMD XP perf issue is caused by
the
>> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>> _lot_ of vmexits for IRQL reads and writes.
> Is there any tools or good ways to count the number of VMexit in Xen?
>Tim/Jan,

I have used iperf benchmark to compare network performance (bandwidth) 
between the two versions of the hypervisor:
1. good: 24769:730f6ed72d70
2. bad: 24770:7f79475d3de7

In the "bad" case, I am seeing that the network bandwidth has dropped 
about 13-15%.

However, when I uses the xentrace utility to trace the number of VMEXIT, 
I actually see about 25% more number of VMEXIT in the good case.  This 
is inconsistent with the statement that Tim mentioned above.

Suravee
> Thanks,
> Suravee
>>
>> Tim.
>>
>

Jan Beulich

2013-Apr-03 07:27 UTC

head link

Re: Xen 4.3 development update

>>> On 02.04.13 at 18:34, Tim Deegan <tim@xen.org> wrote:
> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>> >>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>> > * AMD NPT performance regression after c/s 24770:7f79475d3de7
>> >   owner: ?
>> >   Reference: http://marc.info/?l=xen-devel&m=135075376805215 
>> 
>> This is supposedly fixed with the RTC changes Tim committed the
>> other day. Suravee, is that correct?
> 
> This is a separate problem.  IIRC the AMD XP perf issue is caused by the
> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
> patches.  XP doesn''t have ''lazy IRQL'' or support
for CR8, so it takes a
> _lot_ of vmexits for IRQL reads and writes.
Ah, okay, sorry for mixing this up. But how is this a regression
then?

Jan

Christoph Egger

2013-Apr-03 08:37 UTC

head link

Re: Xen 4.3 development update

On 02.04.13 19:06, Suravee Suthikulpanit wrote:> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>> wrote:
>>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>>    owner: ?
>>>>    Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>> This is supposedly fixed with the RTC changes Tim committed the
>>> other day. Suravee, is that correct?
>> This is a separate problem.  IIRC the AMD XP perf issue is caused by
the
>> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>> _lot_ of vmexits for IRQL reads and writes.
> Is there any tools or good ways to count the number of VMexit in Xen?
xentrace -e 0x8f000 > xentrace.out
[Hit ^C to abort]
xentrace_format formats < xentrace.out > xentrace.dump

You need to manually install ''formats'' from
tools/xentrace/formats
to a proper place.

Christoph

George Dunlap

2013-Apr-03 10:49 UTC

head link

Re: Xen 4.3 development update

On 03/04/13 09:37, Christoph Egger wrote:> On 02.04.13 19:06, Suravee Suthikulpanit wrote:
>> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>>> wrote:
>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>     owner: ?
>>>>>     Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>> This is supposedly fixed with the RTC changes Tim committed the
>>>> other day. Suravee, is that correct?
>>> This is a separate problem.  IIRC the AMD XP perf issue is caused
by the
>>> emulation of LAPIC TPR accesses slowing down with Andres''s
p2m locking
>>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>>> _lot_ of vmexits for IRQL reads and writes.
>> Is there any tools or good ways to count the number of VMexit in Xen?
> xentrace -e 0x8f000 > xentrace.out
> [Hit ^C to abort]
> xentrace_format formats < xentrace.out > xentrace.dump
>
> You need to manually install ''formats'' from
tools/xentrace/formats
> to a proper place.
Even better is to use xenalyze "summary" mode:

http://xenbits.xen.org/ext/xenalyze

Build, then run:

xenalyze --svm-mode -s [trace file] > summary

  -George

George Dunlap

2013-Apr-03 10:51 UTC

head link

Re: Xen 4.3 development update

On 03/04/13 00:48, Suravee Suthikulanit wrote:> On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote:
>> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>>> wrote:
>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>     owner: ?
>>>>>     Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>> This is supposedly fixed with the RTC changes Tim committed the
>>>> other day. Suravee, is that correct?
>>> This is a separate problem.  IIRC the AMD XP perf issue is caused
by the
>>> emulation of LAPIC TPR accesses slowing down with Andres''s
p2m locking
>>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>>> _lot_ of vmexits for IRQL reads and writes.
>> Is there any tools or good ways to count the number of VMexit in Xen?
>>
> Tim/Jan,
>
> I have used iperf benchmark to compare network performance (bandwidth)
> between the two versions of the hypervisor:
> 1. good: 24769:730f6ed72d70
> 2. bad: 24770:7f79475d3de7
>
> In the "bad" case, I am seeing that the network bandwidth has
dropped
> about 13-15%.
>
> However, when I uses the xentrace utility to trace the number of VMEXIT,
> I actually see about 25% more number of VMEXIT in the good case.  This
> is inconsistent with the statement that Tim mentioned above.
I was going to say, what I remember from my little bit of investigation 
back in November, was that it had all the earmarks of 
micro-architectural "drag", which happens when the TLB or the caches 
can''t be effective.

Suvaree, if you look at xenalyze, a microarchitectural "drag" looks
like:
* fewer VMEXITs, but
* time for each vmexit takes longer

If you post the results of "xenalyze --svm-mode -s" for both traces, I
can tell you what I see.

  -George

George Dunlap

2013-Apr-03 10:53 UTC

head link

Re: Xen 4.3 development update

On 03/04/13 08:27, Jan Beulich wrote:>>>> On 02.04.13 at 18:34, Tim Deegan <tim@xen.org> wrote:
>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>>    owner: ?
>>>>    Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>> This is supposedly fixed with the RTC changes Tim committed the
>>> other day. Suravee, is that correct?
>> This is a separate problem.  IIRC the AMD XP perf issue is caused by
the
>> emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>> _lot_ of vmexits for IRQL reads and writes.
> Ah, okay, sorry for mixing this up. But how is this a regression
> then?
My sense, when I looked at this back whenever that there was much more 
to this.  The XP IRQL updating is a problem, but it''s made terribly 
worse by the changset in question.  It seemed to me like the kind of 
thing that would be caused by TLB or caches suddenly becoming much less 
effective.

  -George

Andres Lagar-Cavilla

2013-Apr-03 15:34 UTC

head link

Re: Xen 4.3 development update

On Apr 3, 2013, at 6:53 AM, George Dunlap <george.dunlap@eu.citrix.com>
wrote:
> On 03/04/13 08:27, Jan Beulich wrote:
>>>>> On 02.04.13 at 18:34, Tim Deegan <tim@xen.org> wrote:
>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>   owner: ?
>>>>>   Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>> This is supposedly fixed with the RTC changes Tim committed the
>>>> other day. Suravee, is that correct?
>>> This is a separate problem.  IIRC the AMD XP perf issue is caused
by the
>>> emulation of LAPIC TPR accesses slowing down with Andres''s
p2m locking
>>> patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
>>> _lot_ of vmexits for IRQL reads and writes.
>> Ah, okay, sorry for mixing this up. But how is this a regression
>> then?
> 
> My sense, when I looked at this back whenever that there was much more to
this.  The XP IRQL updating is a problem, but it''s made terribly worse
by the changset in question.  It seemed to me like the kind of thing that would
be caused by TLB or caches suddenly becoming much less effective.
The commit in question does not add p2m mutations, so it doesn''t nuke
the NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the
problem. Later in the 4.2 cycle we changed the common case to use an rwlock.
Does the same perf degradation occur with tip of 4.2?

Andres> 
> -George

Tim Deegan

2013-Apr-04 10:57 UTC

head link

Re: Xen 4.3 development update

At 11:47 -0500 on 02 Apr (1364903259), Suravee Suthikulpanit
wrote:> On 4/2/2013 11:34 AM, Tim Deegan wrote:
> >At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
> >>>>>On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
> >>>>>wrote:
> >>>* AMD NPT performance regression after c/s 24770:7f79475d3de7
> >>>   owner: ?
> >>>   Reference:
http://marc.info/?l=xen-devel&m=135075376805215
> >>This is supposedly fixed with the RTC changes Tim committed the
> >>other day. Suravee, is that correct?
> >This is a separate problem.  IIRC the AMD XP perf issue is caused by
the
> >emulation of LAPIC TPR accesses slowing down with Andres''s p2m
locking
> >patches.  XP doesn''t have ''lazy IRQL'' or
support for CR8, so it takes a
> >_lot_ of vmexits for IRQL reads and writes.
> Is this only for 32-bit XP? or also 64-bit ?
I don''t have a 64-bit XP image handy to test, but a bit of googling
suggests that 64-bit XP has ''lazy IRQL'', so this TPR problem
should only
affect 32-bit XP (and earlier Windowses).

Tim.

Christoph Egger

2013-Apr-04 12:19 UTC

head link

xenalyze (was: Re: Xen 4.3 development update)

On 03.04.13 12:49, George Dunlap wrote:> On 03/04/13 09:37, Christoph Egger wrote:
>> On 02.04.13 19:06, Suravee Suthikulpanit wrote:
>>> Is there any tools or good ways to count the number of VMexit in
Xen?
>> xentrace -e 0x8f000 > xentrace.out
>> [Hit ^C to abort]
>> xentrace_format formats < xentrace.out > xentrace.dump
>>
>> You need to manually install ''formats'' from
tools/xentrace/formats
>> to a proper place.
>
> Even better is to use xenalyze "summary" mode:
>
> http://xenbits.xen.org/ext/xenalyze
>
> Build, then run:
>
> xenalyze --svm-mode -s [trace file] > summary
It does not build for me. argp.h is not portable.

Christoph

George Dunlap

2013-Apr-04 12:51 UTC

head link

Re: xenalyze

On 04/04/2013 01:19 PM, Christoph Egger wrote:> On 03.04.13 12:49, George Dunlap wrote:
>> On 03/04/13 09:37, Christoph Egger wrote:
>>> On 02.04.13 19:06, Suravee Suthikulpanit wrote:
>>>> Is there any tools or good ways to count the number of VMexit
in Xen?
>>> xentrace -e 0x8f000 > xentrace.out
>>> [Hit ^C to abort]
>>> xentrace_format formats < xentrace.out > xentrace.dump
>>>
>>> You need to manually install ''formats'' from
tools/xentrace/formats
>>> to a proper place.
>>
>> Even better is to use xenalyze "summary" mode:
>>
>> http://xenbits.xen.org/ext/xenalyze
>>
>> Build, then run:
>>
>> xenalyze --svm-mode -s [trace file] > summary
>
> It does not build for me. argp.h is not portable.\

It''s a shame NetBSD (which I think is what you use) hasn''t
implemented
it yet -- it''s a lot nicer interface to use.

You could always run it in a Debian VM if you were really keen. :-)

  -George

Tim Deegan

2013-Apr-04 15:23 UTC

head link

Re: Xen 4.3 development update

At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla
wrote:> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> 
> > On 03/04/13 08:27, Jan Beulich wrote:
> >>>>> On 02.04.13 at 18:34, Tim Deegan <tim@xen.org>
wrote:
> >>> This is a separate problem.  IIRC the AMD XP perf issue is
caused by the
> >>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
> >>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
> >>> _lot_ of vmexits for IRQL reads and writes.
> >> Ah, okay, sorry for mixing this up. But how is this a regression
> >> then?
> > 
> > My sense, when I looked at this back whenever that there was much more
to this.  The XP IRQL updating is a problem, but it''s made terribly
worse by the changset in question.  It seemed to me like the kind of thing that
would be caused by TLB or caches suddenly becoming much less effective.
> 
> The commit in question does not add p2m mutations, so it doesn''t
nuke the NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the
problem. Later in the 4.2 cycle we changed the common case to use an rwlock.
Does the same perf degradation occur with tip of 4.2?
> 
Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that takes
about 12 minutes before this locking change takes more than 20 minutes
on the current tip of xen-unstable (I gave up at 22 minutes and rebooted
to test something else).

Tim.

Suravee Suthikulanit

2013-Apr-04 15:29 UTC

head link

Re: Xen 4.3 development update

On 4/3/2013 5:51 AM, George Dunlap wrote:> On 03/04/13 00:48, Suravee Suthikulanit wrote:
>> On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote:
>>> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>>>> wrote:
>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>     owner: ?
>>>>>>     Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>> This is supposedly fixed with the RTC changes Tim committed
the
>>>>> other day. Suravee, is that correct?
>>>> This is a separate problem.  IIRC the AMD XP perf issue is
caused
>>>> by the
>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it
>>>> takes a
>>>> _lot_ of vmexits for IRQL reads and writes.
>>> Is there any tools or good ways to count the number of VMexit in
Xen?
>>>
>> Tim/Jan,
>>
>> I have used iperf benchmark to compare network performance (bandwidth)
>> between the two versions of the hypervisor:
>> 1. good: 24769:730f6ed72d70
>> 2. bad: 24770:7f79475d3de7
>>
>> In the "bad" case, I am seeing that the network bandwidth has
dropped
>> about 13-15%.
>>
>> However, when I uses the xentrace utility to trace the number of
VMEXIT,
>> I actually see about 25% more number of VMEXIT in the good case.  This
>> is inconsistent with the statement that Tim mentioned above.
>
> I was going to say, what I remember from my little bit of 
> investigation back in November, was that it had all the earmarks of 
> micro-architectural "drag", which happens when the TLB or the
caches
> can''t be effective.
>
> Suvaree, if you look at xenalyze, a microarchitectural "drag"
looks like:
> * fewer VMEXITs, but
> * time for each vmexit takes longer
>
> If you post the results of "xenalyze --svm-mode -s" for both
traces, I
> can tell you what I see.
>
>  -George
>George,

Here is the two set of data from xenalyze.

Suravee



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2013-Apr-04 17:05 UTC

head link

Re: Xen 4.3 development update

At 16:23 +0100 on 04 Apr (1365092601), Tim Deegan wrote:> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
> > On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that takes
> about 12 minutes before this locking change takes more than 20 minutes
> on the current tip of xen-unstable (I gave up at 22 minutes and rebooted
> to test something else).
I did a bit of prodding at this, but messed up my measurements in a
bunch of different ways over the afternoon. :(  I''m going to be away
from my test boxes for a couple of weeks now, so all I can say is, if
you''re investigating this bug, beware that:

 - the revision before this change still has the RTC bugs that were
   fixed last week, so don''t measure performance based on guest
   wallclock time, or your ''before'' perf will look too good.
 - the current unstable tip has test code to exercise the new
   map_domain_page(), which will badly affect all the many memory
   accesses done in HVM emulation, so make sure you use debug=n builds
   for measurement.

Also, if there is still a bad slowdown, caused by the p2m lookups, this
might help a little bit:

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 38e87ce..7bd8646 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
         }
     }
 
+
+    /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
+     * a fast path for LAPIC accesses, skipping the p2m lookup. */
+    if ( !nestedhvm_vcpu_in_guestmode(v)
+         && gfn == vlapic_base_address(vcpu_vlapic(current)) >>
PAGE_SHIFT )
+    {
+        if ( !handle_mmio() )
+            hvm_inject_hw_exception(TRAP_gp_fault, 0);
+        rc = 1;
+        goto out;
+    }
+
     p2m = p2m_get_hostp2m(v->domain);
     mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
                               P2M_ALLOC | (access_w ? P2M_UNSHARE : 0), NULL);


but in fact, the handle_mmio() will have to do GVA->GFN lookups for its
%RIP and all its operands, and each of those will involve multiple
GFN->MFN lookups for the pagetable entries, so if the GFN->MFN lookup
has got slower, eliminating just the one at the start may not be all
that great.

Cheers,

Tim.

Suravee Suthikulanit

2013-Apr-04 17:14 UTC

head link

Re: Xen 4.3 development update

On 4/3/2013 5:51 AM, George Dunlap wrote:> On 03/04/13 00:48, Suravee Suthikulanit wrote:
>> On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote:
>>> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>>>> wrote:
>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>     owner: ?
>>>>>>     Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>> This is supposedly fixed with the RTC changes Tim committed
the
>>>>> other day. Suravee, is that correct?
>>>> This is a separate problem.  IIRC the AMD XP perf issue is
caused
>>>> by the
>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it
>>>> takes a
>>>> _lot_ of vmexits for IRQL reads and writes.
>>> Is there any tools or good ways to count the number of VMexit in
Xen?
>>>
>> Tim/Jan,
>>
>> I have used iperf benchmark to compare network performance (bandwidth)
>> between the two versions of the hypervisor:
>> 1. good: 24769:730f6ed72d70
>> 2. bad: 24770:7f79475d3de7
>>
>> In the "bad" case, I am seeing that the network bandwidth has
dropped
>> about 13-15%.
>>
>> However, when I uses the xentrace utility to trace the number of
VMEXIT,
>> I actually see about 25% more number of VMEXIT in the good case.  This
>> is inconsistent with the statement that Tim mentioned above.
>
> I was going to say, what I remember from my little bit of 
> investigation back in November, was that it had all the earmarks of 
> micro-architectural "drag", which happens when the TLB or the
caches
> can''t be effective.
>
> Suvaree, if you look at xenalyze, a microarchitectural "drag"
looks like:
> * fewer VMEXITs, but
> * time for each vmexit takes longer
>
> If you post the results of "xenalyze --svm-mode -s" for both
traces, I
> can tell you what I see.
>
>  -George
>
Here''s another version of the outputs from xenalyze with only VMEXIT.  
In this case, I pin all the VCPUs (4) and pin my application process to 
VCPU 3.

NOTE: This measurement is without the RTC bug.

BAD:
-- v3 --
  Runstates:
    running:       1  4.51s 10815429411 
{10815429411|10815429411|10815429411}
  cpu affinity:       1 10816540697 {10816540697|10816540697|10816540697}

    [7]:       1 10816540697 {10816540697|10816540697|10816540697}
Exit reasons:
  VMEXIT_CR0_READ           633  0.00s  0.00%  1503 cyc { 1092| 1299| 2647}
  VMEXIT_CR4_READ             3  0.00s  0.00%  1831 cyc { 1309| 1659| 2526}
  VMEXIT_CR0_WRITE          305  0.00s  0.00%  1660 cyc { 1158| 1461| 2507}
  VMEXIT_CR4_WRITE            6  0.00s  0.00% 19771 cyc { 1738| 5031|79600}
  VMEXIT_EXCEPTION_NM         1  0.00s  0.00%  2272 cyc { 2272| 2272| 2272}
  VMEXIT_INTR                28  0.00s  0.00%  3374 cyc { 1225| 3770| 6095}
  VMEXIT_VINTR              388  0.00s  0.00%  1023 cyc {  819|  901| 1744}
  VMEXIT_PAUSE               33  0.00s  0.00%  7476 cyc { 4881| 6298|18941}
  VMEXIT_HLT                388  3.35s 14.84% 20701800 cyc 
{169589|3848166|55770601}
  VMEXIT_IOIO              5581  0.19s  0.85% 82514 cyc { 4250|81909|146439}
  VMEXIT_NPF             108072  0.71s  3.14% 15702 cyc { 6362| 6865|37280}
Guest interrupt counts:
Emulate eip list

GOOD:
-- v3 --
  Runstates:
    running:       4 12.10s 7257234016 {18132721625|18132721625|18132721625}
       lost:      12  1.24s 248210482 {188636654|719488416|719488416}
  cpu affinity:       1 32007462122 {32007462122|32007462122|32007462122}
    [7]:       1 32007462122 {32007462122|32007462122|32007462122}
Exit reasons:
  VMEXIT_CR0_READ          4748  0.00s  0.01%  1275 cyc { 1007| 1132| 1878}
  VMEXIT_CR4_READ             6  0.00s  0.00%  1752 cyc { 1189| 1629| 2600}
  VMEXIT_CR0_WRITE         3099  0.00s  0.01%  1541 cyc { 1157| 1420| 2151}
  VMEXIT_CR4_WRITE           12  0.00s  0.00%  4105 cyc { 1885| 4380| 5515}
  VMEXIT_EXCEPTION_NM        18  0.00s  0.00%  2169 cyc { 1973| 2152| 2632}
  VMEXIT_INTR               258  0.00s  0.00%  4622 cyc { 1358| 4235| 8987}
  VMEXIT_VINTR             2552  0.00s  0.00%   971 cyc {  850|  928| 1131}
  VMEXIT_PAUSE              370  0.00s  0.00%  5758 cyc { 4381| 5688| 7933}
  VMEXIT_HLT               1505  6.14s 27.19% 9788981 cyc 
{268573|3768704|56331182}
  VMEXIT_IOIO             53835  1.97s  8.74% 87959 cyc { 4996|82423|144207}
  VMEXIT_NPF             855101  2.06s  9.13%  5787 cyc { 4903| 5328| 8572}
Guest interrupt counts:
Emulate eip list

Suravee



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2013-Apr-05 13:43 UTC

head link

Re: Xen 4.3 development update

On 04/04/13 18:14, Suravee Suthikulanit wrote:> On 4/3/2013 5:51 AM, George Dunlap wrote:
>> On 03/04/13 00:48, Suravee Suthikulanit wrote:
>>> On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote:
>>>> On 4/2/2013 11:34 AM, Tim Deegan wrote:
>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com>
>>>>>>>>> wrote:
>>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>>      owner: ?
>>>>>>>      Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>>> This is supposedly fixed with the RTC changes Tim
committed the
>>>>>> other day. Suravee, is that correct?
>>>>> This is a separate problem.  IIRC the AMD XP perf issue is
caused
>>>>> by the
>>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it
>>>>> takes a
>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>> Is there any tools or good ways to count the number of VMexit
in Xen?
>>>>
>>> Tim/Jan,
>>>
>>> I have used iperf benchmark to compare network performance
(bandwidth)
>>> between the two versions of the hypervisor:
>>> 1. good: 24769:730f6ed72d70
>>> 2. bad: 24770:7f79475d3de7
>>>
>>> In the "bad" case, I am seeing that the network bandwidth
has dropped
>>> about 13-15%.
>>>
>>> However, when I uses the xentrace utility to trace the number of
VMEXIT,
>>> I actually see about 25% more number of VMEXIT in the good case. 
This
>>> is inconsistent with the statement that Tim mentioned above.
>> I was going to say, what I remember from my little bit of
>> investigation back in November, was that it had all the earmarks of
>> micro-architectural "drag", which happens when the TLB or the
caches
>> can''t be effective.
>>
>> Suvaree, if you look at xenalyze, a microarchitectural "drag"
looks like:
>> * fewer VMEXITs, but
>> * time for each vmexit takes longer
>>
>> If you post the results of "xenalyze --svm-mode -s" for both
traces, I
>> can tell you what I see.
>>
>>   -George
>>
> Here''s another version of the outputs from xenalyze with only
VMEXIT.
> In this case, I pin all the VCPUs (4) and pin my application process to
> VCPU 3.
>
> NOTE: This measurement is without the RTC bug.
>
> BAD:
> -- v3 --
>    VMEXIT_CR0_WRITE          305  0.00s  0.00%  1660 cyc { 1158| 1461|
2507}
>    VMEXIT_CR4_WRITE            6  0.00s  0.00% 19771 cyc { 1738|
5031|79600}
[snip]>    VMEXIT_IOIO              5581  0.19s  0.85% 82514 cyc {
4250|81909|146439}
>    VMEXIT_NPF             108072  0.71s  3.14% 15702 cyc { 6362|
6865|37280}
> GOOD:
> -- v3 --
>    VMEXIT_CR0_WRITE         3099  0.00s  0.01%  1541 cyc { 1157| 1420|
2151}
>    VMEXIT_CR4_WRITE           12  0.00s  0.00%  4105 cyc { 1885| 4380|
5515}
[snip]>    VMEXIT_IOIO             53835  1.97s  8.74% 87959 cyc {
4996|82423|144207}
>    VMEXIT_NPF             855101  2.06s  9.13%  5787 cyc { 4903| 5328|
8572}[snip]

So in the good run, we have 855k NPF exits, each of which takes about 
5.7k cycles.  In the bad run, we have only 108k NPF exits, each of which 
takes an average of 15k cycles.  (Although the 50th percentile is still 
only 6.8k cycles -- so most are about the same, but a few take a lot 
longer.)

It''s a bit strange -- the reduced number of NPF exits is consistent
with
the idea of some micro-architectural thing slowing down the processing 
of the guest.  However, in my experience usually this also has an effect 
on other processing as well -- i.e., the time to process an IOIO would 
also go up, because dom0 would be slowed down as well; and time to 
process any random VMEXIT (say, the CR0 writes) would also go up.

But maybe it only has an effect inside the guest, because of the tagged 
TLBs or something?

Suravee, could you run this one again, but:
* Trace everything, not just vmexits
* Send me the trace files somehow (FTP or Dropbox), and/or add 
"--with-interrupt-eip-enumeration=249 --with-mmio-enumeration" when
you
run the summary?

That will give us an idea where the guest is spending its time 
statistically, and what kinds of MMIO it is doing, which may give us a 
clearer picture of what''s going on.

Thanks,
  -George

Dario Faggioli

2013-Apr-09 02:03 UTC

head link

Re: Xen 4.3 development update

On mar, 2013-04-02 at 15:07 +0100, George Dunlap wrote:> == Not yet complete => 
> [..]
>
> * NUMA scheduler affinity
>   critical
>   owner: dario@citrix
>   status: Patches posted
>   prognosis: Excellent
> I have all the Ack-s I need on last version I posted... Will take care
of the minor issues that people pointed out there and repost the,
hopefully, final version of this ASAP.
> * NUMA Memory migration
>   owner: dario@citrix
>   status: in progress
>   prognosis: Fair
> Well, I''m afraid I have to back up from this. I already told I
wasn''t
sure about making it during one of the latest "development update",
and
I''m afraid I have to confirm that.

I''m sending something out right now, but it''s in RFC status,
so not
really suitable for being considered for 4.3. Unfortunately, I got
distracted by many other things while working on this, and it also was
more challenging than I originally thought... I of course will continue
working on it (despite the release), but I guess we should queue it for
4.4. :-/

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2013-Apr-10 12:12 UTC

head link

Re: Xen 4.3 development update

On Tue, 2013-04-02 at 15:07 +0100, George Dunlap wrote:> 
> * Install into /usr/local by default
>   owner: Ian Campbell 
This needs someone to review and ack/nack
<1360081193-17948-1-git-send-email-ian.campbell@citrix.com>

The same person could also look at 
<1360081193-17948-2-git-send-email-ian.campbell@citrix.com> but that in
theory was acked the first time before it was reverted (it''s not sure
if
Acks should stand in that case!)

Ian.

Ian Campbell

2013-Apr-10 12:15 UTC

head link

Re: Xen 4.3 development update

On Tue, 2013-04-02 at 15:07 +0100, George Dunlap wrote:> 
> * ARM v7 server port
>   owner: ijc@citrix
>   prognosis: Excellent
>   status: SMP support missing.
Patches have been posted for both host and guest SMP.
> * ARM v8 server port (tech preview)
>   owner: ijc@citrix
>   status: ?
>   prognosis: Tech preview only
Not sure who well the statuses map to tech preview stuff but "good" I
think.

One thing which I think is going to miss for both v7 and v8 is migration
support.

My big remaining concern is declaring the hypercall ABI stable/frozen.
Rally that''s a must before we remove the tech preview label from v7 or
v8 IMHO. Given that we''ve run 32 and 64 bit guests on a 64-bit h/v
perhaps that''s enough to make the call. 

Ian.

Konrad Rzeszutek Wilk

2013-Apr-10 16:41 UTC

head link

Re: Xen 4.3 development update

> = Timeline > 
> We are planning on a 9-month release cycle.  Based on that, below are
> our estimated dates:
> * Feature freeze: 25 March 2013
> * Code freezing point: 15 April 2013
Is it possible to extend this? One of the reviewers (Ian) is just
now able to look at code and review. That means the developers have
only 3 business days to repost the changes.
> * First RC: 6 May 2013
> * Release: 17 June 2013
> 
> The RCs and release will of course depend on stability and bugs, and
> will therefore be fairly unpredictable.  Each new feature will be
> considered on a case-by-case basis; but the general rule will be as
> follows:
> 
> * Between feature freeze and code freeze, only features which have had
> a v1 posted before the feature freeze, or are on this list, will be
> considered for inclusion.
> 
> * Between the "code freezing point" and the first RC, any new
code
> will need to be justified, and it will become progressively more
> difficult to get non-bugfix patches accepted.  Critera will include
I am hoping you explain what is meant by ''new code''. Say a
patchset is
being posted where only one of the patches is being modified by the
reviewer (case a). Or the reviewers would like new code to be written to
handle a different case (so new code - case b).

The case b) would fall in ''new code will need to be
justified''.. But if
the new code does not meet the criteria is the full patchset going to
languish to the next window? Or can parts of it that pass the
reviewer''s
muster be committed?

The case a) I would think would not be a problem.
> the size of the patch, the importance of the codepath, whether
it''s
> new functionality being added or existing functionality being changed,
> and so on.
Sorry about asking these questions - but I am being increasingly pressed
to fix upstream bugs, upstream new material for v3.10, upstream the claim
toolstack changes, and also go on vacation. Hence I am trying to figure out
what I need to focus on to meet these deadlines.

George Dunlap

2013-Apr-11 09:28 UTC

head link

Re: Xen 4.3 development update

On 10/04/13 17:41, Konrad Rzeszutek Wilk wrote:>> = Timeline >>
>> We are planning on a 9-month release cycle.  Based on that, below are
>> our estimated dates:
>> * Feature freeze: 25 March 2013
>> * Code freezing point: 15 April 2013
> Is it possible to extend this? One of the reviewers (Ian) is just
> now able to look at code and review. That means the developers have
> only 3 business days to repost the changes.
So when I said "freezing point", I meant, "we will start
rejecting
features".  Each feature will need to be considered individually.  I 
think, for example, that PVH is not going to make it -- it touches too 
much code, and is just not in good enough shape to get in as it is.  But 
AFAICT the TMEM stuff should be fine next week.

IanC knows that he''s on the hot path, and so will be working
double-time
over the next few days to review / commit patches.
>> * First RC: 6 May 2013
>> * Release: 17 June 2013
>>
>> The RCs and release will of course depend on stability and bugs, and
>> will therefore be fairly unpredictable.  Each new feature will be
>> considered on a case-by-case basis; but the general rule will be as
>> follows:
>>
>> * Between feature freeze and code freeze, only features which have had
>> a v1 posted before the feature freeze, or are on this list, will be
>> considered for inclusion.
>>
>> * Between the "code freezing point" and the first RC, any new
code
>> will need to be justified, and it will become progressively more
>> difficult to get non-bugfix patches accepted.  Critera will include
> I am hoping you explain what is meant by ''new code''. Say
a patchset is
> being posted where only one of the patches is being modified by the
> reviewer (case a). Or the reviewers would like new code to be written to
> handle a different case (so new code - case b).
>
> The case b) would fall in ''new code will need to be
justified''.. But if
> the new code does not meet the criteria is the full patchset going to
> languish to the next window? Or can parts of it that pass the
reviewer''s
> muster be committed?
>
> The case a) I would think would not be a problem.
I mean "code that does new things", as opposed to code that fixes
bugs.
Any code that is not already in xen.git and is not fixing a bug is "new 
code", whether it was posted yesterday or 6 months ago.

The point is that every new bit of functionality introduces the risk of 
bugs; and each additional bug at this point will risk slipping the 
release date.  So for each new bit of code, we will have to do a risk 
analysis.  Criteria will include:
* Risk of the code having bugs in itself
* Risk of the code introducing bugs in other key functionality
* Cost of bugs
* Value of the new code

The tmem toolstack stuff, for instance, if I understand correctly, is 
mostly about paths that only tmem users use.  If you''re not using tmem,
the risk of having a bug should be low; and the cost of fixing toolstack 
bugs in a point release should also be low.  So I would think that 
sometime next week would be fine.

PVH stuff, however, touches a lot of core hypervisor code in really 
quirky ways.  It has a very high risk of introducing bugs in other bits 
of functionality, which would have a very high cost.  Also, since it has 
a fairly high risk of still being immature itself, we couldn''t really 
recommend that people use it in 4.3; and thus it has low value as a 
release feature at this point.  (Hope that makes sense -- as a mature 
feature, it''s really important; but as a "tech preview only"
feature
it''s not much overall value to customers.)  So I doubt that PVH will
get in.

We had a talk yesterday about the Linux stubdomain stuff -- that''s a
bit
less clear.  Changes to libxl should be in "linux-stubdom-only" paths,
so little risk of breaking libxl functionality.  On the other hand, it 
makes a lot of changes to the build system, adding moderate risk to an 
important component; and it hasn''t had wide public testing yet, so 
there''s no telling how reliable it will be.  On the other other hand, 
it''s a blocker for being able to switch to qemu-upstream by default, 
which was one of our key goals for 4.3; that may or may not be worth 
risking slipping the release a bit for.

Does that give you an idea of what the criteria are and how I''m going
to
be applying them?

Basically, my job at this point is to make sure that the release only slips:
1. On purpose, because we''ve considered the benefit worth the cost.
2. Because of completely unforseeable circumstances

If I ACK a patch, thinking that it won''t slip, and then it does, and I 
might reasonably have known that that was a risk, then I have... well, 
"failed" is kind of a strong word, but yeah, I haven''t done
the job as
well as I should have done. :-)

Does that make sense?
>> the size of the patch, the importance of the codepath, whether
it''s
>> new functionality being added or existing functionality being changed,
>> and so on.
> Sorry about asking these questions - but I am being increasingly pressed
> to fix upstream bugs, upstream new material for v3.10, upstream the claim
> toolstack changes, and also go on vacation. Hence I am trying to figure out
> what I need to focus on to meet these deadlines.
Sure.  What patches do you have outstanding, BTW?  There''s the tmem 
stuff; the PVH stuff as I said seems pretty unlikely to make it (unless 
there''s an amazing patch series posted in the next day or two).  Is 
there anything else you''d like my opinion on?

  -George

Ian Campbell

2013-Apr-11 09:33 UTC

head link

Re: Xen 4.3 development update

On Thu, 2013-04-11 at 10:28 +0100, George Dunlap wrote:> On 10/04/13 17:41, Konrad Rzeszutek Wilk wrote:
> >> = Timeline > >>
> >> We are planning on a 9-month release cycle.  Based on that, below
are
> >> our estimated dates:
> >> * Feature freeze: 25 March 2013
> >> * Code freezing point: 15 April 2013
> > Is it possible to extend this? One of the reviewers (Ian) is just
> > now able to look at code and review. That means the developers have
> > only 3 business days to repost the changes.
> 
> So when I said "freezing point", I meant, "we will start
rejecting
> features".  Each feature will need to be considered individually.  I 
> think, for example, that PVH is not going to make it -- it touches too 
> much code, and is just not in good enough shape to get in as it is.  But 
> AFAICT the TMEM stuff should be fine next week.
> 
> IanC knows that he''s on the hot path, and so will be working
double-time
> over the next few days to review / commit patches.
FWIW for the tmem stuff specifically IanJ seems to have a handle on the
review so it''s not high in my queue.
> We had a talk yesterday about the Linux stubdomain stuff -- that''s
a bit
> less clear.  Changes to libxl should be in "linux-stubdom-only"
paths,
> so little risk of breaking libxl functionality.  On the other hand, it 
> makes a lot of changes to the build system, adding moderate risk to an 
> important component; and it hasn''t had wide public testing yet, so
> there''s no telling how reliable it will be.  On the other other
hand,
> it''s a blocker for being able to switch to qemu-upstream by
default,
> which was one of our key goals for 4.3; that may or may not be worth 
> risking slipping the release a bit for.
I think we switched the default for the non-stubdom case already (or
were planning too!). I think this is sufficient for 4.3.

Ian.

George Dunlap

2013-Apr-11 09:43 UTC

head link

Re: Xen 4.3 development update

On 11/04/13 10:33, Ian Campbell wrote:> On Thu, 2013-04-11 at 10:28 +0100, George Dunlap wrote:
>> On 10/04/13 17:41, Konrad Rzeszutek Wilk wrote:
>>>> = Timeline >>>>
>>>> We are planning on a 9-month release cycle.  Based on that,
below are
>>>> our estimated dates:
>>>> * Feature freeze: 25 March 2013
>>>> * Code freezing point: 15 April 2013
>>> Is it possible to extend this? One of the reviewers (Ian) is just
>>> now able to look at code and review. That means the developers have
>>> only 3 business days to repost the changes.
>> So when I said "freezing point", I meant, "we will start
rejecting
>> features".  Each feature will need to be considered individually. 
I
>> think, for example, that PVH is not going to make it -- it touches too
>> much code, and is just not in good enough shape to get in as it is. 
But
>> AFAICT the TMEM stuff should be fine next week.
>>
>> IanC knows that he''s on the hot path, and so will be working
double-time
>> over the next few days to review / commit patches.
> FWIW for the tmem stuff specifically IanJ seems to have a handle on the
> review so it''s not high in my queue.
>
>> We had a talk yesterday about the Linux stubdomain stuff --
that''s a bit
>> less clear.  Changes to libxl should be in
"linux-stubdom-only" paths,
>> so little risk of breaking libxl functionality.  On the other hand, it
>> makes a lot of changes to the build system, adding moderate risk to an
>> important component; and it hasn''t had wide public testing
yet, so
>> there''s no telling how reliable it will be.  On the other
other hand,
>> it''s a blocker for being able to switch to qemu-upstream by
default,
>> which was one of our key goals for 4.3; that may or may not be worth
>> risking slipping the release a bit for.
> I think we switched the default for the non-stubdom case already (or
> were planning too!). I think this is sufficient for 4.3.
I''ve been thinking about this -- wouldn''t this mean that if
you do the
"default" thing and just install a Windows guest in an HVM domain (not
thinking about qemu or whatever), that now I"m stuck and I *can''t*
use
stubdoms (because Windows doesn''t like the hardware changing that much 
under its feet)?  That doesn''t sound like a very good user experience
to me.

  -George

Ian Campbell

2013-Apr-11 09:49 UTC

head link

Re: Xen 4.3 development update

On Thu, 2013-04-11 at 10:43 +0100, George Dunlap wrote:> On 11/04/13 10:33, Ian Campbell wrote:
> > On Thu, 2013-04-11 at 10:28 +0100, George Dunlap wrote:
> >> On 10/04/13 17:41, Konrad Rzeszutek Wilk wrote:
> >>>> = Timeline > >>>>
> >>>> We are planning on a 9-month release cycle.  Based on
that, below are
> >>>> our estimated dates:
> >>>> * Feature freeze: 25 March 2013
> >>>> * Code freezing point: 15 April 2013
> >>> Is it possible to extend this? One of the reviewers (Ian) is
just
> >>> now able to look at code and review. That means the developers
have
> >>> only 3 business days to repost the changes.
> >> So when I said "freezing point", I meant, "we will
start rejecting
> >> features".  Each feature will need to be considered
individually.  I
> >> think, for example, that PVH is not going to make it -- it touches
too
> >> much code, and is just not in good enough shape to get in as it
is.  But
> >> AFAICT the TMEM stuff should be fine next week.
> >>
> >> IanC knows that he''s on the hot path, and so will be
working double-time
> >> over the next few days to review / commit patches.
> > FWIW for the tmem stuff specifically IanJ seems to have a handle on
the
> > review so it''s not high in my queue.
> >
> >> We had a talk yesterday about the Linux stubdomain stuff --
that''s a bit
> >> less clear.  Changes to libxl should be in
"linux-stubdom-only" paths,
> >> so little risk of breaking libxl functionality.  On the other
hand, it
> >> makes a lot of changes to the build system, adding moderate risk
to an
> >> important component; and it hasn''t had wide public
testing yet, so
> >> there''s no telling how reliable it will be.  On the other
other hand,
> >> it''s a blocker for being able to switch to qemu-upstream
by default,
> >> which was one of our key goals for 4.3; that may or may not be
worth
> >> risking slipping the release a bit for.
> > I think we switched the default for the non-stubdom case already (or
> > were planning too!). I think this is sufficient for 4.3.
> 
> I''ve been thinking about this -- wouldn''t this mean that
if you do the
> "default" thing and just install a Windows guest in an HVM domain
(not
> thinking about qemu or whatever), that now I"m stuck and I
*can''t* use
> stubdoms (because Windows doesn''t like the hardware changing that
much
> under its feet)?  That doesn''t sound like a very good user
experience to me.
For that one domain, yes. You could reinstall that VM (or a new one).
However I think using stubdoms is not yet, sadly, the common case and
those who are using it are likely to use it from the point of
installation.

On balance I think making the switch for the non-stubdom case is the
least bad option for most use cases.

Aside: The Windows not liking the change of hardware thing is mostly
supposition which no one has proved or disproved one way or the other
AFAICT.

Ian.

Pasi Kärkkäinen

2013-Apr-25 13:51 UTC

head link

Re: Xen 4.3 development update / winxp AMD performance regression

On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla
wrote:> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> 
> > On 03/04/13 08:27, Jan Beulich wrote:
> >>>>> On 02.04.13 at 18:34, Tim Deegan <tim@xen.org>
wrote:
> >>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
> >>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> >>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
> >>>>>   owner: ?
> >>>>>   Reference:
http://marc.info/?l=xen-devel&m=135075376805215
> >>>> This is supposedly fixed with the RTC changes Tim
committed the
> >>>> other day. Suravee, is that correct?
> >>> This is a separate problem.  IIRC the AMD XP perf issue is
caused by the
> >>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
> >>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
> >>> _lot_ of vmexits for IRQL reads and writes.
> >> Ah, okay, sorry for mixing this up. But how is this a regression
> >> then?
> > 
> > My sense, when I looked at this back whenever that there was much more
to this.  The XP IRQL updating is a problem, but it''s made terribly
worse by the changset in question.  It seemed to me like the kind of thing that
would be caused by TLB or caches suddenly becoming much less effective.
> 
> The commit in question does not add p2m mutations, so it doesn''t
nuke the NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the
problem. Later in the 4.2 cycle we changed the common case to use an rwlock.
Does the same perf degradation occur with tip of 4.2?
> 
Adding Peter to CC who reported the original winxp performance
problem/regression on AMD.

Peter: Can you try Xen 4.2.2 please and report if it has the performance problem
or not?

Thanks,

-- Pasi

George Dunlap

2013-Apr-25 14:00 UTC

head link

Re: Xen 4.3 development update / winxp AMD performance regression

On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:> On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla wrote:
>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>
>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>>    owner: ?
>>>>>>>    Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>>> This is supposedly fixed with the RTC changes Tim
committed the
>>>>>> other day. Suravee, is that correct?
>>>>> This is a separate problem.  IIRC the AMD XP perf issue is
caused by the
>>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>> Ah, okay, sorry for mixing this up. But how is this a
regression
>>>> then?
>>>
>>> My sense, when I looked at this back whenever that there was much
more to this.  The XP IRQL updating is a problem, but it''s made
terribly worse by the changset in question.  It seemed to me like the kind of
thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>
>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>
>
> Adding Peter to CC who reported the original winxp performance
problem/regression on AMD.
>
> Peter: Can you try Xen 4.2.2 please and report if it has the performance
problem or not?
Do you want to compare 4.2.2 to 4.2.1, or 4.3?

The changeset in question was included in the initial release of 4.2, so 
unless you think it''s been fixed since, I would expect 4.2 to have this
regression.

  -George

Andres Lagar-Cavilla

2013-Apr-25 14:24 UTC

head link

Re: Xen 4.3 development update / winxp AMD performance regression

On Apr 25, 2013, at 10:00 AM, George Dunlap <george.dunlap@eu.citrix.com>
wrote:
> On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:
>> On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla wrote:
>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>> 
>>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich
wrote:
>>>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>>>   owner: ?
>>>>>>>>   Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>>>> This is supposedly fixed with the RTC changes Tim
committed the
>>>>>>> other day. Suravee, is that correct?
>>>>>> This is a separate problem.  IIRC the AMD XP perf issue
is caused by the
>>>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>>> Ah, okay, sorry for mixing this up. But how is this a
regression
>>>>> then?
>>>> 
>>>> My sense, when I looked at this back whenever that there was
much more to this.  The XP IRQL updating is a problem, but it''s made
terribly worse by the changset in question.  It seemed to me like the kind of
thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>> 
>>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>> 
>> 
>> Adding Peter to CC who reported the original winxp performance
problem/regression on AMD.
>> 
>> Peter: Can you try Xen 4.2.2 please and report if it has the
performance problem or not?
> 
> Do you want to compare 4.2.2 to 4.2.1, or 4.3?
> 
> The changeset in question was included in the initial release of 4.2, so
unless you think it''s been fixed since, I would expect 4.2 to have this
regression.
I believe you will see this 4.2 onwards. 4.2 includes the rwlock optimization.
Nothing has been added to the tree in that regard recently.

Andres> 
> -George
>

George Dunlap

2013-Apr-25 15:20 UTC

head link

Re: Xen 4.3 development update

On Thu, Apr 4, 2013 at 4:23 PM, Tim Deegan <tim@xen.org>
wrote:> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>
>> > On 03/04/13 08:27, Jan Beulich wrote:
>> >>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>> >>> This is a separate problem.  IIRC the AMD XP perf issue is
caused by the
>> >>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>> >>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>> >>> _lot_ of vmexits for IRQL reads and writes.
>> >> Ah, okay, sorry for mixing this up. But how is this a
regression
>> >> then?
>> >
>> > My sense, when I looked at this back whenever that there was much
more to this.  The XP IRQL updating is a problem, but it''s made
terribly worse by the changset in question.  It seemed to me like the kind of
thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>
>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>
>
> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that takes
> about 12 minutes before this locking change takes more than 20 minutes
> on the current tip of xen-unstable (I gave up at 22 minutes and rebooted
> to test something else).
Tim,

Can you go into a bit more detail about what you complied on what kind of OS?

I just managed to actually find a c/s from which I could build the
tools (git 914e61c), and then compared that with just rebuilding xen
on accused changeset (6b719c3).

The VM was a Debian Wheezy VM, stock kernel (3.2), PVHVM mode, 1G of
RAM, 4 vcpus, LVM-backed 8G disk.

Host is an AMD Barcelona (I think), 8 cores, 4G RAM.

The test was "make -C xen clean && make -j 6 XEN_TARGET_ARCH=x86_64
xen".

Time was measured on the "test controller" machine -- i.e., my dev
box, which is not running Xen.  (This means there''s some potential for
timing variance with ssh and the network, but no potential for timing
variance due to virtual time issues.)

"Good" (c/s 914e61c):
334.92
312.22
311.21
311.71
315.87

"Bad" (c/s 6b719c3)
326.50
295.77
288.50
296.43
276.66

In the "Good" run I had a vnc display going, whereas in the
"bad" run
I didn''t; that could account for the speed-up.  But so far it
contradicts the idea of a systematic problem in c/s 6b719c3.

I''m going to try some other combinations as well...

 -George

George Dunlap

2013-Apr-25 15:26 UTC

head link

Re: Xen 4.3 development update

[And remembering to cc everyone this time]

On Thu, Apr 25, 2013 at 4:20 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:> "Good" (c/s 914e61c):
> 334.92
> 312.22
> 311.21
> 311.71
> 315.87
>
> "Bad" (c/s 6b719c3)
> 326.50
> 295.77
> 288.50
> 296.43
> 276.66
Sorry, this is "seconds to complete", lower is better.

 -George

Tim Deegan

2013-Apr-25 15:46 UTC

head link

Re: Xen 4.3 development update

At 16:20 +0100 on 25 Apr (1366906804), George Dunlap
wrote:> On Thu, Apr 4, 2013 at 4:23 PM, Tim Deegan <tim@xen.org> wrote:
> > At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
> >> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> >>
> >> > On 03/04/13 08:27, Jan Beulich wrote:
> >> >>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
> >> >>> This is a separate problem.  IIRC the AMD XP perf
issue is caused by the
> >> >>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
> >> >>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
> >> >>> _lot_ of vmexits for IRQL reads and writes.
> >> >> Ah, okay, sorry for mixing this up. But how is this a
regression
> >> >> then?
> >> >
> >> > My sense, when I looked at this back whenever that there was
much more to this.  The XP IRQL updating is a problem, but it''s made
terribly worse by the changset in question.  It seemed to me like the kind of
thing that would be caused by TLB or caches suddenly becoming much less
effective.
> >>
> >> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
> >>
> >
> > Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that
takes
> > about 12 minutes before this locking change takes more than 20 minutes
> > on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
> > to test something else).
> 
> Tim,
> 
> Can you go into a bit more detail about what you complied on what kind of
OS?
I was compiling on Win XP sp3, 32-bit, 1vcpu, 4G ram.  The compile was
the Windows DDK sample code. 

As I think I mentioned later, all my measurements are extremely suspect
as I was relying on guest wallclock time, and the ''before''
case was
before the XP wallclock time was fixed. :(
> The VM was a Debian Wheezy VM, stock kernel (3.2), PVHVM mode, 1G of
> RAM, 4 vcpus, LVM-backed 8G disk.
I suspect the TPR access patterns of XP are not seen on linux; it''s
been
known for long enough now that it''s super-slow on emulated platforms
and
AFAIK it was only ever Windows that used the TPR so aggressively anyway.

Tim.

George Dunlap

2013-Apr-25 15:50 UTC

head link

Re: Xen 4.3 development update

On 04/25/2013 04:46 PM, Tim Deegan wrote:> At 16:20 +0100 on 25 Apr (1366906804), George Dunlap wrote:
>> On Thu, Apr 4, 2013 at 4:23 PM, Tim Deegan <tim@xen.org> wrote:
>>> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
>>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>>>
>>>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>>>>>> This is a separate problem.  IIRC the AMD XP perf
issue is caused by the
>>>>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>>>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>>>> Ah, okay, sorry for mixing this up. But how is this a
regression
>>>>>> then?
>>>>>
>>>>> My sense, when I looked at this back whenever that there
was much more to this.  The XP IRQL updating is a problem, but it''s
made terribly worse by the changset in question.  It seemed to me like the kind
of thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>>>
>>>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>>>
>>>
>>> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that
takes
>>> about 12 minutes before this locking change takes more than 20
minutes
>>> on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
>>> to test something else).
>>
>> Tim,
>>
>> Can you go into a bit more detail about what you complied on what kind
of OS?
>
> I was compiling on Win XP sp3, 32-bit, 1vcpu, 4G ram.  The compile was
> the Windows DDK sample code.
>
> As I think I mentioned later, all my measurements are extremely suspect
> as I was relying on guest wallclock time, and the
''before'' case was
> before the XP wallclock time was fixed. :(
>
>> The VM was a Debian Wheezy VM, stock kernel (3.2), PVHVM mode, 1G of
>> RAM, 4 vcpus, LVM-backed 8G disk.
>
> I suspect the TPR access patterns of XP are not seen on linux;
it''s been
> known for long enough now that it''s super-slow on emulated
platforms and
> AFAIK it was only ever Windows that used the TPR so aggressively anyway.
Right.  IIRC w2k3 sp2 has the "lazy tpr" feature, so if I can get 
consistent results with that one then we can say... well, we can at 
least say it''s not easy to reproduce. :-)

  -George

Peter Maloney

2013-Apr-28 10:18 UTC

head link

Re: Xen 4.3 development update / winxp AMD performance regression

On 04/25/2013 04:24 PM, Andres Lagar-Cavilla wrote:> On Apr 25, 2013, at 10:00 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>
>> On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:
>>> On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla
wrote:
>>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>>>
>>>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich
wrote:
>>>>>>>>>>> On 02.04.13 at 16:07, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>>>>>>>>> * AMD NPT performance regression after c/s
24770:7f79475d3de7
>>>>>>>>>   owner: ?
>>>>>>>>>   Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>>>>> This is supposedly fixed with the RTC changes
Tim committed the
>>>>>>>> other day. Suravee, is that correct?
>>>>>>> This is a separate problem.  IIRC the AMD XP perf
issue is caused by the
>>>>>>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>>>>>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>>>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>>>> Ah, okay, sorry for mixing this up. But how is this a
regression
>>>>>> then?
>>>>> My sense, when I looked at this back whenever that there
was much more to this.  The XP IRQL updating is a problem, but it''s
made terribly worse by the changset in question.  It seemed to me like the kind
of thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>>>
>>> Adding Peter to CC who reported the original winxp performance
problem/regression on AMD.
>>>
>>> Peter: Can you try Xen 4.2.2 please and report if it has the
performance problem or not?
>> Do you want to compare 4.2.2 to 4.2.1, or 4.3?
>>
>> The changeset in question was included in the initial release of 4.2,
so unless you think it''s been fixed since, I would expect 4.2 to have
this regression.
> I believe you will see this 4.2 onwards. 4.2 includes the rwlock
optimization. Nothing has been added to the tree in that regard recently.
>
> AndresBad news... It is very slow still. With 7 vcpus, it took very long to
get to the login screen, then I hit the login button at 10:30:30 and at
10.32:10 I can watch my icons starting to appear one by one very slowly.
When the icons are all there, I see a blue bar instead of the taskbar.
10:32:47 the taskbar looks normal finally, but systray is still empty. I
clicked the start menu at 10:33:40 (still empty systray). At 10:33:54,
the start menu opened. At 10:34:20, the first systray icon appeared. at
10:36 I managed to get Task manager loaded, and it shows 88-95% CPU
usage in 7 cpus, but doesn''t show any processes using much. (xming
using
16, System using 11, taskmgr.exe using 9, CCC.exe using 5, explorer and
services using 4%, etc.) xm top shows the domain at 646.9% CPU.


xentop - 10:39:37   Xen 4.2.2
2 domains: 2 running, 0 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 16757960k total, 12768800k used, 3989160k free    CPUs: 8 @ 4499MHz
      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k)
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR 
VBD_RSECT  VBD_WSECT SSID
  Domain-0 -----r       1184   25.4    8344320   49.8    8388608     
50.1     8    0        0        0    0        0        0       
0          0          0    0
windowsxp2 -----r       3853  587.3    4197220   25.0    4198400     
25.1     7    1      392       20    1        0    17657     4661    
806510      58398    0

(about 8% of the dom0 stuff is the qemu-dm process, and the rest is
unrelated things)

And this was expected, since I already tested 4.2.1, and you said that
this fix should be in 4.2 onwards, so I would have already tested it in
4.2.1.


Here''s xm vcpu-list
Name                                ID  VCPU   CPU State   Time(s) CPU
Affinity
Domain-0                             0     0     2   -b-     461.7 any cpu
Domain-0                             0     1     4   -b-     340.1 any cpu
Domain-0                             0     2     5   -b-     182.8 any cpu
Domain-0                             0     3     3   -b-      84.9 any cpu
Domain-0                             0     4     2   -b-      67.5 any cpu
Domain-0                             0     5     2   r--      62.5 any cpu
Domain-0                             0     6     3   -b-      44.3 any cpu
Domain-0                             0     7     2   -b-      46.5 any cpu
windowsxp2                           3     0     5   r--     755.4 any cpu
windowsxp2                           3     1     1   r--     688.1 any cpu
windowsxp2                           3     2     3   r--     702.6 any cpu
windowsxp2                           3     3     7   r--     723.4 any cpu
windowsxp2                           3     4     6   r--     724.7 any cpu
windowsxp2                           3     5     0   r--     725.0 any cpu
windowsxp2                           3     6     4   r--     821.3 any cpu


Here''s dmesg just to see the version:

 __  __            _  _    ____    ____ 
 \ \/ /___ _ __   | || |  |___ \  |___ \
  \  // _ \ ''_ \  | || |_   __) |   __) |
  /  \  __/ | | | |__   _| / __/ _ / __/
 /_/\_\___|_| |_|    |_|(_)_____(_)_____|
                                        
(XEN) Xen version 4.2.2 (root@site) (gcc (SUSE Linux) 4.7.1 20120723
[gcc-4_7-branch revision 189773]) Sun Apr 28 00:16:04 CEST 2013
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GRUB2 2.00
(XEN) Command line: dom0_mem=8192M,max:8192M iommu=1 loglvl=all
guest_loglvl=all


Here''s the dmesg after domu boots:



(XEN) HVM2: Press F12 for boot menu.
(XEN) HVM2:
(XEN) HVM2: Booting from Hard Disk...
(XEN) HVM2: Booting from 0000:7c00
(XEN) HVM3: HVM Loader
(XEN) HVM3: Detected Xen v4.2.2
(XEN) HVM3: Xenbus rings @0xfeffc000, event channel 9
(XEN) HVM3: System requested ROMBIOS
(XEN) HVM3: CPU speed is 4500 MHz
(XEN) irq.c:270: Dom3 PCI link 0 changed 0 -> 5
(XEN) HVM3: PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom3 PCI link 1 changed 0 -> 10
(XEN) HVM3: PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom3 PCI link 2 changed 0 -> 11
(XEN) HVM3: PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom3 PCI link 3 changed 0 -> 5
(XEN) HVM3: PCI-ISA link 3 routed to IRQ5
(XEN) HVM3: pci dev 01:2 INTD->IRQ5
(XEN) HVM3: pci dev 01:3 INTA->IRQ10
(XEN) HVM3: pci dev 03:0 INTA->IRQ5
(XEN) HVM3: pci dev 04:0 INTA->IRQ5
(XEN) HVM3: pci dev 05:0 INTA->IRQ10
(XEN) HVM3: pci dev 02:0 bar 10 size 02000000: f0000008
(XEN) HVM3: pci dev 03:0 bar 14 size 01000000: f2000008
(XEN) HVM3: pci dev 02:0 bar 14 size 00001000: f3000000
(XEN) HVM3: pci dev 03:0 bar 10 size 00000100: 0000c001
(XEN) HVM3: pci dev 04:0 bar 10 size 00000100: 0000c101
(XEN) HVM3: pci dev 04:0 bar 14 size 00000100: f3001000
(XEN) HVM3: pci dev 05:0 bar 10 size 00000100: 0000c201
(XEN) HVM3: pci dev 01:2 bar 20 size 00000020: 0000c301
(XEN) HVM3: pci dev 01:1 bar 20 size 00000010: 0000c321
(XEN) HVM3: Multiprocessor initialisation:
(XEN) HVM3:  - CPU0 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU1 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU2 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU3 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU4 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU5 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU6 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3: Testing HVM environment:
(XEN) HVM3:  - REP INSB across page boundaries ... passed
(XEN) HVM3:  - GS base MSRs and SWAPGS ... passed
(XEN) HVM3: Passed 2 of 2 tests
(XEN) HVM3: Writing SMBIOS tables ...
(XEN) HVM3: Loading ROMBIOS ...
(XEN) HVM3: 12604 bytes of ROMBIOS high-memory extensions:
(XEN) HVM3:   Relocating to 0xfc001000-0xfc00413c ... done
(XEN) HVM3: Creating MP tables ...
(XEN) HVM3: Loading Cirrus VGABIOS ...
(XEN) HVM3: Loading PCI Option ROM ...
(XEN) HVM3:  - Manufacturer: http://ipxe.org
(XEN) HVM3:  - Product name: iPXE
(XEN) HVM3: Option ROMs:
(XEN) HVM3:  c0000-c8fff: VGA BIOS
(XEN) HVM3:  c9000-d8fff: Etherboot ROM
(XEN) HVM3: Loading ACPI ...
(XEN) HVM3: vm86 TSS at fc010200
(XEN) HVM3: BIOS map:
(XEN) HVM3:  f0000-fffff: Main BIOS
(XEN) HVM3: E820 table:
(XEN) HVM3:  [00]: 00000000:00000000 - 00000000:0009e000: RAM
(XEN) HVM3:  [01]: 00000000:0009e000 - 00000000:000a0000: RESERVED
(XEN) HVM3:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM3:  [02]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM3:  [03]: 00000000:00100000 - 00000000:f0000000: RAM
(XEN) HVM3:  HOLE: 00000000:f0000000 - 00000000:fc000000
(XEN) HVM3:  [04]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM3:  [05]: 00000001:00000000 - 00000001:10000000: RAM
(XEN) HVM3: Invoking ROMBIOS ...
(XEN) HVM3: $Revision: 1.221 $ $Date: 2008/12/07 17:32:29 $
(XEN) stdvga.c:147:d3 entering stdvga and caching modes
(XEN) HVM3: VGABios $Id: vgabios.c,v 1.67 2008/01/27 09:44:12 vruppert Exp $
(XEN) HVM3: Bochs BIOS - build: 06/23/99
(XEN) HVM3: $Revision: 1.221 $ $Date: 2008/12/07 17:32:29 $
(XEN) HVM3: Options: apmbios pcibios eltorito PMM
(XEN) HVM3:
(XEN) HVM3: ata0-0: PCHS=16383/16/63 translation=lba LCHS=1024/255/63
(XEN) HVM3: ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (40960 MBytes)
(XEN) HVM3: IDE time out
(XEN) HVM3:
(XEN) HVM3:
(XEN) HVM3:
(XEN) HVM3: Press F12 for boot menu.
(XEN) HVM3:
(XEN) HVM3: Booting from Hard Disk...
(XEN) HVM3: Booting from 0000:7c00
(XEN) HVM3: int13_harddisk: function 15, unmapped device for ELDL=81
(XEN) HVM3: *** int 15h function AX=e980, BX=007e not yet supported!
(XEN) irq.c:270: Dom3 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom3 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom3 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom3 PCI link 3 changed 5 -> 0
(XEN) grant_table.c:1237:d3 Expanding dom (3) grant table from (4) to
(32) frames.
(XEN) irq.c:375: Dom3 callback via changed to GSI 28
(XEN) stdvga.c:151:d3 leaving stdvga

George Dunlap

2013-Apr-29 09:01 UTC

head link

Re: Xen 4.3 development update / winxp AMD performance regression

On 28/04/13 11:18, Peter Maloney wrote:> On 04/25/2013 04:24 PM, Andres Lagar-Cavilla wrote:
>> On Apr 25, 2013, at 10:00 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>
>>> On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:
>>>> On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla
wrote:
>>>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>>>>
>>>>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>>>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan
Beulich wrote:
>>>>>>>>>>>> On 02.04.13 at 16:07, George
Dunlap <George.Dunlap@eu.citrix.com> wrote:
>>>>>>>>>> * AMD NPT performance regression after
c/s 24770:7f79475d3de7
>>>>>>>>>>    owner: ?
>>>>>>>>>>    Reference:
http://marc.info/?l=xen-devel&m=135075376805215
>>>>>>>>> This is supposedly fixed with the RTC
changes Tim committed the
>>>>>>>>> other day. Suravee, is that correct?
>>>>>>>> This is a separate problem.  IIRC the AMD XP
perf issue is caused by the
>>>>>>>> emulation of LAPIC TPR accesses slowing down
with Andres''s p2m locking
>>>>>>>> patches.  XP doesn''t have
''lazy IRQL'' or support for CR8, so it takes a
>>>>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>>>>> Ah, okay, sorry for mixing this up. But how is this
a regression
>>>>>>> then?
>>>>>> My sense, when I looked at this back whenever that
there was much more to this.  The XP IRQL updating is a problem, but
it''s made terribly worse by the changset in question.  It seemed to me
like the kind of thing that would be caused by TLB or caches suddenly becoming
much less effective.
>>>>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>>>>
>>>> Adding Peter to CC who reported the original winxp performance
problem/regression on AMD.
>>>>
>>>> Peter: Can you try Xen 4.2.2 please and report if it has the
performance problem or not?
>>> Do you want to compare 4.2.2 to 4.2.1, or 4.3?
>>>
>>> The changeset in question was included in the initial release of
4.2, so unless you think it''s been fixed since, I would expect 4.2 to
have this regression.
>> I believe you will see this 4.2 onwards. 4.2 includes the rwlock
optimization. Nothing has been added to the tree in that regard recently.
>>
>> Andres
> Bad news... It is very slow still. With 7 vcpus, it took very long to
> get to the login screen, then I hit the login button at 10:30:30 and at
> 10.32:10 I can watch my icons starting to appear one by one very slowly.
> When the icons are all there, I see a blue bar instead of the taskbar.
> 10:32:47 the taskbar looks normal finally, but systray is still empty. I
> clicked the start menu at 10:33:40 (still empty systray). At 10:33:54,
> the start menu opened. At 10:34:20, the first systray icon appeared. at
> 10:36 I managed to get Task manager loaded, and it shows 88-95% CPU
> usage in 7 cpus, but doesn''t show any processes using much. (xming
using
> 16, System using 11, taskmgr.exe using 9, CCC.exe using 5, explorer and
> services using 4%, etc.) xm top shows the domain at 646.9% CPU.
What guest OS is this again?  Windows XP?  Do you see the same behavior 
with other Windows OSes?  (e.g., Win7, Win8, w2k3sp2, w2k8?)

If you''re really keen, you could do a quick xentrace for me after the
VM
has mostly booted:
1. Run "xentrace -D -e all -S 32 -T 30 /tmp/[name].trace" on your Xen
host
2. Clone and build the following hg repo: 
http://xenbits.xen.org/ext/xenalyze
3. Run "xenalyze --svm-mode -s [name].trace > [name].summary" and
send
me the results

  -George

Peter Maloney

2013-Apr-29 13:21 UTC

head link

Re: Xen 4.3 development update

On 04/04/2013 07:05 PM, Tim Deegan wrote:> At 16:23 +0100 on 04 Apr (1365092601), Tim Deegan wrote:
>> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that
takes
>> about 12 minutes before this locking change takes more than 20 minutes
>> on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
>> to test something else).
> I did a bit of prodding at this, but messed up my measurements in a
> bunch of different ways over the afternoon. :(  I''m going to be
away
> from my test boxes for a couple of weeks now, so all I can say is, if
> you''re investigating this bug, beware that:
>
>  - the revision before this change still has the RTC bugs that were
>    fixed last week, so don''t measure performance based on guest
>    wallclock time, or your ''before'' perf will look too
good.
>  - the current unstable tip has test code to exercise the new
>    map_domain_page(), which will badly affect all the many memory
>    accesses done in HVM emulation, so make sure you use debug=n builds
>    for measurement.
>
> Also, if there is still a bad slowdown, caused by the p2m lookups, this
> might help a little bit:
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 38e87ce..7bd8646 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>          }
>      }
>  
> +
> +    /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
> +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
> +    if ( !nestedhvm_vcpu_in_guestmode(v)
> +         && gfn == vlapic_base_address(vcpu_vlapic(current))
>> PAGE_SHIFT )
> +    {
> +        if ( !handle_mmio() )
> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
> +        rc = 1;
> +        goto out;
> +    }
> +
>      p2m = p2m_get_hostp2m(v->domain);
>      mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
>                                P2M_ALLOC | (access_w ? P2M_UNSHARE : 0),
NULL);This patch (applied to 4.2.2) has a very large improvement on my box
(AMD FX-8150) and WinXP 32 bit.

It only took about 2.5 minutes to log in and see task manager. It takes
about 6 minutes without the patch. And 2.5 minutes is still terrible,
but obviously better.>
> but in fact, the handle_mmio() will have to do GVA->GFN lookups for its
> %RIP and all its operands, and each of those will involve multiple
> GFN->MFN lookups for the pagetable entries, so if the GFN->MFN lookup
> has got slower, eliminating just the one at the start may not be all
> that great.
>
> Cheers,
>
> Tim.
>

Tim Deegan

2013-May-02 15:48 UTC

head link

Re: Xen 4.3 development update

At 15:21 +0200 on 29 Apr (1367248894), Peter Maloney
wrote:> On 04/04/2013 07:05 PM, Tim Deegan wrote:
> > Also, if there is still a bad slowdown, caused by the p2m lookups,
this
> > might help a little bit:
> >
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index 38e87ce..7bd8646 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
> >          }
> >      }
> >  
> > +
> > +    /* For the benefit of 32-bit WinXP (& older Windows) on AMD
CPUs,
> > +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
> > +    if ( !nestedhvm_vcpu_in_guestmode(v)
> > +         && gfn == vlapic_base_address(vcpu_vlapic(current))
>> PAGE_SHIFT )
> > +    {
> > +        if ( !handle_mmio() )
> > +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
> > +        rc = 1;
> > +        goto out;
> > +    }
> > +
> >      p2m = p2m_get_hostp2m(v->domain);
> >      mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
> >                                P2M_ALLOC | (access_w ? P2M_UNSHARE :
0), NULL);
> This patch (applied to 4.2.2) has a very large improvement on my box
> (AMD FX-8150) and WinXP 32 bit.
Hmm - I expected it to be only a mild improvement.  How about this one, 
which puts in the same shortcut in another place as well?  I don''t
think
it will be much better than the last one, but it''s worth a try.

Tim.

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c8487b8..10b6f6b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1361,6 +1361,17 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
         }
     }
 
+    /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
+     * a fast path for LAPIC accesses, skipping the p2m lookup. */
+    if ( !nestedhvm_vcpu_in_guestmode(v)
+         && gfn == vlapic_base_address(vcpu_vlapic(v)) >>
PAGE_SHIFT )
+    {
+        if ( !handle_mmio() )
+            hvm_inject_hw_exception(TRAP_gp_fault, 0);
+        rc = 1;
+        goto out;
+    }
+
     p2m = p2m_get_hostp2m(v->domain);
     mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
                               P2M_ALLOC | (access_w ? P2M_UNSHARE : 0), NULL);
@@ -2471,6 +2482,12 @@ static enum hvm_copy_result __hvm_copy(
             gfn = addr >> PAGE_SHIFT;
         }
 
+        /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
+         * a fast path for LAPIC accesses, skipping the p2m lookup. */
+        if ( !nestedhvm_vcpu_in_guestmode(curr)
+             && gfn == vlapic_base_address(vcpu_vlapic(curr)) >>
PAGE_SHIFT )
+            return HVMCOPY_bad_gfn_to_mfn;
+
         page = get_page_from_gfn(curr->domain, gfn, &p2mt, P2M_UNSHARE);
 
         if ( p2m_is_paging(p2mt) )

George Dunlap

2013-May-03 09:35 UTC

head link

Re: Xen 4.3 development update

On Thu, Apr 25, 2013 at 4:20 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:> On Thu, Apr 4, 2013 at 4:23 PM, Tim Deegan <tim@xen.org> wrote:
>> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>>
>>> > On 03/04/13 08:27, Jan Beulich wrote:
>>> >>>>> On 02.04.13 at 18:34, Tim Deegan
<tim@xen.org> wrote:
>>> >>> This is a separate problem.  IIRC the AMD XP perf
issue is caused by the
>>> >>> emulation of LAPIC TPR accesses slowing down with
Andres''s p2m locking
>>> >>> patches.  XP doesn''t have ''lazy
IRQL'' or support for CR8, so it takes a
>>> >>> _lot_ of vmexits for IRQL reads and writes.
>>> >> Ah, okay, sorry for mixing this up. But how is this a
regression
>>> >> then?
>>> >
>>> > My sense, when I looked at this back whenever that there was
much more to this.  The XP IRQL updating is a problem, but it''s made
terribly worse by the changset in question.  It seemed to me like the kind of
thing that would be caused by TLB or caches suddenly becoming much less
effective.
>>>
>>> The commit in question does not add p2m mutations, so it
doesn''t nuke the NPT/EPT TLBs. It introduces a spin lock in the hot
path and that is the problem. Later in the 4.2 cycle we changed the common case
to use an rwlock. Does the same perf degradation occur with tip of 4.2?
>>>
>>
>> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that
takes
>> about 12 minutes before this locking change takes more than 20 minutes
>> on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
>> to test something else).
>
> Tim,
>
> Can you go into a bit more detail about what you complied on what kind of
OS?
>
> I just managed to actually find a c/s from which I could build the
> tools (git 914e61c), and then compared that with just rebuilding xen
> on accused changeset (6b719c3).
>
> The VM was a Debian Wheezy VM, stock kernel (3.2), PVHVM mode, 1G of
> RAM, 4 vcpus, LVM-backed 8G disk.
>
> Host is an AMD Barcelona (I think), 8 cores, 4G RAM.
>
> The test was "make -C xen clean && make -j 6
XEN_TARGET_ARCH=x86_64 xen".
>
> Time was measured on the "test controller" machine -- i.e., my
dev
> box, which is not running Xen.  (This means there''s some potential
for
> timing variance with ssh and the network, but no potential for timing
> variance due to virtual time issues.)
>
> "Good" (c/s 914e61c):
> 334.92
> 312.22
> 311.21
> 311.71
> 315.87
>
> "Bad" (c/s 6b719c3)
> 326.50
> 295.77
> 288.50
> 296.43
> 276.66
>
> In the "Good" run I had a vnc display going, whereas in the
"bad" run
> I didn''t; that could account for the speed-up.  But so far it
> contradicts the idea of a systematic problem in c/s 6b719c3.
>
> I''m going to try some other combinations as well...
BTW, I did the same test with 4.1, 4.2.2-RC2, and a recent
xen-unstable tip.  Here are all the results, presented in the order of
the version of xen tested:

v4.1:
292.35
267.31
270.91
285.81
278.30

"Good" git c/s 914e61c:
334.92
312.22
311.21
311.71
315.87

"Bad" git c/s 6b719c3:
326.50
295.77
288.50
296.43
276.66

4.2.2-rc2:
261.49
250.75
246.82
246.23
247.64

Xen-unstable "recent" master:
267.31
258.49
256.83
250.77
252.36

So overall I think we can say that c/s 6b719c3 didn''t cause a general
performance regression on AMD HVM guests.

I''m in the process of duplicating the exact test which Peter noticed,
namely time to boot Windows XP.

 -George

George Dunlap

2013-May-03 16:41 UTC

head link

Re: Xen 4.3 development update

On 02/05/13 16:48, Tim Deegan wrote:> At 15:21 +0200 on 29 Apr (1367248894), Peter Maloney wrote:
>> On 04/04/2013 07:05 PM, Tim Deegan wrote:
>>> Also, if there is still a bad slowdown, caused by the p2m lookups,
this
>>> might help a little bit:
>>>
>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>> index 38e87ce..7bd8646 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>           }
>>>       }
>>>   
>>> +
>>> +    /* For the benefit of 32-bit WinXP (& older Windows) on
AMD CPUs,
>>> +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
>>> +    if ( !nestedhvm_vcpu_in_guestmode(v)
>>> +         && gfn ==
vlapic_base_address(vcpu_vlapic(current)) >> PAGE_SHIFT )
>>> +    {
>>> +        if ( !handle_mmio() )
>>> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>> +        rc = 1;
>>> +        goto out;
>>> +    }
>>> +
>>>       p2m = p2m_get_hostp2m(v->domain);
>>>       mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>>>                                 P2M_ALLOC | (access_w ? P2M_UNSHARE
: 0), NULL);
>> This patch (applied to 4.2.2) has a very large improvement on my box
>> (AMD FX-8150) and WinXP 32 bit.
> Hmm - I expected it to be only a mild improvement.  How about this one,
> which puts in the same shortcut in another place as well?  I don''t
think
> it will be much better than the last one, but it''s worth a try.
So I dusted off my old perf testing scripts and added in one to measure 
boot performance.

Below are boot times, from after "xl create" returns, until a specific
python daemon running in the VM starts responding to requests.  So lower 
is better.

There are a number of places where there can be a few seconds of noise 
either way, but on the whole the tests seem fairly repeatable.

I ran this with w2k3eesp2 and with winxpsp3, using some of the 
auto-install test images made for the XenServer regression testing. All 
of them are using a flat file disk backend with qemu-traditional.

Results are in order of commits:

Xen 4.1:

w2k3: 43 34 34 33 34
winxp: 110 111 111 110 112

Xen 4.2:

w2k3: 34 44 45 45 45
winxp: 203 221 210 211 200

Xen-unstable w/ RTC fix:

w2k3: 43 44 44 45 44
winxp: 268 275 265 276 265

Xen-unstable with rtc fix + this "fast lapic" patch:

w2k3: 43 45 44 45 45
winxp: 224 232 232 232 232

So w2k3 boots fairly quickly anyway; has a 50% slow-down when moving 
from 4.1 to 4.2, and no discernible change after that.

winxp boots fairly slowly; nearly doubles in speed for 4.2, and gets 
even worse for xen-unstable.  The patch is a measurable improvement, but 
still nowhere near 4.1, or even 4.2.

On the whole however -- I''m not sure that boot time by itself is a 
blocker.  If the problem really is primarily the "eager TPR" issue for
Windows XP, then I''m not terribly motivated either: the Citrix PV 
drivers patch Windows XP to modify the routine to be lazy (like w2k3); 
there is hardware available which allows the TPR to be virtualized; and 
there are plenty of Windows-based OSes available which do not have this 
problem.

I''ll be doing some more workload-based benchmarks (probably starting 
with the Windows ddk example build) to see if there are other issues I 
turn up.

  -George

Tim Deegan

2013-May-03 16:59 UTC

head link

Re: Xen 4.3 development update

At 17:41 +0100 on 03 May (1367602895), George Dunlap
wrote:> winxp boots fairly slowly; nearly doubles in speed for 4.2, and gets 
> even worse for xen-unstable.  The patch is a measurable improvement, but 
> still nowhere near 4.1, or even 4.2.
Ergh. :(
> On the whole however -- I''m not sure that boot time by itself is a
> blocker.  If the problem really is primarily the "eager TPR"
issue for
> Windows XP, then I''m not terribly motivated either: the Citrix PV 
> drivers patch Windows XP to modify the routine to be lazy (like w2k3); 
> there is hardware available which allows the TPR to be virtualized;
One reason I was chasing this is that the AMD hardware acceleration for
TPR (via the CR8 register) needs software changes in the OS to make
use of it (which XP doesn''t have).  The Intel acceleration works fine
for XP, AFAICT.

Tim.
> and 
> there are plenty of Windows-based OSes available which do not have this 
> problem.

Pasi Kärkkäinen

2013-May-04 10:47 UTC

head link

Re: Xen 4.3 development update

On Fri, May 03, 2013 at 05:41:35PM +0100, George Dunlap
wrote:> On 02/05/13 16:48, Tim Deegan wrote:
> >At 15:21 +0200 on 29 Apr (1367248894), Peter Maloney wrote:
> >>On 04/04/2013 07:05 PM, Tim Deegan wrote:
> >>>Also, if there is still a bad slowdown, caused by the p2m
lookups, this
> >>>might help a little bit:
> >>>
> >>>diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >>>index 38e87ce..7bd8646 100644
> >>>--- a/xen/arch/x86/hvm/hvm.c
> >>>+++ b/xen/arch/x86/hvm/hvm.c
> >>>@@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t
gpa,
> >>>          }
> >>>      }
> >>>+
> >>>+    /* For the benefit of 32-bit WinXP (& older Windows)
on AMD CPUs,
> >>>+     * a fast path for LAPIC accesses, skipping the p2m
lookup. */
> >>>+    if ( !nestedhvm_vcpu_in_guestmode(v)
> >>>+         && gfn ==
vlapic_base_address(vcpu_vlapic(current)) >> PAGE_SHIFT )
> >>>+    {
> >>>+        if ( !handle_mmio() )
> >>>+            hvm_inject_hw_exception(TRAP_gp_fault, 0);
> >>>+        rc = 1;
> >>>+        goto out;
> >>>+    }
> >>>+
> >>>      p2m = p2m_get_hostp2m(v->domain);
> >>>      mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
> >>>                                P2M_ALLOC | (access_w ?
P2M_UNSHARE : 0), NULL);
> >>This patch (applied to 4.2.2) has a very large improvement on my
box
> >>(AMD FX-8150) and WinXP 32 bit.
> >Hmm - I expected it to be only a mild improvement.  How about this one,
> >which puts in the same shortcut in another place as well?  I
don''t think
> >it will be much better than the last one, but it''s worth a
try.
> 
> So I dusted off my old perf testing scripts and added in one to
> measure boot performance.
> 
> Below are boot times, from after "xl create" returns, until a
> specific python daemon running in the VM starts responding to
> requests.  So lower is better.
> 
> There are a number of places where there can be a few seconds of
> noise either way, but on the whole the tests seem fairly repeatable.
> 
> I ran this with w2k3eesp2 and with winxpsp3, using some of the
> auto-install test images made for the XenServer regression testing.
> All of them are using a flat file disk backend with
> qemu-traditional.
> 
> Results are in order of commits:
> 
> Xen 4.1:
> 
> w2k3: 43 34 34 33 34
> winxp: 110 111 111 110 112
> 
> Xen 4.2:
> 
> w2k3: 34 44 45 45 45
> winxp: 203 221 210 211 200
> 
> Xen-unstable w/ RTC fix:
> 
> w2k3: 43 44 44 45 44
> winxp: 268 275 265 276 265
> 
> Xen-unstable with rtc fix + this "fast lapic" patch:
> 
> w2k3: 43 45 44 45 45
> winxp: 224 232 232 232 232
> 
> 
> So w2k3 boots fairly quickly anyway; has a 50% slow-down when moving
> from 4.1 to 4.2, and no discernible change after that.
> 
> winxp boots fairly slowly; nearly doubles in speed for 4.2, and gets
> even worse for xen-unstable.  The patch is a measurable improvement,
> but still nowhere near 4.1, or even 4.2.
> 
> On the whole however -- I''m not sure that boot time by itself is a
> blocker.  If the problem really is primarily the "eager TPR"
issue
> for Windows XP, then I''m not terribly motivated either: the Citrix
> PV drivers patch Windows XP to modify the routine to be lazy (like
> w2k3); there is hardware available which allows the TPR to be
> virtualized; and there are plenty of Windows-based OSes available
> which do not have this problem.
> 
A couple of questions:

- Does Citrix XenServer Windows PV driver work with vanilla Xen 4.2.x? I
remember someone complaining on the list that it doesn''t work.. (but
I''m not sure about that).

- Does GPLPV do the lazy patching for WinXP on AMD?


-- Pasi

George Dunlap

2013-May-07 13:15 UTC

head link

Re: Xen 4.3 development update

On Fri, May 3, 2013 at 5:41 PM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> I''ll be doing some more workload-based benchmarks (probably
starting with
> the Windows ddk example build) to see if there are other issues I turn up.
So here are my results with ddk-build for Windows 2003 (which again
have the "lazy IRQL" feature, and so aren''t impacted as hard
by the
extra time processing).  It''s a "time to complete" test, so
lower is
better.  (I recommend ignoring the first run, as it will be warming up
the disk cache.)

Xen 4.1: 223 167 167 170 165

Xen 4.2: 216 140 145 145 150

Xen-unstable: 227 200 190 200 210

Xen-unstable+lapic: 246 175 175 180 175

So it appears that there has been a regression from 4.1, but since 4.2
is actually significantly *better* than 4.1, it''s probably not related
to the c/s we''ve been discussing.

In any case, the lapic patch seems to give a measurable advantage, so
it''s probably worth putting in.

I''m going to try doing some tests of the same builds on an Intel box
and see what we get as well.  Not sure XP is worth doing, as each
build is going to take forever...

 -George

Pasi Kärkkäinen

2013-May-07 13:56 UTC

head link

Re: Xen 4.3 development update

On Mon, Apr 29, 2013 at 03:21:34PM +0200, Peter Maloney
wrote:> On 04/04/2013 07:05 PM, Tim Deegan wrote:
> > At 16:23 +0100 on 04 Apr (1365092601), Tim Deegan wrote:
> >> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
> >>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> >> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that
takes
> >> about 12 minutes before this locking change takes more than 20
minutes
> >> on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
> >> to test something else).
> > I did a bit of prodding at this, but messed up my measurements in a
> > bunch of different ways over the afternoon. :(  I''m going to
be away
> > from my test boxes for a couple of weeks now, so all I can say is, if
> > you''re investigating this bug, beware that:
> >
> >  - the revision before this change still has the RTC bugs that were
> >    fixed last week, so don''t measure performance based on
guest
> >    wallclock time, or your ''before'' perf will look
too good.
> >  - the current unstable tip has test code to exercise the new
> >    map_domain_page(), which will badly affect all the many memory
> >    accesses done in HVM emulation, so make sure you use debug=n builds
> >    for measurement.
> >
> > Also, if there is still a bad slowdown, caused by the p2m lookups,
this
> > might help a little bit:
> >
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index 38e87ce..7bd8646 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
> >          }
> >      }
> >  
> > +
> > +    /* For the benefit of 32-bit WinXP (& older Windows) on AMD
CPUs,
> > +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
> > +    if ( !nestedhvm_vcpu_in_guestmode(v)
> > +         && gfn == vlapic_base_address(vcpu_vlapic(current))
>> PAGE_SHIFT )
> > +    {
> > +        if ( !handle_mmio() )
> > +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
> > +        rc = 1;
> > +        goto out;
> > +    }
> > +
> >      p2m = p2m_get_hostp2m(v->domain);
> >      mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
> >                                P2M_ALLOC | (access_w ? P2M_UNSHARE :
0), NULL);
> This patch (applied to 4.2.2) has a very large improvement on my box
> (AMD FX-8150) and WinXP 32 bit.
> 
> It only took about 2.5 minutes to log in and see task manager. It takes
> about 6 minutes without the patch. And 2.5 minutes is still terrible,
> but obviously better.
>
So is the problem only on WinXP with "booting up / logging in to
windows",
or do you see performance regressions on some actual benchmark tools aswell 
(after windows has started up) ?

-- Pasi

George Dunlap

2013-May-07 14:55 UTC

head link

Re: Xen 4.3 development update

On 04/05/13 11:47, Pasi Kärkkäinen wrote:> On Fri, May 03, 2013 at 05:41:35PM +0100, George Dunlap wrote:
>> On 02/05/13 16:48, Tim Deegan wrote:
>>> At 15:21 +0200 on 29 Apr (1367248894), Peter Maloney wrote:
>>>> On 04/04/2013 07:05 PM, Tim Deegan wrote:
>>>>> Also, if there is still a bad slowdown, caused by the p2m
lookups, this
>>>>> might help a little bit:
>>>>>
>>>>> diff --git a/xen/arch/x86/hvm/hvm.c
b/xen/arch/x86/hvm/hvm.c
>>>>> index 38e87ce..7bd8646 100644
>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>> @@ -1361,6 +1361,18 @@ int
hvm_hap_nested_page_fault(paddr_t gpa,
>>>>>           }
>>>>>       }
>>>>> +
>>>>> +    /* For the benefit of 32-bit WinXP (& older
Windows) on AMD CPUs,
>>>>> +     * a fast path for LAPIC accesses, skipping the p2m
lookup. */
>>>>> +    if ( !nestedhvm_vcpu_in_guestmode(v)
>>>>> +         && gfn ==
vlapic_base_address(vcpu_vlapic(current)) >> PAGE_SHIFT )
>>>>> +    {
>>>>> +        if ( !handle_mmio() )
>>>>> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>>>> +        rc = 1;
>>>>> +        goto out;
>>>>> +    }
>>>>> +
>>>>>       p2m = p2m_get_hostp2m(v->domain);
>>>>>       mfn = get_gfn_type_access(p2m, gfn, &p2mt,
&p2ma,
>>>>>                                 P2M_ALLOC | (access_w ?
P2M_UNSHARE : 0), NULL);
>>>> This patch (applied to 4.2.2) has a very large improvement on
my box
>>>> (AMD FX-8150) and WinXP 32 bit.
>>> Hmm - I expected it to be only a mild improvement.  How about this
one,
>>> which puts in the same shortcut in another place as well?  I
don''t think
>>> it will be much better than the last one, but it''s worth a
try.
>> So I dusted off my old perf testing scripts and added in one to
>> measure boot performance.
>>
>> Below are boot times, from after "xl create" returns, until a
>> specific python daemon running in the VM starts responding to
>> requests.  So lower is better.
>>
>> There are a number of places where there can be a few seconds of
>> noise either way, but on the whole the tests seem fairly repeatable.
>>
>> I ran this with w2k3eesp2 and with winxpsp3, using some of the
>> auto-install test images made for the XenServer regression testing.
>> All of them are using a flat file disk backend with
>> qemu-traditional.
>>
>> Results are in order of commits:
>>
>> Xen 4.1:
>>
>> w2k3: 43 34 34 33 34
>> winxp: 110 111 111 110 112
>>
>> Xen 4.2:
>>
>> w2k3: 34 44 45 45 45
>> winxp: 203 221 210 211 200
>>
>> Xen-unstable w/ RTC fix:
>>
>> w2k3: 43 44 44 45 44
>> winxp: 268 275 265 276 265
>>
>> Xen-unstable with rtc fix + this "fast lapic" patch:
>>
>> w2k3: 43 45 44 45 45
>> winxp: 224 232 232 232 232
>>
>>
>> So w2k3 boots fairly quickly anyway; has a 50% slow-down when moving
>> from 4.1 to 4.2, and no discernible change after that.
>>
>> winxp boots fairly slowly; nearly doubles in speed for 4.2, and gets
>> even worse for xen-unstable.  The patch is a measurable improvement,
>> but still nowhere near 4.1, or even 4.2.
>>
>> On the whole however -- I''m not sure that boot time by itself
is a
>> blocker.  If the problem really is primarily the "eager TPR"
issue
>> for Windows XP, then I''m not terribly motivated either: the
Citrix
>> PV drivers patch Windows XP to modify the routine to be lazy (like
>> w2k3); there is hardware available which allows the TPR to be
>> virtualized; and there are plenty of Windows-based OSes available
>> which do not have this problem.
>>
> A couple of questions:
>
> - Does Citrix XenServer Windows PV driver work with vanilla Xen 4.2.x? I
remember someone complaining on the list that it doesn''t work.. (but
I''m not sure about that).
I did a quick test of the XS 6.0.2 drivers on unstable and they didn''t 
work.  Didn''t do any debugging, however.
> - Does GPLPV do the lazy patching for WinXP on AMD?
I highly doubt it, but you''d have to ask James Harper.

  -George

George Dunlap

2013-May-07 14:57 UTC

head link

Re: Xen 4.3 development update

On 07/05/13 14:56, Pasi Kärkkäinen wrote:> On Mon, Apr 29, 2013 at 03:21:34PM +0200, Peter Maloney wrote:
>> On 04/04/2013 07:05 PM, Tim Deegan wrote:
>>> At 16:23 +0100 on 04 Apr (1365092601), Tim Deegan wrote:
>>>> At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla
wrote:
>>>>> On Apr 3, 2013, at 6:53 AM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>>>> Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM
that takes
>>>> about 12 minutes before this locking change takes more than 20
minutes
>>>> on the current tip of xen-unstable (I gave up at 22 minutes and
rebooted
>>>> to test something else).
>>> I did a bit of prodding at this, but messed up my measurements in a
>>> bunch of different ways over the afternoon. :(  I''m going
to be away
>>> from my test boxes for a couple of weeks now, so all I can say is,
if
>>> you''re investigating this bug, beware that:
>>>
>>>   - the revision before this change still has the RTC bugs that
were
>>>     fixed last week, so don''t measure performance based on
guest
>>>     wallclock time, or your ''before'' perf will
look too good.
>>>   - the current unstable tip has test code to exercise the new
>>>     map_domain_page(), which will badly affect all the many memory
>>>     accesses done in HVM emulation, so make sure you use debug=n
builds
>>>     for measurement.
>>>
>>> Also, if there is still a bad slowdown, caused by the p2m lookups,
this
>>> might help a little bit:
>>>
>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>> index 38e87ce..7bd8646 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>           }
>>>       }
>>>   
>>> +
>>> +    /* For the benefit of 32-bit WinXP (& older Windows) on
AMD CPUs,
>>> +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
>>> +    if ( !nestedhvm_vcpu_in_guestmode(v)
>>> +         && gfn ==
vlapic_base_address(vcpu_vlapic(current)) >> PAGE_SHIFT )
>>> +    {
>>> +        if ( !handle_mmio() )
>>> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>> +        rc = 1;
>>> +        goto out;
>>> +    }
>>> +
>>>       p2m = p2m_get_hostp2m(v->domain);
>>>       mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>>>                                 P2M_ALLOC | (access_w ? P2M_UNSHARE
: 0), NULL);
>> This patch (applied to 4.2.2) has a very large improvement on my box
>> (AMD FX-8150) and WinXP 32 bit.
>>
>> It only took about 2.5 minutes to log in and see task manager. It takes
>> about 6 minutes without the patch. And 2.5 minutes is still terrible,
>> but obviously better.
>>
> So is the problem only on WinXP with "booting up / logging in to
windows",
> or do you see performance regressions on some actual benchmark tools aswell
> (after windows has started up) ?
For the sake of people watching this thread: The last 4-5 mails I''ve 
sent to Peter Maloney have bounced with "Mailbox Full" messages; so
it''s
possible he''s not actually hearing this part of the discussion...

  -George

George Dunlap

2013-May-07 15:35 UTC

head link

Re: Xen 4.3 development update

On Tue, May 7, 2013 at 2:15 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:> On Fri, May 3, 2013 at 5:41 PM, George Dunlap
> <george.dunlap@eu.citrix.com> wrote:
>
>> I''ll be doing some more workload-based benchmarks (probably
starting with
>> the Windows ddk example build) to see if there are other issues I turn
up.
>
> So here are my results with ddk-build for Windows 2003 (which again
> have the "lazy IRQL" feature, and so aren''t impacted as
hard by the
> extra time processing).  It''s a "time to complete" test,
so lower is
> better.  (I recommend ignoring the first run, as it will be warming up
> the disk cache.)
>
> Xen 4.1: 223 167 167 170 165
>
> Xen 4.2: 216 140 145 145 150
>
> Xen-unstable: 227 200 190 200 210
>
> Xen-unstable+lapic: 246 175 175 180 175
If anyone''s interested, the numbers on my Intel box (which does I
believe have the vlapic stuff) are:

Xen 4.1: 110 70 65 70 70
Xen 4.2: 110 70 65 65 65
unstable: 115 70 70 70 71
unstable+lapic: 75 65 65 65 65

There seems to be a bit of a quantization effect, so I''m not sure I
would read much into the differences in the result here, except to
conclude that the fast lapic patch doesn''t seem to hurt Intel.  It
should, however, reduce suspicion from other things which my have
changed (e.g. regressions in qemu, &c).

 -George

James Harper

2013-May-07 22:23 UTC

head link

Re: Xen 4.3 development update

> > A couple of questions:
> >
> > - Does Citrix XenServer Windows PV driver work with vanilla Xen 4.2.x?
I
> remember someone complaining on the list that it doesn''t work..
(but I''m
> not sure about that).
> 
> I did a quick test of the XS 6.0.2 drivers on unstable and they
didn''t
> work.  Didn''t do any debugging, however.
> 
> > - Does GPLPV do the lazy patching for WinXP on AMD?
> 
> I highly doubt it, but you''d have to ask James Harper.
> 
GPLPV does do some TPR patching. You need to add the /PATCHTPR option to your
boot.ini. It works for 2000 as well (and 2003 before MS stopped using TPR at all
in sp2), if anyone cares :)

For AMD, TPR access is changed to a LOCK MOVE CR8 instruction which enables
setting of TPR without a VMEXIT. For Intel, TPR writes are only done if it would
change the value of TPR, and reads are always done from a cached value. I guess
this is what you mean by ''lazy''.

I think xen itself does TPR optimisation for Intel these days so this may be
unnecessary.

It certainly makes a big difference for XP.

James

George Dunlap

2013-May-08 09:00 UTC

head link

Re: Xen 4.3 development update

On 07/05/13 23:23, James Harper wrote:>>> A couple of questions:
>>>
>>> - Does Citrix XenServer Windows PV driver work with vanilla Xen
4.2.x? I
>> remember someone complaining on the list that it doesn''t
work.. (but I''m
>> not sure about that).
>>
>> I did a quick test of the XS 6.0.2 drivers on unstable and they
didn''t
>> work.  Didn''t do any debugging, however.
>>
>>> - Does GPLPV do the lazy patching for WinXP on AMD?
>> I highly doubt it, but you''d have to ask James Harper.
>>
> GPLPV does do some TPR patching. You need to add the /PATCHTPR option to
your boot.ini. It works for 2000 as well (and 2003 before MS stopped using TPR
at all in sp2), if anyone cares :)
>
> For AMD, TPR access is changed to a LOCK MOVE CR8 instruction which enables
setting of TPR without a VMEXIT. For Intel, TPR writes are only done if it would
change the value of TPR, and reads are always done from a cached value. I guess
this is what you mean by ''lazy''.
>
> I think xen itself does TPR optimisation for Intel these days so this may
be unnecessary.
>
> It certainly makes a big difference for XP.
Well the context of this thread is a set of changes that makes the 
non-lazy TPR exits *much much* more expensive on AMD hardware.  The 
existence of a widely-available set of drivers as a work-round would be 
a pretty important factor in how we decide to proceed.

So if I just download your latest drivers and add /PATCHTPR on the 
boot.ini, the AMD TPR patching should work?

  -George

James Harper

2013-May-12 07:22 UTC

head link

Re: Suspicious URL:Re: Xen 4.3 development update

> > The gplpv-no-patchtpr times are comparable to the times booting
without
> > gplpv drivers.  So the patchtpr seems to work pretty well. This is
WinXP
> > SP3, with whatever version we use for testing in XenServer.  Possible,
> > as you say, that newer patches break things.
> 
> Hmm, but booting the same image on an Intel box (with /patchtpr) causes
> the VM to crash at boot. :-(  Seems to work fine w/o the switch though.
> 
What sort of crash are you getting? I managed to round up an Intel box and
tested it and xp blows up before it even gets a chance to log anything to
/var/log/xen/qemu-dm-<domu>.log, even with no /patchtpr, but my intel box
is running 4.2.0 (from debian experimental) while my amd box is running 4.1.2
(from debian wheezy), so maybe there is something in that. Did I see a thread
about 2003 failing recently? Win 2012 seems to work just fine.

James

Xen devel - Mar 2013 - Xen 4.3 development update

Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

xenalyze (was: Re: Xen 4.3 development update)

Re: xenalyze

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update / winxp AMD performance regression

Re: Xen 4.3 development update / winxp AMD performance regression

Re: Xen 4.3 development update / winxp AMD performance regression

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update / winxp AMD performance regression

Re: Xen 4.3 development update / winxp AMD performance regression

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update

Re: Xen 4.3 development update