thr3ads.net - Xen devel - [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Huang, Xinmei

2007-Jul-28 07:28 UTC

[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

[This email is either empty or too large to be displayed at this time]

Huang, Xinmei

2007-Jul-28 08:10 UTC

head link

[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

With current accelerated VGA for qemu-dm, guest can access LFB directly,
however, qemu-dm is not conscious of these accesses to LFB. The
accompanying task is to determine the range of LFB to be redrawn on
guest display window. Current qemu-dm maintains a copy of LFB, and gets
the LFB dirty-bitmap through memcmp. This patch adopts another way to
get the LFB dirty-bitmap: one hypercall to instruct hypervisor to fill
the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the
dirty-bitmap.

Theoretically, the overhead of memcmp-based method is dependent on
graphic-intensive workload, more exactly, the probability distribution
of address of LFB writes, whereas the overhead of hypercall-based method
is relatively stable. Hypercall and LFB L1 pagetable walking contribute
the overhead of the later one.

Normal shadow pagetable would be re-claimed, i.e. L1 shadow for LFB
would disappears, resulting some issues. One appoach is to keep all the
shadow pagetables for LFB pinned. It is complicated, as the top level
pagetable is pinnable in current shadow(I''d tried but failed).
I''m not
sure this shadow for LFB pinning approach would bring sufficient
performance benefits. This patch just gives an all-dirty-bitmap when L1
shadow for LFB alters. This appoach avoids complicated tracking
mechanism for shadow at the cost of some unnecessary re-drawing for
qemu-dm. Ideal solution might be the optimum point.

 

I did some tests to show the benefit of this patch : DB + n Idle winxp


 

linux/DB guest :  running sysbench/DB, 2 vcpus, 512M

winxp guest     :  2vcpus, 128M(8M shadow)

 

The test result show that this patch will bring benefit to our
bottleneck dom0 and the system scalability besides qemu-dm itself.
The DB throughtput :
1.w/o patch --  49% downgrade for 8 winxp guest and 34% downgrade for 4
winxp guest
2.w/ patch --   <2% downgrade for 8 winxp guest and <1% downgrade for 4
winxp guest
 
Following two charts show the cpu utilization scatters of dom0 w/ and
w/o patch
 

 

- Xinmei 






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2007-Jul-30 08:58 UTC

head link

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Hi,

At 16:10 +0800 on 28 Jul (1185639053), Huang, Xinmei
wrote:> With current accelerated VGA for qemu-dm, guest can access LFB directly,
> however, qemu-dm is not conscious of these accesses to LFB. The
> accompanying task is to determine the range of LFB to be redrawn on
> guest display window. Current qemu-dm maintains a copy of LFB, and gets
> the LFB dirty-bitmap through memcmp. This patch adopts another way to
> get the LFB dirty-bitmap: one hypercall to instruct hypervisor to fill
> the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the
> dirty-bitmap.
Thanks for this -- those numbers look very good!

The shadow-code modifications seem to have a lot of moving parts,
though.  Since we expect that the guest will have a single, contiguous,
kernel-mode mapping of the LFB, we should be able to do this with less
administration:

 - Figure out the VA of the writable mapping of the LFB.
 - When asked for the bitmap, walk the shadow linear page tables of the
   area, recording and clearing the _PAGE_DIRTY bits.  If you see a PTE
   pointing at the wrong place, back off and tell qemu to try the slow
   way.  If you see an LFB mfn with a writeable count > 1, either give
   up or assume it''s dirty.  (If you take a page-fault, then the guest
   has marked his writeable mapping of the LFB non-writable at a higher
   level -- probably just back off at that point).
 - When a shadow PTE pointing at the LFB is made or cleared, set the bit
   in the bitmap.

That involves a single equality test in sh_page_fault() to spot the VA,
a few lines in shadow_set_l1e() to spot new/departing mappings, and
almost everything else can happen in one routine that reads/writes the
linear pagetables with a single for() loop.

A few other points:
 - The assumption that the LFB is MFN-contiguous is not valid.  You do
   work around the degug=y allocator''s habit of handing out pages
   backwards, but that''s there to alert you to the more general
   problem of discontiguous mfns.
 - Since the dirty bits are only one per word, they can be atomically
   cleared without needing locked operations to protect their
   neighbours. That means that you don''t need to pause the domain: the
   shadow lock will be enough to keep the operation safe.
 - After clearing the dirty bits, you need to flush TLBs to make sure
   they''ll get set again.  VMX guests get their TLBs flushed on every
   VMEXIT at the moment, but that''s not true on SVM on some hardware,
   and won''t be true on VMX when Intel processors get tagged TLBs.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
Registered office c/o EC2Y 5EB, UK; company number 05334508

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Huang, Xinmei

2007-Jul-30 10:41 UTC

head link

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Tim, thanks for your comments, pls see the embedded.
>-----Original Message-----
>From: Tim Deegan [mailto:Tim.Deegan@xensource.com] 
>Sent: 2007年7月30日 16:58
>To: Huang, Xinmei
>Cc: xen-devel@lists.xensource.com; Ian.Pratt@cl.cam.ac.uk
>Subject: Re: [Xen-devel] [PATCH]RFC: VGA accleration using 
>shadow PTE D-bit to construct LFB dirty bitmap
>
>Hi,
>
>At 16:10 +0800 on 28 Jul (1185639053), Huang, Xinmei wrote:
>> With current accelerated VGA for qemu-dm, guest can access 
>LFB directly,
>> however, qemu-dm is not conscious of these accesses to LFB. The
>> accompanying task is to determine the range of LFB to be redrawn on
>> guest display window. Current qemu-dm maintains a copy of 
>LFB, and gets
>> the LFB dirty-bitmap through memcmp. This patch adopts another way to
>> get the LFB dirty-bitmap: one hypercall to instruct 
>hypervisor to fill
>> the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the
>> dirty-bitmap.
>
>Thanks for this -- those numbers look very good!
>
>The shadow-code modifications seem to have a lot of moving parts,
>though.  Since we expect that the guest will have a single, contiguous,
>kernel-mode mapping of the LFB, we should be able to do this with less
>administration:
>
> - Figure out the VA of the writable mapping of the LFB.
> - When asked for the bitmap, walk the shadow linear page tables of the
When current != vcpu-of-guest, can we use this mechanism or map_shadow_page() to
access shadow pages?
>   area, recording and clearing the _PAGE_DIRTY bits.  If you see a PTE
>   pointing at the wrong place, back off and tell qemu to try the slow
>   way.  If you see an LFB mfn with a writeable count > 1, either give
>   up or assume it''s dirty.  (If you take a page-fault, then the
guest
>   has marked his writeable mapping of the LFB non-writable at a higher
>   level -- probably just back off at that point).
> - When a shadow PTE pointing at the LFB is made or cleared, 
>set the bit
>   in the bitmap.
>
>That involves a single equality test in sh_page_fault() to spot the VA,
Guest''s va mapped to LFB is assumed to be invariable? 
>a few lines in shadow_set_l1e() to spot new/departing mappings, and
>almost everything else can happen in one routine that reads/writes the
>linear pagetables with a single for() loop.
>
>A few other points:
> - The assumption that the LFB is MFN-contiguous is not valid.  You do
>   work around the degug=y allocator''s habit of handing out pages
>   backwards, but that''s there to alert you to the more general
>   problem of discontiguous mfns.
> - Since the dirty bits are only one per word, they can be atomically
>   cleared without needing locked operations to protect their
>   neighbours. That means that you don''t need to pause the domain:
the
>   shadow lock will be enough to keep the operation safe.
> - After clearing the dirty bits, you need to flush TLBs to make sure
>   they''ll get set again.  VMX guests get their TLBs flushed on
every
>   VMEXIT at the moment, but that''s not true on SVM on some
hardware,
>   and won''t be true on VMX when Intel processors get tagged TLBs.
>
Any more comments on this patch?  Does the Pinned-LFB-shadow idea make sense?

>Cheers,
>
>Tim.
>
>-- 
>Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
>Registered office c/o EC2Y 5EB, UK; company number 05334508
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2007-Jul-30 11:00 UTC

head link

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Hi,

At 18:41 +0800 on 30 Jul (1185820874), Huang, Xinmei
wrote:> > - Figure out the VA of the writable mapping of the LFB.
> > - When asked for the bitmap, walk the shadow linear page tables of the
> 
> When current != vcpu-of-guest, can we use this mechanism or
map_shadow_page() to access shadow pages?
Ah, yes, we''d need to either walk the guest''s shadow pagetable
explicitly or cause the bitmap to be generated by one of guest''s vcpus
(ugh!).
> >That involves a single equality test in sh_page_fault() to spot the VA,
> 
> Guest''s va mapped to LFB is assumed to be invariable? 
Not necessarily.  Every time we see a write fault to new VA mapping the
LFB we can move it.  (I expect that not to happen very often.)
> Any more comments on this patch?  Does the Pinned-LFB-shadow idea make
sense?
It would be possible, but at the moment "pinnable" pages are defined
entirely by their type, with this one hack to allow 64bit l3s to be
pinnable if we think we''re looking at an old linux kernel.  So
you''d
either need another shadow type, or for all users of the up-pointer to
be able to check whether the l1 they''re looking at has vram mapping in
it.

If there is one kernel-mode mapping of the LFB, though, I''d expect it
to
be fairly rarely torn down; it would need the shadows of all processes
that were ever running when the kernel touched the video RAM to be
reaped.  So just marking the page dirty when you make or tear down a
vram mapping should be better.

Tim.

-- 
Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
Registered office c/o EC2Y 5EB, UK; company number 05334508

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2007-Jul-30 12:11 UTC

head link

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

> If there is one kernel-mode mapping of the LFB, though, I''d 
> expect it to be fairly rarely torn down; it would need the 
> shadows of all processes that were ever running when the 
> kernel touched the video RAM to be reaped.  So just marking 
> the page dirty when you make or tear down a vram mapping 
> should be better.
Ideally, we''d return ''all-dirty'' for a missing shadow
page just once,
and then ''all-clean'' on subsequent scans until the PT page
gets
reshadowed. 

I can imagine that if an OS''es screen blanker kicked in we might not
see
any further writes to the LFB, and this could lead to the shadows
ultimately getting evicted, resulting in pessimal performance if we''re
always returning ''all-dirty''.

There''s no real need to do this on a page granularity -- if any of our
LFB-mapping shadow pages have been evicted we could evict them all. I''m
not sure this makes things any easier, though.

We can use missing shadows as an optimization: if we return a bit map
with ''everything clean'' a few times in a row, we are probably
better off
pro-actively unshadowing the page to avoid even doing the dirty bit
scanning.

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2007-Jul-30 12:26 UTC

head link

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

At 13:11 +0100 on 30 Jul (1185801076), Ian Pratt wrote:> 
> > If there is one kernel-mode mapping of the LFB, though, I''d 
> > expect it to be fairly rarely torn down; it would need the 
> > shadows of all processes that were ever running when the 
> > kernel touched the video RAM to be reaped.  So just marking 
> > the page dirty when you make or tear down a vram mapping 
> > should be better.
> 
> Ideally, we''d return ''all-dirty'' for a missing
shadow page just once,
> and then ''all-clean'' on subsequent scans until the PT
page gets
> reshadowed. 
Yes, that''s the behaviour you get if you mark the page dirty when you
clear the PTE that mapped it.  It will then get seen once as dirty and
not again util it''s mapped again.  (Not-present and read-only PTEs
won''t
cause the page to be marked dirty because they can''t cause it to have
changed.)
> We can use missing shadows as an optimization: if we return a bit map
> with ''everything clean'' a few times in a row, we are
probably better off
> pro-actively unshadowing the page to avoid even doing the dirty bit
> scanning.
True, and easily checked.

Tim.

-- 
Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
Registered office c/o EC2Y 5EB, UK; company number 05334508

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Huang, Xinmei

2007-Jul-31 03:10 UTC

head link

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

>-----Original Message-----
>From: Ian Pratt [mailto:Ian.Pratt@cl.cam.ac.uk] 
>Sent: 2007年7月30日 20:11
>To: Tim Deegan; Huang, Xinmei
>Cc: xen-devel@lists.xensource.com; Ian.Pratt@cl.cam.ac.uk
>Subject: RE: [Xen-devel] [PATCH]RFC: VGA accleration using 
>shadow PTE D-bit to construct LFB dirty bitmap
>
>
>> If there is one kernel-mode mapping of the LFB, though, I''d 
>> expect it to be fairly rarely torn down; it would need the 
>> shadows of all processes that were ever running when the 
>> kernel touched the video RAM to be reaped.  So just marking 
>> the page dirty when you make or tear down a vram mapping 
>> should be better.
>
>Ideally, we''d return ''all-dirty'' for a missing
shadow page just once,
>and then ''all-clean'' on subsequent scans until the PT page
gets
>reshadowed. 
Agreed. Re-drawing is much time-consuming. We''d avoid unnecessary
re-drawing if possible.
>I can imagine that if an OS''es screen blanker kicked in we 
>might not see
>any further writes to the LFB, and this could lead to the shadows
>ultimately getting evicted, resulting in pessimal performance if
we''re
>always returning ''all-dirty''.
>
>There''s no real need to do this on a page granularity -- if any of
our
>LFB-mapping shadow pages have been evicted we could evict them all.
I''m
>not sure this makes things any easier, though.
>
>We can use missing shadows as an optimization: if we return a bit map
>with ''everything clean'' a few times in a row, we are
probably
>better off
>pro-actively unshadowing the page to avoid even doing the dirty bit
>scanning.
Unshadowing & re-shadowing the all LFB pages are expensive. 
Performance would vary because the characteristic of graphic-intensive workload
and the value of N -- the times of continuous ''all-clean''.
I''m not sure this brings suffient benefit.
>
>Best,
>Ian
>-Xinmei

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2007-Jul-31 07:38 UTC

head link

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

> >We can use missing shadows as an optimization: if we return a bit map
> >with ''everything clean'' a few times in a row, we are
probably
> >better off
> >pro-actively unshadowing the page to avoid even doing the dirty bit
> >scanning.
> 
> Unshadowing & re-shadowing the all LFB pages are expensive.
> Performance would vary because the characteristic of graphic-intensive
> workload and the value of N -- the times of continuous
''all-clean''.
> I''m not sure this brings suffient benefit.
It''s not that expensive, and it would be good to avoid the scanning
altogether for a screen that is updating very rarely (e.g. screen
blanker enabled). You could leave the shadows in place but write protect
the entries in the L2. However, since the shadows are liable to get
evicted in that scenario anyhow it may be an unnecessary complexity.

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jul 2007 - [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap

RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap