Huang, Xinmei
2007-Jul-28 07:28 UTC
[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
[This email is either empty or too large to be displayed at this time]
Huang, Xinmei
2007-Jul-28 08:10 UTC
[Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
With current accelerated VGA for qemu-dm, guest can access LFB directly, however, qemu-dm is not conscious of these accesses to LFB. The accompanying task is to determine the range of LFB to be redrawn on guest display window. Current qemu-dm maintains a copy of LFB, and gets the LFB dirty-bitmap through memcmp. This patch adopts another way to get the LFB dirty-bitmap: one hypercall to instruct hypervisor to fill the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the dirty-bitmap. Theoretically, the overhead of memcmp-based method is dependent on graphic-intensive workload, more exactly, the probability distribution of address of LFB writes, whereas the overhead of hypercall-based method is relatively stable. Hypercall and LFB L1 pagetable walking contribute the overhead of the later one. Normal shadow pagetable would be re-claimed, i.e. L1 shadow for LFB would disappears, resulting some issues. One appoach is to keep all the shadow pagetables for LFB pinned. It is complicated, as the top level pagetable is pinnable in current shadow(I''d tried but failed). I''m not sure this shadow for LFB pinning approach would bring sufficient performance benefits. This patch just gives an all-dirty-bitmap when L1 shadow for LFB alters. This appoach avoids complicated tracking mechanism for shadow at the cost of some unnecessary re-drawing for qemu-dm. Ideal solution might be the optimum point. I did some tests to show the benefit of this patch : DB + n Idle winxp linux/DB guest : running sysbench/DB, 2 vcpus, 512M winxp guest : 2vcpus, 128M(8M shadow) The test result show that this patch will bring benefit to our bottleneck dom0 and the system scalability besides qemu-dm itself. The DB throughtput : 1.w/o patch -- 49% downgrade for 8 winxp guest and 34% downgrade for 4 winxp guest 2.w/ patch -- <2% downgrade for 8 winxp guest and <1% downgrade for 4 winxp guest Following two charts show the cpu utilization scatters of dom0 w/ and w/o patch - Xinmei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2007-Jul-30 08:58 UTC
Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
Hi, At 16:10 +0800 on 28 Jul (1185639053), Huang, Xinmei wrote:> With current accelerated VGA for qemu-dm, guest can access LFB directly, > however, qemu-dm is not conscious of these accesses to LFB. The > accompanying task is to determine the range of LFB to be redrawn on > guest display window. Current qemu-dm maintains a copy of LFB, and gets > the LFB dirty-bitmap through memcmp. This patch adopts another way to > get the LFB dirty-bitmap: one hypercall to instruct hypervisor to fill > the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the > dirty-bitmap.Thanks for this -- those numbers look very good! The shadow-code modifications seem to have a lot of moving parts, though. Since we expect that the guest will have a single, contiguous, kernel-mode mapping of the LFB, we should be able to do this with less administration: - Figure out the VA of the writable mapping of the LFB. - When asked for the bitmap, walk the shadow linear page tables of the area, recording and clearing the _PAGE_DIRTY bits. If you see a PTE pointing at the wrong place, back off and tell qemu to try the slow way. If you see an LFB mfn with a writeable count > 1, either give up or assume it''s dirty. (If you take a page-fault, then the guest has marked his writeable mapping of the LFB non-writable at a higher level -- probably just back off at that point). - When a shadow PTE pointing at the LFB is made or cleared, set the bit in the bitmap. That involves a single equality test in sh_page_fault() to spot the VA, a few lines in shadow_set_l1e() to spot new/departing mappings, and almost everything else can happen in one routine that reads/writes the linear pagetables with a single for() loop. A few other points: - The assumption that the LFB is MFN-contiguous is not valid. You do work around the degug=y allocator''s habit of handing out pages backwards, but that''s there to alert you to the more general problem of discontiguous mfns. - Since the dirty bits are only one per word, they can be atomically cleared without needing locked operations to protect their neighbours. That means that you don''t need to pause the domain: the shadow lock will be enough to keep the operation safe. - After clearing the dirty bits, you need to flush TLBs to make sure they''ll get set again. VMX guests get their TLBs flushed on every VMEXIT at the moment, but that''s not true on SVM on some hardware, and won''t be true on VMX when Intel processors get tagged TLBs. Cheers, Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Huang, Xinmei
2007-Jul-30 10:41 UTC
RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
Tim, thanks for your comments, pls see the embedded.>-----Original Message----- >From: Tim Deegan [mailto:Tim.Deegan@xensource.com] >Sent: 2007年7月30日 16:58 >To: Huang, Xinmei >Cc: xen-devel@lists.xensource.com; Ian.Pratt@cl.cam.ac.uk >Subject: Re: [Xen-devel] [PATCH]RFC: VGA accleration using >shadow PTE D-bit to construct LFB dirty bitmap > >Hi, > >At 16:10 +0800 on 28 Jul (1185639053), Huang, Xinmei wrote: >> With current accelerated VGA for qemu-dm, guest can access >LFB directly, >> however, qemu-dm is not conscious of these accesses to LFB. The >> accompanying task is to determine the range of LFB to be redrawn on >> guest display window. Current qemu-dm maintains a copy of >LFB, and gets >> the LFB dirty-bitmap through memcmp. This patch adopts another way to >> get the LFB dirty-bitmap: one hypercall to instruct >hypervisor to fill >> the dirty-bitmap. Hypervisor checks the D-bit of PTEs and updates the >> dirty-bitmap. > >Thanks for this -- those numbers look very good! > >The shadow-code modifications seem to have a lot of moving parts, >though. Since we expect that the guest will have a single, contiguous, >kernel-mode mapping of the LFB, we should be able to do this with less >administration: > > - Figure out the VA of the writable mapping of the LFB. > - When asked for the bitmap, walk the shadow linear page tables of theWhen current != vcpu-of-guest, can we use this mechanism or map_shadow_page() to access shadow pages?> area, recording and clearing the _PAGE_DIRTY bits. If you see a PTE > pointing at the wrong place, back off and tell qemu to try the slow > way. If you see an LFB mfn with a writeable count > 1, either give > up or assume it''s dirty. (If you take a page-fault, then the guest > has marked his writeable mapping of the LFB non-writable at a higher > level -- probably just back off at that point). > - When a shadow PTE pointing at the LFB is made or cleared, >set the bit > in the bitmap. > >That involves a single equality test in sh_page_fault() to spot the VA,Guest''s va mapped to LFB is assumed to be invariable?>a few lines in shadow_set_l1e() to spot new/departing mappings, and >almost everything else can happen in one routine that reads/writes the >linear pagetables with a single for() loop. > >A few other points: > - The assumption that the LFB is MFN-contiguous is not valid. You do > work around the degug=y allocator''s habit of handing out pages > backwards, but that''s there to alert you to the more general > problem of discontiguous mfns. > - Since the dirty bits are only one per word, they can be atomically > cleared without needing locked operations to protect their > neighbours. That means that you don''t need to pause the domain: the > shadow lock will be enough to keep the operation safe. > - After clearing the dirty bits, you need to flush TLBs to make sure > they''ll get set again. VMX guests get their TLBs flushed on every > VMEXIT at the moment, but that''s not true on SVM on some hardware, > and won''t be true on VMX when Intel processors get tagged TLBs. >Any more comments on this patch? Does the Pinned-LFB-shadow idea make sense?>Cheers, > >Tim. > >-- >Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited >Registered office c/o EC2Y 5EB, UK; company number 05334508 >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2007-Jul-30 11:00 UTC
Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
Hi, At 18:41 +0800 on 30 Jul (1185820874), Huang, Xinmei wrote:> > - Figure out the VA of the writable mapping of the LFB. > > - When asked for the bitmap, walk the shadow linear page tables of the > > When current != vcpu-of-guest, can we use this mechanism or map_shadow_page() to access shadow pages?Ah, yes, we''d need to either walk the guest''s shadow pagetable explicitly or cause the bitmap to be generated by one of guest''s vcpus (ugh!).> >That involves a single equality test in sh_page_fault() to spot the VA, > > Guest''s va mapped to LFB is assumed to be invariable?Not necessarily. Every time we see a write fault to new VA mapping the LFB we can move it. (I expect that not to happen very often.)> Any more comments on this patch? Does the Pinned-LFB-shadow idea make sense?It would be possible, but at the moment "pinnable" pages are defined entirely by their type, with this one hack to allow 64bit l3s to be pinnable if we think we''re looking at an old linux kernel. So you''d either need another shadow type, or for all users of the up-pointer to be able to check whether the l1 they''re looking at has vram mapping in it. If there is one kernel-mode mapping of the LFB, though, I''d expect it to be fairly rarely torn down; it would need the shadows of all processes that were ever running when the kernel touched the video RAM to be reaped. So just marking the page dirty when you make or tear down a vram mapping should be better. Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2007-Jul-30 12:11 UTC
RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
> If there is one kernel-mode mapping of the LFB, though, I''d > expect it to be fairly rarely torn down; it would need the > shadows of all processes that were ever running when the > kernel touched the video RAM to be reaped. So just marking > the page dirty when you make or tear down a vram mapping > should be better.Ideally, we''d return ''all-dirty'' for a missing shadow page just once, and then ''all-clean'' on subsequent scans until the PT page gets reshadowed. I can imagine that if an OS''es screen blanker kicked in we might not see any further writes to the LFB, and this could lead to the shadows ultimately getting evicted, resulting in pessimal performance if we''re always returning ''all-dirty''. There''s no real need to do this on a page granularity -- if any of our LFB-mapping shadow pages have been evicted we could evict them all. I''m not sure this makes things any easier, though. We can use missing shadows as an optimization: if we return a bit map with ''everything clean'' a few times in a row, we are probably better off pro-actively unshadowing the page to avoid even doing the dirty bit scanning. Best, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2007-Jul-30 12:26 UTC
Re: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
At 13:11 +0100 on 30 Jul (1185801076), Ian Pratt wrote:> > > If there is one kernel-mode mapping of the LFB, though, I''d > > expect it to be fairly rarely torn down; it would need the > > shadows of all processes that were ever running when the > > kernel touched the video RAM to be reaped. So just marking > > the page dirty when you make or tear down a vram mapping > > should be better. > > Ideally, we''d return ''all-dirty'' for a missing shadow page just once, > and then ''all-clean'' on subsequent scans until the PT page gets > reshadowed.Yes, that''s the behaviour you get if you mark the page dirty when you clear the PTE that mapped it. It will then get seen once as dirty and not again util it''s mapped again. (Not-present and read-only PTEs won''t cause the page to be marked dirty because they can''t cause it to have changed.)> We can use missing shadows as an optimization: if we return a bit map > with ''everything clean'' a few times in a row, we are probably better off > pro-actively unshadowing the page to avoid even doing the dirty bit > scanning.True, and easily checked. Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Huang, Xinmei
2007-Jul-31 03:10 UTC
RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
>-----Original Message----- >From: Ian Pratt [mailto:Ian.Pratt@cl.cam.ac.uk] >Sent: 2007年7月30日 20:11 >To: Tim Deegan; Huang, Xinmei >Cc: xen-devel@lists.xensource.com; Ian.Pratt@cl.cam.ac.uk >Subject: RE: [Xen-devel] [PATCH]RFC: VGA accleration using >shadow PTE D-bit to construct LFB dirty bitmap > > >> If there is one kernel-mode mapping of the LFB, though, I''d >> expect it to be fairly rarely torn down; it would need the >> shadows of all processes that were ever running when the >> kernel touched the video RAM to be reaped. So just marking >> the page dirty when you make or tear down a vram mapping >> should be better. > >Ideally, we''d return ''all-dirty'' for a missing shadow page just once, >and then ''all-clean'' on subsequent scans until the PT page gets >reshadowed.Agreed. Re-drawing is much time-consuming. We''d avoid unnecessary re-drawing if possible.>I can imagine that if an OS''es screen blanker kicked in we >might not see >any further writes to the LFB, and this could lead to the shadows >ultimately getting evicted, resulting in pessimal performance if we''re >always returning ''all-dirty''. > >There''s no real need to do this on a page granularity -- if any of our >LFB-mapping shadow pages have been evicted we could evict them all. I''m >not sure this makes things any easier, though. > >We can use missing shadows as an optimization: if we return a bit map >with ''everything clean'' a few times in a row, we are probably >better off >pro-actively unshadowing the page to avoid even doing the dirty bit >scanning.Unshadowing & re-shadowing the all LFB pages are expensive. Performance would vary because the characteristic of graphic-intensive workload and the value of N -- the times of continuous ''all-clean''. I''m not sure this brings suffient benefit.> >Best, >Ian >-Xinmei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2007-Jul-31 07:38 UTC
RE: [Xen-devel] [PATCH]RFC: VGA accleration using shadow PTE D-bit to construct LFB dirty bitmap
> >We can use missing shadows as an optimization: if we return a bit map > >with ''everything clean'' a few times in a row, we are probably > >better off > >pro-actively unshadowing the page to avoid even doing the dirty bit > >scanning. > > Unshadowing & re-shadowing the all LFB pages are expensive. > Performance would vary because the characteristic of graphic-intensive > workload and the value of N -- the times of continuous ''all-clean''. > I''m not sure this brings suffient benefit.It''s not that expensive, and it would be good to avoid the scanning altogether for a screen that is updating very rarely (e.g. screen blanker enabled). You could leave the shadows in place but write protect the entries in the L2. However, since the shadows are liable to get evicted in that scenario anyhow it may be an unnecessary complexity. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel