thr3ads.net - Xen devel - [Xen-devel] vram_dirty vs. shadow paging dirty tracking [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Anthony Liguori

2007-Mar-13 19:32 UTC

[Xen-devel] vram_dirty vs. shadow paging dirty tracking

When thinking about multithreading the device model, it occurred to me 
that it''s a little odd that we''re doing a memcmp to determine
which
portions of the VRAM has changed.  Couldn''t we just use dirty page 
tracking in the shadow paging code?  That should significantly lower the 
overhead of this plus I believe the infrastructure is already mostly 
there in the shadow2 code.

Is this a sane idea?

Regards,

Anthony Liguori

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2007-Mar-13 21:02 UTC

head link

[Xen-devel] RE: vram_dirty vs. shadow paging dirty tracking

> When thinking about multithreading the device model, it occurred to me
> that it''s a little odd that we''re doing a memcmp to
determine which
> portions of the VRAM has changed.  Couldn''t we just use dirty page
> tracking in the shadow paging code?  That should significantly lower
> the
> overhead of this plus I believe the infrastructure is already mostly
> there in the shadow2 code.
Yep, its been in the roadmap doc for quite a while. However, the log
dirty code isn''t ideal for this. We''d need to extend it to
enable it to
be turned on for just a subset of the GFN range (we could use a xen
rangeset for this).

Even so, I''m not super keen on the idea of tearing down and rebuilding
1024 PTE''s up to 50 times a second. 

A lower overhead solution would be to do scanning and resetting of the
dirty bits on the PTEs (and a global tlb flush). In the general case
this is tricky as the framebuffer could be mapped by multiple PTEs. In
practice, I believe this doesn''t happen for either Linux or Windows.
There''s always a good fallback of just returning ''all
dirty'' if the
heuristic is violated. Would be good to knock this up.

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Anthony Liguori

2007-Mar-13 21:30 UTC

head link

[Xen-devel] Re: vram_dirty vs. shadow paging dirty tracking

Ian Pratt wrote:>> When thinking about multithreading the device model, it occurred to me
>> that it''s a little odd that we''re doing a memcmp to
determine which
>> portions of the VRAM has changed.  Couldn''t we just use dirty
page
>> tracking in the shadow paging code?  That should significantly lower
>> the
>> overhead of this plus I believe the infrastructure is already mostly
>> there in the shadow2 code.
>>     
>
> Yep, its been in the roadmap doc for quite a while. However, the log
> dirty code isn''t ideal for this. We''d need to extend it
to enable it to
> be turned on for just a subset of the GFN range (we could use a xen
> rangeset for this).
>   
Okay, I was curious if the log dirty stuff could do ranges.  I guess not.
> Even so, I''m not super keen on the idea of tearing down and
rebuilding
> 1024 PTE''s up to 50 times a second. 
>
> A lower overhead solution would be to do scanning and resetting of the
> dirty bits on the PTEs (and a global tlb flush).
Right, this is the approach I was assuming.  There''s really no use in 
tearing down the whole PTE (since you would have to take an extraneous 
read fault).
> In the general case
> this is tricky as the framebuffer could be mapped by multiple PTEs. In
> practice, I believe this doesn''t happen for either Linux or
Windows.
>   
I wouldn''t think so, but showing my ignorance for a moment, does
shadow2
not provide a mechanism to lookup VA''s given a GFN?  This lookup could 
be cheap if the structures are built during shadow page table construction.

Sounds like this is a good long term goal but I think I''ll stick with 
the threading as an intermediate goal.

I''ve got a minor concern that threading isn''t going to help us
much when
dom0 is UP since the VGA scanning won''t happen while an MMIO/PIO
request
happens.  With an SMP dom0, you could potentially do all the VGA 
scanning on one processor ensuring that qemu-dm wasn''t ever
"busy" when
a request occurs.  I''m slightly concerned though that having a thread 
that''s as CPU hungry as the VGA scanning may increase context-switches 
during the MMIO/PIO handling which would actually hurt performance.

We''ll see soon enough though.

Regards,

Anthony Liguori
> There''s always a good fallback of just returning ''all
dirty'' if the
> heuristic is violated. Would be good to knock this up.
>
> Best,
> Ian
>   

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2007-Mar-14 00:17 UTC

head link

RE: [Xen-devel] Re: vram_dirty vs. shadow paging dirty tracking

> > Yep, its been in the roadmap doc for quite a while. However, the log
> > dirty code isn''t ideal for this. We''d need to extend
it to enable it
> to
> > be turned on for just a subset of the GFN range (we could use a xen
> > rangeset for this).
> >
> 
> Okay, I was curious if the log dirty stuff could do ranges.  I guess
> not.
It could certainly be added, but I prefer the dirty bit solution to this
particular problem. 
 > > Even so, I''m not super keen on the idea of tearing down and
> rebuilding
> > 1024 PTE''s up to 50 times a second.
> >
> > A lower overhead solution would be to do scanning and resetting of
> the
> > dirty bits on the PTEs (and a global tlb flush).
> 
> Right, this is the approach I was assuming.  There''s really no use
in
> tearing down the whole PTE (since you would have to take an extraneous
> read fault).
> 
> > In the general case
> > this is tricky as the framebuffer could be mapped by multiple PTEs.
> In
> > practice, I believe this doesn''t happen for either Linux or
Windows.
> >
> 
> I wouldn''t think so, but showing my ignorance for a moment, does
> shadow2 not provide a mechanism to lookup VA''s given a GFN?  This
lookup could> be cheap if the structures are built during shadow page table
> construction.
No, it deliberately doesn''t because threading all the PTEs that point
to
a GFN can consume quite a bit of memory, introduces locking complexity
that will effect future scalability, and turns out to be completely
unnecessary for normal shadow mode operation because some simple
heuristics get a near-perfect hit rate.
> Sounds like this is a good long term goal but I think I''ll stick
with
> the threading as an intermediate goal.
Yes, that''s more immediately useful, thanks.
> I''ve got a minor concern that threading isn''t going to
help us much
> when
> dom0 is UP since the VGA scanning won''t happen while an MMIO/PIO
> request happens.  
I think the VGA scanning burns enough CPU to stand a good chance of
getting pre-empted when an MMIO/PIO request arrives. We need to make
sure there''s no synchronization required that prevents this.

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Zhai, Edwin

2007-Mar-14 08:22 UTC

head link

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

On Tue, Mar 13, 2007 at 02:32:56PM -0500, Anthony Liguori
wrote:> When thinking about multithreading the device model, it occurred to me 
> that it''s a little odd that we''re doing a memcmp to
determine which
> portions of the VRAM has changed.  Couldn''t we just use dirty page
we made this code to improve the user vnc responsiveness long before.
now QEMU has new vnc implementation to resolve this issue and this code 
introduce perf drop for guest of linux with X or windows.

so i''d like to send a patch to revert it and make a proper solution in
future.

thanks,
> tracking in the shadow paging code?  That should significantly lower the 
> overhead of this plus I believe the infrastructure is already mostly 
> there in the shadow2 code.
> 
> Is this a sane idea?
> 
> Regards,
> 
> Anthony Liguori
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
-- 
best rgds,
edwin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Anthony Liguori

2007-Mar-14 16:00 UTC

head link

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

Zhai, Edwin wrote:> On Tue, Mar 13, 2007 at 02:32:56PM -0500, Anthony Liguori wrote:
>   
>> When thinking about multithreading the device model, it occurred to me 
>> that it''s a little odd that we''re doing a memcmp to
determine which
>> portions of the VRAM has changed.  Couldn''t we just use dirty
page
>>     
>
> we made this code to improve the user vnc responsiveness long before.
> now QEMU has new vnc implementation to resolve this issue and this code 
> introduce perf drop for guest of linux with X or windows.
>   
Compared to what, just updating the full screen 30 times a second?  I 
suspect that''s not as bad as it sounds since SDL will be using a
XShmImage.

The VNC minimization is done based on a timer however so sticking the 
timer stuff into a thread is still useful.  Of course, we should be able 
to quickly determine how useful this is by just changing SDL to update 
the whole image...

Regards,

Anthony Liguori
> so i''d like to send a patch to revert it and make a proper
solution in future.
>
> thanks,
>
>   
>> tracking in the shadow paging code?  That should significantly lower
the
>> overhead of this plus I believe the infrastructure is already mostly 
>> there in the shadow2 code.
>>
>> Is this a sane idea?
>>
>> Regards,
>>
>> Anthony Liguori
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>>     
>
>   

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Zhai, Edwin

2007-Mar-15 02:59 UTC

head link

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

On Wed, Mar 14, 2007 at 11:00:17AM -0500, Anthony Liguori
wrote:> Zhai, Edwin wrote:
> >On Tue, Mar 13, 2007 at 02:32:56PM -0500, Anthony Liguori wrote:
> >  
> >>When thinking about multithreading the device model, it occurred to
me
> >>that it''s a little odd that we''re doing a memcmp
to determine which
> >>portions of the VRAM has changed.  Couldn''t we just use
dirty page
> >>    
> >
> >we made this code to improve the user vnc responsiveness long before.
> >now QEMU has new vnc implementation to resolve this issue and this code
> >introduce perf drop for guest of linux with X or windows.
> >  
> 
> Compared to what, just updating the full screen 30 times a second?  I 
> suspect that''s not as bad as it sounds since SDL will be using a
XShmImage.
removing the memcpy and having whole screen update each time has better 
performance.
> 
> The VNC minimization is done based on a timer however so sticking the 
> timer stuff into a thread is still useful.  Of course, we should be able 
> to quickly determine how useful this is by just changing SDL to update 
> the whole image...
> 
> Regards,
> 
> Anthony Liguori
> 
> >so i''d like to send a patch to revert it and make a proper
solution in
> >future.
> >
> >thanks,
> >
> >  
-- 
best rgds,
edwin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2007-Mar-15 03:22 UTC

head link

RE: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

> 
> Compared to what, just updating the full screen 30 times a second?  I
> suspect that''s not as bad as it sounds since SDL will be using a
> XShmImage. 
> It depends on how you run the benchmark. In case of multiple VMs case
where multiple Qemus (say 8) are running, this kind of comparation eat
unacceptable cpu cycles.
Eddie

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Reasonably Related Threads

Search for more seemingly similar threads

Xen devel - Mar 2007 - vram_dirty vs. shadow paging dirty tracking

[Xen-devel] vram_dirty vs. shadow paging dirty tracking

[Xen-devel] RE: vram_dirty vs. shadow paging dirty tracking

[Xen-devel] Re: vram_dirty vs. shadow paging dirty tracking

RE: [Xen-devel] Re: vram_dirty vs. shadow paging dirty tracking

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

Re: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

RE: [Xen-devel] vram_dirty vs. shadow paging dirty tracking

Reasonably Related Threads