thr3ads.net - Linux Virtualization - [Xen-devel] [PATCH] turn off writable page tables [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Andrew Theurer

2006-Jul-25 22:14 UTC

[Xen-devel] [PATCH] turn off writable page tables

At OLS I gave a talk on some of the Xen scalability inhibitors, and one 
of these was writable page tables.  We went over why the feature does 
not scale, but just as important, we found that the uniprocessor case 
does not provide any advantage either.  These tests were done on x86_64, 
so I wanted to run the 1-way test on 32 bit to show the same problem.  
So, I have run with writable PTs and with emulation forced on for 
several benchmarks:

on Xeon MP processor, uniprocessor dom0 kernel, pae=y:

benchmark                c/s 10729 force_emulate
------------------------ --------- -------------
lmbench fork+exit:       469.5833  470.3913   usec, lower is better
lmbench fork+execve:     1241.0000 1225.7778  usec, lower is better
lmbench fork+/sbin/bash: 12190.000 12119.000  usec, lower is better
dbench 3.03              186.354   191.278    MB/sec
reaim_aim9               1890.01   2055.97    jobs/min
reaim_compute            2538.75   2522.90    jobs/min
reaim_dbase              3852.14   3739.38    jobs/min
reaim_fserver            4437.93   4389.71    jobs/min
reaim_shared             2365.85   2362.97    jobs/min
SPEC SDET                4315.91   4312.02    scripts/hr

These are all within the noise level (some slightly better, some 
slightly worse for emulate).  There really isn''t much of difference 
here.  I''d like to propose turning on the emulate path all the time in 
xen. 

-Andrew Theurer

Applies to c/s 10729
Signed-off-by: Andrew Theurer <habanero@us.ibm.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2006-Jul-25 22:41 UTC

head link

RE: [Xen-devel] [PATCH] turn off writable page tables

> on Xeon MP processor, uniprocessor dom0 kernel, pae=y:
> 
> benchmark                c/s 10729 force_emulate
> ------------------------ --------- -------------
> lmbench fork+exit:       469.5833  470.3913   usec, lower is better
> lmbench fork+execve:     1241.0000 1225.7778  usec, lower is better
> lmbench fork+/sbin/bash: 12190.000 12119.000  usec, lower is better
It''s kinda weird that these scores are so close -- I guess its just
coincidence that we must be getting something like an average of 10-20
pte''s updated per pagetable page and the cost of doing multiple
emulates
perfectly balances the cost of unhooking/rehooking.

I would like to make sure we fully understand what''s going on, though.

I''d like to make sure there''s no ''dumb
stuff'' happening, and the
writeable pagetables isn''t being used erroneously where we
don''t expect
it (hence crippling the scores), and that its actually functioning as
intended i.e. that we get one fault to unhook, and then a fault causing
a rehook once we move to the next page in the fork.
   
If you write a little test program that dirties a large chunk of memory
just before the fork, we should see writeable pagetables winning easily.

It would also be good to use some of the trace buffer stuff to find out
exactly what the sequence of faults and flushes is.

I have no problem with enabling force emulation, I''d just like to fully
understand the tradeoff. I suspect the answer is that typically only a
handful of PTEs are dirty, and hence there are relatively few updates to
the parent process''s page tables. It''s worth understanding
this as it
also has implications for shadow pagetables.


Thanks,
Ian
> dbench 3.03              186.354   191.278    MB/sec
> reaim_aim9               1890.01   2055.97    jobs/min
> reaim_compute            2538.75   2522.90    jobs/min
> reaim_dbase              3852.14   3739.38    jobs/min
> reaim_fserver            4437.93   4389.71    jobs/min
> reaim_shared             2365.85   2362.97    jobs/min
> SPEC SDET                4315.91   4312.02    scripts/hr
> 
> These are all within the noise level (some slightly better, some
> slightly worse for emulate).  There really isn''t much of
difference
> here.  I''d like to propose turning on the emulate path all the
time in
> xen.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Nivedita Singhvi

2006-Jul-25 22:43 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Andrew Theurer wrote:
> 
> These are all within the noise level (some slightly better, some 
> slightly worse for emulate).  There really isn''t much of
difference
> here.  I''d like to propose turning on the emulate path all the
time in xen.
> -Andrew Theurer
> 
> Applies to c/s 10729
> Signed-off-by: Andrew Theurer <habanero@us.ibm.com>
Andrew, is this something for 3.0.3/Fedora6/RHEL5 consideration,
or post-3.0.3?

thanks,
Nivedita


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Theurer

2006-Jul-25 23:19 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Nivedita Singhvi wrote:> Andrew Theurer wrote:
>
>>
>> These are all within the noise level (some slightly better, some 
>> slightly worse for emulate).  There really isn''t much of
difference
>> here.  I''d like to propose turning on the emulate path all the
time
>> in xen.
>> -Andrew Theurer
>>
>> Applies to c/s 10729
>> Signed-off-by: Andrew Theurer <habanero@us.ibm.com>
>
> Andrew, is this something for 3.0.3/Fedora6/RHEL5 consideration,
> or post-3.0.3?
>
>I think it could go either way.  There should be no "risk" using
emulate
over writable PT when it comes to stability, etc.

-Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Theurer

2006-Jul-26 02:25 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Ian Pratt wrote:>> on Xeon MP processor, uniprocessor dom0 kernel, pae=y:
>>
>> benchmark                c/s 10729 force_emulate
>> ------------------------ --------- -------------
>> lmbench fork+exit:       469.5833  470.3913   usec, lower is better
>> lmbench fork+execve:     1241.0000 1225.7778  usec, lower is better
>> lmbench fork+/sbin/bash: 12190.000 12119.000  usec, lower is better
>>     
>
> It''s kinda weird that these scores are so close -- I guess its
just
> coincidence that we must be getting something like an average of 10-20
> pte''s updated per pagetable page and the cost of doing multiple
emulates
> perfectly balances the cost of unhooking/rehooking.
>   Ian, I''ll try a small program which dirties a large chunk of memory.  
I''ll also try the trace tool and see what we get.

Thanks,

Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jacob Gorm Hansen

2006-Jul-26 05:31 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

On 7/25/06, Andrew Theurer <habanero@us.ibm.com>
wrote:> Ian Pratt wrote:
> >> on Xeon MP processor, uniprocessor dom0 kernel, pae=y:
> >>
> >> benchmark                c/s 10729 force_emulate
> >> ------------------------ --------- -------------
> >> lmbench fork+exit:       469.5833  470.3913   usec, lower is
better
> >> lmbench fork+execve:     1241.0000 1225.7778  usec, lower is
better
> >> lmbench fork+/sbin/bash: 12190.000 12119.000  usec, lower is
better
> >>
> >
> > It''s kinda weird that these scores are so close -- I guess
its just
> > coincidence that we must be getting something like an average of 10-20
> > pte''s updated per pagetable page and the cost of doing
multiple emulates
> > perfectly balances the cost of unhooking/rehooking.
Just a silly question; is the old batched update mechanism totally out
of the picture here? Is it the cost of taking additional faults that
makes writable ptes as slow as emulation (which I suppose just means
shadow p.t.s)? Is there tension between shadow pt cache size inside
Xen and runtime performance?

Regards,
Jacob

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2006-Jul-26 08:18 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Hi,
> I''d like to make sure there''s no ''dumb
stuff'' happening, and the
> writeable pagetables isn''t being used erroneously where we
don''t expect
> it (hence crippling the scores), and that its actually functioning as
> intended i.e. that we get one fault to unhook, and then a fault causing
> a rehook once we move to the next page in the fork.
>    
> If you write a little test program that dirties a large chunk of memory
> just before the fork, we should see writeable pagetables winning easily.
Just an idea:  Any chance mm_pin() and mm_unpin() cause this?  The bulk
page table updates for the new process created by fork() are not seen by
xen anyway I think.  The first schedule of the new process triggers
pinning, i.e. r/o mapping and verification ...

cheers,

  Gerd

-- 
Gerd Hoffmann <kraxel@suse.de>
http://www.suse.de/~kraxel/julika-dora.jpeg

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Jul-26 08:40 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

On 26 Jul 2006, at 09:18, Gerd Hoffmann wrote:
>> I''d like to make sure there''s no ''dumb
stuff'' happening, and the
>> writeable pagetables isn''t being used erroneously where we
don''t
>> expect
>> it (hence crippling the scores), and that its actually functioning as
>> intended i.e. that we get one fault to unhook, and then a fault 
>> causing
>> a rehook once we move to the next page in the fork.
>>
>> If you write a little test program that dirties a large chunk of 
>> memory
>> just before the fork, we should see writeable pagetables winning 
>> easily.
>
> Just an idea:  Any chance mm_pin() and mm_unpin() cause this?  The bulk
> page table updates for the new process created by fork() are not seen 
> by
> xen anyway I think.  The first schedule of the new process triggers
> pinning, i.e. r/o mapping and verification ...
The batching should still benefit the write-protecting of the parent 
pagetables, which are visible to Xen during fork() (since the fork() 
runs on them!).

Hence the suggestion of dirtying pages before the fork -- that will 
ensure that lots of PTEs are definitely writable, and so they will have 
to be updated to make them read-only.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Theurer

2006-Jul-26 21:10 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Keir Fraser wrote:>
> On 26 Jul 2006, at 09:18, Gerd Hoffmann wrote:
>
>>> I''d like to make sure there''s no ''dumb
stuff'' happening, and the
>>> writeable pagetables isn''t being used erroneously where we
don''t expect
>>> it (hence crippling the scores), and that its actually functioning
as
>>> intended i.e. that we get one fault to unhook, and then a fault
causing
>>> a rehook once we move to the next page in the fork.
>>>
>>> If you write a little test program that dirties a large chunk of
memory
>>> just before the fork, we should see writeable pagetables winning 
>>> easily.
>>
>> Just an idea:  Any chance mm_pin() and mm_unpin() cause this?  The bulk
>> page table updates for the new process created by fork() are not seen
by
>> xen anyway I think.  The first schedule of the new process triggers
>> pinning, i.e. r/o mapping and verification ...
>
> The batching should still benefit the write-protecting of the parent 
> pagetables, which are visible to Xen during fork() (since the fork() 
> runs on them!).
>
> Hence the suggestion of dirtying pages before the fork -- that will 
> ensure that lots of PTEs are definitely writable, and so they will 
> have to be updated to make them read-only.
>And it does make a difference in this case.  I now have a test program 
which dirties a number of virtually contiguous pages then forks (it also 
resets xen perf counters before fork and collects perf counters right 
after fork), then records the elapsed time for the fork.  The difference 
is quite amazing in this case.  For both writable and emulate, I ran 
with a range of dirty pages, from 1280 to 128000.  The elapsed times for 
fork a quite linear from small number to large number of dirty pages. 
Below are the min and max:

         1280 pages    128000 pages
wtpt:     813 usec      37552 usec 
emulate: 3279 usec     283879 usec

The perf counters showed just about every writable page had all entries 
modified (for 128000 pages below):

writable pt updates: total: 253  all entries updated: 250

So, in a -perfect-world- this works great.  Problem is most workloads 
don''t appear to have a vast percentage of entries that need to be 
updated.   I''ll go ahead and  expand this test to find out what the 
threshold is to break even.  I''ll also see if we can implement a
batched
call in fork to update the parent -I hope this will show just as good 
performance even when most entries need modification and even better 
performance over wtpt with a low number of entries modified.

-Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2006-Jul-26 21:38 UTC

head link

RE: [Xen-devel] [PATCH] turn off writable page tables

> And it does make a difference in this case.  I now have a test program
> which dirties a number of virtually contiguous pages then forks (it
also> resets xen perf counters before fork and collects perf counters right
> after fork), then records the elapsed time for the fork.  The
difference> is quite amazing in this case.  For both writable and emulate, I ran
> with a range of dirty pages, from 1280 to 128000.  The elapsed times
for> fork a quite linear from small number to large number of dirty pages.
> Below are the min and max:
> 
>          1280 pages    128000 pages
> wtpt:     813 usec      37552 usec
> emulate: 3279 usec     283879 usec
Good, at least that suggests that the code works for the usage it was
intended for. 
> So, in a -perfect-world- this works great.  Problem is most workloads
> don''t appear to have a vast percentage of entries that need to be
> updated.   I''ll go ahead and  expand this test to find out what
the
> threshold is to break even.  I''ll also see if we can implement a
batched> call in fork to update the parent -I hope this will show just as good
> performance even when most entries need modification and even better
> performance over wtpt with a low number of entries modified.
With license to make more invasive changes to core Linux mm it certainly
should be possible to optimize this specific case with a batched update
fairly easily. You could even go further an implement a ''make all PTEs
in pagetable RO'' hypercall, possibly including a copy to the child.
This
could potentially work better than current ''late pin'', at
least the
validation would be incremental rather than in one big hit at the end. 

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Bonasera

2006-Jul-26 23:38 UTC

head link

RE: [Xen-devel] [PATCH] turn off writable page tables

xen-devel-request@lists.xensource.com wrote:
> 
> Message: 1
> Date: Wed, 26 Jul 2006 22:38:32 +0100
> From: "Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk>
> Subject: RE: [Xen-devel] [PATCH] turn off writable page tables
> To: "Andrew Theurer" <habanero@us.ibm.com>,	"Keir
Fraser"
> 	<Keir.Fraser@cl.cam.ac.uk>
> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>, Gerd Hoffmann
> 	<kraxel@suse.de>,	xen-devel@lists.xensource.com
> Message-ID:
> 	<A95E2296287EAD4EB592B5DEEFCE0E9D572247@liverpoolst.ad.cl.cam.ac.uk>
> Content-Type: text/plain;	charset="us-ascii"
> 
>> And it does make a difference in this case.  I now have a test program
>> which dirties a number of virtually contiguous pages then forks (it
> also
>> resets xen perf counters before fork and collects perf counters right
>> after fork), then records the elapsed time for the fork.  The
> difference
>> is quite amazing in this case.  For both writable and emulate, I ran
>> with a range of dirty pages, from 1280 to 128000.  The elapsed times
> for
>> fork a quite linear from small number to large number of dirty pages.
>> Below are the min and max:
>>
>>          1280 pages    128000 pages
>> wtpt:     813 usec      37552 usec
>> emulate: 3279 usec     283879 usec
> 
> Good, at least that suggests that the code works for the usage it was
> intended for. 
> 
>> So, in a -perfect-world- this works great.  Problem is most workloads
>> don''t appear to have a vast percentage of entries that need to
be
>> updated.   I''ll go ahead and  expand this test to find out
what the
>> threshold is to break even.  I''ll also see if we can implement
a
> batched
>> call in fork to update the parent -I hope this will show just as good
>> performance even when most entries need modification and even better
>> performance over wtpt with a low number of entries modified.
> 
> With license to make more invasive changes to core Linux mm it certainly
> should be possible to optimize this specific case with a batched update
> fairly easily. You could even go further an implement a ''make all
PTEs
> in pagetable RO'' hypercall, possibly including a copy to the
child. This
> could potentially work better than current ''late pin'', at
least the
> validation would be incremental rather than in one big hit at the end. 
> 
> Ian
OpenSolaris could easily use the "make all PTEs in pagetable RO"
hypercall.
But we don''t copy in bulk to the child, so if you go down that path
please make the copy to child part optional.

Joe

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Theurer

2006-Jul-27 14:43 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

>> fork a quite linear from small number to large number of dirty pages.
>> Below are the min and max:
>>
>>          1280 pages    128000 pages
>> wtpt:     813 usec      37552 usec
>> emulate: 3279 usec     283879 usec
>>     
>
> Good, at least that suggests that the code works for the usage it was
> intended for. 
>
>   
>> So, in a -perfect-world- this works great.  Problem is most workloads
>> don''t appear to have a vast percentage of entries that need to
be
>> updated.   I''ll go ahead and  expand this test to find out
what the
>> threshold is to break even.  I''ll also see if we can implement
a
>>     
> batched
>   
>> call in fork to update the parent -I hope this will show just as good
>> performance even when most entries need modification and even better
>> performance over wtpt with a low number of entries modified.
>>     
>
> With license to make more invasive changes to core Linux mm it certainly
> should be possible to optimize this specific case with a batched update
> fairly easily. You could even go further an implement a ''make all
PTEs
> in pagetable RO'' hypercall, possibly including a copy to the
child. This
> could potentially work better than current ''late pin'', at
least the
> validation would be incremental rather than in one big hit at the end. 
>
> Ian
>   FWIW, I found the threshold for emulate vs wtpt.  I ran the fork test 
with a set number of pages dirtied such that we had x number of PTEs per 
pte_page.

writable-pt
-----------
#pte usec
002 5242
004 5251
006 5373
008 5519
010 5873

emulate
--------
#pte usec
002 4922
004 5265
006 6074
008 6991
010 7806
012 5988

So, the threshold appears to be around 4 PTEs/page.  I was a little 
shocked at first how low this number is, but considering the near 
identical performance with the various workloads, this make sense.  All 
of the workloads had the vast majority of writable pages flushed with 
just 2 PTEs/page changed and a handful with more PTEs/page changed.  It 
would not surprise me if the overall average was around 4 PTEs/page.

I am having a hard time finding any "enterprise" workloads which have
a
lot of PTEs/page right before fork.  If anyone can point me to some, 
that would be great.

I will look into batching next, but I am curious if simply using a 
hypercall in stead of write fault + emulate will make any difference at 
all.  I''ll try that first, then implement the batched update. 

Eventually a hypercall which does more would be nice, but I guess we''ll
have to convince the Linux maintainers it''s a good idea.

-Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Jul-27 15:30 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

On 27 Jul 2006, at 15:43, Andrew Theurer wrote:
> So, the threshold appears to be around 4 PTEs/page.  I was a little 
> shocked at first how low this number is, but considering the near 
> identical performance with the various workloads, this make sense.  
> All of the workloads had the vast majority of writable pages flushed 
> with just 2 PTEs/page changed and a handful with more PTEs/page 
> changed.  It would not surprise me if the overall average was around 4 
> PTEs/page.
>
> I am having a hard time finding any "enterprise" workloads which
have
> a lot of PTEs/page right before fork.  If anyone can point me to some, 
> that would be great.
>
> I will look into batching next, but I am curious if simply using a 
> hypercall in stead of write fault + emulate will make any difference 
> at all.  I''ll try that first, then implement the batched update.
> Eventually a hypercall which does more would be nice, but I guess 
> we''ll have to convince the Linux maintainers it''s a good
idea.
The obvious thing to do is emulate the first 4 updates to a particular 
page, and only then switch to batched mode. Slows down the batched path 
a bit, but stops it firing in many cases where it is no help.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2006-Jul-27 17:31 UTC

head link

RE: [Xen-devel] [PATCH] turn off writable page tables

> > I am having a hard time finding any "enterprise" workloads
which
have> > a lot of PTEs/page right before fork.  If anyone can point me to
some,> > that would be great.
> >
> > I will look into batching next, but I am curious if simply using a
> > hypercall in stead of write fault + emulate will make any difference
> > at all.  I''ll try that first, then implement the batched
update.
> > Eventually a hypercall which does more would be nice, but I guess
> > we''ll have to convince the Linux maintainers it''s a
good idea.
> 
> The obvious thing to do is emulate the first 4 updates to a particular
> page, and only then switch to batched mode. Slows down the batched
path> a bit, but stops it firing in many cases where it is no help.
Why? There should be no overhead to just building batches on the stack
(or a per vcpu area) and flushing at the end of the page. Certainly if
we were to keep wrpt it would make sense to take a few emulations faults
first on a page before engaging wrpt, but for explicit batches we don''t
need any smarts. 

[Although the batching strategy would (currently) work for Linux, we do
have to bare in mind that some OSes (possibly NetBSD) won''t rely on a
lock to protect updates to pagetables and will use individual atomic
ops.]

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Jul-28 08:55 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

On 27 Jul 2006, at 18:31, Ian Pratt wrote:
>> The obvious thing to do is emulate the first 4 updates to a particular
>> page, and only then switch to batched mode. Slows down the batched
> path
>> a bit, but stops it firing in many cases where it is no help.
>
> Why? There should be no overhead to just building batches on the stack
> (or a per vcpu area) and flushing at the end of the page. Certainly if
> we were to keep wrpt it would make sense to take a few emulations 
> faults
> first on a page before engaging wrpt, but for explicit batches we
don''t
> need any smarts.
>
> [Although the batching strategy would (currently) work for Linux, we do
> have to bare in mind that some OSes (possibly NetBSD) won''t rely
on a
> lock to protect updates to pagetables and will use individual atomic
> ops.]
It wasn''t clear to me there was a batching strategy that would 
integrate nicely with Linux generic mm code and be useful to any other 
OSes. We don''t particularly want to accumulate OS-specific hacks unless
it''s a significant win (which we have no evidence it would be here).

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Theurer

2006-Jul-28 15:21 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

Keir Fraser wrote:>
> On 27 Jul 2006, at 18:31, Ian Pratt wrote:
>
>>> The obvious thing to do is emulate the first 4 updates to a
particular
>>> page, and only then switch to batched mode. Slows down the batched
>> path
>>> a bit, but stops it firing in many cases where it is no help.
>>
>> Why? There should be no overhead to just building batches on the stack
>> (or a per vcpu area) and flushing at the end of the page. Certainly if
>> we were to keep wrpt it would make sense to take a few emulations
faults
>> first on a page before engaging wrpt, but for explicit batches we
don''t
>> need any smarts.
>>
>> [Although the batching strategy would (currently) work for Linux, we do
>> have to bare in mind that some OSes (possibly NetBSD) won''t
rely on a
>> lock to protect updates to pagetables and will use individual atomic
>> ops.]
>
> It wasn''t clear to me there was a batching strategy that would 
> integrate nicely with Linux generic mm code and be useful to any other 
> OSes. We don''t particularly want to accumulate OS-specific hacks 
> unless it''s a significant win (which we have no evidence it would
be
> here). I think there are only a couple of spots where batching is obvious (fork 
parent being one).  However, I don''t think we''ll see any
significant
improvement, as we don''t see any right now on typical workloads with 
writable pages either.  And I think that''s the point I want to make -we
are not seeing an advantage for writable pages unless you have a 
workload with a lot of dirty PTE''s/page and it forks a lot, which 
apparently does not seem to be that common (please, if anyone has one, 
let me know, I would like to test it) 

Now, if this were the only consequence, then I would not even bother 
trying to remove writable page tables.  However, the writable pages do 
not scale with SMP guests, partly because of the single page available 
(not counting the inactive page, since it''s never used anymore), but 
also because the tlb flush of all active cpus in that guest post page 
detachment.  Keeping writable page tables will probably also make 
implementing a fine grain locking in the mm.c hypercall functions quite 
difficult.

One other point: For those OSes which use cmpxchg on PTEs, I believe 
keeping the emulate path will preserve the cmpxchg, so I don''t think we
need wtpt for that.  Alternatively, we could add a set_pte_cmpxchg call 
if needed.

So, in summary, we know writable page tables are not broken, they just 
don''t help on typical workloads because the PTEs/page are so low.  
However, they do hurt SMP guest performance.  If we are not seeing a 
benefit today, should we turn it off?  Should we make it a compile time 
option, with the default off?

Thanks,

-Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2006-Jul-28 15:51 UTC

head link

RE: [Xen-devel] [PATCH] turn off writable page tables

> So, in summary, we know writable page tables are not broken, they just
> don''t help on typical workloads because the PTEs/page are so low.
> However, they do hurt SMP guest performance.  If we are not seeing a
> benefit today, should we turn it off?  Should we make it a compile
time> option, with the default off?
I wouldn''t mind seeing wrpt removed altogether, or at least emulation
made the compile time default for the moment. There''s bound to be some
workload that bites us in the future which is why batching updates on
the fork path mightn''t be a bad thing if it can be done without too
much
gratuitous hacking of linux core code.

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Jul-28 16:31 UTC

head link

Re: [Xen-devel] [PATCH] turn off writable page tables

On 28 Jul 2006, at 16:51, Ian Pratt wrote:
>> So, in summary, we know writable page tables are not broken, they just
>> don''t help on typical workloads because the PTEs/page are so
low.
>> However, they do hurt SMP guest performance.  If we are not seeing a
>> benefit today, should we turn it off?  Should we make it a compile
> time
>> option, with the default off?
>
> I wouldn''t mind seeing wrpt removed altogether, or at least
emulation
> made the compile time default for the moment. There''s bound to be
some
> workload that bites us in the future which is why batching updates on
> the fork path mightn''t be a bad thing if it can be done without
too
> much
> gratuitous hacking of linux core code.
My only fear is that batched wrpt has some guest-visible effects. For 
example, the guest has to be able to cope with seeing page directory 
entries with the present bit cleared. Also, on SMP, it has to be able 
to cope with spurious page faults anywhere in its address space (e.g., 
faults on a unhooked page table which some other VCPU has rehooked by 
the time the Xen pagefault handler runs, hence the fault is bounced 
back to the guest even though there is no work to be done). If we turn 
off batched wrpt then guests will not be tested against it and we are 
likely to hit problems if we ever want to turn it back on again -- 
we''ll find that some guests are not able to correctly handle the weird 
side effects.

On the other hand, perhaps we can find a neater more explicit 
alternative to batched wrpt in future.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Mike Day

2006-Jul-28 23:23 UTC

head link

[Xen-devel] Re: turn off writable page tables

On 28/07/06 10:21 -0500, Andrew Theurer wrote:
>So, in summary, we know writable page tables are not broken, they just 
>don''t help on typical workloads because the PTEs/page are so low.  
>However, they do hurt SMP guest performance.  If we are not seeing a 
>benefit today, should we turn it off?  Should we make it a compile time 
>option, with the default off?

Keep in mind that the time is upon us when all servers are SMP, and
soon we will see 8-way blades. If you believe that what we are
releasing this year will take 6-12 months to make it to production at
a customer''s site, then we should be assuming that everything (even
laptops) are SMP.

Looking ahead to next year, it''s not unlikely that the sweetspot for
running Xen will go up from 2-4 cores to 4-8 cores. As a result, I
think wpt should not be the default.

Mike

(In the best tradition of a hardware vendor who likes for everyone to
have the latest iron.)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Possibly Parallel Threads

Search for more seemingly similar threads

Linux Virtualization - Jul 2006 - [PATCH] turn off writable page tables

[Xen-devel] [PATCH] turn off writable page tables

RE: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

RE: [Xen-devel] [PATCH] turn off writable page tables

RE: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

RE: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

RE: [Xen-devel] [PATCH] turn off writable page tables

Re: [Xen-devel] [PATCH] turn off writable page tables

[Xen-devel] Re: turn off writable page tables

Possibly Parallel Threads