thr3ads.net - Xen devel - [Xen-devel] Xen/ia64 - global or per VP VHPT [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Yang, Fred

2005-Apr-29 05:56 UTC

[Xen-devel] Xen/ia64 - global or per VP VHPT

Magenheimer, Dan (HP Labs Fort Collins) wrote:> Hi Eddie --
>>> I see what you mean about vMMU.  Is it ifdef''d or
>>> a runtime choice?  I''d like to be able to do some
>>> comparisons of a global lVHPT vs a per-domain lVHPT.
>> Besides performance, global VHPT is hard to support multiple
>> page size case like Dom1 uses 16KB default page size while
>> DOM2 uses 4K page size. Also once a VM set rr.ps to a new
>> value, I guess you need to purge the whole VHPT that cause
>> serious problem when the scalability becomes large.
> 
> This is not as much of an issue for paravirtualized domains
> as a minimum page size can be specified.  Also rr.ps
> can be virtualized (and in fact is virtualized in the
> current implementation).With a single  global VHPT and force the same page size limitation, it
means all the Domaons must be paravirtualized to a hard defined pag
size; this is definitely to limit the capability of the Xen/ia64.   Will
this also imply only certain version of the Domains can run on a same
platform?
What will be the scability issue with a single VHPT table?  Imaging
multi-VP/Multi-LP, all the LPs walking on the same table, you would need
to global purge or send IPI to all processor for purge a single entry.
Costly! > 
> I agree that purging is a problem, however Linux does not
> currently change page sizes frequently.Again, I hope we are not limiting only one OS version to run on this
Xen/ia64.  I believe Linux also needs to purge entries other than page
size changes!
Through per-VP VHPT and VT-I feature of ia64, we can expend Xen/ia64
capability to enable multiple unmodified OS run on Xen/ia64 without
knowing what the page size the Domain is using.  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Magenheimer, Dan (HP Labs Fort Collins)

2005-Apr-29 15:29 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

(Sorry for the cross-posting... I wanted to ensure that
the xen-ia64-devel list members are able to chime in.
With the high traffic on xen-devel, I''m sure many -- like
myself -- fall behind on reading xen-devel!)
> With a single  global VHPT and force the same page size 
> limitation, it means all the Domaons must be paravirtualized 
> to a hard defined pag size; this is definitely to limit the 
> capability of the Xen/ia64.   Will this also imply only 
> certain version of the Domains can run on a same platform?
> What will be the scability issue with a single VHPT table?  
> Imaging multi-VP/Multi-LP, all the LPs walking on the same 
> table, you would need to global purge or send IPI to all 
> processor for purge a single entry.  Costly! 
No, multiple page sizes are supported, though there does have
to be a system-wide minimum page size (e.g. if this were defined
as 16KB, a 4KB-page mapping request from a guestOS would be rejected).
Larger page sizes are represented by multiple entries in the
global VHPT.

Purging is definitely expensive but there may be ways to
minimize that.  That''s where the research comes in.

I expect the answer to be that global VHPT will have advantages
for some workloads and the per-domain VHPT will have advantages
for other workloads.  (The classic answer to any performance
question... "it depends..." :-)
> > I agree that purging is a problem, however Linux does not
> > currently change page sizes frequently.
> Again, I hope we are not limiting only one OS version to run 
> on this Xen/ia64.  I believe Linux also needs to purge 
> entries other than page size changes!
> Through per-VP VHPT and VT-I feature of ia64, we can expend 
> Xen/ia64 capability to enable multiple unmodified OS run on 
> Xen/ia64 without knowing what the page size the Domain is using.  
Per-domain VHPT will have its disadvantages too, namely a large
chunk of memory per domain that is not owned by the domain.
Perhaps this is not as much of a problem on VT which will be
limited to 16 domains, but I hope to support more non-VT domains
(at least 64, maybe more).

Is the per-domain VHPT the same size as whatever the domain allocates
for its own VHPT (essentially a shadow)?  Aren''t there purge
performance problems with this too?

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Yang, Fred

2005-Apr-29 15:41 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

> 
> Per-domain VHPT will have its disadvantages too, namely a large
> chunk of memory per domain that is not owned by the domain.
> Perhaps this is not as much of a problem on VT which will be
> limited to 16 domains, but I hope to support more non-VT domains
> (at least 64, maybe more).For the quick answer on this, we are using fixed partition on RID to get
16 domains for start - to get to domainN.  But it is for the basic code
to work.  The scheme can be switched to dynamically RID partition to get
to >64 domains.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Magenheimer, Dan (HP Labs Fort Collins)

2005-Apr-29 15:44 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

> > Per-domain VHPT will have its disadvantages too, namely a large
> > chunk of memory per domain that is not owned by the domain.
> > Perhaps this is not as much of a problem on VT which will be
> > limited to 16 domains, but I hope to support more non-VT domains
> > (at least 64, maybe more).
> For the quick answer on this, we are using fixed partition on 
> RID to get 16 domains for start - to get to domainN.  But it 
> is for the basic code to work.  The scheme can be switched to 
> dynamically RID partition to get to >64 domains.
But only with a full TLB purge on every domain switch, correct?

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Munoz, Alberto J

2005-Apr-29 16:09 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Hi Dan,

Magenheimer, Dan (HP Labs Fort Collins) <mailto:dan.magenheimer@hp.com>
wrote on Friday, April 29, 2005 8:29 AM:
> (Sorry for the cross-posting... I wanted to ensure that
> the xen-ia64-devel list members are able to chime in.
> With the high traffic on xen-devel, I''m sure many -- like
> myself -- fall behind on reading xen-devel!)
> 
>> With a single  global VHPT and force the same page size
>> limitation, it means all the Domaons must be paravirtualized
>> to a hard defined pag size; this is definitely to limit the
>> capability of the Xen/ia64.   Will this also imply only
>> certain version of the Domains can run on a same platform?
>> What will be the scability issue with a single VHPT table?
>> Imaging multi-VP/Multi-LP, all the LPs walking on the same
>> table, you would need to global purge or send IPI to all
>> processor for purge a single entry.  Costly!
> 
> No, multiple page sizes are supported, though there does have
> to be a system-wide minimum page size (e.g. if this were defined
> as 16KB, a 4KB-page mapping request from a guestOS would be rejected).
> Larger page sizes are represented by multiple entries in the
> global VHPT.
In my opinion this is a moot point because in order to provide the
appropriate semantics for physical mode emulation (PRS.dt, or PSR.it, or
PSR.rt == 0) it is necessary to support a 4K page size as the minimum
(unless you special case translations for physical mode emulation). Also in
terms of machine memory utilization, it is better to have smaller pages (I
know this functionality is not yet available in Xen, but I believe it will
become important once people are done working on the basics).
> 
> Purging is definitely expensive but there may be ways to
> minimize that.  That''s where the research comes in.
It is not just purging. Having a global VHPT is, in general, really bad for
scalability. Every time the hypervisor wants to modify anything in the VHPT,
it must guarantee that no other processors are accessing that VHPT (this is
a fairly complex thing to do in TLB miss handlers). If you make this
synchronization mechanism dependent on the number of domains (and
processors/cores/threads) in the system, rather than on the degree of SMP of
a domain, as it would be with a per domain VHPT, you will severely limit
scalability. 

Also consider the caching effects (assuming you have hash chains, which I
think you would need in order to avoid forward progress issues). Every time
a processor walks a hash chain, it must bring all those PTEs into its cache.
Every time you set an access or dirty bit, you must get the line private.

If you are only considering 2-way (maybe even 4-way) machines this is not a
big deal, but if we are talking about larger machines (IPF''s bread and
butter), these problems are really serious.

Another important thing is hashing into the VHPT. If you have a single VHPT
for multiple guests (and those guests are the same, e.g., same version of
Linux) then you are depending 100% on having a good RID allocator (per
domain) otherwise the translations for different domains will start
colliding in your hash chains and thus reducing the efficiency of your VHPT.
The point here is that guest OSs (that care about this type of stuff) are
designed to spread RIDs such that they minimize their own hash chain
collisions, but there are not design to not collide with other guest''s.
Also, the fact that the hash algorithm is implementation specific makes this
problem even worse.

> I expect the answer to be that global VHPT will have advantages
> for some workloads and the per-domain VHPT will have advantages
> for other workloads.  (The classic answer to any performance
> question... "it depends..." :-)
As you point out this is ALWAYS the case, but what matters is what are your
target workloads and target systems are. How many domains per system do you
expect to support, and how many processors/cores/threads do you expect per
system, etc.
> 
>>> I agree that purging is a problem, however Linux does not
>>> currently change page sizes frequently.
>> Again, I hope we are not limiting only one OS version to run
>> on this Xen/ia64.  I believe Linux also needs to purge
>> entries other than page size changes!
>> Through per-VP VHPT and VT-I feature of ia64, we can expend
>> Xen/ia64 capability to enable multiple unmodified OS run on
>> Xen/ia64 without knowing what the page size the Domain is using.
> 
> Per-domain VHPT will have its disadvantages too, namely a large
> chunk of memory per domain that is not owned by the domain.
> Perhaps this is not as much of a problem on VT which will be
> limited to 16 domains, but I hope to support more non-VT domains
> (at least 64, maybe more).
Memory footprint is really not that big a deal for these large machines, but
in any case, the size of the VHPT is typically proportional to the size of
physical memory (some people suggest 4 PTEs per physical page frame and some
people suggest 2, but in any case, there is a linear relationship between
the two). If you follow this guide line, then individual VHPTs for 5 guests
should be 1/5 of the size of the combined VHPT for all 5 guests.
> 
> Is the per-domain VHPT the same size as whatever the domain allocates
> for its own VHPT (essentially a shadow)?  Aren''t there purge
> performance problems with this too?
Purges are always an issue for SMP machines, but you do not want the problem
to scale with the number of domains and the number of
processors/cores/threads in the system.
> 
> Dan
Bert


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Munoz, Alberto J

2005-Apr-29 20:58 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Hi Dan,

Magenheimer, Dan (HP Labs Fort Collins) <mailto:dan.magenheimer@hp.com>
wrote on Friday, April 29, 2005 1:05 PM:
>> In my opinion this is a moot point because in order to provide the
>> appropriate semantics for physical mode emulation (PRS.dt, or PSR.it,
or
>> PSR.rt == 0) it is necessary to support a 4K page size as the minimum
>> (unless you special case translations for physical mode
>> emulation). Also in
>> terms of machine memory utilization, it is better to have
>> smaller pages (I
>> know this functionality is not yet available in Xen, but I
>> believe it will
>> become important once people are done working on the basics).
> 
> In my opinion, performance when emulating physical mode is
> a moot point.  
Linux IPF TLB miss handlers turn off PRS.dt. This is very performance
sensitive.
> It might make sense to simply not insert
> metaphysical addresses into the VHPT and just rely on the
> TLB (though perhaps a one-entry virtual TLB might be required
> to ensure forward progress).
> 
> Remember, one major difference between full virtualization (VT)
> and paravirtualization is that you have to handle any case that
> any crazy OS designer might try, while I just have to ensure that
> I can tell the crazy OS designer what crazy things need to be
> removed to make sure it works on Xen :-)  This guarantees that
> our design choices will sometimes differ.
I have not forgotten that (just as I have not forgotten this same argument
used in other contexts in the past, let''s just do it this way because
we
know no reasonable software will ever do that...) 

The way I see you applying this argument here is a bit different, though:
there are things that Linux does today that will cause trouble with this
particular design choice, but all I have to do is to make sure these
troublesome things get designed out of the paravirtualized OS.

In any case, I think it is critical to define exactly what an IPF
paravirtualized guest is (maybe this has already been done and I missed it)
before making assumptions as to what the guest will and will not do
(specially when those things are done by native guests today). I don''t
think
it is quiet the same as an X-86 XenoLinux, as a number of the hypercalls are
very specific to addressing X-86 virtualization holes, which do not have
equivalents in IPF. 

I know that there have been attempts at paravirtualizing (actually more like
dynamically patching) IPF Linux before (e.g., vBlades, you may be familiar
with it :-), but I am not sure if the Xen project for IPF has decided
exactly what an IPF paravirtualized XenoLinux will look like. I am also not
sure if it has also been decided that no native IPF guests (no binary
patching) will be supported.
>> It is not just purging. Having a global VHPT is, in general,
>> really bad for scalability....
> 
>> Another important thing is hashing into the VHPT. If you have ...
> 
>> As you point out this is ALWAYS the case, but what matters is
>> what are your target workloads and target systems are...
> 
> All this just says that a global VHPT may not be good for a
> big machine.  This may be true.  I''m not suggesting that
> Xen/ia64 support ONLY a global VHPT or even necessarily that
> it be the default, just that we preserve the capability to
> configure either (or even both).
Let''s define "big" in an environment where there are multiple
cores per
die...

Another argument (independent of scalability) here is that interference
between guests/domains in a virtualization environment should be minimized.
This particular design of a single vhpt is fostering this interference.
> 
> I wasn''t present in the early Itanium architecture discussions
> but I''ll bet there were advocates for both lVHPT and sVHPT who
> each thought it a terrible waste that the architecture support
> both.  That was silicon and both are supported; this is a small
> matter of software :-)
I was present during those early discussions and the argument went this way:
we need to support both Windows (a MAS OS) and HP-UX (a SAS OS) => we need
to support both short and long format VHPT.
> 
>> Memory footprint is really not that big a deal for these
>> large machines, but
>> in any case, the size of the VHPT is typically proportional
>> to the size of
>> physical memory (some people suggest 4 PTEs per physical page
>> frame and some
>> people suggest 2, but in any case, there is a linear
>> relationship between
>> the two). If you follow this guide line, then individual
>> VHPTs for 5 guests
>> should be 1/5 of the size of the combined VHPT for all 5 guests.
> 
> The point is that significant memory needs to be reserved in advance
> or dynamically recovered whenever a domain launches.  Maybe this
> isn''t a big deal with a good flexible memory allocator and
> "hidden ballooning" to steal physical memory from running
domains.
Going back to the example of 5 VHPTs of size X vs. one VHPT of size 5X, I
would say that this problem is worse with the single VHPT, as it either has
to have the ability to grow dynamically as domains get created, or has to be
pre-allocated to a size that supports a maximum number of domains.
> 
> E.g., assume an administrator automatically configures all domains
> with a nominal 4GB but ability to dynamically grow up to 64GB.  The
> per-guest VHPT would need to pre-allocate a shadow VHPT for the
> largest of these (say 1% of 64GB) even if each of the domains never
> grew beyond the 4GB, right?  (Either that or some kind of VHPT
> resizing might be required whenever memory is "hot-plugged"?)
I am not sure I understand your example. As I said in my previous posting,
experience has shown that the optimal size of the VHPT (for performance) is
dependent of the number of physical pages it supports (not how many domains,
but how many total pages those domains will be using). In other words, the
problem of having a VHPT support more memory is independent of whether it
represents one domain or multiple domains. It depends on how many total
memory pages are being supported. 

I believe that you somehow think that having a single VHPT to support
multiple domains would save you some memory, or rather the need to grow a
VHPT? Or put another way, why do you think that the situation you describe
above is unique to the multiple VHPT design and not to the single VHPT
design?
> 
> Again, there''s a lot of interesting questions and discussion
around
> this... which means its best to preserve our options if possible.
I see it a bit more black and white than you do.
 > Dan
Bert


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2005-Apr-30 02:10 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Hi, Dan: 
      see my later comments as I have 15 hours time difference with you
guys.

Magenheimer, Dan (HP Labs Fort Collins) wrote:>>> Per-domain VHPT will have its disadvantages too, namely a large
>>> chunk of memory per domain that is not owned by the domain.
>>> Perhaps this is not as much of a problem on VT which will be
>>> limited to 16 domains, but I hope to support more non-VT domains
>>> (at least 64, maybe more).
>> For the quick answer on this, we are using fixed partition on
>> RID to get 16 domains for start - to get to domainN.  But it
>> is for the basic code to work.  The scheme can be switched to
>> dynamically RID partition to get to >64 domains.
> 
> But only with a full TLB purge on every domain switch, correct?
> 	Actually we have designed the rid virtualization mechanism but
is not in this implementation yet. Actually in this area we don''t have
difference between your approach (starting_rid/ending_rid for each
domain) and high 4 bits indicating domain ID.  Merge this problem is
quit easy.

	In our implementation, full TLB purge happens only when all
machine tlb is exhausted and HV decide to recycle all machine TLBs(like
current Linux does). For domain switch, we don''t have any extra
requirement except switching machine PTA(point to per domain VHPT).
	> All this just says that a global VHPT may not be good for a
> big machine.  This may be true.  I''m not suggesting that
> Xen/ia64 support ONLY a global VHPT or even necessarily that
> it be the default, just that we preserve the capability to
> configure either (or even both).            I am afraid supporting for both solution is extremely high
burden as VMMU is a too fundmental thing. For example: How to support
hypercall information passing between guest and HV? You are using
poorman''s exception handler now that is OK for temply debug effort. But
as we discussed, it has critical problem/limitations. 
	The solution to solve that in our vMMU is that we keep all guest
TLBs in HV internal data structure, and we have defined a seperate TLB
section type like ForeignMap(Term in X86 XEN)/Hypercall sharedPage in
vTLB. Xenolinux or Device model or others can insert special maps for
that. This type of section will not be automatically purged when the
collision chain is full. In this way guest will not see tlb miss for
"uaccess" in HV to access guest data.
	How to solve that in global VHPT? I am afraid it is really hard.
Why do we want to spend more time to discard existing approach and
investigate on no hints direction?

	BTW, how do you support MMIO map for DOM-N if the domain-N is a
non modified Linux? I am afraid global VHPT will also eventually need a
similar vTLB data struture to support.
> Is the per-domain VHPT the same size as whatever the domain allocates
> for its own VHPT (essentially a shadow)?  Aren''t there purge
> performance problems with this too?	In our vMMU implementation, the per domain VHPT is only used to
assit the software data structure (per domain VTLB). So we are actually
not shadow. 

Eddie


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Matt Chapman

2005-May-01 02:21 UTC

head link

Re: [Xen-ia64-devel] RE: Xen/ia64 - global or per VP VHPT

(I''m coming in late here, so apologies if I''m missing
something.)
> > No, multiple page sizes are supported, though there does have
> > to be a system-wide minimum page size (e.g. if this were defined
> > as 16KB, a 4KB-page mapping request from a guestOS would be rejected).
Of course if necessary smaller page sizes could be supported in software
at a performance cost, as suggested in the ASDM (2.II.5.5 Subpaging).
> In my opinion this is a moot point because in order to provide the
> appropriate semantics for physical mode emulation (PRS.dt, or PSR.it, or
> PSR.rt == 0) it is necessary to support a 4K page size as the minimum
> (unless you special case translations for physical mode emulation).
Can you explain why this is the case?  Surely the granularity of the
metaphysical->physical mapping can be arbitrary?
> Also in
> terms of machine memory utilization, it is better to have smaller pages (I
> know this functionality is not yet available in Xen, but I believe it will
> become important once people are done working on the basics).
Below you say "Memory footprint is really not that big a deal for these
large machines" ;)  As it is, just about everyone runs Itanium Linux
with 16KB page size, so 16KB memory granularity is obviously not a big
deal.

Since the mappings inserted by the hypervisor are limited to this
granularity (at least, without some complicated superpage logic to
allocate and map pages sequentially), I like the idea of using a larger
granularity in order to increase TLB coverage.
> > Purging is definitely expensive but there may be ways to
> > minimize that.  That''s where the research comes in.
> 
> It is not just purging. Having a global VHPT is, in general, really bad for
> scalability. Every time the hypervisor wants to modify anything in the
VHPT,
> it must guarantee that no other processors are accessing that VHPT (this is
> a fairly complex thing to do in TLB miss handlers).
I think there are more than two options here?  From what I gather, I
understand that you are comparing a single global lVHPT to a per-domain
lVHPT.  There is also the option of a per-physical-CPU lVHPT, and a
per-domain per-virtual-CPU lVHPT.

When implementing the lVHPT in Linux I decided on a per-CPU VHPT for the
scalability reasons that you cite.  And one drawback is, as Dan says,
that it may be difficult to find a large enough chunk of free physical
memory to bring up a new processor (or domain in the per-domain case).
> Another important thing is hashing into the VHPT. If you have a single VHPT
> for multiple guests (and those guests are the same, e.g., same version of
> Linux) then you are depending 100% on having a good RID allocator (per
> domain) otherwise the translations for different domains will start
> colliding in your hash chains and thus reducing the efficiency of your
VHPT.
> The point here is that guest OSs (that care about this type of stuff) are
> designed to spread RIDs such that they minimize their own hash chain
> collisions, but there are not design to not collide with other
guest''s.
> Also, the fact that the hash algorithm is implementation specific makes
this
> problem even worse.
RID allocation is certainly an issue, but I think it''s an issue even
with a per-domain VHPT.  If you have a guest that uses the short VHPT,
such as Linux by default, it may not produce good RID allocation even
with just one domain.  For best performance one would need to either
modify the guest, or virtualise RIDs completely, in which case a global
or per-physical-CPU VHPT can be made to work well too.

Matt


_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

Magenheimer, Dan (HP Labs Fort Collins)

2005-May-01 02:37 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Since this discussion has gotten so ia64 (IPF) specific, I''m
going to move it entirely to xen-ia64-devel to avoid
excessive cross-posting.  For reasons of courtesy, we should
probably limit cross-posting to issues that are of general
interest to the broader Xen development community.

Let me take this opportunity to advertise xen-ia64-devel.
If you are interested in following this (or other Xen/ia64)
discussion, please sign up at

http://lists.xensource.com/xen-ia64-devel

or read the archives at

http://lists.xensource.com/archives/html/xen-ia64-devel/
> -----Original Message-----
> From: Munoz, Alberto J [mailto:alberto.j.munoz@intel.com] 
> Sent: Friday, April 29, 2005 2:58 PM
> To: Magenheimer, Dan (HP Labs Fort Collins); Yang, Fred; Dong, Eddie
> Cc: ipf-xen; Xen-devel; xen-ia64-devel@lists.xensource.com
> Subject: RE: Xen/ia64 - global or per VP VHPT
> 
> Hi Dan,
> 
> Magenheimer, Dan (HP Labs Fort Collins) 
> <mailto:dan.magenheimer@hp.com>
> wrote on Friday, April 29, 2005 1:05 PM:
> 
> >> In my opinion this is a moot point because in order to provide the
> >> appropriate semantics for physical mode emulation (PRS.dt, 
> or PSR.it, or
> >> PSR.rt == 0) it is necessary to support a 4K page size as 
> the minimum
> >> (unless you special case translations for physical mode
> >> emulation). Also in
> >> terms of machine memory utilization, it is better to have
> >> smaller pages (I
> >> know this functionality is not yet available in Xen, but I
> >> believe it will
> >> become important once people are done working on the basics).
> > 
> > In my opinion, performance when emulating physical mode is
> > a moot point.  
> 
> Linux IPF TLB miss handlers turn off PRS.dt. This is very performance
> sensitive.
> 
> > It might make sense to simply not insert
> > metaphysical addresses into the VHPT and just rely on the
> > TLB (though perhaps a one-entry virtual TLB might be required
> > to ensure forward progress).
> > 
> > Remember, one major difference between full virtualization (VT)
> > and paravirtualization is that you have to handle any case that
> > any crazy OS designer might try, while I just have to ensure that
> > I can tell the crazy OS designer what crazy things need to be
> > removed to make sure it works on Xen :-)  This guarantees that
> > our design choices will sometimes differ.
> 
> I have not forgotten that (just as I have not forgotten this 
> same argument
> used in other contexts in the past, let''s just do it this way 
> because we
> know no reasonable software will ever do that...) 
> 
> The way I see you applying this argument here is a bit 
> different, though:
> there are things that Linux does today that will cause 
> trouble with this
> particular design choice, but all I have to do is to make sure these
> troublesome things get designed out of the paravirtualized OS.
> 
> In any case, I think it is critical to define exactly what an IPF
> paravirtualized guest is (maybe this has already been done 
> and I missed it)
> before making assumptions as to what the guest will and will not do
> (specially when those things are done by native guests 
> today). I don''t think
> it is quiet the same as an X-86 XenoLinux, as a number of the 
> hypercalls are
> very specific to addressing X-86 virtualization holes, which 
> do not have
> equivalents in IPF. 
> 
> I know that there have been attempts at paravirtualizing 
> (actually more like
> dynamically patching) IPF Linux before (e.g., vBlades, you 
> may be familiar
> with it :-), but I am not sure if the Xen project for IPF has decided
> exactly what an IPF paravirtualized XenoLinux will look like. 
> I am also not
> sure if it has also been decided that no native IPF guests (no binary
> patching) will be supported.
> 
> >> It is not just purging. Having a global VHPT is, in general,
> >> really bad for scalability....
> > 
> >> Another important thing is hashing into the VHPT. If you have ...
> > 
> >> As you point out this is ALWAYS the case, but what matters is
> >> what are your target workloads and target systems are...
> > 
> > All this just says that a global VHPT may not be good for a
> > big machine.  This may be true.  I''m not suggesting that
> > Xen/ia64 support ONLY a global VHPT or even necessarily that
> > it be the default, just that we preserve the capability to
> > configure either (or even both).
> 
> Let''s define "big" in an environment where there are
multiple
> cores per
> die...
> 
> Another argument (independent of scalability) here is that 
> interference
> between guests/domains in a virtualization environment should 
> be minimized.
> This particular design of a single vhpt is fostering this 
> interference.
> 
> > 
> > I wasn''t present in the early Itanium architecture
discussions
> > but I''ll bet there were advocates for both lVHPT and sVHPT
who
> > each thought it a terrible waste that the architecture support
> > both.  That was silicon and both are supported; this is a small
> > matter of software :-)
> 
> I was present during those early discussions and the argument 
> went this way:
> we need to support both Windows (a MAS OS) and HP-UX (a SAS 
> OS) => we need
> to support both short and long format VHPT.
> 
> > 
> >> Memory footprint is really not that big a deal for these
> >> large machines, but
> >> in any case, the size of the VHPT is typically proportional
> >> to the size of
> >> physical memory (some people suggest 4 PTEs per physical page
> >> frame and some
> >> people suggest 2, but in any case, there is a linear
> >> relationship between
> >> the two). If you follow this guide line, then individual
> >> VHPTs for 5 guests
> >> should be 1/5 of the size of the combined VHPT for all 5 guests.
> > 
> > The point is that significant memory needs to be reserved in advance
> > or dynamically recovered whenever a domain launches.  Maybe this
> > isn''t a big deal with a good flexible memory allocator and
> > "hidden ballooning" to steal physical memory from running
domains.
> 
> Going back to the example of 5 VHPTs of size X vs. one VHPT 
> of size 5X, I
> would say that this problem is worse with the single VHPT, as 
> it either has
> to have the ability to grow dynamically as domains get 
> created, or has to be
> pre-allocated to a size that supports a maximum number of domains.
> 
> > 
> > E.g., assume an administrator automatically configures all domains
> > with a nominal 4GB but ability to dynamically grow up to 64GB.  The
> > per-guest VHPT would need to pre-allocate a shadow VHPT for the
> > largest of these (say 1% of 64GB) even if each of the domains never
> > grew beyond the 4GB, right?  (Either that or some kind of VHPT
> > resizing might be required whenever memory is
"hot-plugged"?)
> 
> I am not sure I understand your example. As I said in my 
> previous posting,
> experience has shown that the optimal size of the VHPT (for 
> performance) is
> dependent of the number of physical pages it supports (not 
> how many domains,
> but how many total pages those domains will be using). In 
> other words, the
> problem of having a VHPT support more memory is independent 
> of whether it
> represents one domain or multiple domains. It depends on how 
> many total
> memory pages are being supported. 
> 
> I believe that you somehow think that having a single VHPT to support
> multiple domains would save you some memory, or rather the 
> need to grow a
> VHPT? Or put another way, why do you think that the situation 
> you describe
> above is unique to the multiple VHPT design and not to the single VHPT
> design?
> 
> > 
> > Again, there''s a lot of interesting questions and discussion
around
> > this... which means its best to preserve our options if possible.
> 
> I see it a bit more black and white than you do.
>  
> > Dan
> 
> Bert
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Magenheimer, Dan (HP Labs Fort Collins)

2005-May-01 03:16 UTC

head link

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Moving to xen-ia64-devel only... 
> -----Original Message-----
> From: Dong, Eddie [mailto:eddie.dong@intel.com] 
> Sent: Friday, April 29, 2005 8:11 PM
> To: Magenheimer, Dan (HP Labs Fort Collins); Yang, Fred
> Cc: ipf-xen; Xen-devel; xen-ia64-devel@lists.xensource.com
> Subject: RE: Xen/ia64 - global or per VP VHPT
> 
> Hi, Dan: 
>       see my later comments as I have 15 hours time 
> difference with you guys.
> 
> Magenheimer, Dan (HP Labs Fort Collins) wrote:
> >>> Per-domain VHPT will have its disadvantages too, namely a
large
> >>> chunk of memory per domain that is not owned by the domain.
> >>> Perhaps this is not as much of a problem on VT which will be
> >>> limited to 16 domains, but I hope to support more non-VT
domains
> >>> (at least 64, maybe more).
> >> For the quick answer on this, we are using fixed partition on
> >> RID to get 16 domains for start - to get to domainN.  But it
> >> is for the basic code to work.  The scheme can be switched to
> >> dynamically RID partition to get to >64 domains.
> > 
> > But only with a full TLB purge on every domain switch, correct?
> > 
> 	Actually we have designed the rid virtualization 
> mechanism but is not in this implementation yet. Actually in 
> this area we don''t have difference between your approach 
> (starting_rid/ending_rid for each domain) and high 4 bits 
> indicating domain ID.  Merge this problem is quit easy.
> 
> 	In our implementation, full TLB purge happens only when 
> all machine tlb is exhausted and HV decide to recycle all 
> machine TLBs(like current Linux does). For domain switch, we 
> don''t have any extra requirement except switching machine 
> PTA(point to per domain VHPT).
> 	
> > All this just says that a global VHPT may not be good for a
> > big machine.  This may be true.  I''m not suggesting that
> > Xen/ia64 support ONLY a global VHPT or even necessarily that
> > it be the default, just that we preserve the capability to
> > configure either (or even both).
>             I am afraid supporting for both solution is 
> extremely high burden as VMMU is a too fundmental thing. For 
> example: How to support hypercall information passing between 
> guest and HV? You are using poorman''s exception handler now 
> that is OK for temply debug effort. But as we discussed, it 
> has critical problem/limitations. 
> 	The solution to solve that in our vMMU is that we keep 
> all guest TLBs in HV internal data structure, and we have 
> defined a seperate TLB section type like ForeignMap(Term in 
> X86 XEN)/Hypercall sharedPage in vTLB. Xenolinux or Device 
> model or others can insert special maps for that. This type 
> of section will not be automatically purged when the 
> collision chain is full. In this way guest will not see tlb 
> miss for "uaccess" in HV to access guest data.
> 	How to solve that in global VHPT? I am afraid it is 
> really hard. Why do we want to spend more time to discard 
> existing approach and investigate on no hints direction?
> 
> 	BTW, how do you support MMIO map for DOM-N if the 
> domain-N is a non modified Linux? I am afraid global VHPT 
> will also eventually need a similar vTLB data struture to support.
> 
> > Is the per-domain VHPT the same size as whatever the domain 
> allocates
> > for its own VHPT (essentially a shadow)?  Aren''t there purge
> > performance problems with this too?
> 	In our vMMU implementation, the per domain VHPT is only 
> used to assit the software data structure (per domain VTLB). 
> So we are actually not shadow. 
> 
> Eddie
> 
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Apr 2005 - Xen/ia64 - global or per VP VHPT

[Xen-devel] Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

Re: [Xen-ia64-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT