Jan Beulich
2010-Jan-12 14:47 UTC
[Xen-devel] pre-reservation of memory for domain creation
Close to the top of XendDomainInfo::_constructDomain() there''s a comment # Hack to pre-reserve some memory for initial domain creation. # There is an implicit memory overhead for any domain creation. This # overhead is greater for some types of domain than others. For # example, an x86 HVM domain will have a default shadow-pagetable # allocation of 1MB. We free up 4MB here to be on the safe side. # 2MB memory allocation was not enough in some cases, so it''s 4MB now which would seem inconsistent with the current implementation in the hypervisor: sh_set_allocation() as called from shadow_enable() wants *at least* 4Mb - modified from 1Mb by c/s 20389. It is not clear (to me) from the changeset description why this change was needed (and even less why it was needed uniformly for 64- and 32-bits). If indeed it is needed, the tools should be adjusted accordingly. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-12 15:04 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
On 12/01/2010 14:47, "Jan Beulich" <JBeulich@novell.com> wrote:> which would seem inconsistent with the current implementation in the > hypervisor: sh_set_allocation() as called from shadow_enable() wants > *at least* 4Mb - modified from 1Mb by c/s 20389. It is not clear (to me) > from the changeset description why this change was needed (and even > less why it was needed uniformly for 64- and 32-bits). If indeed it is > needed, the tools should be adjusted accordingly.ISTR trying the patch without the sh_set_allocation() chunk and failing to create many-VCPU HVM guests without it. Either the creation failed or Xen crashed (!) -- I can''t remember which now, it may even have been both (with Xen crash during destruction of partially-created domain). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-12 15:32 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
>>> Keir Fraser <keir.fraser@eu.citrix.com> 12.01.10 16:04 >>> >On 12/01/2010 14:47, "Jan Beulich" <JBeulich@novell.com> wrote: > >> which would seem inconsistent with the current implementation in the >> hypervisor: sh_set_allocation() as called from shadow_enable() wants >> *at least* 4Mb - modified from 1Mb by c/s 20389. It is not clear (to me) >> from the changeset description why this change was needed (and even >> less why it was needed uniformly for 64- and 32-bits). If indeed it is >> needed, the tools should be adjusted accordingly. > >ISTR trying the patch without the sh_set_allocation() chunk and failing to >create many-VCPU HVM guests without it. Either the creation failed or Xen >crashed (!) -- I can''t remember which now, it may even have been both (with >Xen crash during destruction of partially-created domain).Hmm - given that we''re talking about order-2 allocations here, it would seem a mistake in the first place to use a fixed amount here if it really depends on the number of vCPU-s. If that number isn''t known at that point (I think it isn''t, since XEN_DOMCTL_createdomain doesn''t take it as input), the amount should be adjusted when setting that count (i.e. from XEN_DOMCTL_max_vcpus). Also - how did that fixed amount get determined? shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb (1024 pages) is not matching this (given that it ought to be fine for 128 vCPU-s), just as the old value of 1Mb wasn''t matching the supposed need of 32 vCPU-s. In any case - the larger the value here, the more likely VM creation will fail due to fragmented memory (and fragmentation when using ballooning here grows with the total amount of memory Dom0 owns). Hence it''s not even clear whether setting the pre-reservation value to 8Mb would be good enough, or whether even 16Mb wouldn''t suffice in not too uncommon cases. And btw., I think that papering over a Xen crash during destruction of partially-created domain is rather bad a thing to do. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-12 15:53 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
On 12/01/2010 15:32, "Jan Beulich" <JBeulich@novell.com> wrote:> Also - how did that fixed amount get determined? > shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb > (1024 pages) is not matching this (given that it ought to be fine for > 128 vCPU-s), just as the old value of 1Mb wasn''t matching the > supposed need of 32 vCPU-s.I''m not sure I really believe the toolstack comment. As you say, it wasn''t correct before or after the patch we''re talking about.> And btw., I think that papering over a Xen crash during > destruction of partially-created domain is rather bad a thing to do.Depends on whether domain-creation failure at the point it failed -- and due to inadequate pre-reservation -- is expected and allowed for by the HVM paging logic. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-12 16:17 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
>>> Keir Fraser <keir.fraser@eu.citrix.com> 12.01.10 16:53 >>> >On 12/01/2010 15:32, "Jan Beulich" <JBeulich@novell.com> wrote: > >> Also - how did that fixed amount get determined? >> shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb >> (1024 pages) is not matching this (given that it ought to be fine for >> 128 vCPU-s), just as the old value of 1Mb wasn''t matching the >> supposed need of 32 vCPU-s. > >I''m not sure I really believe the toolstack comment. As you say, it wasn''t >correct before or after the patch we''re talking about.The tool stack comment was correct before that patch; it isn''t now. The size in shadow_enable() doesn''t match shadow_min_acceptable_pages(), but that''s all hypervisor code. And that''s also no answer to the question on where the particular value came from (and namely why it being much smaller than what shadow_min_acceptable_pages() would determine still isn''t going to be a problem).>> And btw., I think that papering over a Xen crash during >> destruction of partially-created domain is rather bad a thing to do. > >Depends on whether domain-creation failure at the point it failed -- and due >to inadequate pre-reservation -- is expected and allowed for by the HVM >paging logic.So you say it''s acceptable for a flaw in the tools (exposed during guest creation) to bring down the whole machine? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-12 16:57 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
> So you say it''s acceptable for a flaw in the tools (exposed during guest > creation) to bring down the whole machine?I think I regarded it at the time as a flaw in the hypervisor, not giving a large enough value to sh_set_allocation(). The tools shouldn''t be able to crash Xen via the normal dom0 interfaces, at least without trying. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-13 02:34 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
If we didn''t add this change, as Keir said, Xen will crash during destruction of partially-created domain. However I didn''t noticed the toolstack and shadow_min_acceptable_pages() side at that time... For now, should we adjust the shadow pre-alloc size to match shadow_min_acceptable_pages() and modify toolstack accordingly? Thanks! Dongxiao Jan Beulich wrote:>>>> Keir Fraser <keir.fraser@eu.citrix.com> 12.01.10 16:53 >>> >> On 12/01/2010 15:32, "Jan Beulich" <JBeulich@novell.com> wrote: >> >>> Also - how did that fixed amount get determined? >>> shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb >>> (1024 pages) is not matching this (given that it ought to be fine >>> for 128 vCPU-s), just as the old value of 1Mb wasn''t matching the >>> supposed need of 32 vCPU-s. >> >> I''m not sure I really believe the toolstack comment. As you say, it >> wasn''t correct before or after the patch we''re talking about. > > The tool stack comment was correct before that patch; it isn''t now. > The size in shadow_enable() doesn''t match > shadow_min_acceptable_pages(), but that''s all hypervisor code. > > And that''s also no answer to the question on where the particular > value came from (and namely why it being much smaller than what > shadow_min_acceptable_pages() would determine still isn''t going > to be a problem). > >>> And btw., I think that papering over a Xen crash during >>> destruction of partially-created domain is rather bad a thing to do. >> >> Depends on whether domain-creation failure at the point it failed -- >> and due to inadequate pre-reservation -- is expected and allowed for >> by the HVM paging logic. > > So you say it''s acceptable for a flaw in the tools (exposed during > guest creation) to bring down the whole machine? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-13 07:55 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 13.01.10 03:34 >>> >If we didn''t add this change, as Keir said, Xen will crash during destruction of partially-created domain.If this indeed is reproducible, I think it should be fixed.>However I didn''t noticed the toolstack and shadow_min_acceptable_pages() side at that time... >For now, should we adjust the shadow pre-alloc size to match shadow_min_acceptable_pages() and modify toolstack accordingly?I would say so, just with the problem that I can''t reliable say what "accordingly" here would be (and hence I can''t craft a patch I can guarantee will work at least in most of the cases). And as said before, I''m also not convinced that using the maximum possible number of vCPU-s for this initial calculation is really the right thing to do. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-13 08:00 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
It''s either that or stub out the feature (return to previos #VCPUs limit) I think. -- Keir On 13/01/2010 02:34, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:> If we didn''t add this change, as Keir said, Xen will crash during destruction > of partially-created domain. > However I didn''t noticed the toolstack and shadow_min_acceptable_pages() side > at that time... > For now, should we adjust the shadow pre-alloc size to match > shadow_min_acceptable_pages() and modify toolstack accordingly? > > Thanks! > Dongxiao > > Jan Beulich wrote: >>>>> Keir Fraser <keir.fraser@eu.citrix.com> 12.01.10 16:53 >>> >>> On 12/01/2010 15:32, "Jan Beulich" <JBeulich@novell.com> wrote: >>> >>>> Also - how did that fixed amount get determined? >>>> shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb >>>> (1024 pages) is not matching this (given that it ought to be fine >>>> for 128 vCPU-s), just as the old value of 1Mb wasn''t matching the >>>> supposed need of 32 vCPU-s. >>> >>> I''m not sure I really believe the toolstack comment. As you say, it >>> wasn''t correct before or after the patch we''re talking about. >> >> The tool stack comment was correct before that patch; it isn''t now. >> The size in shadow_enable() doesn''t match >> shadow_min_acceptable_pages(), but that''s all hypervisor code. >> >> And that''s also no answer to the question on where the particular >> value came from (and namely why it being much smaller than what >> shadow_min_acceptable_pages() would determine still isn''t going >> to be a problem). >> >>>> And btw., I think that papering over a Xen crash during >>>> destruction of partially-created domain is rather bad a thing to do. >>> >>> Depends on whether domain-creation failure at the point it failed -- >>> and due to inadequate pre-reservation -- is expected and allowed for >>> by the HVM paging logic. >> >> So you say it''s acceptable for a flaw in the tools (exposed during >> guest creation) to bring down the whole machine? >> >> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-13 08:53 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Keir, I will work on that. Thanks! Dongxiao Keir Fraser wrote:> It''s either that or stub out the feature (return to previos #VCPUs > limit) I think. > > -- Keir > > On 13/01/2010 02:34, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote: > >> If we didn''t add this change, as Keir said, Xen will crash during >> destruction of partially-created domain. However I didn''t noticed >> the toolstack and shadow_min_acceptable_pages() side at that time... >> For now, should we adjust the shadow pre-alloc size to match >> shadow_min_acceptable_pages() and modify toolstack accordingly? >> >> Thanks! >> Dongxiao >> >> Jan Beulich wrote: >>>>>> Keir Fraser <keir.fraser@eu.citrix.com> 12.01.10 16:53 >>> >>>> On 12/01/2010 15:32, "Jan Beulich" <JBeulich@novell.com> wrote: >>>> >>>>> Also - how did that fixed amount get determined? >>>>> shadow_min_acceptable_pages() says 128 pages per vCPU, but 4Mb >>>>> (1024 pages) is not matching this (given that it ought to be fine >>>>> for 128 vCPU-s), just as the old value of 1Mb wasn''t matching the >>>>> supposed need of 32 vCPU-s. >>>> >>>> I''m not sure I really believe the toolstack comment. As you say, it >>>> wasn''t correct before or after the patch we''re talking about. >>> >>> The tool stack comment was correct before that patch; it isn''t now. >>> The size in shadow_enable() doesn''t match >>> shadow_min_acceptable_pages(), but that''s all hypervisor code. >>> >>> And that''s also no answer to the question on where the particular >>> value came from (and namely why it being much smaller than what >>> shadow_min_acceptable_pages() would determine still isn''t going >>> to be a problem). >>> >>>>> And btw., I think that papering over a Xen crash during >>>>> destruction of partially-created domain is rather bad a thing to >>>>> do. >>>> >>>> Depends on whether domain-creation failure at the point it failed >>>> -- and due to inadequate pre-reservation -- is expected and >>>> allowed for by the HVM paging logic. >>> >>> So you say it''s acceptable for a flaw in the tools (exposed during >>> guest creation) to bring down the whole machine? >>> >>> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-14 07:16 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Keir and Jan, I am working on the issue "pre-reservation of memory for domain creation". Now I have the following findings. Currently guest initialization process in xend (XendDomainInfo.py) is: _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> _initDomain() --> shadow_mem_control() ... In domain_create, previously we reserve 1M memory for domain creation (as described in xend comment), and these memory SHOULD NOT related with vcpu number. And later, shadow_mem_control() will modify the shadow size to 256 pages per vcpu (also plus some other values related with guest memory size...). Therefore the C/S 20389 which modifies 1M to 4M to fit more vcpu number is wrong. I''m sorry for that. Following is the reason why currently 1M doesn''t work for big number vcpus, as we mentioned, it caused Xen crash. Each time when sh_set_allocation() is called, it checks whether shadow_min_acceptable_pages() has been allocated, if not, it will allocate them. That is to say, it is 128 pages per vcpu. But before we define d->max_vcpu, guest vcpu hasn''t been initialized, so shadow_min_acceptable_pages() always returns 0. Therefore we only allocated 1M shadow memory for domain_create, and didn''t satisfy 128 pages per vcpu for alloc_vcpu(). As we know, vcpu allocation is done in the hypercall of XEN_DOMCTL_max_vcpus. However, at this point we haven''t called shadow_mem_control() and are still using the pre-allocated 1M shadow memory to allocate so many vcpus. So it should be a BUG. Therefore when vcpu number increases, 1M is not enough and causes Xen crash. C/S 20389 exposes this issue. So I think the right process should be, after d->max_vcpu is set and before alloc_vcpu(), we should call sh_set_allocation() to satisfy 128 pages per vcpu. The following patch does this work. Is it work for you? Thanks! Best Regards, -- Dongxiao Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> diff -r 13d4e78ede97 xen/arch/x86/mm/shadow/common.c --- a/xen/arch/x86/mm/shadow/common.c Wed Jan 13 08:33:34 2010 +0000 +++ b/xen/arch/x86/mm/shadow/common.c Thu Jan 14 14:02:23 2010 +0800 @@ -41,6 +41,9 @@ DEFINE_PER_CPU(uint32_t,trace_shadow_path_flags); +static unsigned int sh_set_allocation(struct domain *d, + unsigned int pages, + int *preempted); /* Set up the shadow-specific parts of a domain struct at start of day. * Called for every domain from arch_domain_create() */ void shadow_domain_init(struct domain *d, unsigned int domcr_flags) @@ -82,6 +85,12 @@ void shadow_vcpu_init(struct vcpu *v) } #endif + if ( !is_idle_domain(v->domain) ) + { + shadow_lock(v->domain); + sh_set_allocation(v->domain, 128, NULL); + shadow_unlock(v->domain); + } v->arch.paging.mode = &SHADOW_INTERNAL_NAME(sh_paging_mode, 3); } @@ -3099,7 +3108,7 @@ int shadow_enable(struct domain *d, u32 { unsigned int r; shadow_lock(d); Jan Beulich wrote:>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 13.01.10 03:34 >>> >> If we didn''t add this change, as Keir said, Xen will crash during >> destruction of partially-created domain. > > If this indeed is reproducible, I think it should be fixed. > >> However I didn''t noticed the toolstack and >> shadow_min_acceptable_pages() side at that time... >> For now, should we adjust the shadow pre-alloc size to match >> shadow_min_acceptable_pages() and modify toolstack accordingly? > > I would say so, just with the problem that I can''t reliable say what > "accordingly" here would be (and hence I can''t craft a patch I can > guarantee will work at least in most of the cases). > > And as said before, I''m also not convinced that using the maximum > possible number of vCPU-s for this initial calculation is really the > right thing to do. > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-14 07:19 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Keir and Jan, Ignore the previous mail, the patch in the text is incomplete... I am working on the issue "pre-reservation of memory for domain creation". Now I have the following findings. Currently guest initialization process in xend (XendDomainInfo.py) is: _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> _initDomain() --> shadow_mem_control() ... In domain_create, previously we reserve 1M memory for domain creation (as described in xend comment), and these memory SHOULD NOT related with vcpu number. And later, shadow_mem_control() will modify the shadow size to 256 pages per vcpu (also plus some other values related with guest memory size...). Therefore the C/S 20389 which modifies 1M to 4M to fit more vcpu number is wrong. I''m sorry for that. Following is the reason why currently 1M doesn''t work for big number vcpus, as we mentioned, it caused Xen crash. Each time when sh_set_allocation() is called, it checks whether shadow_min_acceptable_pages() has been allocated, if not, it will allocate them. That is to say, it is 128 pages per vcpu. But before we define d->max_vcpu, guest vcpu hasn''t been initialized, so shadow_min_acceptable_pages() always returns 0. Therefore we only allocated 1M shadow memory for domain_create, and didn''t satisfy 128 pages per vcpu for alloc_vcpu(). As we know, vcpu allocation is done in the hypercall of XEN_DOMCTL_max_vcpus. However, at this point we haven''t called shadow_mem_control() and are still using the pre-allocated 1M shadow memory to allocate so many vcpus. So it should be a BUG. Therefore when vcpu number increases, 1M is not enough and causes Xen crash. C/S 20389 exposes this issue. So I think the right process should be, after d->max_vcpu is set and before alloc_vcpu(), we should call sh_set_allocation() to satisfy 128 pages per vcpu. The following patch does this work. Is it work for you? Thanks! Best Regards, -- Dongxiao Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> diff -r 13d4e78ede97 xen/arch/x86/mm/shadow/common.c --- a/xen/arch/x86/mm/shadow/common.c Wed Jan 13 08:33:34 2010 +0000 +++ b/xen/arch/x86/mm/shadow/common.c Thu Jan 14 14:02:23 2010 +0800 @@ -41,6 +41,9 @@ DEFINE_PER_CPU(uint32_t,trace_shadow_path_flags); +static unsigned int sh_set_allocation(struct domain *d, + unsigned int pages, + int *preempted); /* Set up the shadow-specific parts of a domain struct at start of day. * Called for every domain from arch_domain_create() */ void shadow_domain_init(struct domain *d, unsigned int domcr_flags) @@ -82,6 +85,12 @@ void shadow_vcpu_init(struct vcpu *v) } #endif + if ( !is_idle_domain(v->domain) ) + { + shadow_lock(v->domain); + sh_set_allocation(v->domain, 128, NULL); + shadow_unlock(v->domain); + } v->arch.paging.mode = &SHADOW_INTERNAL_NAME(sh_paging_mode, 3); } @@ -3099,7 +3108,7 @@ int shadow_enable(struct domain *d, u32 { unsigned int r; shadow_lock(d); - r = sh_set_allocation(d, 1024, NULL); /* Use at least 4MB */ + r = sh_set_allocation(d, 256, NULL); /* Use at least 1MB */ if ( r != 0 ) { sh_set_allocation(d, 0, NULL); Jan Beulich wrote:>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 13.01.10 03:34 >>> >> If we didn''t add this change, as Keir said, Xen will crash during >> destruction of partially-created domain. > > If this indeed is reproducible, I think it should be fixed. > >> However I didn''t noticed the toolstack and >> shadow_min_acceptable_pages() side at that time... >> For now, should we adjust the shadow pre-alloc size to match >> shadow_min_acceptable_pages() and modify toolstack accordingly? > > I would say so, just with the problem that I can''t reliable say what > "accordingly" here would be (and hence I can''t craft a patch I can > guarantee will work at least in most of the cases). > > And as said before, I''m also not convinced that using the maximum > possible number of vCPU-s for this initial calculation is really the > right thing to do. > > JanBest Regards, -- Dongxiao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-14 09:00 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> > Currently guest initialization process in xend (XendDomainInfo.py) is: > > _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> > _initDomain() --> shadow_mem_control() ...While the patch certainly matches what I had in mind, with this sequence it is clear that the tools still will need adjustment: The full ballooning only happens from _initDomain(), and hence the pre-reservation (from _constructDomain) of 4Mb would still be too small for large vCPU counts. I wonder though what all this memory is needed for before the domain (not to speak of secondary CPUs) actually gets started. If that could be got under control, tools side adjustment would not be necessary. Tim? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-14 09:09 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Jan Beulich wrote:>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> >> Currently guest initialization process in xend >> (XendDomainInfo.py) is: >> >> _constructDomain() --> domain_create() --> domain_max_vcpus() ... >> --> _initDomain() --> shadow_mem_control() ... > > While the patch certainly matches what I had in mind, with this > sequence > it is clear that the tools still will need adjustment: The full > ballooning only happens from _initDomain(), and hence the > pre-reservation (from _constructDomain) of 4Mb would still be too > small for large vCPU counts.The pre-reservation of memory size should be no relationship with vcpu number. As vcpu hasn''t been initialized at that point. On vcpu initialization, the patch has ballooned the shadow size to 128 pages per vcpu. Did you find some piece of code that use the pre-reserved memory to allocate vcpu-related memory? Thanks! Dongxiao> > I wonder though what all this memory is needed for before the domain > (not to speak of secondary CPUs) actually gets started. If that could > be > got under control, tools side adjustment would not be necessary. Tim? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-14 09:27 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
On 14/01/2010 07:19, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:> + if ( !is_idle_domain(v->domain) ) > + { > + shadow_lock(v->domain); > + sh_set_allocation(v->domain, 128, NULL); > + shadow_unlock(v->domain); > + }What if sh_set_allocation() fails? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-14 09:36 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 10:09 >>> >Jan Beulich wrote: >> it is clear that the tools still will need adjustment: The full >> ballooning only happens from _initDomain(), and hence the >> pre-reservation (from _constructDomain) of 4Mb would still be too >> small for large vCPU counts. > >The pre-reservation of memory size should be no relationship with >vcpu number. As vcpu hasn''t been initialized at that point. On vcpu >initialization, the patch has ballooned the shadow size to 128 pages >per vcpu. Did you find some piece of code that use the pre-reserved >memory to allocate vcpu-related memory?Your patch adds a call to sh_set_allocation() in a call tree starting at XEN_DOMCTL_max_vcpus. That in turn originates from the call to xc.domain_max_vcpus() in the tools, which happens between the pre-reservation (_constructDomain()) and the full ballooning (_initDomain()). Hence at this point only a maximum of 4Mb (as stated before, with an unknown fraction of it being suitable) can be assumed to be available in Xen, but 128 vCPU-s require 64Mb. In order to make the pre-reservation not more complicated (and error prone), I was asking whether (and if so, how and why) this memory really is needed this early. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Jan-14 12:46 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
At 09:00 +0000 on 14 Jan (1263459616), Jan Beulich wrote:> >>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> > > Currently guest initialization process in xend (XendDomainInfo.py) is: > > > > _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> > > _initDomain() --> shadow_mem_control() ... > > While the patch certainly matches what I had in mind, with this sequence > it is clear that the tools still will need adjustment: The full ballooning only > happens from _initDomain(), and hence the pre-reservation (from > _constructDomain) of 4Mb would still be too small for large vCPU counts. > > I wonder though what all this memory is needed for before the domain > (not to speak of secondary CPUs) actually gets started. If that could be > got under control, tools side adjustment would not be necessary. Tim?Hmmm. Some shadow memory has to be allocated before the VCPUs are initialized so that they can be given monitor pagetables etc. Some shadow memory has to be allocated before the guest''s main memory is assigned because the p2m is built out of shadow memory. It looks like there are (at least) two bugs here: - shadow_min_acceptable_pages() returns 0 for a domain with no vcpus, leading to a domain having shadow enabled but no shadow memory at all. - shadow_min_acceptable_pages() increases as more VCPUs are assigned but the shadow allocation is never increased to match. Fixing the first one should be enough, so long as xend assignd vcpus and memory before assigning shadow memory properly (which I believe it does). Patch attached. The separate issue of how much memory should be ballooned out before starting a domain really needs a full overhaul of all allocations in Xen so that we can assign blame for every page. :) Shadow memory is the best-behaved part of all this; most HVM overhead is just allocated anonymously. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-21 16:24 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
What''s the status of this? As I had pointed out before, this change alone (without a tools side adjustment) will not work. And the question still wasn''t answered clearly whether the full amount of memory will really be needed at this point (i.e. before the second stage of initialization, which happens after the full size ballooning has taken place). Of course, the patch as it was presented here would at least restore old behavior for guests with not too many vCPU-s (and get hypervisor and tools back in sync again), so I''d rather see this patch applied than nothing further happening at all. Thanks, Jan>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>>Keir and Jan, Ignore the previous mail, the patch in the text is incomplete... I am working on the issue "pre-reservation of memory for domain creation". Now I have the following findings. Currently guest initialization process in xend (XendDomainInfo.py) is: _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> _initDomain() --> shadow_mem_control() ... In domain_create, previously we reserve 1M memory for domain creation (as described in xend comment), and these memory SHOULD NOT related with vcpu number. And later, shadow_mem_control() will modify the shadow size to 256 pages per vcpu (also plus some other values related with guest memory size...). Therefore the C/S 20389 which modifies 1M to 4M to fit more vcpu number is wrong. I''m sorry for that. Following is the reason why currently 1M doesn''t work for big number vcpus, as we mentioned, it caused Xen crash. Each time when sh_set_allocation() is called, it checks whether shadow_min_acceptable_pages() has been allocated, if not, it will allocate them. That is to say, it is 128 pages per vcpu. But before we define d->max_vcpu, guest vcpu hasn''t been initialized, so shadow_min_acceptable_pages() always returns 0. Therefore we only allocated 1M shadow memory for domain_create, and didn''t satisfy 128 pages per vcpu for alloc_vcpu(). As we know, vcpu allocation is done in the hypercall of XEN_DOMCTL_max_vcpus. However, at this point we haven''t called shadow_mem_control() and are still using the pre-allocated 1M shadow memory to allocate so many vcpus. So it should be a BUG. Therefore when vcpu number increases, 1M is not enough and causes Xen crash. C/S 20389 exposes this issue. So I think the right process should be, after d->max_vcpu is set and before alloc_vcpu(), we should call sh_set_allocation() to satisfy 128 pages per vcpu. The following patch does this work. Is it work for you? Thanks! Best Regards, -- Dongxiao Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> diff -r 13d4e78ede97 xen/arch/x86/mm/shadow/common.c --- a/xen/arch/x86/mm/shadow/common.c Wed Jan 13 08:33:34 2010 +0000 +++ b/xen/arch/x86/mm/shadow/common.c Thu Jan 14 14:02:23 2010 +0800 @@ -41,6 +41,9 @@ DEFINE_PER_CPU(uint32_t,trace_shadow_path_flags); +static unsigned int sh_set_allocation(struct domain *d, + unsigned int pages, + int *preempted); /* Set up the shadow-specific parts of a domain struct at start of day. * Called for every domain from arch_domain_create() */ void shadow_domain_init(struct domain *d, unsigned int domcr_flags) @@ -82,6 +85,12 @@ void shadow_vcpu_init(struct vcpu *v) } #endif + if ( !is_idle_domain(v->domain) ) + { + shadow_lock(v->domain); + sh_set_allocation(v->domain, 128, NULL); + shadow_unlock(v->domain); + } v->arch.paging.mode = &SHADOW_INTERNAL_NAME(sh_paging_mode, 3); } @@ -3099,7 +3108,7 @@ int shadow_enable(struct domain *d, u32 { unsigned int r; shadow_lock(d); - r = sh_set_allocation(d, 1024, NULL); /* Use at least 4MB */ + r = sh_set_allocation(d, 256, NULL); /* Use at least 1MB */ if ( r != 0 ) { sh_set_allocation(d, 0, NULL); Jan Beulich wrote:>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 13.01.10 03:34 >>> >> If we didn''t add this change, as Keir said, Xen will crash during >> destruction of partially-created domain. > > If this indeed is reproducible, I think it should be fixed. > >> However I didn''t noticed the toolstack and >> shadow_min_acceptable_pages() side at that time... >> For now, should we adjust the shadow pre-alloc size to match >> shadow_min_acceptable_pages() and modify toolstack accordingly? > > I would say so, just with the problem that I can''t reliable say what > "accordingly" here would be (and hence I can''t craft a patch I can > guarantee will work at least in most of the cases). > > And as said before, I''m also not convinced that using the maximum > possible number of vCPU-s for this initial calculation is really the > right thing to do. > > JanBest Regards, -- Dongxiao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-22 04:28 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Hi, Jan, This patch may not work because when allocating 128 pages for each vcpu, tool side hasn''t balloon so much memory for it. So the allocation may fail. What about this solution, we change the tool side to balloon more memory for pre-reserve memory (For example 8M),and pre-alloc 4M for domain creation (which works for 128 vcpus)? Anyway like the comment in _constructDomain, it''s still somewhat hacky. Thanks! Dongxiao Jan Beulich wrote:> What''s the status of this? As I had pointed out before, this change > alone (without a tools side adjustment) will not work. And the > question still > wasn''t answered clearly whether the full amount of memory will really > be > needed at this point (i.e. before the second stage of initialization, > which > happens after the full size ballooning has taken place). > > Of course, the patch as it was presented here would at least restore > old behavior for guests with not too many vCPU-s (and get hypervisor > and tools back in sync again), so I''d rather see this patch applied > than > nothing further happening at all. > > Thanks, Jan > >>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> Keir and >>>> Jan, > > Ignore the previous mail, the patch in the text is incomplete... > > I am working on the issue "pre-reservation of memory for domain > creation". Now I have the following findings. > > Currently guest initialization process in xend > (XendDomainInfo.py) is: > > _constructDomain() --> domain_create() --> domain_max_vcpus() ... > --> _initDomain() --> shadow_mem_control() ... > > In domain_create, previously we reserve 1M memory for domain > creation (as described in xend comment), and these memory SHOULD NOT > related with vcpu number. And later, shadow_mem_control() will modify > the shadow size to 256 pages per vcpu (also plus some other values > related with guest memory size...). Therefore the C/S 20389 which > modifies 1M to 4M to fit more vcpu number is wrong. I''m sorry for > that. > > Following is the reason why currently 1M doesn''t work for big > number vcpus, as we mentioned, it caused Xen crash. > > Each time when sh_set_allocation() is called, it checks whether > shadow_min_acceptable_pages() has been allocated, if not, it will > allocate them. That is to say, it is 128 pages per vcpu. But before > we define d->max_vcpu, guest vcpu hasn''t been initialized, so > shadow_min_acceptable_pages() always returns 0. Therefore we only > allocated 1M shadow memory for domain_create, and didn''t satisfy 128 > pages per vcpu for alloc_vcpu(). > > As we know, vcpu allocation is done in the hypercall of > XEN_DOMCTL_max_vcpus. However, at this point we haven''t called > shadow_mem_control() and are still using the pre-allocated 1M shadow > memory to allocate so many vcpus. So it should be a BUG. Therefore > when vcpu number increases, 1M is not enough and causes Xen crash. > C/S 20389 exposes this issue. > > So I think the right process should be, after d->max_vcpu is set > and before alloc_vcpu(), we should call sh_set_allocation() to > satisfy 128 pages per vcpu. The following patch does this work. Is it > work for you? Thanks! > > Best Regards, > -- Dongxiao > > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > diff -r 13d4e78ede97 xen/arch/x86/mm/shadow/common.c > --- a/xen/arch/x86/mm/shadow/common.c Wed Jan 13 08:33:34 2010 +0000 > +++ b/xen/arch/x86/mm/shadow/common.c Thu Jan 14 14:02:23 2010 +0800 > @@ -41,6 +41,9 @@ > > DEFINE_PER_CPU(uint32_t,trace_shadow_path_flags); > > +static unsigned int sh_set_allocation(struct domain *d, > + unsigned int pages, > + int *preempted); > /* Set up the shadow-specific parts of a domain struct at start of > day. * Called for every domain from arch_domain_create() */ > void shadow_domain_init(struct domain *d, unsigned int domcr_flags) > @@ -82,6 +85,12 @@ void shadow_vcpu_init(struct vcpu *v) > } > #endif > > + if ( !is_idle_domain(v->domain) ) > + { > + shadow_lock(v->domain); > + sh_set_allocation(v->domain, 128, NULL); > + shadow_unlock(v->domain); > + } > v->arch.paging.mode = &SHADOW_INTERNAL_NAME(sh_paging_mode, 3); > } > > @@ -3099,7 +3108,7 @@ int shadow_enable(struct domain *d, u32 > { > unsigned int r; > shadow_lock(d); > - r = sh_set_allocation(d, 1024, NULL); /* Use at least 4MB */ > + r = sh_set_allocation(d, 256, NULL); /* Use at least 1MB */ > if ( r != 0 ) > { > sh_set_allocation(d, 0, NULL); > > > > Jan Beulich wrote: >>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 13.01.10 03:34 >>> >>> If we didn''t add this change, as Keir said, Xen will crash during >>> destruction of partially-created domain. >> >> If this indeed is reproducible, I think it should be fixed. >> >>> However I didn''t noticed the toolstack and >>> shadow_min_acceptable_pages() side at that time... >>> For now, should we adjust the shadow pre-alloc size to match >>> shadow_min_acceptable_pages() and modify toolstack accordingly? >> >> I would say so, just with the problem that I can''t reliable say what >> "accordingly" here would be (and hence I can''t craft a patch I can >> guarantee will work at least in most of the cases). >> >> And as said before, I''m also not convinced that using the maximum >> possible number of vCPU-s for this initial calculation is really the >> right thing to do. >> >> Jan > > Best Regards, > -- Dongxiao_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-22 08:13 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 22.01.10 05:28 >>> >Hi, Jan, > This patch may not work because when allocating 128 pages for >each vcpu, tool side hasn''t balloon so much memory for it. So the >allocation may fail. > What about this solution, we change the tool side to balloon >more memory for pre-reserve memory (For example 8M),and >pre-alloc 4M for domain creation (which works for 128 vcpus)?Based on what information do you judge that 4M will work for 128 vCPU-s? And how did you conclude that ballooning 8M will be sufficient to be able to allocate 4M (before the patch in question 4M got ballooned in order to be able to allocate 1M)?>Anyway like the comment in _constructDomain, it''s still somewhat >hacky.Indeed. Which is why I''m concerned about extending this hack. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Jan-22 09:20 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Jan, Removing this 1M pre-allocation of shadow memory may have difficulty. If we want to ensure safety, we should balloon out enough memory before any shadow operation. This needs to move the full balloon of memory in the beginning place of _constructDomain(), and this needs the domain vcpu number information. This change may have a flaw that we easily ballon out so much memory at an early stage and guest maybe could not correctly boot up (for example, error configuration). Another problem is that, currently the operations of setting vcpu number and alloc_vcpu() are done within one hypercall. Memory ballooning and allocation should be called between them in order to allocate 128 pages for each vcpu (Actually I think this "128" is also from experience). Thanks! Dongxiao Jan Beulich wrote:>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 22.01.10 05:28 >>> >> Hi, Jan, >> This patch may not work because when allocating 128 pages for >> each vcpu, tool side hasn''t balloon so much memory for it. So the >> allocation may fail. What about this solution, we change the tool >> side to balloon more memory for pre-reserve memory (For example >> 8M),and pre-alloc 4M for domain creation (which works for 128 vcpus)? > > Based on what information do you judge that 4M will work for 128 > vCPU-s? > > And how did you conclude that ballooning 8M will be sufficient to be > able to allocate 4M (before the patch in question 4M got ballooned in > order to be able to allocate 1M)? > >> Anyway like the comment in _constructDomain, it''s still somewhat >> hacky. > > Indeed. Which is why I''m concerned about extending this hack. > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Feb-08 16:25 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
>>> Tim Deegan <Tim.Deegan@citrix.com> 14.01.10 13:46 >>> >At 09:00 +0000 on 14 Jan (1263459616), Jan Beulich wrote: >> >>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> >> > Currently guest initialization process in xend (XendDomainInfo.py) is: >> > >> > _constructDomain() --> domain_create() --> domain_max_vcpus() ... --> >> > _initDomain() --> shadow_mem_control() ... >> >> While the patch certainly matches what I had in mind, with this sequence >> it is clear that the tools still will need adjustment: The full ballooning only >> happens from _initDomain(), and hence the pre-reservation (from >> _constructDomain) of 4Mb would still be too small for large vCPU counts. >> >> I wonder though what all this memory is needed for before the domain >> (not to speak of secondary CPUs) actually gets started. If that could be >> got under control, tools side adjustment would not be necessary. Tim? > >Hmmm. Some shadow memory has to be allocated before the VCPUs are >initialized so that they can be given monitor pagetables etc. Some >shadow memory has to be allocated before the guest''s main memory is >assigned because the p2m is built out of shadow memory.So is there a way to quantify that? In particular, is that *initial* amount in any way dependent on the number of vCPU-s?>Fixing the first one should be enough, so long as xend assignd vcpus and >memory before assigning shadow memory properly (which I believe it >does). Patch attached.The full memory assignment happens after vCPU-s got assigned, but the initial assignment happens before. Dongxiao''s patch tried to account for that, but neither was that patch accepted so far, nor am I convinced this is really correct or even necessary. I''m re-raising this question because we''re not seeming to make any progress towards a satisfactory resolution of the regression c/s 20389 introduced. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xu, Dongxiao
2010-Feb-09 08:02 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
Jan Beulich wrote:>>>> Tim Deegan <Tim.Deegan@citrix.com> 14.01.10 13:46 >>> >> At 09:00 +0000 on 14 Jan (1263459616), Jan Beulich wrote: >>>>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 14.01.10 08:19 >>> >>>> Currently guest initialization process in xend >>>> (XendDomainInfo.py) is: >>>> >>>> _constructDomain() --> domain_create() --> domain_max_vcpus() >>>> ... --> _initDomain() --> shadow_mem_control() ... >>> >>> While the patch certainly matches what I had in mind, with this >>> sequence >>> it is clear that the tools still will need adjustment: The full >>> ballooning only happens from _initDomain(), and hence the >>> pre-reservation (from _constructDomain) of 4Mb would still be too >>> small for large vCPU counts. >>> >>> I wonder though what all this memory is needed for before the domain >>> (not to speak of secondary CPUs) actually gets started. If that >>> could be >>> got under control, tools side adjustment would not be necessary. >>> Tim? >> >> Hmmm. Some shadow memory has to be allocated before the VCPUs are >> initialized so that they can be given monitor pagetables etc. Some >> shadow memory has to be allocated before the guest''s main memory is >> assigned because the p2m is built out of shadow memory. > > So is there a way to quantify that? In particular, is that *initial* > amount in any way dependent on the number of vCPU-s? > >> Fixing the first one should be enough, so long as xend assignd vcpus >> and memory before assigning shadow memory properly (which I believe >> it does). Patch attached. > > The full memory assignment happens after vCPU-s got assigned, but > the initial assignment happens before. Dongxiao''s patch tried to > account for that, but neither was that patch accepted so far, nor am I > convinced this is really correct or even necessary.The patch I attached last time could not solve this issue. The reason is the same that, at the point when shadow allocates memory for each vcpu, xend hasn''t ballooned out enough memory. I discussed this issue within our team, however we could''t achieve a good solution currently since Chinese New Year vacation will soon start. Anyway I made a work-around patch for it, though you may not like it. Thanks! Dongxiao Xend: Enlarge the memory balloon size for domain creation since shadow pre-allocation size has changed from 1M to 4M. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> diff -r 5b895c3f4386 tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Mon Feb 08 10:18:51 2010 +0000 +++ b/tools/python/xen/xend/XendDomainInfo.py Tue Feb 09 15:07:47 2010 +0800 @@ -2519,9 +2519,8 @@ class XendDomainInfo: # There is an implicit memory overhead for any domain creation. This # overhead is greater for some types of domain than others. For # example, an x86 HVM domain will have a default shadow-pagetable - # allocation of 1MB. We free up 4MB here to be on the safe side. - # 2MB memory allocation was not enough in some cases, so it''s 4MB now - balloon.free(4*1024, self) # 4MB should be plenty + # allocation of 4MB. We free up 16MB here to be on the safe side. + balloon.free(16*1024, self) # 16MB should be plenty ssidref = 0 if security.on() == xsconstants.XS_POLICY_USE:> > I''m re-raising this question because we''re not seeming to make any > progress towards a satisfactory resolution of the regression c/s 20389 > introduced. > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Feb-09 09:18 UTC
RE: [Xen-devel] pre-reservation of memory for domain creation
>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 09.02.10 09:02 >>> >The patch I attached last time could not solve this issue. The reason is >the same that, at the point when shadow allocates memory for each vcpu, >xend hasn''t ballooned out enough memory.This is why I pinged Tim again.>I discussed this issue within our team, however we could''t achieve a >good solution currently since Chinese New Year vacation will soon start. >Anyway I made a work-around patch for it, though you may not like it.Yes, this is what I don''t want to do unless absolutely necessary. At the risk of unduly repeating myself - ballooning out 4Mb to be able to allocate 1Mb in 4-page chunks was empirically sufficient (though there never was a guarantee). Increasing the to-be-allocated and to-be- ballooned-out amounts by the same factor does not yield the same confidence that allocation will actually succeed. Hence my goal to reduce the initial allocation (to be satisfied from the pre-ballooning) as much as possible. Unfortunately there was no reply from Tim on that matter so far. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Feb-09 10:39 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
At 16:25 +0000 on 08 Feb (1265646333), Jan Beulich wrote:> >Hmmm. Some shadow memory has to be allocated before the VCPUs are > >initialized so that they can be given monitor pagetables etc. Some > >shadow memory has to be allocated before the guest''s main memory is > >assigned because the p2m is built out of shadow memory. > > So is there a way to quantify that? In particular, is that *initial* > amount in any way dependent on the number of vCPU-s?Sort of, and not very. We currently allocate about a page per megabyte for p2m; we use a small amount per vcpu (maybe as many as 7 pages) before the vpcu can be scheduled.> I''m re-raising this question because we''re not seeming to make any > progress towards a satisfactory resolution of the regression c/s 20389 > introduced.I thought the regression was the Xen crash and that should be fixed by the patch I sent. Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Feb-09 11:21 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
>>> Tim Deegan <Tim.Deegan@citrix.com> 09.02.10 11:39 >>> >At 16:25 +0000 on 08 Feb (1265646333), Jan Beulich wrote: >> >Hmmm. Some shadow memory has to be allocated before the VCPUs are >> >initialized so that they can be given monitor pagetables etc. Some >> >shadow memory has to be allocated before the guest''s main memory is >> >assigned because the p2m is built out of shadow memory. >> >> So is there a way to quantify that? In particular, is that *initial* >> amount in any way dependent on the number of vCPU-s? > >Sort of, and not very. We currently allocate about a page per megabyte >for p2m; we use a small amount per vcpu (maybe as many as 7 pages) >before the vpcu can be scheduled.Sounds all pretty vague to base calculations upon. Also, a per-megabyte allocation seems questionable at this point, since with that even the code prior to 20389 would have failed (with 1M pre-allocated you wouldn''t have been able to create a 1G guest). Besides that I don''t think Xen even knows the memory size of the guest prior to the full ballooning having taken place in Dom0.>> I''m re-raising this question because we''re not seeming to make any >> progress towards a satisfactory resolution of the regression c/s 20389 >> introduced. > >I thought the regression was the Xen crash and that should be fixed by >the patch I sent.The Xen crash was what Keir talked about; I was referring to the increased early allocation which continues to be out of sync with the ballooning happening in the tools (and hence in environments where ballooning is being used likely has no chance of succeeding). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Feb-09 11:34 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
At 11:21 +0000 on 09 Feb (1265714487), Jan Beulich wrote:> The Xen crash was what Keir talked about; I was referring to the > increased early allocation which continues to be out of sync with the > ballooning happening in the tools (and hence in environments where > ballooning is being used likely has no chance of succeeding).OK; the early allocation, for a domain with no vcpus and no RAM, is a constant 4 MiB, and just needs to be added to whatever overheads are already being taken into account (domain and vcpu structs, VMCBs, &c). If there''s a problem with allocating the RAM before the proper shadow allocation is set up, then xend should set the shadow allocation before it builds the guest. Does it not already do that? Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Feb-09 12:41 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
>>> Tim Deegan <Tim.Deegan@citrix.com> 09.02.10 12:34 >>> >At 11:21 +0000 on 09 Feb (1265714487), Jan Beulich wrote: >> The Xen crash was what Keir talked about; I was referring to the >> increased early allocation which continues to be out of sync with the >> ballooning happening in the tools (and hence in environments where >> ballooning is being used likely has no chance of succeeding). > >OK; the early allocation, for a domain with no vcpus and no RAM, is a >constant 4 MiB, and just needs to be added to whatever overheads are >already being taken into account (domain and vcpu structs, VMCBs, &c).The issue is that this used to be 1Mb. Hence the question whether in fact it needs to be 4Mb now (which it got increased to when max HVM vCPU-s got grown to 128). And if indeed the increase was necessary, then to make it depend on the number of vCPU-s the VM will actually have (hence the question on how much memory is needed at this stage per vCPU). The point here is that on one hand there''s no clear picture how much memory is really needed before the full shadow allocation gets set up, and on the other hand Keir is recalling issues resulting from not growing the amount to 4Mb. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Feb-09 13:47 UTC
Re: [Xen-devel] pre-reservation of memory for domain creation
At 12:41 +0000 on 09 Feb (1265719265), Jan Beulich wrote:> >OK; the early allocation, for a domain with no vcpus and no RAM, is a > >constant 4 MiB, and just needs to be added to whatever overheads are > >already being taken into account (domain and vcpu structs, VMCBs, &c). > > The issue is that this used to be 1Mb. Hence the question whether > in fact it needs to be 4Mb now (which it got increased to when max > HVM vCPU-s got grown to 128). > > The point here is that on one hand there''s no clear picture how > much memory is really needed before the full shadow allocation > gets set up, and on the other hand Keir is recalling issues > resulting from not growing the amount to 4Mb.Righto. The change from 1MB to 4MB is not the right fix for those issues -- instead, Xend ought to set the shadow memory allocation before it allocates vcpus. Tim -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel