tmem could cause Dom0 allocation to fail with error "Not enough RAM for DOM0 reservation", particularly when used without dom0_mem option. Following is the sequence : - init_tmem allocates a set of pages and sets up dstmem and workmem to alloc pages in MP case (with cpu notifiers) - construct_dom0 estimates nr_pages by calling avail_domheap_pages - On other CPUs, tmem cpu_notifier gets called and allocates pages from domheap, making the construct_dom0''s estimate stale. - construct_dom0 fails tmem=off or dom0_mem=xxx both solve the problem for now. -dulloor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jun-19 07:26 UTC
Re: [Xen-devel] tmem and construct_dom0 memory allocation race
On 19/06/2010 00:10, "Dulloor" <dulloor@gmail.com> wrote:> Following is the sequence : > - init_tmem allocates a set of pages and sets up dstmem and workmem to > alloc pages in MP case (with cpu notifiers) > - construct_dom0 estimates nr_pages by calling avail_domheap_pages > - On other CPUs, tmem cpu_notifier gets called and allocates pages > from domheap, making the construct_dom0''s estimate stale. > - construct_dom0 fails > > tmem=off or dom0_mem=xxx both solve the problem for now.Xen boot is pretty serialised. In particular SMP boot, and all cpu notification calls, should be done before dom0 is constructed. So, have you actually seen this race? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Jun-21 15:35 UTC
RE: [Xen-devel] tmem and construct_dom0 memory allocation race
Hi Dulloor -- Intel had previously reported a failure for 2.6.18-xen dom0+tmem with dom0_mem unspecified. I''m not sure if this is the same bug or not. The latest versions of the Linux-side tmem patch disable tmem by default (in Linux, not Xen!) and require a kernel boot option to turn it on. Since dom0 is special and I''ve done very little testing with dom0 using tmem (as tmem is primarily used with guests), I think the correct (at least short-term) fix for this will be to not enable tmem for dom0 when dom0_mem is unspecified. I haven''t gotten around to updating 2.6.18-xen for awhile, assuming it is increasingly rarely used (except in products where dom0_mem is always specified). I''ll try to submit a major update to the Linux-side tmem patch for the 2.6.18-xen tree soon so at least it is consistent with other Linux-side Xen patches. Dan> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Saturday, June 19, 2010 1:27 AM > To: Dulloor; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] tmem and construct_dom0 memory allocation race > > On 19/06/2010 00:10, "Dulloor" <dulloor@gmail.com> wrote: > > > Following is the sequence : > > - init_tmem allocates a set of pages and sets up dstmem and workmem > to > > alloc pages in MP case (with cpu notifiers) > > - construct_dom0 estimates nr_pages by calling avail_domheap_pages > > - On other CPUs, tmem cpu_notifier gets called and allocates pages > > from domheap, making the construct_dom0''s estimate stale. > > - construct_dom0 fails > > > > tmem=off or dom0_mem=xxx both solve the problem for now. > > Xen boot is pretty serialised. In particular SMP boot, and all cpu > notification calls, should be done before dom0 is constructed. So, have > you > actually seen this race? > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dulloor
2010-Jun-22 07:17 UTC
Re: [Xen-devel] tmem and construct_dom0 memory allocation race
Hi Dan, I am using a pvops kernel. Hi Keir, You are right .. there is no race. I spent some time debugging this. The problem is that a zero-order allocation (from alloc_chunk, for the last dom0 page) fails with tmem on (in alloc_heap_pages), even though there are pages available in the heap. I don''t think tmem really intends to get triggered so early. What do you think ? Also, on an unrelated note, the number of pages estimated for dom0 (nr_pages) could be off by (opt_dom0_vcpus/16) pages, due to the perdomain_pt_page allocation (in vcpu_initialise). thanks dulloor On Mon, Jun 21, 2010 at 8:35 AM, Dan Magenheimer <dan.magenheimer@oracle.com> wrote:> Hi Dulloor -- > > Intel had previously reported a failure for 2.6.18-xen > dom0+tmem with dom0_mem unspecified. I''m not sure if > this is the same bug or not. > > The latest versions of the Linux-side tmem patch disable > tmem by default (in Linux, not Xen!) and require a kernel > boot option to turn it on. Since dom0 is special and > I''ve done very little testing with dom0 using tmem (as > tmem is primarily used with guests), I think the correct > (at least short-term) fix for this will be to not enable > tmem for dom0 when dom0_mem is unspecified. I haven''t > gotten around to updating 2.6.18-xen for awhile, assuming > it is increasingly rarely used (except in products where > dom0_mem is always specified). > > I''ll try to submit a major update to the Linux-side > tmem patch for the 2.6.18-xen tree soon so at least > it is consistent with other Linux-side Xen patches. > > Dan > >> -----Original Message----- >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: Saturday, June 19, 2010 1:27 AM >> To: Dulloor; xen-devel@lists.xensource.com >> Subject: Re: [Xen-devel] tmem and construct_dom0 memory allocation race >> >> On 19/06/2010 00:10, "Dulloor" <dulloor@gmail.com> wrote: >> >> > Following is the sequence : >> > - init_tmem allocates a set of pages and sets up dstmem and workmem >> to >> > alloc pages in MP case (with cpu notifiers) >> > - construct_dom0 estimates nr_pages by calling avail_domheap_pages >> > - On other CPUs, tmem cpu_notifier gets called and allocates pages >> > from domheap, making the construct_dom0''s estimate stale. >> > - construct_dom0 fails >> > >> > tmem=off or dom0_mem=xxx both solve the problem for now. >> >> Xen boot is pretty serialised. In particular SMP boot, and all cpu >> notification calls, should be done before dom0 is constructed. So, have >> you >> actually seen this race? >> >> -- Keir >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jun-22 07:36 UTC
Re: [Xen-devel] tmem and construct_dom0 memory allocation race
>>> On 22.06.10 at 09:17, Dulloor <dulloor@gmail.com> wrote: > Hi Keir, You are right .. there is no race. I spent some time > debugging this. The problem is that a zero-order allocation (from > alloc_chunk, for the last dom0 page) fails with tmem on (in > alloc_heap_pages), even though there are pages available in the heap. > I don''t think tmem really intends to get triggered so early. What do > you think ?How can that allocation fail if the heap isn''t empty? How can tmem get into the picture when Dom0 didn''t even start yet?> Also, on an unrelated note, the number of pages estimated for dom0 > (nr_pages) could be off by (opt_dom0_vcpus/16) pages, due to the > perdomain_pt_page allocation (in vcpu_initialise).Certainly this could also be included in the calculation, but you can''t make Dom0 consume all of the memory Xen has available anyway, so the worst that can happen afaict is that Dom0''s swiotlb could end up being a few pages smaller than intended. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jun-22 07:38 UTC
Re: [Xen-devel] tmem and construct_dom0 memory allocation race
On 22/06/2010 08:17, "Dulloor" <dulloor@gmail.com> wrote:> Hi Keir, You are right .. there is no race. I spent some time > debugging this. The problem is that a zero-order allocation (from > alloc_chunk, for the last dom0 page) fails with tmem on (in > alloc_heap_pages), even though there are pages available in the heap. > I don''t think tmem really intends to get triggered so early. What do > you think ?That''s one for Dan to comment on.> Also, on an unrelated note, the number of pages estimated for dom0 > (nr_pages) could be off by (opt_dom0_vcpus/16) pages, due to the > perdomain_pt_page allocation (in vcpu_initialise).You could send a patch for this. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Jun-22 17:23 UTC
RE: [Xen-devel] tmem and construct_dom0 memory allocation race
> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Subject: Re: [Xen-devel] tmem and construct_dom0 memory allocation race > > On 22/06/2010 08:17, "Dulloor" <dulloor@gmail.com> wrote: > > > Hi Keir, You are right .. there is no race. I spent some time > > debugging this. The problem is that a zero-order allocation (from > > alloc_chunk, for the last dom0 page) fails with tmem on (in > > alloc_heap_pages), even though there are pages available in the heap. > > I don''t think tmem really intends to get triggered so early. What do > > you think ? > > That''s one for Dan to comment on.Hmmm... the special casing in alloc_heap_pages to avoid fragmentation need not be invoked if tmem doesn''t hold any pages (as is the case at dom0 boot)... Does this patch fix the problem? If so... Signed-off-by: Dan Magenheimer diff -r a24dbfcbdf69 xen/common/page_alloc.c --- a/xen/common/page_alloc.c Tue Jun 22 07:19:38 2010 +0100 +++ b/xen/common/page_alloc.c Tue Jun 22 11:17:44 2010 -0600 @@ -316,11 +316,14 @@ static struct page_info *alloc_heap_page spin_lock(&heap_lock); /* - * TMEM: When available memory is scarce, allow only mid-size allocations - * to avoid worst of fragmentation issues. Others try TMEM pools then fail. + * TMEM: When available memory is scarce due to tmem absorbing it, allow + * only mid-size allocations to avoid worst of fragmentation issues. + * Others try tmem pools then fail. This is a workaround until all + * post-dom0-creation-multi-page allocations can be eliminated. */ if ( opt_tmem && ((order == 0) || (order >= 9)) && - (total_avail_pages <= midsize_alloc_zone_pages) ) + (total_avail_pages <= midsize_alloc_zone_pages) && + tmem_freeable_pages() ) goto try_tmem; /* diff -r a24dbfcbdf69 xen/common/tmem.c --- a/xen/common/tmem.c Tue Jun 22 07:19:38 2010 +0100 +++ b/xen/common/tmem.c Tue Jun 22 11:17:44 2010 -0600 @@ -2850,6 +2850,11 @@ EXPORT void *tmem_relinquish_pages(unsig return pfp; } +EXPORT unsigned long tmem_freeable_pages(void) +{ + return tmh_freeable_pages(); +} + /* called at hypervisor startup */ static int __init init_tmem(void) { diff -r a24dbfcbdf69 xen/include/xen/tmem.h --- a/xen/include/xen/tmem.h Tue Jun 22 07:19:38 2010 +0100 +++ b/xen/include/xen/tmem.h Tue Jun 22 11:17:44 2010 -0600 @@ -11,6 +11,7 @@ extern void tmem_destroy(void *); extern void *tmem_relinquish_pages(unsigned int, unsigned int); +extern unsigned long tmem_freeable_pages(void); extern int opt_tmem; #endif /* __XEN_TMEM_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dulloor
2010-Jun-22 18:56 UTC
Re: [Xen-devel] tmem and construct_dom0 memory allocation race
On Tue, Jun 22, 2010 at 10:23 AM, Dan Magenheimer <dan.magenheimer@oracle.com> wrote:>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Subject: Re: [Xen-devel] tmem and construct_dom0 memory allocation race >> >> On 22/06/2010 08:17, "Dulloor" <dulloor@gmail.com> wrote: >> >> > Hi Keir, You are right .. there is no race. I spent some time >> > debugging this. The problem is that a zero-order allocation (from >> > alloc_chunk, for the last dom0 page) fails with tmem on (in >> > alloc_heap_pages), even though there are pages available in the heap. >> > I don''t think tmem really intends to get triggered so early. What do >> > you think ? >> >> That''s one for Dan to comment on. > > Hmmm... the special casing in alloc_heap_pages to avoid fragmentation > need not be invoked if tmem doesn''t hold any pages (as is the > case at dom0 boot)... > > Does this patch fix the problem? If so...I have already tried something like this and it works. Also, we could check for tmem_freeable_pages right after opt_tmem check and before doing other order and fragmentation checks.> > Signed-off-by: Dan Magenheimer > > diff -r a24dbfcbdf69 xen/common/page_alloc.c > --- a/xen/common/page_alloc.c Tue Jun 22 07:19:38 2010 +0100 > +++ b/xen/common/page_alloc.c Tue Jun 22 11:17:44 2010 -0600 > @@ -316,11 +316,14 @@ static struct page_info *alloc_heap_page > spin_lock(&heap_lock); > > /* > - * TMEM: When available memory is scarce, allow only mid-size allocations > - * to avoid worst of fragmentation issues. Others try TMEM pools then fail. > + * TMEM: When available memory is scarce due to tmem absorbing it, allow > + * only mid-size allocations to avoid worst of fragmentation issues. > + * Others try tmem pools then fail. This is a workaround until all > + * post-dom0-creation-multi-page allocations can be eliminated. > */ > if ( opt_tmem && ((order == 0) || (order >= 9)) && > - (total_avail_pages <= midsize_alloc_zone_pages) ) > + (total_avail_pages <= midsize_alloc_zone_pages) && > + tmem_freeable_pages() ) > goto try_tmem; > > /* > diff -r a24dbfcbdf69 xen/common/tmem.c > --- a/xen/common/tmem.c Tue Jun 22 07:19:38 2010 +0100 > +++ b/xen/common/tmem.c Tue Jun 22 11:17:44 2010 -0600 > @@ -2850,6 +2850,11 @@ EXPORT void *tmem_relinquish_pages(unsig > return pfp; > } > > +EXPORT unsigned long tmem_freeable_pages(void) > +{ > + return tmh_freeable_pages(); > +} > + > /* called at hypervisor startup */ > static int __init init_tmem(void) > { > diff -r a24dbfcbdf69 xen/include/xen/tmem.h > --- a/xen/include/xen/tmem.h Tue Jun 22 07:19:38 2010 +0100 > +++ b/xen/include/xen/tmem.h Tue Jun 22 11:17:44 2010 -0600 > @@ -11,6 +11,7 @@ > > extern void tmem_destroy(void *); > extern void *tmem_relinquish_pages(unsigned int, unsigned int); > +extern unsigned long tmem_freeable_pages(void); > extern int opt_tmem; > > #endif /* __XEN_TMEM_H__ */ >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel