Guests with vCPU count not divisible by 4 have unused bits in the last word of their inuse bitmap, and the garbage collection code therefore would get mislead believing that some entries were actually recoverable for use. Also use an earlier established local variable in mapcache_vcpu_init() instead of re-calculating the value (noticed while investigating the generally better option of setting those overhanging bits once during setup - this didn''t work out in a simple enough fashion because the mapping getting established there isn''t in the current address space, and hence the bitmap isn''t directly accessible there). Reported-by: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn) idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor); if ( unlikely(idx >= dcache->entries) ) { - unsigned long accum = 0; + unsigned long accum = 0, prev = 0; /* /First/, clean the garbage map and update the inuse list. */ for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) { + accum |= prev; dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); - accum |= ~dcache->inuse[i]; + prev = ~dcache->inuse[i]; } - if ( accum ) + if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) ) idx = find_first_zero_bit(dcache->inuse, dcache->entries); else { @@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v) if ( ents > dcache->entries ) { /* Populate page tables. */ - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, - d->max_vcpus * MAPCACHE_VCPU_ENTRIES, + int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents, NIL(l1_pgentry_t *), NULL); /* Populate bit maps. */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Keir Fraser
2013-Jun-12 17:27 UTC
Re: [PATCH] x86: fix map_domain_page() last resort fallback
On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote:> Guests with vCPU count not divisible by 4 have unused bits in the last > word of their inuse bitmap, and the garbage collection code therefore > would get mislead believing that some entries were actually recoverable > for use. > > Also use an earlier established local variable in mapcache_vcpu_init() > instead of re-calculating the value (noticed while investigating the > generally better option of setting those overhanging bits once during > setup - this didn''t work out in a simple enough fashion because the > mapping getting established there isn''t in the current address space, > and hence the bitmap isn''t directly accessible there). > > Reported-by: Konrad Wilk <konrad.wilk@oracle.com> > Signed-off-by: Jan Beulich <jbeulich@suse.com>Whilst I can''t argue against this as the obvious bugfix to the existing code, I personally object to clawing back hash-table entries at all. The size of the per-vcpu hashtable is small, and it should be perfectly possible to always allow enough extra entries in the mapcache to always be able to allocate an entry even when all vcpu''s maphash buckets are in use. Perhaps this is the right fix for 4.3 at this point, but in that case I am quite inclined to simplify this down after 4.3, sidestepping the whole issue. -- Keir> --- a/xen/arch/x86/domain_page.c > +++ b/xen/arch/x86/domain_page.c > @@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn) > idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor); > if ( unlikely(idx >= dcache->entries) ) > { > - unsigned long accum = 0; > + unsigned long accum = 0, prev = 0; > > /* /First/, clean the garbage map and update the inuse list. */ > for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) > { > + accum |= prev; > dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); > - accum |= ~dcache->inuse[i]; > + prev = ~dcache->inuse[i]; > } > > - if ( accum ) > + if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) ) > idx = find_first_zero_bit(dcache->inuse, dcache->entries); > else > { > @@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v) > if ( ents > dcache->entries ) > { > /* Populate page tables. */ > - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, > - d->max_vcpus * > MAPCACHE_VCPU_ENTRIES, > + int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents, > NIL(l1_pgentry_t *), NULL); > > /* Populate bit maps. */ > > >
Jan Beulich
2013-Jun-13 07:49 UTC
Re: [PATCH] x86: fix map_domain_page() last resort fallback
>>> On 12.06.13 at 19:27, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote: > >> Guests with vCPU count not divisible by 4 have unused bits in the last >> word of their inuse bitmap, and the garbage collection code therefore >> would get mislead believing that some entries were actually recoverable >> for use. >> >> Also use an earlier established local variable in mapcache_vcpu_init() >> instead of re-calculating the value (noticed while investigating the >> generally better option of setting those overhanging bits once during >> setup - this didn''t work out in a simple enough fashion because the >> mapping getting established there isn''t in the current address space, >> and hence the bitmap isn''t directly accessible there). >> >> Reported-by: Konrad Wilk <konrad.wilk@oracle.com> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Whilst I can''t argue against this as the obvious bugfix to the existing > code, I personally object to clawing back hash-table entries at all. The > size of the per-vcpu hashtable is small, and it should be perfectly possible > to always allow enough extra entries in the mapcache to always be able to > allocate an entry even when all vcpu''s maphash buckets are in use. > > Perhaps this is the right fix for 4.3 at this point, but in that case I am > quite inclined to simplify this down after 4.3, sidestepping the whole > issue.I won''t object undoing this, and moving the MAPHASH_ENTRIES definition into config.h, but I also won''t put my name under it. Jan
Keir Fraser
2013-Jun-13 08:06 UTC
Re: [PATCH] x86: fix map_domain_page() last resort fallback
On 13/06/2013 08:49, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 12.06.13 at 19:27, Keir Fraser <keir.xen@gmail.com> wrote: >> On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>> Guests with vCPU count not divisible by 4 have unused bits in the last >>> word of their inuse bitmap, and the garbage collection code therefore >>> would get mislead believing that some entries were actually recoverable >>> for use. >>> >>> Also use an earlier established local variable in mapcache_vcpu_init() >>> instead of re-calculating the value (noticed while investigating the >>> generally better option of setting those overhanging bits once during >>> setup - this didn''t work out in a simple enough fashion because the >>> mapping getting established there isn''t in the current address space, >>> and hence the bitmap isn''t directly accessible there). >>> >>> Reported-by: Konrad Wilk <konrad.wilk@oracle.com> >>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> Whilst I can''t argue against this as the obvious bugfix to the existing >> code, I personally object to clawing back hash-table entries at all. The >> size of the per-vcpu hashtable is small, and it should be perfectly possible >> to always allow enough extra entries in the mapcache to always be able to >> allocate an entry even when all vcpu''s maphash buckets are in use. >> >> Perhaps this is the right fix for 4.3 at this point, but in that case I am >> quite inclined to simplify this down after 4.3, sidestepping the whole >> issue. > > I won''t object undoing this, and moving the MAPHASH_ENTRIES > definition into config.h, but I also won''t put my name under it.I think your fix is best for 4.3. Let''s get it checked in. Acked-by: Keir Fraser <keir@xen.org>> Jan >