Jan Beulich
2013-Jul-11 11:30 UTC
[PATCH] add locking around certain calls to map_pages_to_xen()
While boot time calls don''t need this, run time uses of the function which may result in L2 page tables to get populated need to be serialized to avoid two CPUs populating the same L2 (or L3) entry, overwriting each other''s results. This fixes what would seem to be a regression from commit b0581b92 ("x86: make map_domain_page_global() a simple wrapper around vmap()"), albeit that change only made more readily visible the already existing issue. The __init addition to memguard_init(), while seemingly unrelated, helps making obvious that this function''s use of map_pages_to_xen() is a boot time only one. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5757,13 +5757,33 @@ void destroy_xen_mappings(unsigned long void __set_fixmap( enum fixed_addresses idx, unsigned long mfn, unsigned long flags) { - BUG_ON(idx >= __end_of_fixed_addresses); + /* + * As the fixmap area requires more than a single L2 entry, we need a lock + * to avoid races of map_pages_to_xen() populating those. There''s no need + * for this at boot time, and not doing so avoids needing to disable IRQs + * while holding the lock. + */ + static DEFINE_SPINLOCK(lock); + + if ( system_state > SYS_STATE_boot ) + spin_lock(&lock); map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); + if ( system_state > SYS_STATE_boot ) + spin_unlock(&lock); } void *__init arch_vmap_virt_end(void) { - return (void *)fix_to_virt(__end_of_fixed_addresses); + /* + * If the fixmap area required more than a single L3 entry, we''d need to + * lock against vmap()''s calls to map_pages_to_xen(). Pre-populate the + * L3 entry possibly shared with vmap(), and make sure we don''t share an + * L2 entry with it. + */ + BUILD_BUG_ON(((FIXADDR_TOP - 1) ^ FIXADDR_START) >> L3_PAGETABLE_SHIFT); + map_pages_to_xen(fix_to_virt(__end_of_fixed_addresses), 0, 1, _PAGE_AVAIL); + return (void *)(fix_to_virt(__end_of_fixed_addresses) & + (~0L << L2_PAGETABLE_SHIFT)); } void __iomem *ioremap(paddr_t pa, size_t len) @@ -6011,7 +6031,7 @@ void free_perdomain_mappings(struct doma #ifdef MEMORY_GUARD -void memguard_init(void) +void __init memguard_init(void) { unsigned long start = max_t(unsigned long, xen_phys_start, 1UL << 20); map_pages_to_xen( --- a/xen/arch/x86/x86_64/mmconfig_64.c +++ b/xen/arch/x86/x86_64/mmconfig_64.c @@ -118,6 +118,8 @@ static void __iomem *mcfg_ioremap(const unsigned long idx, unsigned int prot) { unsigned long virt, size; + int rc; + static DEFINE_SPINLOCK(lock); virt = PCI_MCFG_VIRT_START + (idx << mmcfg_pci_segment_shift) + (cfg->start_bus_number << 20); @@ -125,10 +127,13 @@ static void __iomem *mcfg_ioremap(const if (virt + size < virt || virt + size > PCI_MCFG_VIRT_END) return NULL; - if (map_pages_to_xen(virt, - (cfg->address >> PAGE_SHIFT) + - (cfg->start_bus_number << (20 - PAGE_SHIFT)), - size >> PAGE_SHIFT, prot)) + spin_lock(&lock); + rc = map_pages_to_xen(virt, + (cfg->address >> PAGE_SHIFT) + + (cfg->start_bus_number << (20 - PAGE_SHIFT)), + size >> PAGE_SHIFT, prot); + spin_unlock(&lock); + if ( rc ) return NULL; return (void __iomem *) virt; --- a/xen/common/vmap.c +++ b/xen/common/vmap.c @@ -179,7 +179,12 @@ void *__vmap(const unsigned long *mfn, u for ( ; va && nr--; ++mfn, cur += PAGE_SIZE * granularity ) { - if ( map_pages_to_xen(cur, *mfn, granularity, flags) ) + int rc; + + spin_lock(&vm_lock); + rc = map_pages_to_xen(cur, *mfn, granularity, flags); + spin_unlock(&vm_lock); + if ( rc ) { vunmap(va); va = NULL; @@ -198,7 +203,9 @@ void vunmap(const void *va) { unsigned long addr = (unsigned long)va; + spin_lock(&vm_lock); destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); + spin_unlock(&vm_lock); vm_free(va); } #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2013-Jul-11 11:37 UTC
Re: [PATCH] add locking around certain calls to map_pages_to_xen()
On 11/07/13 12:30, Jan Beulich wrote:> While boot time calls don''t need this, run time uses of the function > which may result in L2 page tables to get populated need to be > serialized to avoid two CPUs populating the same L2 (or L3) entry, > overwriting each other''s results. > > This fixes what would seem to be a regression from commit b0581b92 > ("x86: make map_domain_page_global() a simple wrapper around vmap()"), > albeit that change only made more readily visible the already existing > issue. > > The __init addition to memguard_init(), while seemingly unrelated, > helps making obvious that this function''s use of map_pages_to_xen() is > a boot time only one. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5757,13 +5757,33 @@ void destroy_xen_mappings(unsigned long > void __set_fixmap( > enum fixed_addresses idx, unsigned long mfn, unsigned long flags) > { > - BUG_ON(idx >= __end_of_fixed_addresses); > + /* > + * As the fixmap area requires more than a single L2 entry, we need a lock > + * to avoid races of map_pages_to_xen() populating those. There''s no need > + * for this at boot time, and not doing so avoids needing to disable IRQs > + * while holding the lock. > + */ > + static DEFINE_SPINLOCK(lock); > + > + if ( system_state > SYS_STATE_boot ) > + spin_lock(&lock); > map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); > + if ( system_state > SYS_STATE_boot ) > + spin_unlock(&lock);Perhaps read system_state to a local variable? While I really hope this cant happen, a race condition with system_state changing would be a very bad thing here. ~Andrew> } > > void *__init arch_vmap_virt_end(void) > { > - return (void *)fix_to_virt(__end_of_fixed_addresses); > + /* > + * If the fixmap area required more than a single L3 entry, we''d need to > + * lock against vmap()''s calls to map_pages_to_xen(). Pre-populate the > + * L3 entry possibly shared with vmap(), and make sure we don''t share an > + * L2 entry with it. > + */ > + BUILD_BUG_ON(((FIXADDR_TOP - 1) ^ FIXADDR_START) >> L3_PAGETABLE_SHIFT); > + map_pages_to_xen(fix_to_virt(__end_of_fixed_addresses), 0, 1, _PAGE_AVAIL); > + return (void *)(fix_to_virt(__end_of_fixed_addresses) & > + (~0L << L2_PAGETABLE_SHIFT)); > } > > void __iomem *ioremap(paddr_t pa, size_t len) > @@ -6011,7 +6031,7 @@ void free_perdomain_mappings(struct doma > > #ifdef MEMORY_GUARD > > -void memguard_init(void) > +void __init memguard_init(void) > { > unsigned long start = max_t(unsigned long, xen_phys_start, 1UL << 20); > map_pages_to_xen( > --- a/xen/arch/x86/x86_64/mmconfig_64.c > +++ b/xen/arch/x86/x86_64/mmconfig_64.c > @@ -118,6 +118,8 @@ static void __iomem *mcfg_ioremap(const > unsigned long idx, unsigned int prot) > { > unsigned long virt, size; > + int rc; > + static DEFINE_SPINLOCK(lock); > > virt = PCI_MCFG_VIRT_START + (idx << mmcfg_pci_segment_shift) + > (cfg->start_bus_number << 20); > @@ -125,10 +127,13 @@ static void __iomem *mcfg_ioremap(const > if (virt + size < virt || virt + size > PCI_MCFG_VIRT_END) > return NULL; > > - if (map_pages_to_xen(virt, > - (cfg->address >> PAGE_SHIFT) + > - (cfg->start_bus_number << (20 - PAGE_SHIFT)), > - size >> PAGE_SHIFT, prot)) > + spin_lock(&lock); > + rc = map_pages_to_xen(virt, > + (cfg->address >> PAGE_SHIFT) + > + (cfg->start_bus_number << (20 - PAGE_SHIFT)), > + size >> PAGE_SHIFT, prot); > + spin_unlock(&lock); > + if ( rc ) > return NULL; > > return (void __iomem *) virt; > --- a/xen/common/vmap.c > +++ b/xen/common/vmap.c > @@ -179,7 +179,12 @@ void *__vmap(const unsigned long *mfn, u > > for ( ; va && nr--; ++mfn, cur += PAGE_SIZE * granularity ) > { > - if ( map_pages_to_xen(cur, *mfn, granularity, flags) ) > + int rc; > + > + spin_lock(&vm_lock); > + rc = map_pages_to_xen(cur, *mfn, granularity, flags); > + spin_unlock(&vm_lock); > + if ( rc ) > { > vunmap(va); > va = NULL; > @@ -198,7 +203,9 @@ void vunmap(const void *va) > { > unsigned long addr = (unsigned long)va; > > + spin_lock(&vm_lock); > destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); > + spin_unlock(&vm_lock); > vm_free(va); > } > #endif > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Jul-11 11:56 UTC
Re: [PATCH] add locking around certain calls to map_pages_to_xen()
>>> On 11.07.13 at 13:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 11/07/13 12:30, Jan Beulich wrote: >> While boot time calls don''t need this, run time uses of the function >> which may result in L2 page tables to get populated need to be >> serialized to avoid two CPUs populating the same L2 (or L3) entry, >> overwriting each other''s results. >> >> This fixes what would seem to be a regression from commit b0581b92 >> ("x86: make map_domain_page_global() a simple wrapper around vmap()"), >> albeit that change only made more readily visible the already existing >> issue. >> >> The __init addition to memguard_init(), while seemingly unrelated, >> helps making obvious that this function''s use of map_pages_to_xen() is >> a boot time only one. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -5757,13 +5757,33 @@ void destroy_xen_mappings(unsigned long >> void __set_fixmap( >> enum fixed_addresses idx, unsigned long mfn, unsigned long flags) >> { >> - BUG_ON(idx >= __end_of_fixed_addresses); >> + /* >> + * As the fixmap area requires more than a single L2 entry, we need a > lock >> + * to avoid races of map_pages_to_xen() populating those. There''s no > need >> + * for this at boot time, and not doing so avoids needing to disable > IRQs >> + * while holding the lock. >> + */ >> + static DEFINE_SPINLOCK(lock); >> + >> + if ( system_state > SYS_STATE_boot ) >> + spin_lock(&lock); >> map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); >> + if ( system_state > SYS_STATE_boot ) >> + spin_unlock(&lock); > > Perhaps read system_state to a local variable? > > While I really hope this cant happen, a race condition with system_state > changing would be a very bad thing here.That really can''t happen, and hence I don''t see the need to use a local variable for latching the state. Jan
Jan Beulich
2013-Jul-12 08:17 UTC
[PATCH v2] add locking around certain calls to map_pages_to_xen()
While boot time calls don''t need this, run time uses of the function which may result in L2 page tables getting populated need to be serialized to avoid two CPUs populating the same L2 (or L3) entry, overwriting each other''s results. This fixes what would seem to be a regression from commit b0581b92 ("x86: make map_domain_page_global() a simple wrapper around vmap()"), albeit that change only made more readily visible the already existing issue. The __init addition to memguard_init(), while seemingly unrelated, helps making obvious that this function''s use of map_pages_to_xen() is a boot time only one. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Latch system_state into a local variable, as suggested by Andrew Cooper (in the hope that this might accelerate getting the patch in to overcome the stage tester failures). --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5758,13 +5758,34 @@ void destroy_xen_mappings(unsigned long void __set_fixmap( enum fixed_addresses idx, unsigned long mfn, unsigned long flags) { - BUG_ON(idx >= __end_of_fixed_addresses); + /* + * As the fixmap area requires more than a single L2 entry, we need a lock + * to avoid races of map_pages_to_xen() populating those. There''s no need + * for this at boot time, and not doing so avoids needing to disable IRQs + * while holding the lock. + */ + static DEFINE_SPINLOCK(lock); + bool_t locking = system_state > SYS_STATE_boot; + + if ( locking ) + spin_lock(&lock); map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); + if ( locking ) + spin_unlock(&lock); } void *__init arch_vmap_virt_end(void) { - return (void *)fix_to_virt(__end_of_fixed_addresses); + /* + * If the fixmap area required more than a single L3 entry, we''d need to + * lock against vmap()''s calls to map_pages_to_xen(). Pre-populate the + * L3 entry possibly shared with vmap(), and make sure we don''t share an + * L2 entry with it. + */ + BUILD_BUG_ON(((FIXADDR_TOP - 1) ^ FIXADDR_START) >> L3_PAGETABLE_SHIFT); + map_pages_to_xen(fix_to_virt(__end_of_fixed_addresses), 0, 1, _PAGE_AVAIL); + return (void *)(fix_to_virt(__end_of_fixed_addresses) & + (~0L << L2_PAGETABLE_SHIFT)); } void __iomem *ioremap(paddr_t pa, size_t len) @@ -6012,7 +6033,7 @@ void free_perdomain_mappings(struct doma #ifdef MEMORY_GUARD -void memguard_init(void) +void __init memguard_init(void) { unsigned long start = max_t(unsigned long, xen_phys_start, 1UL << 20); map_pages_to_xen( --- a/xen/arch/x86/x86_64/mmconfig_64.c +++ b/xen/arch/x86/x86_64/mmconfig_64.c @@ -118,6 +118,8 @@ static void __iomem *mcfg_ioremap(const unsigned long idx, unsigned int prot) { unsigned long virt, size; + int rc; + static DEFINE_SPINLOCK(lock); virt = PCI_MCFG_VIRT_START + (idx << mmcfg_pci_segment_shift) + (cfg->start_bus_number << 20); @@ -125,10 +127,13 @@ static void __iomem *mcfg_ioremap(const if (virt + size < virt || virt + size > PCI_MCFG_VIRT_END) return NULL; - if (map_pages_to_xen(virt, - (cfg->address >> PAGE_SHIFT) + - (cfg->start_bus_number << (20 - PAGE_SHIFT)), - size >> PAGE_SHIFT, prot)) + spin_lock(&lock); + rc = map_pages_to_xen(virt, + (cfg->address >> PAGE_SHIFT) + + (cfg->start_bus_number << (20 - PAGE_SHIFT)), + size >> PAGE_SHIFT, prot); + spin_unlock(&lock); + if ( rc ) return NULL; return (void __iomem *) virt; --- a/xen/common/vmap.c +++ b/xen/common/vmap.c @@ -179,7 +179,12 @@ void *__vmap(const unsigned long *mfn, u for ( ; va && nr--; ++mfn, cur += PAGE_SIZE * granularity ) { - if ( map_pages_to_xen(cur, *mfn, granularity, flags) ) + int rc; + + spin_lock(&vm_lock); + rc = map_pages_to_xen(cur, *mfn, granularity, flags); + spin_unlock(&vm_lock); + if ( rc ) { vunmap(va); va = NULL; @@ -198,7 +203,9 @@ void vunmap(const void *va) { unsigned long addr = (unsigned long)va; + spin_lock(&vm_lock); destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); + spin_unlock(&vm_lock); vm_free(va); } #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2013-Jul-12 09:48 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
On 12/07/13 09:17, Jan Beulich wrote:> While boot time calls don''t need this, run time uses of the function > which may result in L2 page tables getting populated need to be > serialized to avoid two CPUs populating the same L2 (or L3) entry, > overwriting each other''s results. > > This fixes what would seem to be a regression from commit b0581b92 > ("x86: make map_domain_page_global() a simple wrapper around vmap()"), > albeit that change only made more readily visible the already existing > issue. > > The __init addition to memguard_init(), while seemingly unrelated, > helps making obvious that this function''s use of map_pages_to_xen() is > a boot time only one. > > Signed-off-by: Jan Beulich <jbeulich@suse.com>Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>> --- > v2: Latch system_state into a local variable, as suggested by Andrew > Cooper (in the hope that this might accelerate getting the patch in > to overcome the stage tester failures). > > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5758,13 +5758,34 @@ void destroy_xen_mappings(unsigned long > void __set_fixmap( > enum fixed_addresses idx, unsigned long mfn, unsigned long flags) > { > - BUG_ON(idx >= __end_of_fixed_addresses); > + /* > + * As the fixmap area requires more than a single L2 entry, we need a lock > + * to avoid races of map_pages_to_xen() populating those. There''s no need > + * for this at boot time, and not doing so avoids needing to disable IRQs > + * while holding the lock. > + */ > + static DEFINE_SPINLOCK(lock); > + bool_t locking = system_state > SYS_STATE_boot; > + > + if ( locking ) > + spin_lock(&lock); > map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); > + if ( locking ) > + spin_unlock(&lock); > } > > void *__init arch_vmap_virt_end(void) > { > - return (void *)fix_to_virt(__end_of_fixed_addresses); > + /* > + * If the fixmap area required more than a single L3 entry, we''d need to > + * lock against vmap()''s calls to map_pages_to_xen(). Pre-populate the > + * L3 entry possibly shared with vmap(), and make sure we don''t share an > + * L2 entry with it. > + */ > + BUILD_BUG_ON(((FIXADDR_TOP - 1) ^ FIXADDR_START) >> L3_PAGETABLE_SHIFT); > + map_pages_to_xen(fix_to_virt(__end_of_fixed_addresses), 0, 1, _PAGE_AVAIL); > + return (void *)(fix_to_virt(__end_of_fixed_addresses) & > + (~0L << L2_PAGETABLE_SHIFT)); > } > > void __iomem *ioremap(paddr_t pa, size_t len) > @@ -6012,7 +6033,7 @@ void free_perdomain_mappings(struct doma > > #ifdef MEMORY_GUARD > > -void memguard_init(void) > +void __init memguard_init(void) > { > unsigned long start = max_t(unsigned long, xen_phys_start, 1UL << 20); > map_pages_to_xen( > --- a/xen/arch/x86/x86_64/mmconfig_64.c > +++ b/xen/arch/x86/x86_64/mmconfig_64.c > @@ -118,6 +118,8 @@ static void __iomem *mcfg_ioremap(const > unsigned long idx, unsigned int prot) > { > unsigned long virt, size; > + int rc; > + static DEFINE_SPINLOCK(lock); > > virt = PCI_MCFG_VIRT_START + (idx << mmcfg_pci_segment_shift) + > (cfg->start_bus_number << 20); > @@ -125,10 +127,13 @@ static void __iomem *mcfg_ioremap(const > if (virt + size < virt || virt + size > PCI_MCFG_VIRT_END) > return NULL; > > - if (map_pages_to_xen(virt, > - (cfg->address >> PAGE_SHIFT) + > - (cfg->start_bus_number << (20 - PAGE_SHIFT)), > - size >> PAGE_SHIFT, prot)) > + spin_lock(&lock); > + rc = map_pages_to_xen(virt, > + (cfg->address >> PAGE_SHIFT) + > + (cfg->start_bus_number << (20 - PAGE_SHIFT)), > + size >> PAGE_SHIFT, prot); > + spin_unlock(&lock); > + if ( rc ) > return NULL; > > return (void __iomem *) virt; > --- a/xen/common/vmap.c > +++ b/xen/common/vmap.c > @@ -179,7 +179,12 @@ void *__vmap(const unsigned long *mfn, u > > for ( ; va && nr--; ++mfn, cur += PAGE_SIZE * granularity ) > { > - if ( map_pages_to_xen(cur, *mfn, granularity, flags) ) > + int rc; > + > + spin_lock(&vm_lock); > + rc = map_pages_to_xen(cur, *mfn, granularity, flags); > + spin_unlock(&vm_lock); > + if ( rc ) > { > vunmap(va); > va = NULL; > @@ -198,7 +203,9 @@ void vunmap(const void *va) > { > unsigned long addr = (unsigned long)va; > > + spin_lock(&vm_lock); > destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); > + spin_unlock(&vm_lock); > vm_free(va); > } > #endif > >
Keir Fraser
2013-Jul-12 12:15 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
On 12/07/2013 09:17, "Jan Beulich" <JBeulich@suse.com> wrote:> While boot time calls don''t need this, run time uses of the function > which may result in L2 page tables getting populated need to be > serialized to avoid two CPUs populating the same L2 (or L3) entry, > overwriting each other''s results. > > This fixes what would seem to be a regression from commit b0581b92 > ("x86: make map_domain_page_global() a simple wrapper around vmap()"), > albeit that change only made more readily visible the already existing > issue. > > The __init addition to memguard_init(), while seemingly unrelated, > helps making obvious that this function''s use of map_pages_to_xen() is > a boot time only one.Why can''t the locking be implemented inside map_pages_to_xen()? The requirement is due to implementation details of that function after all. Pushing the synchronisation out to the callers is uglier and more fragile. -- Keir> Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Latch system_state into a local variable, as suggested by Andrew > Cooper (in the hope that this might accelerate getting the patch in > to overcome the stage tester failures). > > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5758,13 +5758,34 @@ void destroy_xen_mappings(unsigned long > void __set_fixmap( > enum fixed_addresses idx, unsigned long mfn, unsigned long flags) > { > - BUG_ON(idx >= __end_of_fixed_addresses); > + /* > + * As the fixmap area requires more than a single L2 entry, we need a > lock > + * to avoid races of map_pages_to_xen() populating those. There''s no need > + * for this at boot time, and not doing so avoids needing to disable IRQs > + * while holding the lock. > + */ > + static DEFINE_SPINLOCK(lock); > + bool_t locking = system_state > SYS_STATE_boot; > + > + if ( locking ) > + spin_lock(&lock); > map_pages_to_xen(fix_to_virt(idx), mfn, 1, flags); > + if ( locking ) > + spin_unlock(&lock); > } > > void *__init arch_vmap_virt_end(void) > { > - return (void *)fix_to_virt(__end_of_fixed_addresses); > + /* > + * If the fixmap area required more than a single L3 entry, we''d need to > + * lock against vmap()''s calls to map_pages_to_xen(). Pre-populate the > + * L3 entry possibly shared with vmap(), and make sure we don''t share an > + * L2 entry with it. > + */ > + BUILD_BUG_ON(((FIXADDR_TOP - 1) ^ FIXADDR_START) >> L3_PAGETABLE_SHIFT); > + map_pages_to_xen(fix_to_virt(__end_of_fixed_addresses), 0, 1, > _PAGE_AVAIL); > + return (void *)(fix_to_virt(__end_of_fixed_addresses) & > + (~0L << L2_PAGETABLE_SHIFT)); > } > > void __iomem *ioremap(paddr_t pa, size_t len) > @@ -6012,7 +6033,7 @@ void free_perdomain_mappings(struct doma > > #ifdef MEMORY_GUARD > > -void memguard_init(void) > +void __init memguard_init(void) > { > unsigned long start = max_t(unsigned long, xen_phys_start, 1UL << 20); > map_pages_to_xen( > --- a/xen/arch/x86/x86_64/mmconfig_64.c > +++ b/xen/arch/x86/x86_64/mmconfig_64.c > @@ -118,6 +118,8 @@ static void __iomem *mcfg_ioremap(const > unsigned long idx, unsigned int prot) > { > unsigned long virt, size; > + int rc; > + static DEFINE_SPINLOCK(lock); > > virt = PCI_MCFG_VIRT_START + (idx << mmcfg_pci_segment_shift) + > (cfg->start_bus_number << 20); > @@ -125,10 +127,13 @@ static void __iomem *mcfg_ioremap(const > if (virt + size < virt || virt + size > PCI_MCFG_VIRT_END) > return NULL; > > - if (map_pages_to_xen(virt, > - (cfg->address >> PAGE_SHIFT) + > - (cfg->start_bus_number << (20 - PAGE_SHIFT)), > - size >> PAGE_SHIFT, prot)) > + spin_lock(&lock); > + rc = map_pages_to_xen(virt, > + (cfg->address >> PAGE_SHIFT) + > + (cfg->start_bus_number << (20 - PAGE_SHIFT)), > + size >> PAGE_SHIFT, prot); > + spin_unlock(&lock); > + if ( rc ) > return NULL; > > return (void __iomem *) virt; > --- a/xen/common/vmap.c > +++ b/xen/common/vmap.c > @@ -179,7 +179,12 @@ void *__vmap(const unsigned long *mfn, u > > for ( ; va && nr--; ++mfn, cur += PAGE_SIZE * granularity ) > { > - if ( map_pages_to_xen(cur, *mfn, granularity, flags) ) > + int rc; > + > + spin_lock(&vm_lock); > + rc = map_pages_to_xen(cur, *mfn, granularity, flags); > + spin_unlock(&vm_lock); > + if ( rc ) > { > vunmap(va); > va = NULL; > @@ -198,7 +203,9 @@ void vunmap(const void *va) > { > unsigned long addr = (unsigned long)va; > > + spin_lock(&vm_lock); > destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); > + spin_unlock(&vm_lock); > vm_free(va); > } > #endif > >
Jan Beulich
2013-Jul-12 12:44 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
>>> On 12.07.13 at 14:15, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/07/2013 09:17, "Jan Beulich" <JBeulich@suse.com> wrote: > >> While boot time calls don''t need this, run time uses of the function >> which may result in L2 page tables getting populated need to be >> serialized to avoid two CPUs populating the same L2 (or L3) entry, >> overwriting each other''s results. >> >> This fixes what would seem to be a regression from commit b0581b92 >> ("x86: make map_domain_page_global() a simple wrapper around vmap()"), >> albeit that change only made more readily visible the already existing >> issue. >> >> The __init addition to memguard_init(), while seemingly unrelated, >> helps making obvious that this function''s use of map_pages_to_xen() is >> a boot time only one. > > Why can''t the locking be implemented inside map_pages_to_xen()? The > requirement is due to implementation details of that function after all. > Pushing the synchronisation out to the callers is uglier and more fragile.Not all use cases of the function require synchronization, so the only kind of synchronization I would see validly adding there instead of in the callers would be a mechanism marking a to-be- populated non-leaf page table entry as "being processed" first, and have racing invocations spin until that state clears. Afaict that wouldn''t cope with eventual (future) races through destroy_xen_mappings() though, and hence I think the proposed patch is the better alternative. But if you''re fine with ignoring that last aspect, I''m okay with going the alternative route. Jan
Keir Fraser
2013-Jul-12 13:37 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
On 12/07/2013 13:44, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 12.07.13 at 14:15, Keir Fraser <keir.xen@gmail.com> wrote: >> On 12/07/2013 09:17, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>> While boot time calls don''t need this, run time uses of the function >>> which may result in L2 page tables getting populated need to be >>> serialized to avoid two CPUs populating the same L2 (or L3) entry, >>> overwriting each other''s results. >>> >>> This fixes what would seem to be a regression from commit b0581b92 >>> ("x86: make map_domain_page_global() a simple wrapper around vmap()"), >>> albeit that change only made more readily visible the already existing >>> issue. >>> >>> The __init addition to memguard_init(), while seemingly unrelated, >>> helps making obvious that this function''s use of map_pages_to_xen() is >>> a boot time only one. >> >> Why can''t the locking be implemented inside map_pages_to_xen()? The >> requirement is due to implementation details of that function after all. >> Pushing the synchronisation out to the callers is uglier and more fragile. > > Not all use cases of the function require synchronization, so the > only kind of synchronization I would see validly adding there > instead of in the callers would be a mechanism marking a to-be- > populated non-leaf page table entry as "being processed" first, > and have racing invocations spin until that state clears. Afaict > that wouldn''t cope with eventual (future) races through > destroy_xen_mappings() though, and hence I think the proposed > patch is the better alternative. But if you''re fine with ignoring > that last aspect, I''m okay with going the alternative route.Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), or at least the parts that add new page tables?> Jan >
Jan Beulich
2013-Jul-12 13:41 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
>>> On 12.07.13 at 15:37, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/07/2013 13:44, "Jan Beulich" <JBeulich@suse.com> wrote: > >>>>> On 12.07.13 at 14:15, Keir Fraser <keir.xen@gmail.com> wrote: >>> On 12/07/2013 09:17, "Jan Beulich" <JBeulich@suse.com> wrote: >>> >>>> While boot time calls don''t need this, run time uses of the function >>>> which may result in L2 page tables getting populated need to be >>>> serialized to avoid two CPUs populating the same L2 (or L3) entry, >>>> overwriting each other''s results. >>>> >>>> This fixes what would seem to be a regression from commit b0581b92 >>>> ("x86: make map_domain_page_global() a simple wrapper around vmap()"), >>>> albeit that change only made more readily visible the already existing >>>> issue. >>>> >>>> The __init addition to memguard_init(), while seemingly unrelated, >>>> helps making obvious that this function''s use of map_pages_to_xen() is >>>> a boot time only one. >>> >>> Why can''t the locking be implemented inside map_pages_to_xen()? The >>> requirement is due to implementation details of that function after all. >>> Pushing the synchronisation out to the callers is uglier and more fragile. >> >> Not all use cases of the function require synchronization, so the >> only kind of synchronization I would see validly adding there >> instead of in the callers would be a mechanism marking a to-be- >> populated non-leaf page table entry as "being processed" first, >> and have racing invocations spin until that state clears. Afaict >> that wouldn''t cope with eventual (future) races through >> destroy_xen_mappings() though, and hence I think the proposed >> patch is the better alternative. But if you''re fine with ignoring >> that last aspect, I''m okay with going the alternative route. > > Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), or > at least the parts that add new page tables?I''m not certain about the safety of this, but clearly two CPUs changing entirely different parts of the address space don''t need to lock out one another, so I rather view adding a global lock here as being (potentially) harmful in terms of performance (and hence the thought of locking at page table entry granularity instead). Jan
Keir Fraser
2013-Jul-12 14:01 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
On 12/07/2013 14:41, "Jan Beulich" <JBeulich@suse.com> wrote:>> Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), or >> at least the parts that add new page tables? > > I''m not certain about the safety of this, but clearly two CPUs > changing entirely different parts of the address space don''t need > to lock out one another, so I rather view adding a global lock here > as being (potentially) harmful in terms of performance (and hence > the thought of locking at page table entry granularity instead).Ah, I see. Well, locking only on changes to page-directory entries wouldn''t be too bad, even if it were a single global lock? That would be a rare occurrence. It''s reasonable to assume that callers will not conflict on the page-aligned regions they modify, so this would suffice? -- Keir
Jan Beulich
2013-Jul-12 14:30 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
>>> On 12.07.13 at 16:01, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/07/2013 14:41, "Jan Beulich" <JBeulich@suse.com> wrote: > >>> Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), or >>> at least the parts that add new page tables? >> >> I''m not certain about the safety of this, but clearly two CPUs >> changing entirely different parts of the address space don''t need >> to lock out one another, so I rather view adding a global lock here >> as being (potentially) harmful in terms of performance (and hence >> the thought of locking at page table entry granularity instead). > > Ah, I see. Well, locking only on changes to page-directory entries wouldn''t > be too bad, even if it were a single global lock? That would be a rare > occurrence. It''s reasonable to assume that callers will not conflict on the > page-aligned regions they modify, so this would suffice?Well, okay, I''ll do it that way then. Are you okay with skipping the locking during boot, just as done in __set_fixmap() in the current version of the patch? Jan
Jan Beulich
2013-Jul-15 08:24 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
>>> On 12.07.13 at 16:30, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 12.07.13 at 16:01, Keir Fraser <keir.xen@gmail.com> wrote: >> On 12/07/2013 14:41, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>>> Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), or >>>> at least the parts that add new page tables? >>> >>> I''m not certain about the safety of this, but clearly two CPUs >>> changing entirely different parts of the address space don''t need >>> to lock out one another, so I rather view adding a global lock here >>> as being (potentially) harmful in terms of performance (and hence >>> the thought of locking at page table entry granularity instead). >> >> Ah, I see. Well, locking only on changes to page-directory entries wouldn''t >> be too bad, even if it were a single global lock? That would be a rare >> occurrence. It''s reasonable to assume that callers will not conflict on the >> page-aligned regions they modify, so this would suffice? > > Well, okay, I''ll do it that way then. Are you okay with skipping the > locking during boot, just as done in __set_fixmap() in the current > version of the patch?Further to this, there''s a worrying comment prior to map_pages_to_xen(): "map_pages_to_xen() can be called with interrupts disabled: * During early bootstrap; or * alloc_xenheap_pages() via memguard_guard_range" The former would be taken care of by only doing any locking post-boot, but the latter would also require to skip the locking when interrupts are disabled. Yet I wonder whether that part of the comment isn''t stale - alloc_xenheap_pages() surely shouldn''t be called with interrupts disabled? (I didn''t get to try yet whether check_lock() would trigger on any such path.) Jan
Keir Fraser
2013-Jul-15 08:36 UTC
Re: [PATCH v2] add locking around certain calls to map_pages_to_xen()
On 15/07/2013 09:24, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 12.07.13 at 16:30, "Jan Beulich" <JBeulich@suse.com> wrote: >>>>> On 12.07.13 at 16:01, Keir Fraser <keir.xen@gmail.com> wrote: >>> On 12/07/2013 14:41, "Jan Beulich" <JBeulich@suse.com> wrote: >>> >>>>> Is it unsafe to just stick a lock around the guts of map_pages_to_xen(), >>>>> or >>>>> at least the parts that add new page tables? >>>> >>>> I''m not certain about the safety of this, but clearly two CPUs >>>> changing entirely different parts of the address space don''t need >>>> to lock out one another, so I rather view adding a global lock here >>>> as being (potentially) harmful in terms of performance (and hence >>>> the thought of locking at page table entry granularity instead). >>> >>> Ah, I see. Well, locking only on changes to page-directory entries wouldn''t >>> be too bad, even if it were a single global lock? That would be a rare >>> occurrence. It''s reasonable to assume that callers will not conflict on the >>> page-aligned regions they modify, so this would suffice? >> >> Well, okay, I''ll do it that way then. Are you okay with skipping the >> locking during boot, just as done in __set_fixmap() in the current >> version of the patch?Yes.> Further to this, there''s a worrying comment prior to > map_pages_to_xen(): "map_pages_to_xen() can be called with > interrupts disabled: > * During early bootstrap; or > * alloc_xenheap_pages() via memguard_guard_range" > > The former would be taken care of by only doing any locking > post-boot, but the latter would also require to skip the locking > when interrupts are disabled. Yet I wonder whether that part > of the comment isn''t stale - alloc_xenheap_pages() surely > shouldn''t be called with interrupts disabled? (I didn''t get to > try yet whether check_lock() would trigger on any such path.)Agreed it must be stale! -- Keir> Jan >