Jan Beulich
2011-Aug-16 14:07 UTC
[Xen-devel] [PATCH] xen/x86: replace order-based range checking of M2P table by linear one
The order-based approach is not only less efficient (requiring a shift and a compare, typical generated code looking like this mov eax, [machine_to_phys_order] mov ecx, eax shr ebx, cl test ebx, ebx jnz ... whereas a direct check requires just a compare, like in cmp ebx, [machine_to_phys_nr] jae ... ), but also slightly dangerous in the 32-on-64 case - the element address calculation can wrap if the next power of two boundary is sufficiently far away from the actual upper limit of the table, and hence can result in user space addresses being accessed (with it being unknown what may actually be mapped there). Additionally, the elimination of the mistaken use of fls() here (should have been __fls()) fixes a latent issue on x86-64 that would trigger if the code was run on a system with memory extending beyond the 44-bit boundary. Signed-off-by: Jan Beulich <jbeulich@novell.com> --- arch/x86/include/asm/xen/page.h | 4 ++-- arch/x86/xen/enlighten.c | 4 ++-- arch/x86/xen/mmu.c | 12 ++++++++---- 3 files changed, 12 insertions(+), 8 deletions(-) --- 3.1-rc2/arch/x86/include/asm/xen/page.h +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/include/asm/xen/page.h @@ -39,7 +39,7 @@ typedef struct xpaddr { ((unsigned long)((u64)CONFIG_XEN_MAX_DOMAIN_MEMORY * 1024 * 1024 * 1024 / PAGE_SIZE)) extern unsigned long *machine_to_phys_mapping; -extern unsigned int machine_to_phys_order; +extern unsigned long machine_to_phys_nr; extern unsigned long get_phys_to_machine(unsigned long pfn); extern bool set_phys_to_machine(unsigned long pfn, unsigned long mfn); @@ -87,7 +87,7 @@ static inline unsigned long mfn_to_pfn(u if (xen_feature(XENFEAT_auto_translated_physmap)) return mfn; - if (unlikely((mfn >> machine_to_phys_order) != 0)) { + if (unlikely(mfn >= machine_to_phys_nr)) { pfn = ~0; goto try_override; } --- 3.1-rc2/arch/x86/xen/enlighten.c +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/enlighten.c @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(xen_domain_type); unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START; EXPORT_SYMBOL(machine_to_phys_mapping); -unsigned int machine_to_phys_order; -EXPORT_SYMBOL(machine_to_phys_order); +unsigned long machine_to_phys_nr; +EXPORT_SYMBOL(machine_to_phys_nr); struct start_info *xen_start_info; EXPORT_SYMBOL_GPL(xen_start_info); --- 3.1-rc2/arch/x86/xen/mmu.c +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/mmu.c @@ -1713,15 +1713,19 @@ static void __init xen_map_identity_earl void __init xen_setup_machphys_mapping(void) { struct xen_machphys_mapping mapping; - unsigned long machine_to_phys_nr_ents; if (HYPERVISOR_memory_op(XENMEM_machphys_mapping, &mapping) == 0) { machine_to_phys_mapping = (unsigned long *)mapping.v_start; - machine_to_phys_nr_ents = mapping.max_mfn + 1; + machine_to_phys_nr = mapping.max_mfn + 1; } else { - machine_to_phys_nr_ents = MACH2PHYS_NR_ENTRIES; + machine_to_phys_nr = MACH2PHYS_NR_ENTRIES; } - machine_to_phys_order = fls(machine_to_phys_nr_ents - 1); +#ifdef CONFIG_X86_32 + if (machine_to_phys_mapping + machine_to_phys_nr + < machine_to_phys_mapping) + machine_to_phys_nr = (unsigned long *)NULL + - machine_to_phys_mapping; +#endif } #ifdef CONFIG_X86_64 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:13 UTC
Re: [Xen-devel] [PATCH] xen/x86: replace order-based range checking of M2P table by linear one
On Tue, Aug 16, 2011 at 03:07:41PM +0100, Jan Beulich wrote:> The order-based approach is not only less efficient (requiring a shift > and a compare, typical generated code looking like this > > mov eax, [machine_to_phys_order] > mov ecx, eax > shr ebx, cl > test ebx, ebx > jnz ... > > whereas a direct check requires just a compare, like in > > cmp ebx, [machine_to_phys_nr] > jae ... > > ), but also slightly dangerous in the 32-on-64 case - the element > address calculation can wrap if the next power of two boundary is > sufficiently far away from the actual upper limit of the table, and > hence can result in user space addresses being accessed (with it being > unknown what may actually be mapped there).Ugh. OK, let me queue it up for 3.1 - this also sounds like a good candidate for stable trees.> > Additionally, the elimination of the mistaken use of fls() here (should > have been __fls()) fixes a latent issue on x86-64 that would trigger > if the code was run on a system with memory extending beyond the 44-bit > boundary. > > Signed-off-by: Jan Beulich <jbeulich@novell.com> > > --- > arch/x86/include/asm/xen/page.h | 4 ++-- > arch/x86/xen/enlighten.c | 4 ++-- > arch/x86/xen/mmu.c | 12 ++++++++---- > 3 files changed, 12 insertions(+), 8 deletions(-) > > --- 3.1-rc2/arch/x86/include/asm/xen/page.h > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/include/asm/xen/page.h > @@ -39,7 +39,7 @@ typedef struct xpaddr { > ((unsigned long)((u64)CONFIG_XEN_MAX_DOMAIN_MEMORY * 1024 * 1024 * 1024 / PAGE_SIZE)) > > extern unsigned long *machine_to_phys_mapping; > -extern unsigned int machine_to_phys_order; > +extern unsigned long machine_to_phys_nr; > > extern unsigned long get_phys_to_machine(unsigned long pfn); > extern bool set_phys_to_machine(unsigned long pfn, unsigned long mfn); > @@ -87,7 +87,7 @@ static inline unsigned long mfn_to_pfn(u > if (xen_feature(XENFEAT_auto_translated_physmap)) > return mfn; > > - if (unlikely((mfn >> machine_to_phys_order) != 0)) { > + if (unlikely(mfn >= machine_to_phys_nr)) { > pfn = ~0; > goto try_override; > } > --- 3.1-rc2/arch/x86/xen/enlighten.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/enlighten.c > @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(xen_domain_type); > > unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START; > EXPORT_SYMBOL(machine_to_phys_mapping); > -unsigned int machine_to_phys_order; > -EXPORT_SYMBOL(machine_to_phys_order); > +unsigned long machine_to_phys_nr; > +EXPORT_SYMBOL(machine_to_phys_nr); > > struct start_info *xen_start_info; > EXPORT_SYMBOL_GPL(xen_start_info); > --- 3.1-rc2/arch/x86/xen/mmu.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/mmu.c > @@ -1713,15 +1713,19 @@ static void __init xen_map_identity_earl > void __init xen_setup_machphys_mapping(void) > { > struct xen_machphys_mapping mapping; > - unsigned long machine_to_phys_nr_ents; > > if (HYPERVISOR_memory_op(XENMEM_machphys_mapping, &mapping) == 0) { > machine_to_phys_mapping = (unsigned long *)mapping.v_start; > - machine_to_phys_nr_ents = mapping.max_mfn + 1; > + machine_to_phys_nr = mapping.max_mfn + 1; > } else { > - machine_to_phys_nr_ents = MACH2PHYS_NR_ENTRIES; > + machine_to_phys_nr = MACH2PHYS_NR_ENTRIES; > } > - machine_to_phys_order = fls(machine_to_phys_nr_ents - 1); > +#ifdef CONFIG_X86_32 > + if (machine_to_phys_mapping + machine_to_phys_nr > + < machine_to_phys_mapping) > + machine_to_phys_nr = (unsigned long *)NULL > + - machine_to_phys_mapping; > +#endif > } > > #ifdef CONFIG_X86_64 > > >> The order-based approach is not only less efficient (requiring a shift > and a compare, typical generated code looking like this > > mov eax, [machine_to_phys_order] > mov ecx, eax > shr ebx, cl > test ebx, ebx > jnz ... > > whereas a direct check requires just a compare, like in > > cmp ebx, [machine_to_phys_nr] > jae ... > > ), but also slightly dangerous in the 32-on-64 case - the element > address calculation can wrap if the next power of two boundary is > sufficiently far away from the actual upper limit of the table, and > hence can result in user space addresses being accessed (with it being > unknown what may actually be mapped there). > > Additionally, the elimination of the mistaken use of fls() here (should > have been __fls()) fixes a latent issue on x86-64 that would trigger > if the code was run on a system with memory extending beyond the 44-bit > boundary. > > Signed-off-by: Jan Beulich <jbeulich@novell.com> > > --- > arch/x86/include/asm/xen/page.h | 4 ++-- > arch/x86/xen/enlighten.c | 4 ++-- > arch/x86/xen/mmu.c | 12 ++++++++---- > 3 files changed, 12 insertions(+), 8 deletions(-) > > --- 3.1-rc2/arch/x86/include/asm/xen/page.h > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/include/asm/xen/page.h > @@ -39,7 +39,7 @@ typedef struct xpaddr { > ((unsigned long)((u64)CONFIG_XEN_MAX_DOMAIN_MEMORY * 1024 * 1024 * 1024 / PAGE_SIZE)) > > extern unsigned long *machine_to_phys_mapping; > -extern unsigned int machine_to_phys_order; > +extern unsigned long machine_to_phys_nr; > > extern unsigned long get_phys_to_machine(unsigned long pfn); > extern bool set_phys_to_machine(unsigned long pfn, unsigned long mfn); > @@ -87,7 +87,7 @@ static inline unsigned long mfn_to_pfn(u > if (xen_feature(XENFEAT_auto_translated_physmap)) > return mfn; > > - if (unlikely((mfn >> machine_to_phys_order) != 0)) { > + if (unlikely(mfn >= machine_to_phys_nr)) { > pfn = ~0; > goto try_override; > } > --- 3.1-rc2/arch/x86/xen/enlighten.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/enlighten.c > @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(xen_domain_type); > > unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START; > EXPORT_SYMBOL(machine_to_phys_mapping); > -unsigned int machine_to_phys_order; > -EXPORT_SYMBOL(machine_to_phys_order); > +unsigned long machine_to_phys_nr; > +EXPORT_SYMBOL(machine_to_phys_nr); > > struct start_info *xen_start_info; > EXPORT_SYMBOL_GPL(xen_start_info); > --- 3.1-rc2/arch/x86/xen/mmu.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/mmu.c > @@ -1713,15 +1713,19 @@ static void __init xen_map_identity_earl > void __init xen_setup_machphys_mapping(void) > { > struct xen_machphys_mapping mapping; > - unsigned long machine_to_phys_nr_ents; > > if (HYPERVISOR_memory_op(XENMEM_machphys_mapping, &mapping) == 0) { > machine_to_phys_mapping = (unsigned long *)mapping.v_start; > - machine_to_phys_nr_ents = mapping.max_mfn + 1; > + machine_to_phys_nr = mapping.max_mfn + 1; > } else { > - machine_to_phys_nr_ents = MACH2PHYS_NR_ENTRIES; > + machine_to_phys_nr = MACH2PHYS_NR_ENTRIES; > } > - machine_to_phys_order = fls(machine_to_phys_nr_ents - 1); > +#ifdef CONFIG_X86_32 > + if (machine_to_phys_mapping + machine_to_phys_nr > + < machine_to_phys_mapping) > + machine_to_phys_nr = (unsigned long *)NULL > + - machine_to_phys_mapping; > +#endif > } > > #ifdef CONFIG_X86_64> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Aug-16 17:45 UTC
[Xen-devel] Re: [PATCH] xen/x86: replace order-based range checking of M2P table by linear one
On 08/16/2011 07:07 AM, Jan Beulich wrote:> The order-based approach is not only less efficient (requiring a shift > and a compare, typical generated code looking like this > > mov eax, [machine_to_phys_order] > mov ecx, eax > shr ebx, cl > test ebx, ebx > jnz ... > > whereas a direct check requires just a compare, like in > > cmp ebx, [machine_to_phys_nr] > jae ... > > ), but also slightly dangerous in the 32-on-64 case - the element > address calculation can wrap if the next power of two boundary is > sufficiently far away from the actual upper limit of the table, and > hence can result in user space addresses being accessed (with it being > unknown what may actually be mapped there). > > Additionally, the elimination of the mistaken use of fls() here (should > have been __fls()) fixes a latent issue on x86-64 that would trigger > if the code was run on a system with memory extending beyond the 44-bit > boundary.I never really understood the rationale for the order stuff; I copied it across from 2.6.18-xen without really thinking about it. This looks sensible. But...> > Signed-off-by: Jan Beulich <jbeulich@novell.com> > > --- > arch/x86/include/asm/xen/page.h | 4 ++-- > arch/x86/xen/enlighten.c | 4 ++-- > arch/x86/xen/mmu.c | 12 ++++++++---- > 3 files changed, 12 insertions(+), 8 deletions(-) > > --- 3.1-rc2/arch/x86/include/asm/xen/page.h > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/include/asm/xen/page.h > @@ -39,7 +39,7 @@ typedef struct xpaddr { > ((unsigned long)((u64)CONFIG_XEN_MAX_DOMAIN_MEMORY * 1024 * 1024 * 1024 / PAGE_SIZE)) > > extern unsigned long *machine_to_phys_mapping; > -extern unsigned int machine_to_phys_order; > +extern unsigned long machine_to_phys_nr; > > extern unsigned long get_phys_to_machine(unsigned long pfn); > extern bool set_phys_to_machine(unsigned long pfn, unsigned long mfn); > @@ -87,7 +87,7 @@ static inline unsigned long mfn_to_pfn(u > if (xen_feature(XENFEAT_auto_translated_physmap)) > return mfn; > > - if (unlikely((mfn >> machine_to_phys_order) != 0)) { > + if (unlikely(mfn >= machine_to_phys_nr)) { > pfn = ~0; > goto try_override; > } > --- 3.1-rc2/arch/x86/xen/enlighten.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/enlighten.c > @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(xen_domain_type); > > unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START; > EXPORT_SYMBOL(machine_to_phys_mapping); > -unsigned int machine_to_phys_order; > -EXPORT_SYMBOL(machine_to_phys_order); > +unsigned long machine_to_phys_nr; > +EXPORT_SYMBOL(machine_to_phys_nr); > > struct start_info *xen_start_info; > EXPORT_SYMBOL_GPL(xen_start_info); > --- 3.1-rc2/arch/x86/xen/mmu.c > +++ 3.1-rc2-x86-xen-p2m-nr/arch/x86/xen/mmu.c > @@ -1713,15 +1713,19 @@ static void __init xen_map_identity_earl > void __init xen_setup_machphys_mapping(void) > { > struct xen_machphys_mapping mapping; > - unsigned long machine_to_phys_nr_ents; > > if (HYPERVISOR_memory_op(XENMEM_machphys_mapping, &mapping) == 0) { > machine_to_phys_mapping = (unsigned long *)mapping.v_start; > - machine_to_phys_nr_ents = mapping.max_mfn + 1; > + machine_to_phys_nr = mapping.max_mfn + 1; > } else { > - machine_to_phys_nr_ents = MACH2PHYS_NR_ENTRIES; > + machine_to_phys_nr = MACH2PHYS_NR_ENTRIES; > } > - machine_to_phys_order = fls(machine_to_phys_nr_ents - 1); > +#ifdef CONFIG_X86_32 > + if (machine_to_phys_mapping + machine_to_phys_nr > + < machine_to_phys_mapping)I''d prefer extra parens around the + just to make it very clear. Is this kind of overflow check fully defined, or could the compiler find some way of screwing it up?> + machine_to_phys_nr = (unsigned long *)NULL > + - machine_to_phys_mapping;Is the machine_to_phys_mapping array guaranteed to end at the end of the address space? And I think a literal ''0'' there would make it a bit clearer what''s going on, rather than invoking all the stuff that NULL implies. Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-17 06:58 UTC
[Xen-devel] Re: [PATCH] xen/x86: replace order-based range checking of M2P table by linear one
>>> On 16.08.11 at 19:45, Jeremy Fitzhardinge <jeremy@goop.org> wrote: > On 08/16/2011 07:07 AM, Jan Beulich wrote: >> +#ifdef CONFIG_X86_32 >> + if (machine_to_phys_mapping + machine_to_phys_nr >> + < machine_to_phys_mapping) > > I''d prefer extra parens around the + just to make it very clear. Is > this kind of overflow check fully defined, or could the compiler find > some way of screwing it up?This check is fully defined afaik, ...>> + machine_to_phys_nr = (unsigned long *)NULL >> + - machine_to_phys_mapping; > > Is the machine_to_phys_mapping array guaranteed to end at the end of the > address space? And I think a literal ''0'' there would make it a bit > clearer what''s going on, rather than invoking all the stuff that NULL > implies.... and this operation is as long as machine_to_phys_mapping is aligned to a long boundary (which we know it is). machine_to_phys_mapping not necessarily ends at the end of address space (it never will on a 32-bit hypervisor, and the 64-bit one puts a 2Mb guard page at the end), but the purpose here is not to determine a precise machine_to_phys_nr, but just to avoid overflow, since the number reported by the hypervisor isn''t necessarily precise. But wait, only on x86-64 it''s possibly much larger than the actual number of entries, while what a 32-bit kernel gets to see (no matter whether running on a 32- or 64-bit hypervisor) should really never itself result in an overflow (only the previous order-based calculation could), so it should even be okay to make this a BUG_ON(). I may give this a try, and come up with an incremental patch (but I don''t see a pressing need to re-spin this one). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel