Andres Lagar-Cavilla
2011-Nov-24 17:58 UTC
[PATCH] Don''t trigger unnecessary shadow scans on p2m entry update
xen/arch/x86/mm/shadow/common.c | 6 ++---- xen/arch/x86/mm/shadow/multi.c | 2 +- xen/arch/x86/mm/shadow/private.h | 6 ++++++ 3 files changed, 9 insertions(+), 5 deletions(-) When updating a p2m entry, the hypervisor scans all shadow pte''s to find mappings of that gfn and tear them down. This is avoided if the page count reveals that there are no additional mappings. The current test ignores the PGC_allocated flag and its effect on the page count. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/common.c --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -2464,7 +2464,6 @@ int sh_remove_write_access_from_sl1p(str int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn) { struct page_info *page = mfn_to_page(gmfn); - int expected_count; /* Dispatch table for getting per-type functions */ static const hash_callback_t callbacks[SH_type_unused] = { @@ -2501,7 +2500,7 @@ int sh_remove_all_mappings(struct vcpu * ; perfc_incr(shadow_mappings); - if ( (page->count_info & PGC_count_mask) == 0 ) + if ( __check_page_no_refs(page) ) return 0; /* Although this is an externally visible function, we do not know @@ -2517,8 +2516,7 @@ int sh_remove_all_mappings(struct vcpu * hash_foreach(v, callback_mask, callbacks, gmfn); /* If that didn''t catch the mapping, something is very wrong */ - expected_count = (page->count_info & PGC_allocated) ? 1 : 0; - if ( (page->count_info & PGC_count_mask) != expected_count ) + if ( !__check_page_no_refs(page) ) { /* Don''t complain if we''re in HVM and there are some extra mappings: * The qemu helper process has an untyped mapping of this dom''s RAM diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/multi.c --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -4591,7 +4591,7 @@ int sh_rm_mappings_from_l1(struct vcpu * { (void) shadow_set_l1e(v, sl1e, shadow_l1e_empty(), p2m_invalid, sl1mfn); - if ( (mfn_to_page(target_mfn)->count_info & PGC_count_mask) == 0 ) + if ( __check_page_no_refs(mfn_to_page(target_mfn)) ) /* This breaks us cleanly out of the FOREACH macro */ done = 1; } diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/private.h --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup( } #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ +static inline int __check_page_no_refs(struct page_info *page) +{ + unsigned long __count = page->count_info; + return ( (__count & PGC_count_mask) =+ ((__count & PGC_allocated) ? 1 : 0) ); +} #endif /* _XEN_SHADOW_PRIVATE_H */
Keir Fraser
2011-Nov-24 18:31 UTC
Re: [PATCH] Don''t trigger unnecessary shadow scans on p2m entry update
On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote:> xen/arch/x86/mm/shadow/common.c | 6 ++---- > xen/arch/x86/mm/shadow/multi.c | 2 +- > xen/arch/x86/mm/shadow/private.h | 6 ++++++ > 3 files changed, 9 insertions(+), 5 deletions(-) > > > When updating a p2m entry, the hypervisor scans all shadow pte''s to find > mappings of that gfn and tear them down. This is avoided if the page count > reveals that there are no additional mappings. The current test ignores the > PGC_allocated flag and its effect on the page count. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > Signed-off-by: Adin Scannell <adin@scannell.ca> > > diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/common.c > --- a/xen/arch/x86/mm/shadow/common.c > +++ b/xen/arch/x86/mm/shadow/common.c > @@ -2464,7 +2464,6 @@ int sh_remove_write_access_from_sl1p(str > int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn) > { > struct page_info *page = mfn_to_page(gmfn); > - int expected_count; > > /* Dispatch table for getting per-type functions */ > static const hash_callback_t callbacks[SH_type_unused] = { > @@ -2501,7 +2500,7 @@ int sh_remove_all_mappings(struct vcpu * > ; > > perfc_incr(shadow_mappings); > - if ( (page->count_info & PGC_count_mask) == 0 ) > + if ( __check_page_no_refs(page) ) > return 0; > > /* Although this is an externally visible function, we do not know > @@ -2517,8 +2516,7 @@ int sh_remove_all_mappings(struct vcpu * > hash_foreach(v, callback_mask, callbacks, gmfn); > > /* If that didn''t catch the mapping, something is very wrong */ > - expected_count = (page->count_info & PGC_allocated) ? 1 : 0; > - if ( (page->count_info & PGC_count_mask) != expected_count ) > + if ( !__check_page_no_refs(page) ) > { > /* Don''t complain if we''re in HVM and there are some extra mappings: > * The qemu helper process has an untyped mapping of this dom''s RAM > diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/multi.c > --- a/xen/arch/x86/mm/shadow/multi.c > +++ b/xen/arch/x86/mm/shadow/multi.c > @@ -4591,7 +4591,7 @@ int sh_rm_mappings_from_l1(struct vcpu * > { > (void) shadow_set_l1e(v, sl1e, shadow_l1e_empty(), > p2m_invalid, sl1mfn); > - if ( (mfn_to_page(target_mfn)->count_info & PGC_count_mask) == 0 > ) > + if ( __check_page_no_refs(mfn_to_page(target_mfn)) ) > /* This breaks us cleanly out of the FOREACH macro */ > done = 1; > } > diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/private.h > --- a/xen/arch/x86/mm/shadow/private.h > +++ b/xen/arch/x86/mm/shadow/private.h > @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup( > } > #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ > > +static inline int __check_page_no_refs(struct page_info *page) > +{ > + unsigned long __count = page->count_info; > + return ( (__count & PGC_count_mask) => + ((__count & PGC_allocated) ? 1 : 0) );It''s a fussy point, but it might be better to use atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads the field once only. It''s cheap (compiles to a single mov instruction), but atomic_read_ulong doesn''t exist so you''d have to add its (obvious) definition to asm-x86/atomic.h -- Keir> +} > > #endif /* _XEN_SHADOW_PRIVATE_H */ >
Jan Beulich
2011-Nov-25 08:45 UTC
Re: [PATCH] Don''t trigger unnecessary shadow scans on p2m entry update
>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote: > On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote: >> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup( >> } >> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ >> >> +static inline int __check_page_no_refs(struct page_info *page) >> +{ >> + unsigned long __count = page->count_info; >> + return ( (__count & PGC_count_mask) =>> + ((__count & PGC_allocated) ? 1 : 0) ); > > It''s a fussy point, but it might be better to use > atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads > the field once only. It''s cheap (compiles to a single mov instruction), but > atomic_read_ulong doesn''t exist so you''d have to add its (obvious) > definition to asm-x86/atomic.hI think Tim suggested anyway to use (__count & (PGC_count_mask|PGC_allocated)) matched against (1|PGC_allocated) here, which would eliminate the multiple read potential if used in a switch statement. Also, for clarity the function should probably return bool_t. Jan
Keir Fraser
2011-Nov-25 09:18 UTC
Re: [PATCH] Don''t trigger unnecessary shadow scans on p2m entry update
On 25/11/2011 08:45, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote: >> On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote: >>> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup( >>> } >>> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ >>> >>> +static inline int __check_page_no_refs(struct page_info *page) >>> +{ >>> + unsigned long __count = page->count_info; >>> + return ( (__count & PGC_count_mask) =>>> + ((__count & PGC_allocated) ? 1 : 0) ); >> >> It''s a fussy point, but it might be better to use >> atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads >> the field once only. It''s cheap (compiles to a single mov instruction), but >> atomic_read_ulong doesn''t exist so you''d have to add its (obvious) >> definition to asm-x86/atomic.h > > I think Tim suggested anyway to use > > (__count & (PGC_count_mask|PGC_allocated)) matched against > (1|PGC_allocated) here, which would eliminate the multiple read > potential if used in a switch statement.Yes, that would be better. -- Keir> Also, for clarity the function should probably return bool_t. > > Jan >
Andres Lagar-Cavilla
2011-Nov-29 20:18 UTC
Re: [PATCH] Don''t trigger unnecessary shadow scans on p2m entry update
Not sure about that. We need to check for either of two conditions on page->count_info. That it''s zero, or that it''s PGC_allocated | 1. (__count & (PGC_count_mask|PGC_allocated)) matched against (1|PGC_allocated) would lose the zero case. Resubmitting shortly. Andres On Nov 25, 2011, at 4:18 AM, Keir Fraser wrote:> On 25/11/2011 08:45, "Jan Beulich" <JBeulich@suse.com> wrote: > >>>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote: >>> On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote: >>>> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup( >>>> } >>>> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ >>>> >>>> +static inline int __check_page_no_refs(struct page_info *page) >>>> +{ >>>> + unsigned long __count = page->count_info; >>>> + return ( (__count & PGC_count_mask) =>>>> + ((__count & PGC_allocated) ? 1 : 0) ); >>> >>> It''s a fussy point, but it might be better to use >>> atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads >>> the field once only. It''s cheap (compiles to a single mov instruction), but >>> atomic_read_ulong doesn''t exist so you''d have to add its (obvious) >>> definition to asm-x86/atomic.h >> >> I think Tim suggested anyway to use >> >> (__count & (PGC_count_mask|PGC_allocated)) matched against >> (1|PGC_allocated) here, which would eliminate the multiple read >> potential if used in a switch statement. > > Yes, that would be better. > > -- Keir > >> Also, for clarity the function should probably return bool_t. >> >> Jan >> > >