thr3ads.net - Xen devel - [PATCH 0 of 6] [RFC] Use wait queues for paging, v2 [Feb 2012]

If this information is useful, please help other people find it:
Share via:

Tim Deegan

2012-Feb-23 16:34 UTC

[PATCH 0 of 6] [RFC] Use wait queues for paging, v2

This is v2 of the patch I posted last week, after feedback from Andres. 

The first four patches are cleanups that should probably happen even
if the main waitq patch doesn''t go in.

Patch 5 is the waitq patch, updated but largely following the same 
approach. I''m not suggesting it be applied right now because I know
at least one path (get_two_gfns()) needs work before it''s safe.

Patch 6 is a follow-up to avoid claiming a slot unnecessarily in 
p2m_mem_paging_populate(); it''s also a good idea even if the main 
waitq patch doesn''t happen but would need to be backported since 
it only applies on top of those changes.

Tim.

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 1 of 6] mm: guest_remove_page() should not populate or unshare

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 2 of 6] x86/mm: remove ''p2m_guest'' lookup type

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 3 of 6] x86/mm: make ''query type'' argument to get_gfn into a set of flags

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 4 of 6] x86/mm: tidy up get_two_gfns() a little

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:34 UTC

head link

[PATCH 6 of 6] x86/mm: Don''t claim a slot on the paging ring if we might not need it

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Feb-23 16:43 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

At 16:34 +0000 on 23 Feb (1330014862), Tim Deegan wrote:> This is v2 of the patch I posted last week, after feedback from Andres. 
> 
> The first four patches are cleanups that should probably happen even
> if the main waitq patch doesn''t go in.
> 
> Patch 5 is the waitq patch, updated but largely following the same 
> approach. I''m not suggesting it be applied right now because I
know
> at least one path (get_two_gfns()) needs work before it''s safe.
> 
> Patch 6 is a follow-up to avoid claiming a slot unnecessarily in 
> p2m_mem_paging_populate(); it''s also a good idea even if the main 
> waitq patch doesn''t happen but would need to be backported since 
> it only applies on top of those changes.
My apologies for these being attached rather than inline -- my hg email
runes are clearly rusty. :) 

Tim

Andres Lagar-Cavilla

2012-Feb-23 16:45 UTC

head link

Re: [PATCH 3 of 6] x86/mm: make ''query type'' argument to get_gfn into a set of flags

On Feb 23, 2012, at 11:34 AM, Tim Deegan wrote:
> # HG changeset patch
> # User Tim Deegan <tim@xen.org>
> # Date 1330013729 0
> # Node ID 00f61d0186a6b098b349b6f94c9a11c5a18dc99a
> # Parent  ec94098841bfae20101608f3daf7a1a01ff71c49
> x86/mm: make ''query type'' argument to get_gfn into a set
of flags
> 
> Having an enum for this won''t work if we want to add any
orthogonal
> options to it -- the existing code is only correct (after the removal of
> p2m_guest in the previous patch) because there are no tests anywhere for
> ''== p2m_alloc'', only for ''!= p2m_query''
and ''== p2m_unshare''.
> 
> Replace it with a set of flags.
> 
> Signed-off-by: Tim Deegan <tim@xen.org>
P2M_UNSHARE flag should imply P2M_ALLOC. They make no sense separate, and code
is simplified a bit. Otherwise:
Acked-by: Andres Lagar-Cavilla
<andres@lagarcavilla.org>> 
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/hvm/emulate.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -696,7 +696,7 @@ static int hvmemul_rep_movs(
> 
>     get_two_gfns(current->domain, sgpa >> PAGE_SHIFT, &sp2mt,
NULL, NULL,
>                  current->domain, dgpa >> PAGE_SHIFT, &dp2mt,
NULL, NULL,
> -                 p2m_alloc, &tg);
> +                 P2M_ALLOC, &tg);
> 
>     if ( !p2m_is_ram(sp2mt) && !p2m_is_grant(sp2mt) )
>     {
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/hvm/hvm.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -1235,7 +1235,7 @@ int hvm_hap_nested_page_fault(unsigned l
> 
>     p2m = p2m_get_hostp2m(v->domain);
>     mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
> -                              access_w ? p2m_unshare : p2m_alloc, NULL);
> +                              P2M_ALLOC | (access_w ? P2M_UNSHARE : 0),
NULL);
> 
>     /* Check access permissions first, then handle faults */
>     if ( mfn_x(mfn) != INVALID_MFN )
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/hvm/svm/svm.c
> --- a/xen/arch/x86/hvm/svm/svm.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/hvm/svm/svm.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -1265,7 +1265,7 @@ static void svm_do_nested_pgfault(struct
>         p2m = p2m_get_p2m(v);
>         _d.gpa = gpa;
>         _d.qualification = 0;
> -        mfn = get_gfn_type_access(p2m, gfn, &_d.p2mt, &p2ma,
p2m_query, NULL);
> +        mfn = get_gfn_type_access(p2m, gfn, &_d.p2mt, &p2ma, 0,
NULL);
>         __put_gfn(p2m, gfn);
>         _d.mfn = mfn_x(mfn);
> 
> @@ -1287,7 +1287,7 @@ static void svm_do_nested_pgfault(struct
>     if ( p2m == NULL )
>         p2m = p2m_get_p2m(v);
>     /* Everything else is an error. */
> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, p2m_query,
NULL);
> +    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>     __put_gfn(p2m, gfn);
>     gdprintk(XENLOG_ERR,
>          "SVM violation gpa %#"PRIpaddr", mfn %#lx, type
%i\n",
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/guest_walk.c
> --- a/xen/arch/x86/mm/guest_walk.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/guest_walk.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -98,7 +98,8 @@ static inline void *map_domain_gfn(struc
>     void *map;
> 
>     /* Translate the gfn, unsharing if shared */
> -    *mfn = get_gfn_type_access(p2m, gfn_x(gfn), p2mt, &p2ma,
p2m_unshare, NULL);
> +    *mfn = get_gfn_type_access(p2m, gfn_x(gfn), p2mt, &p2ma, 
> +                               P2M_ALLOC | P2M_UNSHARE, NULL);
>     if ( p2m_is_paging(*p2mt) )
>     {
>         ASSERT(!p2m_is_nestedp2m(p2m));
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/hap/guest_walk.c
> --- a/xen/arch/x86/mm/hap/guest_walk.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/hap/guest_walk.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -60,7 +60,8 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
> 
>     /* Get the top-level table''s MFN */
>     top_gfn = cr3 >> PAGE_SHIFT;
> -    top_mfn = get_gfn_type_access(p2m, top_gfn, &p2mt, &p2ma,
p2m_unshare, NULL);
> +    top_mfn = get_gfn_type_access(p2m, top_gfn, &p2mt, &p2ma, 
> +                                  P2M_ALLOC | P2M_UNSHARE, NULL);
>     if ( p2m_is_paging(p2mt) )
>     {
>         ASSERT(!p2m_is_nestedp2m(p2m));
> @@ -96,7 +97,8 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>     if ( missing == 0 )
>     {
>         gfn_t gfn = guest_l1e_get_gfn(gw.l1e);
> -        (void)get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma,
p2m_unshare, NULL);
> +        (void)get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma,
> +                                  P2M_ALLOC | P2M_UNSHARE, NULL); 
>         if ( p2m_is_paging(p2mt) )
>         {
>             ASSERT(!p2m_is_nestedp2m(p2m));
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/hap/nested_hap.c
> --- a/xen/arch/x86/mm/hap/nested_hap.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/hap/nested_hap.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -150,7 +150,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain 
> 
>     /* walk L0 P2M table */
>     mfn = get_gfn_type_access(p2m, L1_gpa >> PAGE_SHIFT, &p2mt,
&p2ma,
> -                              p2m_query, page_order);
> +                              0, page_order);
> 
>     rc = NESTEDHVM_PAGEFAULT_MMIO;
>     if ( p2m_is_mmio(p2mt) )
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/mem_sharing.c
> --- a/xen/arch/x86/mm/mem_sharing.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/mem_sharing.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -734,7 +734,7 @@ int mem_sharing_share_pages(struct domai
> 
>     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
>                  cd, cgfn, &cmfn_type, NULL, &cmfn,
> -                 p2m_query, &tg);
> +                 0, &tg);
> 
>     /* This tricky business is to avoid two callers deadlocking if 
>      * grabbing pages in opposite client/source order */
> @@ -849,7 +849,7 @@ int mem_sharing_add_to_physmap(struct do
> 
>     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
>                  cd, cgfn, &cmfn_type, &a, &cmfn,
> -                 p2m_query, &tg);
> +                 0, &tg);
> 
>     /* Get the source shared page, check and lock */
>     ret = XENMEM_SHARING_OP_S_HANDLE_INVALID;
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/p2m-ept.c
> --- a/xen/arch/x86/mm/p2m-ept.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/p2m-ept.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -514,7 +514,7 @@ static mfn_t ept_get_entry(struct p2m_do
>             goto out;
>         else if ( ret == GUEST_TABLE_POD_PAGE )
>         {
> -            if ( q == p2m_query )
> +            if ( !(q & P2M_ALLOC) )
>             {
>                 *t = p2m_populate_on_demand;
>                 goto out;
> @@ -541,7 +541,7 @@ static mfn_t ept_get_entry(struct p2m_do
> 
>     if ( ept_entry->sa_p2mt == p2m_populate_on_demand )
>     {
> -        if ( q == p2m_query )
> +        if ( !(q & P2M_ALLOC) )
>         {
>             *t = p2m_populate_on_demand;
>             goto out;
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/p2m-pod.c
> --- a/xen/arch/x86/mm/p2m-pod.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/p2m-pod.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -529,7 +529,7 @@ p2m_pod_decrease_reservation(struct doma
>         p2m_access_t a;
>         p2m_type_t t;
> 
> -        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, p2m_query,
NULL);
> +        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
> 
>         if ( t == p2m_populate_on_demand )
>             pod++;
> @@ -570,7 +570,7 @@ p2m_pod_decrease_reservation(struct doma
>         p2m_type_t t;
>         p2m_access_t a;
> 
> -        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, p2m_query,
NULL);
> +        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
>         if ( t == p2m_populate_on_demand )
>         {
>             set_p2m_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
p2m->default_access);
> @@ -656,7 +656,7 @@ p2m_pod_zero_check_superpage(struct p2m_
>     for ( i=0; i<SUPERPAGE_PAGES; i++ )
>     {
>         p2m_access_t a; 
> -        mfn = p2m->get_entry(p2m, gfn + i, &type, &a,
p2m_query, NULL);
> +        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL);
> 
>         if ( i == 0 )
>         {
> @@ -786,7 +786,7 @@ p2m_pod_zero_check(struct p2m_domain *p2
>     for ( i=0; i<count; i++ )
>     {
>         p2m_access_t a;
> -        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a,
p2m_query, NULL);
> +        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0,
NULL);
>         /* If this is ram, and not a pagetable or from the xen heap, and
probably not mapped
>            elsewhere, map it; otherwise, skip. */
>         if ( p2m_is_ram(types[i])
> @@ -932,7 +932,7 @@ p2m_pod_emergency_sweep(struct p2m_domai
>     for ( i=p2m->pod.reclaim_single; i > 0 ; i-- )
>     {
>         p2m_access_t a;
> -        (void)p2m->get_entry(p2m, i, &t, &a, p2m_query, NULL);
> +        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL);
>         if ( p2m_is_ram(t) )
>         {
>             gfns[j] = i;
> @@ -1130,7 +1130,7 @@ guest_physmap_mark_populate_on_demand(st
>     for ( i = 0; i < (1UL << order); i++ )
>     {
>         p2m_access_t a;
> -        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, p2m_query,
NULL);
> +        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
>         if ( p2m_is_ram(ot) )
>         {
>             printk("%s: gfn_to_mfn returned type %d!\n",
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/p2m-pt.c
> --- a/xen/arch/x86/mm/p2m-pt.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/p2m-pt.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -513,7 +513,7 @@ pod_retry_l3:
>              (p2m_flags_to_type(l3e_get_flags(l3e)) ==
p2m_populate_on_demand) )
>         {
>             /* The read has succeeded, so we know that mapping exists */
> -            if ( q != p2m_query )
> +            if ( q & P2M_ALLOC )
>             {
>                 if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G, q) )
>                     goto pod_retry_l3;
> @@ -565,7 +565,7 @@ pod_retry_l2:
>         {
>             /* The read has succeeded, so we know that the mapping
>              * exits at this point.  */
> -            if ( q != p2m_query )
> +            if ( q & P2M_ALLOC )
>             {
>                 if ( !p2m_pod_demand_populate(p2m, gfn, 
>                                                 PAGE_ORDER_2M, q) )
> @@ -623,7 +623,7 @@ pod_retry_l1:
>         {
>             /* The read has succeeded, so we know that the mapping
>              * exits at this point.  */
> -            if ( q != p2m_query )
> +            if ( q & P2M_ALLOC )
>             {
>                 if ( !p2m_pod_demand_populate(p2m, gfn, 
>                                                 PAGE_ORDER_4K, q) )
> @@ -714,7 +714,7 @@ pod_retry_l3:
>         {
>             if ( p2m_flags_to_type(l3e_get_flags(*l3e)) ==
p2m_populate_on_demand )
>             {
> -                if ( q != p2m_query )
> +                if ( q & P2M_ALLOC )
>                 {
>                     if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G,
q) )
>                         goto pod_retry_l3;
> @@ -753,7 +753,7 @@ pod_retry_l2:
>         /* PoD: Try to populate a 2-meg chunk */
>         if ( p2m_flags_to_type(l2e_get_flags(*l2e)) ==
p2m_populate_on_demand )
>         {
> -            if ( q != p2m_query ) {
> +            if ( q & P2M_ALLOC ) {
>                 if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_2M, q) )
>                     goto pod_retry_l2;
>             } else
> @@ -786,7 +786,7 @@ pod_retry_l1:
>         /* PoD: Try to populate */
>         if ( p2m_flags_to_type(l1e_get_flags(*l1e)) ==
p2m_populate_on_demand )
>         {
> -            if ( q != p2m_query ) {
> +            if ( q & P2M_ALLOC ) {
>                 if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_4K, q) )
>                     goto pod_retry_l1;
>             } else
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/p2m.c	Thu Feb 23 16:15:29 2012 +0000
> @@ -167,7 +167,7 @@ mfn_t __get_gfn_type_access(struct p2m_d
>     mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
> 
> #ifdef __x86_64__
> -    if ( q == p2m_unshare && p2m_is_shared(*t) )
> +    if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) )
>     {
>         ASSERT(!p2m_is_nestedp2m(p2m));
>         mem_sharing_unshare_page(p2m->domain, gfn, 0);
> @@ -180,7 +180,7 @@ mfn_t __get_gfn_type_access(struct p2m_d
>     {
>         /* Return invalid_mfn to avoid caller''s access */
>         mfn = _mfn(INVALID_MFN);
> -        if (q != p2m_query)
> +        if ( q & P2M_ALLOC )
>             domain_crash(p2m->domain);
>     }
> #endif
> @@ -367,7 +367,7 @@ void p2m_teardown(struct p2m_domain *p2m
>     for ( gfn=0; gfn < p2m->max_mapped_pfn; gfn++ )
>     {
>         p2m_access_t a;
> -        mfn = p2m->get_entry(p2m, gfn, &t, &a, p2m_query,
NULL);
> +        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
>         if ( mfn_valid(mfn) && (t == p2m_ram_shared) )
>         {
>             ASSERT(!p2m_is_nestedp2m(p2m));
> @@ -437,7 +437,7 @@ p2m_remove_page(struct p2m_domain *p2m, 
>     {
>         for ( i = 0; i < (1UL << page_order); i++ )
>         {
> -            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a,
p2m_query, NULL);
> +            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a,
0, NULL);
>             if ( !p2m_is_grant(t) && !p2m_is_shared(t) )
>                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
>             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
> @@ -499,7 +499,7 @@ guest_physmap_add_entry(struct domain *d
>     /* First, remove m->p mappings for existing p->m mappings */
>     for ( i = 0; i < (1UL << page_order); i++ )
>     {
> -        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, p2m_query,
NULL);
> +        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
> #ifdef __x86_64__
>         if ( p2m_is_shared(ot) )
>         {
> @@ -512,7 +512,7 @@ guest_physmap_add_entry(struct domain *d
>                 p2m_unlock(p2m);
>                 return rc;
>             }
> -            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a,
p2m_query, NULL);
> +            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0,
NULL);
>             ASSERT(!p2m_is_shared(ot));
>         }
> #endif /* __x86_64__ */
> @@ -561,7 +561,7 @@ guest_physmap_add_entry(struct domain *d
>              * address */
>             P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new
gfn=%#lx\n",
>                       mfn + i, ogfn, gfn + i);
> -            omfn = p2m->get_entry(p2m, ogfn, &ot, &a,
p2m_query, NULL);
> +            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL);
>             if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
>             {
>                 ASSERT(mfn_valid(omfn));
> @@ -620,7 +620,7 @@ p2m_type_t p2m_change_type(struct domain
> 
>     gfn_lock(p2m, gfn, 0);
> 
> -    mfn = p2m->get_entry(p2m, gfn, &pt, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL);
>     if ( pt == ot )
>         set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, nt,
p2m->default_access);
> 
> @@ -648,7 +648,7 @@ void p2m_change_type_range(struct domain
> 
>     for ( gfn = start; gfn < end; gfn++ )
>     {
> -        mfn = p2m->get_entry(p2m, gfn, &pt, &a, p2m_query,
NULL);
> +        mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL);
>         if ( pt == ot )
>             set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, nt,
p2m->default_access);
>     }
> @@ -674,7 +674,7 @@ set_mmio_p2m_entry(struct domain *d, uns
>         return 0;
> 
>     gfn_lock(p2m, gfn, 0);
> -    omfn = p2m->get_entry(p2m, gfn, &ot, &a, p2m_query, NULL);
> +    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
>     if ( p2m_is_grant(ot) )
>     {
>         p2m_unlock(p2m);
> @@ -710,7 +710,7 @@ clear_mmio_p2m_entry(struct domain *d, u
>         return 0;
> 
>     gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, gfn, &t, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
> 
>     /* Do not use mfn_valid() here as it will usually fail for MMIO pages.
*/
>     if ( (INVALID_MFN == mfn_x(mfn)) || (t != p2m_mmio_direct) )
> @@ -741,7 +741,7 @@ set_shared_p2m_entry(struct domain *d, u
>         return 0;
> 
>     gfn_lock(p2m, gfn, 0);
> -    omfn = p2m->get_entry(p2m, gfn, &ot, &a, p2m_query, NULL);
> +    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
>     /* At the moment we only allow p2m change if gfn has already been made
>      * sharable first */
>     ASSERT(p2m_is_shared(ot));
> @@ -793,7 +793,7 @@ int p2m_mem_paging_nominate(struct domai
> 
>     gfn_lock(p2m, gfn, 0);
> 
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
> 
>     /* Check if mfn is valid */
>     if ( !mfn_valid(mfn) )
> @@ -856,7 +856,7 @@ int p2m_mem_paging_evict(struct domain *
>     gfn_lock(p2m, gfn, 0);
> 
>     /* Get mfn */
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
>     if ( unlikely(!mfn_valid(mfn)) )
>         goto out;
> 
> @@ -980,7 +980,7 @@ void p2m_mem_paging_populate(struct doma
> 
>     /* Fix p2m mapping */
>     gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
>     /* Allow only nominated or evicted pages to enter page-in path */
>     if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
>     {
> @@ -1042,7 +1042,7 @@ int p2m_mem_paging_prep(struct domain *d
> 
>     gfn_lock(p2m, gfn, 0);
> 
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
> 
>     ret = -ENOENT;
>     /* Allow missing pages */
> @@ -1135,7 +1135,7 @@ void p2m_mem_paging_resume(struct domain
>         if ( !(rsp.flags & MEM_EVENT_FLAG_DROP_PAGE) )
>         {
>             gfn_lock(p2m, rsp.gfn, 0);
> -            mfn = p2m->get_entry(p2m, rsp.gfn, &p2mt, &a,
p2m_query, NULL);
> +            mfn = p2m->get_entry(p2m, rsp.gfn, &p2mt, &a, 0,
NULL);
>             /* Allow only pages which were prepared properly, or pages
which
>              * were nominated but not evicted */
>             if ( mfn_valid(mfn) && (p2mt == p2m_ram_paging_in) )
> @@ -1168,7 +1168,7 @@ bool_t p2m_mem_access_check(unsigned lon
> 
>     /* First, handle rx2rw conversion automatically */
>     gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, p2m_query,
NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> 
>     if ( access_w && p2ma == p2m_access_rx2rw ) 
>     {
> @@ -1297,7 +1297,7 @@ int p2m_set_mem_access(struct domain *d,
>     p2m_lock(p2m);
>     for ( pfn = start_pfn; pfn < start_pfn + nr; pfn++ )
>     {
> -        mfn = p2m->get_entry(p2m, pfn, &t, &_a, p2m_query,
NULL);
> +        mfn = p2m->get_entry(p2m, pfn, &t, &_a, 0, NULL);
>         if ( p2m->set_entry(p2m, pfn, mfn, PAGE_ORDER_4K, t, a) == 0 )
>         {
>             rc = -ENOMEM;
> @@ -1338,7 +1338,7 @@ int p2m_get_mem_access(struct domain *d,
>     }
> 
>     gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, pfn, &t, &a, p2m_query, NULL);
> +    mfn = p2m->get_entry(p2m, pfn, &t, &a, 0, NULL);
>     gfn_unlock(p2m, gfn, 0);
> 
>     if ( mfn_x(mfn) == INVALID_MFN )
> @@ -1579,7 +1579,7 @@ void audit_p2m(struct domain *d,
>             continue;
>         }
> 
> -        p2mfn = get_gfn_type_access(p2m, gfn, &type, &p2ma,
p2m_query, NULL);
> +        p2mfn = get_gfn_type_access(p2m, gfn, &type, &p2ma, 0,
NULL);
>         if ( mfn_x(p2mfn) != mfn )
>         {
>             mpbad++;
> diff -r ec94098841bf -r 00f61d0186a6 xen/arch/x86/mm/shadow/types.h
> --- a/xen/arch/x86/mm/shadow/types.h	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/arch/x86/mm/shadow/types.h	Thu Feb 23 16:15:29 2012 +0000
> @@ -193,7 +193,7 @@ static inline shadow_l4e_t shadow_l4e_fr
> 
>  /* Override get_gfn to work with gfn_t */
> #undef get_gfn_query
> -#define get_gfn_query(d, g, t) get_gfn_type((d), gfn_x(g), (t), p2m_query)
> +#define get_gfn_query(d, g, t) get_gfn_type((d), gfn_x(g), (t), 0)
> 
> /* The shadow types needed for the various levels. */
> 
> diff -r ec94098841bf -r 00f61d0186a6 xen/include/asm-x86/guest_pt.h
> --- a/xen/include/asm-x86/guest_pt.h	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/include/asm-x86/guest_pt.h	Thu Feb 23 16:15:29 2012 +0000
> @@ -53,7 +53,7 @@ gfn_to_paddr(gfn_t gfn)
> 
> /* Override get_gfn to work with gfn_t */
> #undef get_gfn
> -#define get_gfn(d, g, t) get_gfn_type((d), gfn_x(g), (t), p2m_alloc)
> +#define get_gfn(d, g, t) get_gfn_type((d), gfn_x(g), (t), P2M_ALLOC)
> 
> 
> /* Types of the guest''s page tables and access functions for them
*/
> diff -r ec94098841bf -r 00f61d0186a6 xen/include/asm-x86/p2m.h
> --- a/xen/include/asm-x86/p2m.h	Thu Feb 23 16:15:29 2012 +0000
> +++ b/xen/include/asm-x86/p2m.h	Thu Feb 23 16:15:29 2012 +0000
> @@ -115,11 +115,9 @@ typedef enum {
> } p2m_access_t;
> 
> /* Modifiers to the query */
> -typedef enum {
> -    p2m_query,              /* Do not populate a PoD entries      */
> -    p2m_alloc,              /* Automatically populate PoD entries */
> -    p2m_unshare,            /* Break c-o-w sharing; implies alloc */
> -} p2m_query_t;
> +typedef unsigned int p2m_query_t;
> +#define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out entries
*/
> +#define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
> 
> /* We use bitmaps and maks to handle groups of types */
> #define p2m_to_mask(_t) (1UL << (_t))
> @@ -330,9 +328,10 @@ static inline mfn_t get_gfn_type(struct 
>  * N.B. get_gfn_query() is the _only_ one guaranteed not to take the
>  * p2m lock; none of the others can be called with the p2m or paging
>  * lock held. */
> -#define get_gfn(d, g, t)         get_gfn_type((d), (g), (t), p2m_alloc)
> -#define get_gfn_query(d, g, t)   get_gfn_type((d), (g), (t), p2m_query)
> -#define get_gfn_unshare(d, g, t) get_gfn_type((d), (g), (t), p2m_unshare)
> +#define get_gfn(d, g, t)         get_gfn_type((d), (g), (t), P2M_ALLOC)
> +#define get_gfn_query(d, g, t)   get_gfn_type((d), (g), (t), 0)
> +#define get_gfn_unshare(d, g, t) get_gfn_type((d), (g), (t), \
> +                                              P2M_ALLOC | P2M_UNSHARE)
> 
> /* Compatibility function exporting the old untyped interface */
> static inline unsigned long get_gfn_untyped(struct domain *d, unsigned long
gpfn)
> @@ -364,8 +363,7 @@ static inline mfn_t get_gfn_query_unlock
>                                            p2m_type_t *t)
> {
>     p2m_access_t a;
> -    return __get_gfn_type_access(p2m_get_hostp2m(d), gfn, t, &a, 
> -                                    p2m_query, NULL, 0);
> +    return __get_gfn_type_access(p2m_get_hostp2m(d), gfn, t, &a, 0,
NULL, 0);
> }
> 
> /* General conversion function from mfn to gfn */

Andres Lagar-Cavilla

2012-Feb-23 16:48 UTC

head link

Re: [PATCH 6 of 6] x86/mm: Don''t claim a slot on the paging ring if we might not need it

On Feb 23, 2012, at 11:34 AM, Tim Deegan wrote:
> # HG changeset patch
> # User Tim Deegan <tim@xen.org>
> # Date 1330013929 0
> # Node ID 516f96172e67ff463c371357e0ab93724c8b9048
> # Parent  510d80343793227bd39b9a5d4d3b61b9a3f776e4
> x86/mm: Don''t claim a slot on the paging ring if we might not need
it.
> 
> This avoids the possibility of hitting a wait queue for a slot and then
> finding out later that it wasn''t necessary.
> 
> Signed-off-by: Tim Deegan <tim@xen.org>
> 
> diff -r 510d80343793 -r 516f96172e67 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c	Thu Feb 23 16:18:09 2012 +0000
> +++ b/xen/arch/x86/mm/p2m.c	Thu Feb 23 16:18:49 2012 +0000
> @@ -1001,19 +1001,23 @@ void p2m_mem_paging_populate(struct doma
>     p2m_access_t a;
>     mfn_t mfn;
>     struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -    int send_request = 0;
> +    int rc, send_request = 0;
> 
> -    /* We''re paging. There should be a ring */
> -    int rc = mem_event_claim_slot(d, &d->mem_event->paging);
> -    if ( rc == -ENOSYS )
> +    if ( !d->mem_event->paging.ring_page )
There is a trivial call to do this in non open-coded fashion
bool_t mem_event_check_ring(struct mem_event_domain *med);
I don''t use it frequently myself…otherwise
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
>     {
>         gdprintk(XENLOG_ERR, "Domain %hu paging gfn %lx yet no ring
"
>                              "in place\n", d->domain_id, gfn);
>         domain_crash(d);
>         return;
>     }
> -    else if ( rc < 0 )
> -        return;
> +
> +    /* Foreign users must grab a ring entry before doing any work since
> +     * they might not be able to get one later.  Local users
shouldn''t
> +     * grab a slot until they know they need it, to avoid going on a
> +     * wait-queue unnecessarily. */
> +    if ( d != current->domain )
> +        if ( mem_event_claim_slot(d, &d->mem_event->paging) <
0 )
> +            return;
> 
>     /* Fix p2m mapping */
>     gfn_lock(p2m, gfn, 0);
> @@ -1042,10 +1046,18 @@ void p2m_mem_paging_populate(struct doma
>     if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged )
>     {
>         /* gfn is already on its way back and vcpu is not paused */
> -        mem_event_cancel_slot(d, &d->mem_event->paging);
> +        if ( d != current->domain )
> +            mem_event_cancel_slot(d, &d->mem_event->paging);
>         return;
>     }
> 
> +    /* Now local users should claim a slot. */
> +    if ( d == current->domain ) 
> +    {
> +        rc = mem_event_claim_slot(d, &d->mem_event->paging);
> +        ASSERT(rc == 0);
> +    }
> +
>     /* Send request to pager */
>     req.p2mt = p2mt;
>     req.vcpu_id = v->vcpu_id;

Andres Lagar-Cavilla

2012-Feb-23 16:49 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Feb 23, 2012, at 11:34 AM, Tim Deegan wrote:
> This is v2 of the patch I posted last week, after feedback from Andres. 
> 
> The first four patches are cleanups that should probably happen even
> if the main waitq patch doesn''t go in.
> 
> Patch 5 is the waitq patch, updated but largely following the same 
> approach. I''m not suggesting it be applied right now because I
know
> at least one path (get_two_gfns()) needs work before it''s safe.
> 
> Patch 6 is a follow-up to avoid claiming a slot unnecessarily in 
> p2m_mem_paging_populate(); it''s also a good idea even if the main 
> waitq patch doesn''t happen but would need to be backported since 
> it only applies on top of those changes.1,2 and 4:
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

3 and 6 commented separately. Still looking at 5

Andres> 
> Tim.

Olaf Hering

2012-Feb-24 13:45 UTC

head link

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

On Thu, Feb 23, Tim Deegan wrote:
> This is based on an earlier RFC patch by Olaf Hering, but heavily
> simplified (removing a per-gfn queue of waiting vcpus in favour of
> a scan of all vcpus on page-in).
Tim, thanks for that work. From just reading the change it looks ok.

A few comments below:
> +++ b/xen/arch/x86/mm/p2m.c	Thu Feb 23 16:18:09 2012 +0000
> @@ -160,13 +160,49 @@ mfn_t __get_gfn_type_access(struct p2m_d
>      }
>  
>      /* For now only perform locking on hap domains */
> -    if ( locked && (hap_enabled(p2m->domain)) )
> +    locked = locked && hap_enabled(p2m->domain);
> +
> +#ifdef __x86_64__
> +again:
> +#endif
> +    if ( locked )
>          /* Grab the lock here, don''t release until put_gfn */
>          gfn_lock(p2m, gfn, 0);
>  
>      mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
>  
>  #ifdef __x86_64__
> +    if ( p2m_is_paging(*t) && (q & P2M_ALLOC)
> +         && p2m->domain == current->domain ) 
> +    {
> +        if ( locked )
> +            gfn_unlock(p2m, gfn, 0);
> +
> +        /* Ping the pager */
> +        if ( *t == p2m_ram_paging_out || *t == p2m_ram_paged )
> +            p2m_mem_paging_populate(p2m->domain, gfn);
> +
> +        /* Wait until the pager finishes paging it in */
> +        current->arch.mem_paging_gfn = gfn;
> +        wait_event(current->arch.mem_paging_wq, ({
> +                    int done;
> +                    mfn = p2m->get_entry(p2m, gfn, t, a, 0,
page_order);
> +                    done = (*t != p2m_ram_paging_in);
I assume p2m_mem_paging_populate() will not return until the state is
forwarded to p2m_ram_paging_in.  Maybe p2m_is_paging(*t) would make it
more obvious what this check is supposed to do.
> +                    /* Safety catch: it _should_ be safe to wait here
> +                     * but if it''s not, crash the VM, not the
host */
> +                    if ( in_atomic() )
> +                    {
> +                        WARN();
> +                        domain_crash(p2m->domain);
> +                        done = 1;
> +                    }
> +                    done;
> +                }));
> +        goto again;
> +    }
> +#endif
> +
> +#ifdef __x86_64__
>      if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) )
>      {
>          ASSERT(!p2m_is_nestedp2m(p2m));
> @@ -946,17 +982,17 @@ void p2m_mem_paging_drop_page(struct dom
>   * This function needs to be called whenever gfn_to_mfn() returns any of
the p2m
>   * paging types because the gfn may not be backed by a mfn.
>   *
> - * The gfn can be in any of the paging states, but the pager needs only be
> - * notified when the gfn is in the paging-out path (paging_out or paged). 
This
> - * function may be called more than once from several vcpus. If the vcpu
belongs
> - * to the guest, the vcpu must be stopped and the pager notified that the
vcpu
> - * was stopped. The pager needs to handle several requests for the same
gfn.
> + * The gfn can be in any of the paging states, but the pager needs only
> + * be notified when the gfn is in the paging-out path (paging_out or
> + * paged).  This function may be called more than once from several
> + * vcpus.  The pager needs to handle several requests for the same gfn.
>   *
> - * If the gfn is not in the paging-out path and the vcpu does not belong
to the
> - * guest, nothing needs to be done and the function assumes that a request
was
> - * already sent to the pager. In this case the caller has to try again
until the
> - * gfn is fully paged in again.
> + * If the gfn is not in the paging-out path nothing needs to be done and
> + * the function assumes that a request was already sent to the pager.
> + * In this case the caller has to try again until the gfn is fully paged
> + * in again.
>   */
> +
>  void p2m_mem_paging_populate(struct domain *d, unsigned long gfn)
>  {
>      struct vcpu *v = current;
> @@ -965,6 +1001,7 @@ void p2m_mem_paging_populate(struct doma
>      p2m_access_t a;
>      mfn_t mfn;
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    int send_request = 0;
Is that variable supposed to be used?
Perhaps the feature to fast-forward (or rollback) from
p2m_ram_paging_out to p2m_ram_rw could be a separate patch. My initial
version of this patch did not have a strict requirement for this
feature, if I remember correctly.
>      /* We''re paging. There should be a ring */
>      int rc = mem_event_claim_slot(d, &d->mem_event->paging);
> @@ -987,19 +1024,22 @@ void p2m_mem_paging_populate(struct doma
>          /* Evict will fail now, tag this request for pager */
>          if ( p2mt == p2m_ram_paging_out )
>              req.flags |= MEM_EVENT_FLAG_EVICT_FAIL;
> -
> -        set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a);
> +        if ( p2mt == p2m_ram_paging_out && mfn_valid(mfn)
&& v->domain == d )
> +            /* Short-cut back to paged-in state (but not for foreign 
> +             * mappings, or the pager couldn''t map it to page it
out) */
> +            set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
> +                          paging_mode_log_dirty(d) 
> +                          ? p2m_ram_logdirty : p2m_ram_rw, a);
> +        else
> +        {
> +            set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in,
a);
> +            send_request = 1;
> +        }
>      }
>      gfn_unlock(p2m, gfn, 0);
>  
> -    /* Pause domain if request came from guest and gfn has paging type */
> -    if ( p2m_is_paging(p2mt) && v->domain == d )
> -    {
> -        vcpu_pause_nosync(v);
> -        req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> -    }
>      /* No need to inform pager if the gfn is not in the page-out path */
> -    else if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged
)
> +    if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged )
>      {
>          /* gfn is already on its way back and vcpu is not paused */
>          mem_event_cancel_slot(d, &d->mem_event->paging);

Olaf Hering

2012-Feb-26 22:14 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Thu, Feb 23, Tim Deegan wrote:
> This is v2 of the patch I posted last week, after feedback from Andres. 
Tried this series, but processes in dom0 started to hang in D state
when a paged guest is started. I will see if I can spot the error.

Olaf

Olaf Hering

2012-Feb-27 16:51 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Sun, Feb 26, Olaf Hering wrote:
> On Thu, Feb 23, Tim Deegan wrote:
> 
> > This is v2 of the patch I posted last week, after feedback from
Andres.
> 
> Tried this series, but processes in dom0 started to hang in D state
> when a paged guest is started. I will see if I can spot the error.
This change for patch #5 is needed, especially the first part.
Now it appears to work.

Olaf


# HG changeset patch
# Parent c3738598897f5239a72cabde676f5e86fd4c8241

diff -r c3738598897f xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -999,6 +999,10 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
     v->arch.hvm_vcpu.inject_trap = -1;
 
+#ifdef CONFIG_X86_64
+    init_waitqueue_head(&v->arch.hvm_vcpu.mem_paging_wq);
+#endif
+
 #ifdef CONFIG_COMPAT
     rc = setup_compat_arg_xlat(v);
     if ( rc != 0 )
diff -r c3738598897f xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -183,8 +183,8 @@ again:
             p2m_mem_paging_populate(p2m->domain, gfn);
 
         /* Wait until the pager finishes paging it in */
-        current->arch.mem_paging_gfn = gfn;
-        wait_event(current->arch.mem_paging_wq, ({
+        current->arch.hvm_vcpu.mem_paging_gfn = gfn;
+        wait_event(current->arch.hvm_vcpu.mem_paging_wq, ({
                     int done;
                     mfn = p2m->get_entry(p2m, gfn, t, a, 0, page_order);
                     done = (*t != p2m_ram_paging_in);
@@ -1190,8 +1190,8 @@ void p2m_mem_paging_resume(struct domain
         }
         /* Wake any vcpus that were waiting for this GFN */
         for_each_vcpu ( d, v )
-            if ( v->arch.mem_paging_gfn == rsp.gfn )
-                wake_up_all(&v->arch.mem_paging_wq);
+            if ( v->arch.hvm_vcpu.mem_paging_gfn == rsp.gfn )
+                wake_up_all(&v->arch.hvm_vcpu.mem_paging_wq);
     }
 }
 
diff -r c3738598897f xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -494,12 +494,6 @@ struct arch_vcpu
     
     struct paging_vcpu paging;
 
-#ifdef CONFIG_X86_64
-    /* Mem-paging: this vcpu is waiting for a gfn to be paged in */
-    struct waitqueue_head mem_paging_wq;
-    unsigned long mem_paging_gfn;
-#endif
-
 #ifdef CONFIG_X86_32
     /* map_domain_page() mapping cache. */
     struct mapcache_vcpu mapcache;
diff -r c3738598897f xen/include/asm-x86/hvm/vcpu.h
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -170,6 +170,13 @@ struct hvm_vcpu {
     unsigned long inject_cr2;
 
     struct viridian_vcpu viridian;
+
+#ifdef CONFIG_X86_64
+    /* Mem-paging: this vcpu is waiting for a gfn to be paged in */
+    struct waitqueue_head mem_paging_wq;
+    unsigned long mem_paging_gfn;
+#endif
+
 };
 
 #endif /* __ASM_X86_HVM_VCPU_H__ */

Tim Deegan

2012-Feb-27 19:26 UTC

head link

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

Hi, 

At 14:45 +0100 on 24 Feb (1330094744), Olaf Hering
wrote:> >  #ifdef __x86_64__
> > +    if ( p2m_is_paging(*t) && (q & P2M_ALLOC)
> > +         && p2m->domain == current->domain ) 
> > +    {
> > +        if ( locked )
> > +            gfn_unlock(p2m, gfn, 0);
> > +
> > +        /* Ping the pager */
> > +        if ( *t == p2m_ram_paging_out || *t == p2m_ram_paged )
> > +            p2m_mem_paging_populate(p2m->domain, gfn);
> > +
> > +        /* Wait until the pager finishes paging it in */
> > +        current->arch.mem_paging_gfn = gfn;
> > +        wait_event(current->arch.mem_paging_wq, ({
> > +                    int done;
> > +                    mfn = p2m->get_entry(p2m, gfn, t, a, 0,
page_order);
> > +                    done = (*t != p2m_ram_paging_in);
> 
> I assume p2m_mem_paging_populate() will not return until the state is
> forwarded to p2m_ram_paging_in.  Maybe p2m_is_paging(*t) would make it
> more obvious what this check is supposed to do.
But it would be wrong.  If the type anything other than
p2m_ram_paging_in, then we can''t be sure that the pager is working on
unblocking us.

Andres made the same suggestion - clearly this code needs a comment. :)
> > +                    /* Safety catch: it _should_ be safe to wait here
> > +                     * but if it''s not, crash the VM, not
the host */
> > +                    if ( in_atomic() )
> > +                    {
> > +                        WARN();
> > +                        domain_crash(p2m->domain);
> > +                        done = 1;
> > +                    }
> > +                    done;
> > +                }));
> > +        goto again;
> > +    }
> > +#endif
> >  void p2m_mem_paging_populate(struct domain *d, unsigned long gfn)
> >  {
> >      struct vcpu *v = current;
> > @@ -965,6 +1001,7 @@ void p2m_mem_paging_populate(struct doma
> >      p2m_access_t a;
> >      mfn_t mfn;
> >      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > +    int send_request = 0;
> 
> Is that variable supposed to be used?
Erk.  Clearly something got mangled in the rebase.  I''ll sort that out.
> Perhaps the feature to fast-forward (or rollback) from
> p2m_ram_paging_out to p2m_ram_rw could be a separate patch. My initial
> version of this patch did not have a strict requirement for this
> feature, if I remember correctly.
Sure, I can split that into a separate patch.

Cheers,

Tim.

Olaf Hering

2012-Feb-27 20:18 UTC

head link

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

On Mon, Feb 27, Tim Deegan wrote:
> At 14:45 +0100 on 24 Feb (1330094744), Olaf Hering wrote:
> > >  #ifdef __x86_64__
> > > +    if ( p2m_is_paging(*t) && (q & P2M_ALLOC)
> > > +         && p2m->domain == current->domain ) 
> > > +    {
> > > +        if ( locked )
> > > +            gfn_unlock(p2m, gfn, 0);
> > > +
> > > +        /* Ping the pager */
> > > +        if ( *t == p2m_ram_paging_out || *t == p2m_ram_paged )
> > > +            p2m_mem_paging_populate(p2m->domain, gfn);
> > > +
> > > +        /* Wait until the pager finishes paging it in */
> > > +        current->arch.mem_paging_gfn = gfn;
> > > +        wait_event(current->arch.mem_paging_wq, ({
> > > +                    int done;
> > > +                    mfn = p2m->get_entry(p2m, gfn, t, a, 0,
page_order);
> > > +                    done = (*t != p2m_ram_paging_in);
> > 
> > I assume p2m_mem_paging_populate() will not return until the state is
> > forwarded to p2m_ram_paging_in.  Maybe p2m_is_paging(*t) would make it
> > more obvious what this check is supposed to do.
> 
> But it would be wrong.  If the type anything other than
> p2m_ram_paging_in, then we can''t be sure that the pager is working
on
> unblocking us.
Not really wrong. I think it depends what will happen to the p2mt once
it leaves the paging state and once the condition in wait_event() is
executed again. Now I see what its supposed to mean, it enters
wait_event() with p2m_ram_paging_in and it is supposed to leave once the
type changed to something else. It works because the page-in path has
now only one p2mt, not two like a few weeks ago.

Olaf

Andres Lagar-Cavilla

2012-Feb-28 21:11 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

>
> On Feb 27, 2012, at 11:51 AM, Olaf Hering wrote:
>
>> On Sun, Feb 26, Olaf Hering wrote:
>>
>>> On Thu, Feb 23, Tim Deegan wrote:
>>>
>>>> This is v2 of the patch I posted last week, after feedback from
>>>> Andres.
>>>
>>> Tried this series, but processes in dom0 started to hang in D state
>>> when a paged guest is started. I will see if I can spot the error.
>>
>> This change for patch #5 is needed, especially the first part.
>> Now it appears to work.
>>
>> Olaf
Unfortunately, I get all kinds of borking on Win7 domains with Citrix PV
drivers 6.0. So it''s a no-go from my end for now.

The pv drivers are calling memops on pages that are paged out, ka-boom.

I applied Tim''s 6 patch series plus Olaf''s fix. Xen WARNs
below.
Andres

(XEN) Xen WARN at p2m.c:200
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    4
(XEN) RIP:    e008:[<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) rax: 0000000000000001   rbx: 000000000000000b   rcx: ffff83083ffaff18
(XEN) rdx: 00000043bfcac680   rsi: 0000000000000000   rdi: 000000183f775000
(XEN) rbp: ffff83083ffaf9a8   rsp: ffff83083ffaf928   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 000000000000000c   r11: fffff88002fe3b0e
(XEN) r12: ffff83183f7a1bb0   r13: ffff83083ffaf9dc   r14: 0000000000000000
(XEN) r15: 00000000000033c0   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000183f793000   cr2: 000000007724f4df
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83083ffaf928:
(XEN)    ffff83083ffaff18 ffff83083ffaff18 0000000000000000 000000018012635a
(XEN)    ffff83083ffaf9d8 ffff83083ffaff18 ffff82c48030a760 ffff82c48030a760
(XEN)    ffff82c48030a760 ffff82c48030a760 01ff82c4801264b3 0000000000000000
(XEN)    ffff83183f7a3000 ffff83083ffafa28 0000000000000000 000000000183f7b9
(XEN)    ffff83083ffafa08 ffff82c480170b91 ffff830000000001 ffff83183f7a3010
(XEN)    0000000000000000 ffff82c4801b3d94 0000000b00000007 ffff83183f7a3000
(XEN)    fffff88002fe3b80 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafa88 ffff82c480170e4b 8b13687800000000 7d4d94af0000009f
(XEN)    0000000100007ff0 0000000000000000 00000000000033c0 ffff82c48015a3ff
(XEN)    0000000000000202 ffff83183f7a3000 ffff82c4801265c1 0000000000000008
(XEN)    0000000000000007 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafc78 ffff82c480114c72 0000000000000286 ffff83083ffafab8
(XEN)    ffff82c4801265c1 ffff83183f6a19a0 ffff83083ffafae8 ffff82c48012daa1
(XEN)    ffff83083ffafb18 0000000000000040 ffff83183f6a1990 ffff83183f7a3000
(XEN)    ffff83083ffafb18 ffff82c48012dbc5 0000000000000040 ffff83083ffeffd0
(XEN)    ffff83183f7a3000 ffff82c4803099c0 ffff83083ffafb68 ffff82c480105e63
(XEN)    0000000000000003 0000000000000000 ffff83183f6a19d0 ffff83083ffb6048
(XEN)    ffff83183f7a1bb0 0000000000000001 0000000000000001 0000000000000001
(XEN)    ffff83083ffafbf8 ffff82c4801e6f0d 00000000000000ff 0000000300000000
(XEN)    ffff83083ffafbc8 ffff83083ffafbc0 0000000000000000 ffff83083ffafcd8
(XEN) Xen call trace:
(XEN)    [<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN)    [<ffff82c480170b91>] xenmem_add_to_physmap_once+0x610/0x773
(XEN)    [<ffff82c480170e4b>] arch_memory_op+0x157/0xc7a
(XEN)    [<ffff82c480114c72>] do_memory_op+0x1d60/0x1dce
(XEN)    [<ffff82c4801b5409>] hvm_memory_op+0x62/0x6a
(XEN)    [<ffff82c4801b84f1>] hvm_do_hypercall+0x1af/0x2f5
(XEN)    [<ffff82c4801d73a5>] vmx_vmexit_handler+0xef3/0x1849
(XEN)
(XEN) domain_crash called from p2m.c:200
(XEN) Domain 2 (vcpu#0) crashed on cpu#4:
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    4
(XEN) RIP:    0010:[<fffffa8002252185>]
(XEN) RFLAGS: 0000000000000082   CONTEXT: hvm guest
(XEN) rax: 000000000000000c   rbx: 00000000000033c0   rcx: 0000000000000180
(XEN) rdx: 0000000000000007   rsi: fffff88002fe3b80   rdi: 0000000000000007
(XEN) rbp: 0000000000000001   rsp: fffff88002fe3b30   r8:  fffff88002fe3b80
(XEN) r9:  00000000ffffffff   r10: 0000000000000000   r11: fffff88002fe3b0e
(XEN) r12: 0000000000000004   r13: fffffa800222d200   r14: fffff880010d7730
(XEN) r15: 0000000000000001   cr0: 0000000080050031   cr4: 00000000000006f8
(XEN) cr3: 0000000000187000   cr2: 000000007724f4df
(XEN) ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 0000   cs: 0010
(XEN) Xen WARN at p2m.c:200
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    4
(XEN) RIP:    e008:[<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) rax: 0000000000000001   rbx: 000000000000000b   rcx: ffff83083ffaff18
(XEN) rdx: 00000043bfcac680   rsi: 0000000000000000   rdi: 000000183f775000
(XEN) rbp: ffff83083ffaf9a8   rsp: ffff83083ffaf928   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 0000000000000002   r11: 0000000000000002
(XEN) r12: ffff83183f7a1bb0   r13: ffff83083ffaf9dc   r14: 0000000000000000
(XEN) r15: 00000000000033c0   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000183f793000   cr2: 000000007724f4df
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83083ffaf928:
(XEN)    ffff83083ffaff18 ffff83083ffaff18 0000000000000000 000000018012635a
(XEN)    ffff83083ffaf9d8 ffff83083ffaff18 ffff82c48030a760 ffff82c48030a760
(XEN)    ffff82c48030a760 ffff82c48030a760 01ff82c4801264b3 0000000000000000
(XEN)    ffff83183f7a3000 ffff83083ffafa28 0000000000000000 000000000183f7b9
(XEN)    ffff83083ffafa08 ffff82c480170b91 ffff830000000001 ffff83183f7a3010
(XEN)    0000000000000000 ffff82c4801b3d94 0000000b00000007 ffff83183f7a3000
(XEN)    fffff88002fe3b80 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafa88 ffff82c480170e4b 8b13687800000000 7d4d94af0000009f
(XEN)    0000000100007ff0 0000000000000000 00000000000033c0 ffff82c48015a3ff
(XEN)    0000000000000202 ffff83183f7a3000 ffff82c4801265c1 0000000000000008
(XEN)    0000000000000007 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafc78 ffff82c480114c72 0000000000000286 ffff83083ffafab8
(XEN)    ffff82c4801265c1 ffff83183f6a19a0 ffff83083ffafae8 ffff82c48012daa1
(XEN)    ffff83083ffafb18 0000000000000040 ffff83183f6a1990 ffff83183f7a3000
(XEN)    ffff83083ffafb18 ffff82c48012dbc5 0000000000000040 ffff83083ffeffd0
(XEN)    ffff83183f7a3000 ffff82c4803099c0 ffff83083ffafb68 ffff82c480105e63
(XEN)    0000000000000003 0000000000000000 ffff83183f6a19d0 ffff83083ffb6048
(XEN)    ffff83183f7a1bb0 0000000000000001 0000000000000001 0000000000000001
(XEN)    ffff83083ffafbf8 ffff82c4801e6f0d 00000000000000ff 0000000300000000
(XEN)    ffff83083ffafbc8 ffff83083ffafbc0 0000000000000000 ffff83083ffafcd8
(XEN) Xen call trace:
(XEN)    [<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN)    [<ffff82c480170b91>] xenmem_add_to_physmap_once+0x610/0x773
(XEN)    [<ffff82c480170e4b>] arch_memory_op+0x157/0xc7a
(XEN)    [<ffff82c480114c72>] do_memory_op+0x1d60/0x1dce
(XEN)    [<ffff82c4801b5409>] hvm_memory_op+0x62/0x6a
(XEN)    [<ffff82c4801b84f1>] hvm_do_hypercall+0x1af/0x2f5
(XEN)    [<ffff82c4801d73a5>] vmx_vmexit_handler+0xef3/0x1849
(XEN)
(XEN) domain_crash called from p2m.c:200
(XEN) Xen WARN at p2m.c:200
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    4
(XEN) RIP:    e008:[<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) rax: 0000000000000001   rbx: 000000000000000b   rcx: ffff83083ffaff18
(XEN) rdx: 00000043bfcac680   rsi: 0000000000000000   rdi: 000000183f775000
(XEN) rbp: ffff83083ffaf9a8   rsp: ffff83083ffaf928   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 0000000000000003   r11: 0000000000000003
(XEN) r12: ffff83183f7a1bb0   r13: ffff83083ffaf9dc   r14: 0000000000000000
(XEN) r15: 00000000000033c0   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000183f793000   cr2: 000000007724f4df
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83083ffaf928:
(XEN)    ffff83083ffaff18 ffff83083ffaff18 0000000000000000 000000018012635a
(XEN)    ffff83083ffaf9d8 ffff83083ffaff18 ffff82c48030a760 ffff82c48030a760
(XEN)    ffff82c48030a760 ffff82c48030a760 01ff82c4801264b3 0000000000000000
(XEN)    ffff83183f7a3000 ffff83083ffafa28 0000000000000000 000000000183f7b9
(XEN)    ffff83083ffafa08 ffff82c480170b91 ffff830000000001 ffff83183f7a3010
(XEN)    0000000000000000 ffff82c4801b3d94 0000000b00000007 ffff83183f7a3000
(XEN)    fffff88002fe3b80 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafa88 ffff82c480170e4b 8b13687800000000 7d4d94af0000009f
(XEN)    0000000100007ff0 0000000000000000 00000000000033c0 ffff82c48015a3ff
(XEN)    0000000000000202 ffff83183f7a3000 ffff82c4801265c1 0000000000000008
(XEN)    0000000000000007 fffff88002fe3b80 ffff8300bf2ea000 ffff82c4801b5344
(XEN)    ffff83083ffafc78 ffff82c480114c72 0000000000000286 ffff83083ffafab8
(XEN)    ffff82c4801265c1 ffff83183f6a19a0 ffff83083ffafae8 ffff82c48012daa1
(XEN)    ffff83083ffafb18 0000000000000040 ffff83183f6a1990 ffff83183f7a3000
(XEN)    ffff83083ffafb18 ffff82c48012dbc5 0000000000000040 ffff83083ffeffd0
(XEN)    ffff83183f7a3000 ffff82c4803099c0 ffff83083ffafb68 ffff82c480105e63
(XEN)    0000000000000003 0000000000000000 ffff83183f6a19d0 ffff83083ffb6048
(XEN)    ffff83183f7a1bb0 0000000000000001 0000000000000001 0000000000000001
(XEN)    ffff83083ffafbf8 ffff82c4801e6f0d 00000000000000ff 0000000300000000
(XEN)    ffff83083ffafbc8 ffff83083ffafbc0 0000000000000000 ffff83083ffafcd8
(XEN) Xen call trace:
(XEN)    [<ffff82c4801e08e2>] __get_gfn_type_access+0x230/0x3ee
(XEN)    [<ffff82c480170b91>] xenmem_add_to_physmap_once+0x610/0x773
(XEN)    [<ffff82c480170e4b>] arch_memory_op+0x157/0xc7a
(XEN)    [<ffff82c480114c72>] do_memory_op+0x1d60/0x1dce
(XEN)    [<ffff82c4801b5409>] hvm_memory_op+0x62/0x6a
(XEN)    [<ffff82c4801b84f1>] hvm_do_hypercall+0x1af/0x2f5
(XEN)    [<ffff82c4801d73a5>] vmx_vmexit_handler+0xef3/0x1849
(XEN)
(XEN) domain_crash called from p2m.c:200


>>
>>
>> # HG changeset patch
>> # Parent c3738598897f5239a72cabde676f5e86fd4c8241
>>
>> diff -r c3738598897f xen/arch/x86/hvm/hvm.c
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -999,6 +999,10 @@ int hvm_vcpu_initialise(struct vcpu *v)
>>
>>     v->arch.hvm_vcpu.inject_trap = -1;
>>
>> +#ifdef CONFIG_X86_64
>> +    init_waitqueue_head(&v->arch.hvm_vcpu.mem_paging_wq);
>> +#endif
>> +
>> #ifdef CONFIG_COMPAT
>>     rc = setup_compat_arg_xlat(v);
>>     if ( rc != 0 )
>> diff -r c3738598897f xen/arch/x86/mm/p2m.c
>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -183,8 +183,8 @@ again:
>>             p2m_mem_paging_populate(p2m->domain, gfn);
>>
>>         /* Wait until the pager finishes paging it in */
>> -        current->arch.mem_paging_gfn = gfn;
>> -        wait_event(current->arch.mem_paging_wq, ({
>> +        current->arch.hvm_vcpu.mem_paging_gfn = gfn;
>> +        wait_event(current->arch.hvm_vcpu.mem_paging_wq, ({
>>                     int done;
>>                     mfn = p2m->get_entry(p2m, gfn, t, a, 0,
page_order);
>>                     done = (*t != p2m_ram_paging_in);
>> @@ -1190,8 +1190,8 @@ void p2m_mem_paging_resume(struct domain
>>         }
>>         /* Wake any vcpus that were waiting for this GFN */
>>         for_each_vcpu ( d, v )
>> -            if ( v->arch.mem_paging_gfn == rsp.gfn )
>> -                wake_up_all(&v->arch.mem_paging_wq);
>> +            if ( v->arch.hvm_vcpu.mem_paging_gfn == rsp.gfn )
>> +                wake_up_all(&v->arch.hvm_vcpu.mem_paging_wq);
>>     }
>> }
>>
>> diff -r c3738598897f xen/include/asm-x86/domain.h
>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -494,12 +494,6 @@ struct arch_vcpu
>>
>>     struct paging_vcpu paging;
>>
>> -#ifdef CONFIG_X86_64
>> -    /* Mem-paging: this vcpu is waiting for a gfn to be paged in */
>> -    struct waitqueue_head mem_paging_wq;
>> -    unsigned long mem_paging_gfn;
>> -#endif
>> -
>> #ifdef CONFIG_X86_32
>>     /* map_domain_page() mapping cache. */
>>     struct mapcache_vcpu mapcache;
>> diff -r c3738598897f xen/include/asm-x86/hvm/vcpu.h
>> --- a/xen/include/asm-x86/hvm/vcpu.h
>> +++ b/xen/include/asm-x86/hvm/vcpu.h
>> @@ -170,6 +170,13 @@ struct hvm_vcpu {
>>     unsigned long inject_cr2;
>>
>>     struct viridian_vcpu viridian;
>> +
>> +#ifdef CONFIG_X86_64
>> +    /* Mem-paging: this vcpu is waiting for a gfn to be paged in */
>> +    struct waitqueue_head mem_paging_wq;
>> +    unsigned long mem_paging_gfn;
>> +#endif
>> +
>> };
>>
>> #endif /* __ASM_X86_HVM_VCPU_H__ */
>
>

Olaf Hering

2012-Feb-29 16:18 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Tue, Feb 28, Andres Lagar-Cavilla wrote:
> >
> > On Feb 27, 2012, at 11:51 AM, Olaf Hering wrote:
> >
> >> On Sun, Feb 26, Olaf Hering wrote:
> >>
> >>> On Thu, Feb 23, Tim Deegan wrote:
> >>>
> >>>> This is v2 of the patch I posted last week, after feedback
from
> >>>> Andres.
> >>>
> >>> Tried this series, but processes in dom0 started to hang in D
state
> >>> when a paged guest is started. I will see if I can spot the
error.
> >>
> >> This change for patch #5 is needed, especially the first part.
> >> Now it appears to work.
> >>
> >> Olaf
> 
> Unfortunately, I get all kinds of borking on Win7 domains with Citrix PV
> drivers 6.0. So it''s a no-go from my end for now.
This is the domain_lock() in xenmem_add_to_physmap_once(). 

Is get_gfn_untyped() correct, or would get_gfn_query() work as well in
this context?

Olaf

Olaf Hering

2012-Feb-29 19:56 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Wed, Feb 29, Olaf Hering wrote:
> This is the domain_lock() in xenmem_add_to_physmap_once(). 
> 
> Is get_gfn_untyped() correct, or would get_gfn_query() work as well in
> this context?
Another case is emulate_privileged_op(), in "Write CR3" case
get_gfn_untyped() is called with domain_lock().

Same for XENMEM_remove_from_physmap.

These are the places I found by a quick grep for get_gfn_untyped,
get_gfn and get_gfn_unshare.

Olaf

Tim Deegan

2012-Mar-15 15:27 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

At 20:56 +0100 on 29 Feb (1330548983), Olaf Hering
wrote:> On Wed, Feb 29, Olaf Hering wrote:
> 
> > This is the domain_lock() in xenmem_add_to_physmap_once(). 
> > 
> > Is get_gfn_untyped() correct, or would get_gfn_query() work as well in
> > this context?
get_gfn_untyped() is correct.  I''m not sure that we really need to take
the domain lock while we''re doing it, though.  It might be that the new
gfn locks will be enough to serialize these updates.
> Another case is emulate_privileged_op(), in "Write CR3" case
> get_gfn_untyped() is called with domain_lock().
I think that should be OK as it only happens for PV guests. 
> Same for XENMEM_remove_from_physmap.
That can almost certainly be done without the domain lock.

Sadly I haven''t got time to fix these all up today.  
I have committed patches 1 to 4 of this series, and made some of the
changes suggested to the other two but there seems little point in
reposting them until we can sort out the locking. 
 
Tim.

Olaf Hering

2012-Mar-15 15:37 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

On Thu, Mar 15, Tim Deegan wrote:
> At 20:56 +0100 on 29 Feb (1330548983), Olaf Hering wrote:
> > On Wed, Feb 29, Olaf Hering wrote:
> > 
> > > This is the domain_lock() in xenmem_add_to_physmap_once(). 
> > > 
> > > Is get_gfn_untyped() correct, or would get_gfn_query() work as
well in
> > > this context?
> 
> get_gfn_untyped() is correct.  I''m not sure that we really need to
take
> the domain lock while we''re doing it, though.  It might be that
the new
> gfn locks will be enough to serialize these updates.
> 
> > Another case is emulate_privileged_op(), in "Write CR3" case
> > get_gfn_untyped() is called with domain_lock().
> 
> I think that should be OK as it only happens for PV guests. 
I think its just me not knowing these things, but would it make sense to
add comments to XX_lock users describing what they protect (or used to
protect since comments can stale over time)?

Olaf

Tim Deegan

2012-Mar-15 15:40 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

At 16:37 +0100 on 15 Mar (1331829432), Olaf Hering
wrote:> On Thu, Mar 15, Tim Deegan wrote:
> 
> > At 20:56 +0100 on 29 Feb (1330548983), Olaf Hering wrote:
> > > On Wed, Feb 29, Olaf Hering wrote:
> > > 
> > > > This is the domain_lock() in xenmem_add_to_physmap_once(). 
> > > > 
> > > > Is get_gfn_untyped() correct, or would get_gfn_query() work
as well in
> > > > this context?
> > 
> > get_gfn_untyped() is correct.  I''m not sure that we really
need to take
> > the domain lock while we''re doing it, though.  It might be
that the new
> > gfn locks will be enough to serialize these updates.
> > 
> > > Another case is emulate_privileged_op(), in "Write CR3"
case
> > > get_gfn_untyped() is called with domain_lock().
> > 
> > I think that should be OK as it only happens for PV guests. 
> 
> I think its just me not knowing these things, but would it make sense to
> add comments to XX_lock users describing what they protect (or used to
> protect since comments can stale over time)?
Yes!  I''m trying to get this sorted out for all the MM locks, as well
as
the discipline of which order to take them in.  But the domain lock
(a.k.a the Big Lock) is an ancient and venerable lock, now poorly
understood, at least by me. :)  The add_to_physmap path started taking
this lock in cset 9187:fbeb0a5b7219, in 2006.

Tim.

Andres Lagar-Cavilla

2012-Mar-15 15:56 UTC

head link

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

> At 16:37 +0100 on 15 Mar (1331829432), Olaf Hering wrote:
>> On Thu, Mar 15, Tim Deegan wrote:
>>
>> > At 20:56 +0100 on 29 Feb (1330548983), Olaf Hering wrote:
>> > > On Wed, Feb 29, Olaf Hering wrote:
>> > >
>> > > > This is the domain_lock() in
xenmem_add_to_physmap_once().
>> > > >
>> > > > Is get_gfn_untyped() correct, or would get_gfn_query()
work as
>> well in
>> > > > this context?
>> >
>> > get_gfn_untyped() is correct.  I''m not sure that we
really need to
>> take
>> > the domain lock while we''re doing it, though.  It might
be that the
>> new
>> > gfn locks will be enough to serialize these updates.
>> >
>> > > Another case is emulate_privileged_op(), in "Write
CR3" case
>> > > get_gfn_untyped() is called with domain_lock().
>> >
>> > I think that should be OK as it only happens for PV guests.
>>
>> I think its just me not knowing these things, but would it make sense
to
>> add comments to XX_lock users describing what they protect (or used to
>> protect since comments can stale over time)?
>
> Yes!  I''m trying to get this sorted out for all the MM locks, as
well as
> the discipline of which order to take them in.  But the domain lock
> (a.k.a the Big Lock) is an ancient and venerable lock, now poorly
> understood, at least by me. :)  The add_to_physmap path started taking
> this lock in cset 9187:fbeb0a5b7219, in 2006.
We''ve come a long way in terms of discipline for the mm. But there is
still the odd interaction -- see recent posts about deadlock between p2m
and event channel lock.

And the big domain lock is surely a big rug thrown over a lot of painful
dirt ("metaphors, not Andres''s forte")

Even more venerable and hideous is the domctl lock. It''s actually
somewhat
surprising the switch to memops for paging/sharing worked rather
seamlessly.

Andres>
> Tim.
>
>

Xen devel - Feb 2012 - [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

[PATCH 0 of 6] [RFC] Use wait queues for paging, v2

[PATCH 1 of 6] mm: guest_remove_page() should not populate or unshare

[PATCH 2 of 6] x86/mm: remove ''p2m_guest'' lookup type

[PATCH 3 of 6] x86/mm: make ''query type'' argument to get_gfn into a set of flags

[PATCH 4 of 6] x86/mm: tidy up get_two_gfns() a little

[PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

[PATCH 6 of 6] x86/mm: Don''t claim a slot on the paging ring if we might not need it

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 3 of 6] x86/mm: make ''query type'' argument to get_gfn into a set of flags

Re: [PATCH 6 of 6] x86/mm: Don''t claim a slot on the paging ring if we might not need it

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

Re: [PATCH 5 of 6] [RFC] x86/mm: use wait queues for mem_paging

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2

Re: [PATCH 0 of 6] [RFC] Use wait queues for paging, v2