thr3ads.net - Xen devel - Re: [PATCH] mem_event: use wait queue when ring is full [Jan 2012]

If this information is useful, please help other people find it:
Share via:

Andres Lagar-Cavilla

2012-Jan-11 18:02 UTC

Re: [PATCH] mem_event: use wait queue when ring is full

> Date: Mon, 19 Dec 2011 12:39:46 +0100
> From: Olaf Hering <olaf@aepfle.de>
> To: xen-devel@lists.xensource.com
> Subject: [Xen-devel] [PATCH] mem_event: use wait queue when ring is
> 	full
> Message-ID: <d910d738f31b4ee1a6b0.1324294786@probook.site>
> Content-Type: text/plain; charset="us-ascii"
>
> # HG changeset patch
> # User Olaf Hering <olaf@aepfle.de>
> # Date 1324294734 -3600
> # Node ID d910d738f31b4ee1a6b0727ac0a66cff4c8933b0
> # Parent  03138a08366b895d79e143119d4c9c72833cdbcd
> mem_event: use wait queue when ring is full
>
> This change is based on an idea/patch from Adin Scannell.
Olaf,
thanks for the post. We''ll have to nack this patch in its current form.
It
hard reboots our host during our testing.

What we did is take this patch, amalgamate it with some bits from our ring
management approach. We''re ready to submit that, along with a
description
of how we test it. It works for us, and it involves wait queue''s for
corner cases.

Stay tuned, test, and hopefully, ack.
Thanks
Andres

>
> If the ring is full, put the current vcpu to sleep if it belongs to the
> target domain. The wakeup happens in the p2m_*_resume functions. Wakeup
> will take the number of free slots into account.
>
> A request from foreign domain has to succeed once a slot was claimed
> because such vcpus can not sleep.
>
> This change fixes also a bug in p2m_mem_paging_drop_page(). Up to now a
> full ring will lead to harmless inconsistency in the pager.
>
> v7:
>   - add ring accounting so that each vcpu can place at least one request
>
> v6:
>   - take foreign requests into account before calling wake_up_nr()
>   - call wake_up_nr() outside of ring lock
>  - rename ->bit to ->pause_flag
>  - update comment in mem_event_enable()
>
> v5:
>   - rename mem_event_check_ring() to mem_event_claim_slot()
>   - rename mem_event_put_req_producers() to mem_event_release_slot()
>   - add local/foreign request accounting
>   - keep room for at least one guest request
>
> v4:
>  - fix off-by-one bug in _mem_event_put_request
>  - add mem_event_wake_requesters() and use wake_up_nr()
>  - rename mem_event_mark_and_pause() and mem_event_mark_and_pause()
> functions
>  - req_producers counts foreign request producers, rename member
>
> v3:
>  - rename ->mem_event_bit to ->bit
>  - remove me_ from new VPF_ defines
>
> v2:
>  - p2m_mem_paging_populate: move vcpu_pause after put_request, otherwise
> the
>    vcpu will not wake_up after a wait_event because the pause_count was
>    increased twice. Fixes guest hangs.
>  - update free space check in _mem_event_put_request()
>  - simplify mem_event_put_request()
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> diff -r 03138a08366b -r d910d738f31b xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4172,8 +4172,8 @@ static int hvm_memory_event_traps(long p
>      if ( (p & HVMPME_onchangeonly) && (value == old) )
>          return 1;
>
> -    rc = mem_event_check_ring(d, &d->mem_event->access);
> -    if ( rc )
> +    rc = mem_event_claim_slot(d, &d->mem_event->access);
> +    if ( rc < 0 )
>          return rc;
>
>      memset(&req, 0, sizeof(req));
> diff -r 03138a08366b -r d910d738f31b xen/arch/x86/mm/mem_event.c
> --- a/xen/arch/x86/mm/mem_event.c
> +++ b/xen/arch/x86/mm/mem_event.c
> @@ -23,6 +23,7 @@
>
>  #include <asm/domain.h>
>  #include <xen/event.h>
> +#include <xen/wait.h>
>  #include <asm/p2m.h>
>  #include <asm/mem_event.h>
>  #include <asm/mem_paging.h>
> @@ -41,6 +42,7 @@ static int mem_event_enable(
>      struct domain *d,
>      xen_domctl_mem_event_op_t *mec,
>      struct mem_event_domain *med,
> +    int pause_flag,
>      xen_event_channel_notification_t notification_fn)
>  {
>      int rc;
> @@ -110,8 +112,25 @@ static int mem_event_enable(
>
>      mem_event_ring_lock_init(med);
>
> -    /* Wake any VCPUs paused for memory events */
> -    mem_event_unpause_vcpus(d);
> +    med->pause_flag = pause_flag;
> +
> +    /*
> +     * Configure ring accounting:
> +     * Each guest vcpu should be able to place at least one request.
> +     * If there are more vcpus than available slots in the ring, not all
> vcpus
> +     * can place requests in the ring anyway.  A minimum (arbitrary)
> number of
> +     * foreign requests will be allowed in this case.
> +     */
> +    if ( d->max_vcpus < RING_SIZE(&med->front_ring) )
> +        med->max_foreign = RING_SIZE(&med->front_ring) -
d->max_vcpus;
> +    if ( med->max_foreign < 13 )
> +        med->max_foreign = 13;
> +    med->max_target = RING_SIZE(&med->front_ring) -
med->max_foreign;
> +
> +    init_waitqueue_head(&med->wq);
> +
> +    /* Wake any VCPUs waiting for the ring to appear */
> +    mem_event_wake_waiters(d, med);
>
>      return 0;
>
> @@ -127,6 +146,9 @@ static int mem_event_enable(
>
>  static int mem_event_disable(struct mem_event_domain *med)
>  {
> +    if (!list_empty(&med->wq.list))
> +        return -EBUSY;
> +
>      unmap_domain_page(med->ring_page);
>      med->ring_page = NULL;
>
> @@ -136,14 +158,30 @@ static int mem_event_disable(struct mem_
>      return 0;
>  }
>
> -void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med, mem_event_request_t *req)
> +static int _mem_event_put_request(struct domain *d,
> +                                  struct mem_event_domain *med,
> +                                  mem_event_request_t *req,
> +                                  int *done)
>  {
>      mem_event_front_ring_t *front_ring;
> +    int free_req, claimed_req, ret;
>      RING_IDX req_prod;
>
> +    if ( *done )
> +        return 1;
> +
>      mem_event_ring_lock(med);
>
>      front_ring = &med->front_ring;
> +
> +    /* Foreign requests must succeed because their vcpus can not sleep */
> +    claimed_req = med->foreign_producers;
> +    free_req = RING_FREE_REQUESTS(front_ring);
> +    if ( !free_req || ( current->domain == d && free_req <=
claimed_req )
> ) {
> +        mem_event_ring_unlock(med);
> +        return 0;
> +    }
> +
>      req_prod = front_ring->req_prod_pvt;
>
>      if ( current->domain != d )
> @@ -156,14 +194,45 @@ void mem_event_put_request(struct domain
>      memcpy(RING_GET_REQUEST(front_ring, req_prod), req, sizeof(*req));
>      req_prod++;
>
> +    ret = 1;
> +    *done = 1;
> +    free_req--;
> +
> +    /* Update accounting */
> +    if ( current->domain == d )
> +    {
> +        med->target_producers--;
> +        /* Ring is full, no more requests from this vcpu, go to sleep */
> +        if ( free_req < med->max_target )
> +            ret = 0;
> +    }
> +    else
> +        med->foreign_producers--;
> +
>      /* Update ring */
> -    med->req_producers--;
>      front_ring->req_prod_pvt = req_prod;
>      RING_PUSH_REQUESTS(front_ring);
>
>      mem_event_ring_unlock(med);
>
>      notify_via_xen_event_channel(d, med->xen_port);
> +
> +    return ret;
> +}
> +
> +void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med,
> +                           mem_event_request_t *req)
> +{
> +    int done = 0;
> +    /* Go to sleep if request came from guest */
> +    if (current->domain == d) {
> +        wait_event(med->wq, _mem_event_put_request(d, med, req,
&done));
> +        return;
> +    }
> +    /* Ring was full anyway, unable to sleep in non-guest context */
> +    if (!_mem_event_put_request(d, med, req, &done))
> +        printk("Failed to put memreq: d %u t %x f %x gfn %lx\n",
> d->domain_id,
> +                req->type, req->flags, (unsigned long)req->gfn);
>  }
>
>  int mem_event_get_response(struct mem_event_domain *med,
> mem_event_response_t *rsp)
> @@ -195,32 +264,97 @@ int mem_event_get_response(struct mem_ev
>      return 1;
>  }
>
> -void mem_event_unpause_vcpus(struct domain *d)
> +/**
> + * mem_event_wake_requesters - Wake vcpus waiting for room in the ring
> + * @d: guest domain
> + * @med: mem_event ring
> + *
> + * mem_event_wake_requesters() will wakeup vcpus waiting for room in the
> + * ring. Only as many as can place another request in the ring will
> + * resume execution.
> + */
> +void mem_event_wake_requesters(struct mem_event_domain *med)
> +{
> +    int free_req;
> +
> +    mem_event_ring_lock(med);
> +    free_req = RING_FREE_REQUESTS(&med->front_ring);
> +    free_req -= med->foreign_producers;
> +    mem_event_ring_unlock(med);
> +
> +    if ( free_req )
> +        wake_up_nr(&med->wq, free_req);
> +}
> +
> +/**
> + * mem_event_wake_waiters - Wake all vcpus waiting for the ring
> + * @d: guest domain
> + * @med: mem_event ring
> + *
> + * mem_event_wake_waiters() will wakeup all vcpus waiting for the ring to
> + * become available.
> + */
> +void mem_event_wake_waiters(struct domain *d, struct mem_event_domain
> *med)
>  {
>      struct vcpu *v;
>
>      for_each_vcpu ( d, v )
> -        if ( test_and_clear_bit(_VPF_mem_event, &v->pause_flags) )
> +        if ( test_and_clear_bit(med->pause_flag,
&v->pause_flags) )
>              vcpu_wake(v);
>  }
>
> -void mem_event_mark_and_pause(struct vcpu *v)
> +/**
> + * mem_event_mark_and_sleep - Put vcpu to sleep
> + * @v: guest vcpu
> + * @med: mem_event ring
> + *
> + * mem_event_mark_and_sleep() tags vcpu and put it to sleep.
> + * The vcpu will resume execution in mem_event_wake_waiters().
> + */
> +void mem_event_mark_and_sleep(struct vcpu *v, struct mem_event_domain
> *med)
>  {
> -    set_bit(_VPF_mem_event, &v->pause_flags);
> +    set_bit(med->pause_flag, &v->pause_flags);
>      vcpu_sleep_nosync(v);
>  }
>
> -void mem_event_put_req_producers(struct mem_event_domain *med)
> +/**
> + * mem_event_release_slot - Release a claimed slot
> + * @med: mem_event ring
> + *
> + * mem_event_release_slot() releases a claimed slot in the mem_event
> ring.
> + */
> +void mem_event_release_slot(struct domain *d, struct mem_event_domain
> *med)
>  {
>      mem_event_ring_lock(med);
> -    med->req_producers--;
> +    if ( current->domain == d )
> +        med->target_producers--;
> +    else
> +        med->foreign_producers--;
>      mem_event_ring_unlock(med);
>  }
>
> -int mem_event_check_ring(struct domain *d, struct mem_event_domain *med)
> +/**
> + * mem_event_claim_slot - Check state of a mem_event ring
> + * @d: guest domain
> + * @med: mem_event ring
> + *
> + * Return codes: < 0: the ring is not yet configured
> + *                 0: the ring has some room
> + *               > 0: the ring is full
> + *
> + * mem_event_claim_slot() checks the state of the given mem_event ring.
> + * If the current vcpu belongs to the guest domain, the function assumes
> that
> + * mem_event_put_request() will sleep until the ring has room again.
> + * A guest can always place at least one request.
> + *
> + * If the current vcpu does not belong to the target domain the caller
> must try
> + * again until there is room. A slot is claimed and the caller can place
> a
> + * request. If the caller does not need to send a request, the claimed
> slot has
> + * to be released with mem_event_release_slot().
> + */
> +int mem_event_claim_slot(struct domain *d, struct mem_event_domain *med)
>  {
> -    struct vcpu *curr = current;
> -    int free_requests;
> +    int free_req;
>      int ring_full = 1;
>
>      if ( !med->ring_page )
> @@ -228,15 +362,19 @@ int mem_event_check_ring(struct domain *
>
>      mem_event_ring_lock(med);
>
> -    free_requests = RING_FREE_REQUESTS(&med->front_ring);
> -    if ( med->req_producers < free_requests )
> +    free_req = RING_FREE_REQUESTS(&med->front_ring);
> +
> +    if ( current->domain == d )
>      {
> -        med->req_producers++;
> +        med->target_producers++;
>          ring_full = 0;
>      }
> -
> -    if ( ring_full && (curr->domain == d) )
> -        mem_event_mark_and_pause(curr);
> +    else if ( med->foreign_producers + med->target_producers <
free_req
> &&
> +              med->foreign_producers < med->max_foreign )
> +    {
> +        med->foreign_producers++;
> +        ring_full = 0;
> +    }
>
>      mem_event_ring_unlock(med);
>
> @@ -316,7 +454,7 @@ int mem_event_domctl(struct domain *d, x
>              if ( p2m->pod.entry_count )
>                  break;
>
> -            rc = mem_event_enable(d, mec, med, mem_paging_notification);
> +            rc = mem_event_enable(d, mec, med, _VPF_mem_paging,
> mem_paging_notification);
>          }
>          break;
>
> @@ -355,7 +493,7 @@ int mem_event_domctl(struct domain *d, x
>              if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
>                  break;
>
> -            rc = mem_event_enable(d, mec, med, mem_access_notification);
> +            rc = mem_event_enable(d, mec, med, _VPF_mem_access,
> mem_access_notification);
>          }
>          break;
>
> diff -r 03138a08366b -r d910d738f31b xen/arch/x86/mm/mem_sharing.c
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -253,18 +253,10 @@ static void mem_sharing_audit(void)
>  #endif
>
>
> -static struct page_info* mem_sharing_alloc_page(struct domain *d,
> -                                                unsigned long gfn)
> +static void mem_sharing_notify_helper(struct domain *d, unsigned long
> gfn)
>  {
> -    struct page_info* page;
>      struct vcpu *v = current;
> -    mem_event_request_t req;
> -
> -    page = alloc_domheap_page(d, 0);
> -    if(page != NULL) return page;
> -
> -    memset(&req, 0, sizeof(req));
> -    req.type = MEM_EVENT_TYPE_SHARED;
> +    mem_event_request_t req = { .type = MEM_EVENT_TYPE_SHARED };
>
>      if ( v->domain != d )
>      {
> @@ -275,20 +267,18 @@ static struct page_info* mem_sharing_all
>          gdprintk(XENLOG_ERR,
>                   "Failed alloc on unshare path for foreign (%d)
> lookup\n",
>                   d->domain_id);
> -        return page;
> +        return;
>      }
>
> -    vcpu_pause_nosync(v);
> -    req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> +    if (mem_event_claim_slot(d, &d->mem_event->share) < 0)
> +        return;
>
> -    if(mem_event_check_ring(d, &d->mem_event->share)) return
page;
> -
> +    req.flags = MEM_EVENT_FLAG_VCPU_PAUSED;
>      req.gfn = gfn;
>      req.p2mt = p2m_ram_shared;
>      req.vcpu_id = v->vcpu_id;
>      mem_event_put_request(d, &d->mem_event->share, &req);
> -
> -    return page;
> +    vcpu_pause_nosync(v);
>  }
>
>  unsigned int mem_sharing_get_nr_saved_mfns(void)
> @@ -656,7 +646,7 @@ gfn_found:
>      if(ret == 0) goto private_page_found;
>
>      old_page = page;
> -    page = mem_sharing_alloc_page(d, gfn);
> +    page = alloc_domheap_page(d, 0);
>      if(!page)
>      {
>          /* We''ve failed to obtain memory for private page. Need
to re-add
> the
> @@ -664,6 +654,7 @@ gfn_found:
>          list_add(&gfn_info->list, &hash_entry->gfns);
>          put_gfn(d, gfn);
>          shr_unlock();
> +        mem_sharing_notify_helper(d, gfn);
>          return -ENOMEM;
>      }
>
> diff -r 03138a08366b -r d910d738f31b xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -861,20 +861,12 @@ int p2m_mem_paging_evict(struct domain *
>   */
>  void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn)
>  {
> -    struct vcpu *v = current;
> -    mem_event_request_t req;
> +    mem_event_request_t req = { .type = MEM_EVENT_TYPE_PAGING, .gfn = gfn
> };
>
> -    /* Check that there''s space on the ring for this request */
> -    if ( mem_event_check_ring(d, &d->mem_event->paging) == 0)
> -    {
> -        /* Send release notification to pager */
> -        memset(&req, 0, sizeof(req));
> -        req.flags |= MEM_EVENT_FLAG_DROP_PAGE;
> -        req.gfn = gfn;
> -        req.vcpu_id = v->vcpu_id;
> +    /* Send release notification to pager */
> +    req.flags = MEM_EVENT_FLAG_DROP_PAGE;
>
> -        mem_event_put_request(d, &d->mem_event->paging,
&req);
> -    }
> +    mem_event_put_request(d, &d->mem_event->paging, &req);
>  }
>
>  /**
> @@ -908,7 +900,7 @@ void p2m_mem_paging_populate(struct doma
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>
>      /* Check that there''s space on the ring for this request */
> -    if ( mem_event_check_ring(d, &d->mem_event->paging) )
> +    if ( mem_event_claim_slot(d, &d->mem_event->paging) )
>          return;
>
>      memset(&req, 0, sizeof(req));
> @@ -938,7 +930,7 @@ void p2m_mem_paging_populate(struct doma
>      else if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged
)
>      {
>          /* gfn is already on its way back and vcpu is not paused */
> -        mem_event_put_req_producers(&d->mem_event->paging);
> +        mem_event_release_slot(d, &d->mem_event->paging);
>          return;
>      }
>
> @@ -1080,8 +1072,8 @@ void p2m_mem_paging_resume(struct domain
>              vcpu_unpause(d->vcpu[rsp.vcpu_id]);
>      }
>
> -    /* Unpause any domains that were paused because the ring was full */
> -    mem_event_unpause_vcpus(d);
> +    /* Wake vcpus waiting for room in the ring */
> +    mem_event_wake_requesters(&d->mem_event->paging);
>  }
>
>  bool_t p2m_mem_access_check(unsigned long gpa, bool_t gla_valid, unsigned
> long gla,
> @@ -1115,7 +1107,7 @@ bool_t p2m_mem_access_check(unsigned lon
>      p2m_unlock(p2m);
>
>      /* Otherwise, check if there is a memory event listener, and send the
> message along */
> -    res = mem_event_check_ring(d, &d->mem_event->access);
> +    res = mem_event_claim_slot(d, &d->mem_event->access);
>      if ( res < 0 )
>      {
>          /* No listener */
> @@ -1125,7 +1117,7 @@ bool_t p2m_mem_access_check(unsigned lon
>                     "Memory access permissions failure, no mem_event
> listener: pausing VCPU %d, dom %d\n",
>                     v->vcpu_id, d->domain_id);
>
> -            mem_event_mark_and_pause(v);
> +            mem_event_mark_and_sleep(v, &d->mem_event->access);
>          }
>          else
>          {
> @@ -1186,9 +1178,11 @@ void p2m_mem_access_resume(struct domain
>              vcpu_unpause(d->vcpu[rsp.vcpu_id]);
>      }
>
> -    /* Unpause any domains that were paused because the ring was full or
> no listener
> -     * was available */
> -    mem_event_unpause_vcpus(d);
> +    /* Wake vcpus waiting for room in the ring */
> +    mem_event_wake_requesters(&d->mem_event->access);
> +
> +    /* Unpause all vcpus that were paused because no listener was
> available */
> +    mem_event_wake_waiters(d, &d->mem_event->access);
>  }
>
>
> diff -r 03138a08366b -r d910d738f31b xen/include/asm-x86/mem_event.h
> --- a/xen/include/asm-x86/mem_event.h
> +++ b/xen/include/asm-x86/mem_event.h
> @@ -24,13 +24,13 @@
>  #ifndef __MEM_EVENT_H__
>  #define __MEM_EVENT_H__
>
> -/* Pauses VCPU while marking pause flag for mem event */
> -void mem_event_mark_and_pause(struct vcpu *v);
> -int mem_event_check_ring(struct domain *d, struct mem_event_domain *med);
> -void mem_event_put_req_producers(struct mem_event_domain *med);
> +int mem_event_claim_slot(struct domain *d, struct mem_event_domain *med);
> +void mem_event_release_slot(struct domain *d, struct mem_event_domain
> *med);
>  void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med, mem_event_request_t *req);
>  int mem_event_get_response(struct mem_event_domain *med,
> mem_event_response_t *rsp);
> -void mem_event_unpause_vcpus(struct domain *d);
> +void mem_event_wake_requesters(struct mem_event_domain *med);
> +void mem_event_wake_waiters(struct domain *d, struct mem_event_domain
> *med);
> +void mem_event_mark_and_sleep(struct vcpu *v, struct mem_event_domain
> *med);
>
>  int mem_event_domctl(struct domain *d, xen_domctl_mem_event_op_t *mec,
>                       XEN_GUEST_HANDLE(void) u_domctl);
> diff -r 03138a08366b -r d910d738f31b xen/include/xen/sched.h
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -14,6 +14,7 @@
>  #include <xen/nodemask.h>
>  #include <xen/radix-tree.h>
>  #include <xen/multicall.h>
> +#include <xen/wait.h>
>  #include <public/xen.h>
>  #include <public/domctl.h>
>  #include <public/sysctl.h>
> @@ -183,7 +184,11 @@ struct mem_event_domain
>  {
>      /* ring lock */
>      spinlock_t ring_lock;
> -    unsigned int req_producers;
> +    /* The ring has 64 entries */
> +    unsigned char foreign_producers;
> +    unsigned char max_foreign;
> +    unsigned char target_producers;
> +    unsigned char max_target;
>      /* shared page */
>      mem_event_shared_page_t *shared_page;
>      /* shared ring page */
> @@ -192,6 +197,10 @@ struct mem_event_domain
>      mem_event_front_ring_t front_ring;
>      /* event channel port (vcpu0 only) */
>      int xen_port;
> +    /* mem_event bit for vcpu->pause_flags */
> +    int pause_flag;
> +    /* list of vcpus waiting for room in the ring */
> +    struct waitqueue_head wq;
>  };
>
>  struct mem_event_per_domain
> @@ -615,9 +624,12 @@ static inline struct domain *next_domain
>   /* VCPU affinity has changed: migrating to a new CPU. */
>  #define _VPF_migrating       3
>  #define VPF_migrating        (1UL<<_VPF_migrating)
> - /* VCPU is blocked on memory-event ring. */
> -#define _VPF_mem_event       4
> -#define VPF_mem_event        (1UL<<_VPF_mem_event)
> + /* VCPU is blocked due to missing mem_paging ring. */
> +#define _VPF_mem_paging      4
> +#define VPF_mem_paging       (1UL<<_VPF_mem_paging)
> + /* VCPU is blocked due to missing mem_access ring. */
> +#define _VPF_mem_access      5
> +#define VPF_mem_access       (1UL<<_VPF_mem_access)
>
>  static inline int vcpu_runnable(struct vcpu *v)
>  {
>
>
>
> ------------------------------
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
> End of Xen-devel Digest, Vol 82, Issue 325
> ******************************************
>

Olaf Hering

2012-Jan-12 13:59 UTC

head link

Re: [PATCH] mem_event: use wait queue when ring is full

On Wed, Jan 11, Andres Lagar-Cavilla wrote:
> > mem_event: use wait queue when ring is full
> >
> > This change is based on an idea/patch from Adin Scannell.
> 
> Olaf,
> thanks for the post. We''ll have to nack this patch in its current
form. It
> hard reboots our host during our testing.
Thats very unfortunate. I have seen such unexpected reboots myself a few
weeks ago. I suspect they were caused by an incorrect debug change which
I had on top of my waitqueue changes. Also the fixes Keir provided a few
weeks ago may have helped.

Is it an otherwise unmodified xen-unstable build, or do you use other
patches as well? Whats your environment and workload anyway in dom0 and
domU?

It would be very good to know why the reboots happen. Perhaps such
failures can not be debugged without special hardware, or a detailed
code review.

I just tested an otherwise unmodified xen-unstable build and did not
encounter reboots while ballooning a single 2G guest up and down. The
guest did just hang after a few iterations, most likely because v7 of my
patch again (or still?) has the math wrong in the ring accounting. I
will check what the issue is. I think v6 was ok in that respect, but I
will double check that older version as well.

> What we did is take this patch, amalgamate it with some bits from our ring
> management approach. We''re ready to submit that, along with a
description
> of how we test it. It works for us, and it involves wait queue''s
for
> corner cases.
Now if the patch you just sent out uses wait queues as well, and using
wait queues causes sudden host reboots for reasons not yet known, how is
your patch any better other that the reboots dont appear to happen
anymore?

I did not use anything but paging for testing, perhaps I should also run
some access tests. How should I use tools/tests/xen-access/xen-access.c?

Olaf

Andres Lagar-Cavilla

2012-Jan-12 16:11 UTC

head link

Re: [PATCH] mem_event: use wait queue when ring is full

> On Wed, Jan 11, Andres Lagar-Cavilla wrote:
>
>> > mem_event: use wait queue when ring is full
>> >
>> > This change is based on an idea/patch from Adin Scannell.
>>
>> Olaf,
>> thanks for the post. We''ll have to nack this patch in its
current form.
>> It
>> hard reboots our host during our testing.
>
> Thats very unfortunate. I have seen such unexpected reboots myself a few
> weeks ago. I suspect they were caused by an incorrect debug change which
> I had on top of my waitqueue changes. Also the fixes Keir provided a few
> weeks ago may have helped.
>
> Is it an otherwise unmodified xen-unstable build, or do you use other
> patches as well? Whats your environment and workload anyway in dom0 and
> domU?
>
> It would be very good to know why the reboots happen. Perhaps such
> failures can not be debugged without special hardware, or a detailed
> code review.
>
>
> I just tested an otherwise unmodified xen-unstable build and did not
> encounter reboots while ballooning a single 2G guest up and down. The
> guest did just hang after a few iterations, most likely because v7 of my
> patch again (or still?) has the math wrong in the ring accounting. I
> will check what the issue is. I think v6 was ok in that respect, but I
> will double check that older version as well.
>
>
>> What we did is take this patch, amalgamate it with some bits from our
>> ring
>> management approach. We''re ready to submit that, along with a
>> description
>> of how we test it. It works for us, and it involves wait
queue''s for
>> corner cases.
>
> Now if the patch you just sent out uses wait queues as well, and using
> wait queues causes sudden host reboots for reasons not yet known, how is
> your patch any better other that the reboots dont appear to happen
> anymore?
I believe you were missing some unlocks, which were triggering ASSERTs
going into a wait queue.

In any case, the patch was crashing, we spent quite some time merging it
all towards the endgame we all want (wait queues and better ring logic)
and now it doesn''t seem to crash.

But obviously our testing rigs are quite different, which is a good thing.

I''ll post the mem access testing code, with a description of how we
drive
that test.

Thanks!
Andres>
> I did not use anything but paging for testing, perhaps I should also run
> some access tests. How should I use tools/tests/xen-access/xen-access.c?
>
> Olaf
>

Adin Scannell

2012-Jan-12 17:50 UTC

head link

Re: [PATCH] mem_event: use wait queue when ring is full

On Thu, Jan 12, 2012 at 11:11 AM, Andres Lagar-Cavilla
<andres@lagarcavilla.org> wrote:>>> What we did is take this patch, amalgamate it with some bits from
our
>>> ring
>>> management approach. We''re ready to submit that, along
with a
>>> description
>>> of how we test it. It works for us, and it involves wait
queue''s for
>>> corner cases.
>>
>> Now if the patch you just sent out uses wait queues as well, and using
>> wait queues causes sudden host reboots for reasons not yet known, how
is
>> your patch any better other that the reboots dont appear to happen
>> anymore?
I didn''t spend a lot of time diagnosing exactly what was going wrong
with the patch.  I did have some local patches applied (in the process
of being submitted to the list) and some debug changes to ensure that
the correct code paths were hit, so it''s quite possible that it may
have been my mistake. If so, I apologize. I didn''t want to spend a lot
of time debugging and I''d had a similar experience with waitqueues in
the fall.

As Andres pointed out, we spent time merging our local approach into
your patch and testing that one. As a result of the combination, I
also did a few interface changes to ensure that callers use the
mem_event code correctly (i.e. calls to wake() are handled internally,
rather than relying on callers), and dropped some of the complexity of
accounting separately for foreign mappers.  With the waitqueue
failsafe in place, I don''t think that''s necessary.  Anyways, I
tried
to preserve the spirit of your code, and would love to hear thoughts.

We''ll be doing more testing today to ensure that we''ve
properly
exercised all the different code paths (including wait queues).

Cheers,
-Adin

Andres Lagar-Cavilla

2012-Jan-12 19:22 UTC

head link

Re: [PATCH] mem_event: use wait queue when ring is full

>
> I didn''t spend a lot of time diagnosing exactly what was going
wrong
> with the patch.  I did have some local patches applied (in the process
> of being submitted to the list) and some debug changes to ensure that
> the correct code paths were hit, so it''s quite possible that it
may
> have been my mistake. If so, I apologize. I didn''t want to spend a
lot
> of time debugging and I''d had a similar experience with waitqueues
in
> the fall.
>
> As Andres pointed out, we spent time merging our local approach into
> your patch and testing that one. As a result of the combination, I
> also did a few interface changes to ensure that callers use the
> mem_event code correctly (i.e. calls to wake() are handled internally,
> rather than relying on callers), and dropped some of the complexity of
> accounting separately for foreign mappers.  With the waitqueue
> failsafe in place, I don''t think that''s necessary. 
Anyways, I tried
> to preserve the spirit of your code, and would love to hear thoughts.
>
> We''ll be doing more testing today to ensure that we''ve
properly
> exercised all the different code paths (including wait queues).
It works.

1. Fire off 1GB HVM with PV drivers. Enable balloon
2. Fire off xenpaging
3. xenstore write memory/target-tot_pages 524288
(wait until everything is paged)
4. xenstore write memory/target 524288

No crashes, neither domain nor host nor xenpaging. Phew ;)

Olaf, let us know if you have further concerns, afaict the patch is ready
for showtime.

Andres>
> Cheers,
> -Adin
>>>> What we did is take this patch, amalgamate it with some bits
from our
>>>> ring
>>>> management approach. We''re ready to submit that, along
with a
>>>> description
>>>> of how we test it. It works for us, and it involves wait
queue''s for
>>>> corner cases.
>>>
>>> Now if the patch you just sent out uses wait queues as well, and
using
>>> wait queues causes sudden host reboots for reasons not yet known,
how
>>> is
>>> your patch any better other that the reboots dont appear to happen
>>> anymore?
>
>

Xen devel - Jan 2012 - Re: [PATCH] mem_event: use wait queue when ring is full

Re: [PATCH] mem_event: use wait queue when ring is full

Re: [PATCH] mem_event: use wait queue when ring is full

Re: [PATCH] mem_event: use wait queue when ring is full

Re: [PATCH] mem_event: use wait queue when ring is full

Re: [PATCH] mem_event: use wait queue when ring is full