Annie Li
2012-Nov-15 07:03 UTC
[PATCH 0/4] Implement persistent grant in xen-netfront/netback
This patch implements persistent grants for xen-netfront/netback. This mechanism maintains page pools in netback/netfront, these page pools is used to save grant pages which are mapped. This way improve performance which is wasted when doing grant operations. Current netback/netfront does map/unmap grant operations frequently when transmitting/receiving packets, and grant operations costs much cpu clock. In this patch, netfront/netback maps grant pages when needed and then saves them into a page pool for future use. All these pages will be unmapped when removing/releasing the net device. In netfront, two pools are maintained for transmitting and receiving packets. When new grant pages are needed, the driver gets grant pages from this pool first. If no free grant page exists, it allocates new page, maps it and then saves it into the pool. The pool size for transmit/receive is exactly tx/rx ring size. The driver uses memcpy(not grantcopy) to copy data grant pages. Here, memcpy is copying the whole page size data. I tried to copy len size data from offset, but network does not seem work well. I am trying to find the root cause now. In netback, it also maintains two page pools for tx/rx. When netback gets a request, it does a search first to find out whether the grant reference of this request is already mapped into its page pool. If the grant ref is mapped, the address of this mapped page is gotten and memcpy is used to copy data between grant pages. However, if the grant ref is not mapped, a new page is allocated, mapped with this grant ref, and then saved into page pool for future use. Similarly, memcpy replaces grant copy to copy data between grant pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are used to save vif pointer for every request because current netback is not per-vif based. This would be changed after implementing 1:1 model in netback. This patch supports both persistent-grant and non persistent grant. A new xenstore key "feature-persistent-grants" is used to represent this feature. This patch is based on linux3.4-rc3. I hit netperf/netserver failure on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this netperf/netserver failure connects compound page commit in v3.7-rc1, but I did hit BUG_ON with debug patch from thread http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html Annie Li (4): xen/netback: implements persistent grant with one page pool. xen/netback: Split one page pool into two(tx/rx) page pool. Xen/netfront: Implement persistent grant in netfront. fix code indent issue in xen-netfront. drivers/net/xen-netback/common.h | 24 ++- drivers/net/xen-netback/interface.c | 26 +++ drivers/net/xen-netback/netback.c | 215 ++++++++++++++++++-- drivers/net/xen-netback/xenbus.c | 14 ++- drivers/net/xen-netfront.c | 378 +++++++++++++++++++++++++++++------ 5 files changed, 570 insertions(+), 87 deletions(-) -- 1.7.3.4
Annie Li
2012-Nov-15 07:04 UTC
[PATCH 1/4] xen/netback: implements persistent grant with one page pool.
This patch implements persistent grant in netback driver. Tx and rx share the same page pool, this pool will be split into two parts in next patch. Signed-off-by: Annie Li <annie.li@oracle.com> --- drivers/net/xen-netback/common.h | 18 +++- drivers/net/xen-netback/interface.c | 22 ++++ drivers/net/xen-netback/netback.c | 212 +++++++++++++++++++++++++++++++---- drivers/net/xen-netback/xenbus.c | 14 ++- 4 files changed, 239 insertions(+), 27 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 94b79c3..a85cac6 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,8 +45,19 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ + (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) + struct xen_netbk; +struct persistent_entry { + grant_ref_t forgranted; + struct page *fpage; + struct gnttab_map_grant_ref map; +}; + struct xenvif { /* Unique identifier for this interface. */ domid_t domid; @@ -75,6 +86,7 @@ struct xenvif { /* Internal feature information. */ u8 can_queue:1; /* can queue packets for receiver? */ + u8 persistent_grant:1; /* * Allow xenvif_start_xmit() to peek ahead in the rx request @@ -98,6 +110,9 @@ struct xenvif { struct net_device *dev; wait_queue_head_t waiting_to_free; + + struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; + unsigned int persistent_gntcnt; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) return to_xenbus_device(vif->dev->dev.parent); } -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) - struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, unsigned int handle); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index b7d41f8..226d159 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, return ERR_PTR(err); } + vif->persistent_gntcnt = 0; + netdev_dbg(dev, "Successfully created xenvif\n"); return vif; } @@ -343,6 +345,23 @@ err: return err; } +void xenvif_free_grants(struct persistent_entry **pers_entry, + unsigned int count) +{ + int i, ret; + struct gnttab_unmap_grant_ref unmap; + + for (i = 0; i < count; i++) { + gnttab_set_unmap_op(&unmap, + (unsigned long)page_to_pfn(pers_entry[i]->fpage), + GNTMAP_host_map, + pers_entry[i]->map.handle); + ret = gnttab_unmap_refs(&unmap, &pers_entry[i]->fpage, + 1, false); + BUG_ON(ret); + } +} + void xenvif_disconnect(struct xenvif *vif) { struct net_device *dev = vif->dev; @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif) unregister_netdev(vif->dev); xen_netbk_unmap_frontend_rings(vif); + if (vif->persistent_grant) + xenvif_free_grants(vif->persistent_gnt, + vif->persistent_gntcnt); free_netdev(vif->dev); } diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 2596401..a26d3fc 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -80,6 +80,8 @@ union page_ext { void *mapping; }; +struct xenvif; + struct xen_netbk { wait_queue_head_t wq; struct task_struct *task; @@ -102,6 +104,7 @@ struct xen_netbk { struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; + struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS]; u16 pending_ring[MAX_PENDING_REQS]; @@ -111,12 +114,139 @@ struct xen_netbk { * straddles two buffers in the frontend. */ struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE]; + struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE]; struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; }; static struct xen_netbk *xen_netbk; static int xen_netbk_group_nr; +static struct persistent_entry* +get_per_gnt(struct persistent_entry **pers_entry, + unsigned int count, grant_ref_t gref) +{ + int i; + + for (i = 0; i < count; i++) + if (gref == pers_entry[i]->forgranted) + return pers_entry[i]; + + return NULL; +} + +static void* +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count, + grant_ref_t ref, domid_t domid) +{ + struct gnttab_map_grant_ref *map; + struct page *page; + unsigned long vaddr; + unsigned long pfn; + uint32_t flags; + int ret = 0; + + pers_entry[count] = (struct persistent_entry *) + kmalloc(sizeof(struct persistent_entry), + GFP_KERNEL); + if (!pers_entry[count]) + return ERR_PTR(-ENOMEM); + + map = &pers_entry[count]->map; + page = alloc_page(GFP_KERNEL); + if (!page) { + kfree(pers_entry[count]); + pers_entry[count] = NULL; + return ERR_PTR(-ENOMEM); + } + + pers_entry[count]->fpage = page; + pfn = page_to_pfn(page); + vaddr = (unsigned long)pfn_to_kaddr(pfn); + flags = GNTMAP_host_map; + + gnttab_set_map_op(map, vaddr, flags, ref, domid); + ret = gnttab_map_refs(map, NULL, &page, 1); + BUG_ON(ret); + + pers_entry[count]->forgranted = ref; + + return page_address(page); +} + +static void* +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count, + grant_ref_t ref, domid_t domid, unsigned int total) +{ + struct persistent_entry *per_gnt; + void *vaddr; + + per_gnt = get_per_gnt(pers_entry, *count, ref); + + if (per_gnt != NULL) + return page_address(per_gnt->fpage); + else { + BUG_ON(*count >= total); + vaddr = map_new_gnt(pers_entry, *count, ref, domid); + if (IS_ERR_OR_NULL(vaddr)) + return vaddr; + *count += 1; + return vaddr; + } +} + +static int +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, + struct xen_netbk *netbk, bool tx_pool) +{ + int i; + struct xenvif *vif; + struct gnttab_copy *uop = vuop; + unsigned int *gnt_count; + unsigned int gnt_total; + struct persistent_entry **pers_entry; + int ret = 0; + + BUG_ON(cmd != GNTTABOP_copy); + for (i = 0; i < count; i++) { + if (tx_pool) + vif = netbk->gnttab_tx_vif[i]; + else + vif = netbk->gnttab_rx_vif[i]; + + pers_entry = vif->persistent_gnt; + gnt_count = &vif->persistent_gntcnt; + gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; + + if (vif->persistent_grant) { + void *saddr, *daddr; + + saddr = uop[i].source.domid == DOMID_SELF ? + (void *) uop[i].source.u.gmfn : + get_ref_page(pers_entry, gnt_count, + uop[i].source.u.ref, + uop[i].source.domid, + gnt_total); + if (IS_ERR_OR_NULL(saddr)) + return -ENOMEM; + + daddr = uop[i].dest.domid == DOMID_SELF ? + (void *) uop[i].dest.u.gmfn : + get_ref_page(pers_entry, gnt_count, + uop[i].dest.u.ref, + uop[i].dest.domid, + gnt_total); + if (IS_ERR_OR_NULL(daddr)) + return -ENOMEM; + + memcpy(daddr+uop[i].dest.offset, + saddr+uop[i].source.offset, uop[i].len); + } else + ret = HYPERVISOR_grant_table_op(cmd, &uop[i], 1); + } + + return ret; +} + void xen_netbk_add_xenvif(struct xenvif *vif) { int i; @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, * Set up the grant operations for this fragment. If it''s a flipping * interface, we also set up the unmap request from here. */ -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, - struct netrx_pending_operations *npo, - struct page *page, unsigned long size, - unsigned long offset, int *head) +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, + struct netrx_pending_operations *npo, + struct page *page, unsigned long size, + unsigned long offset, int *head, + struct xenvif **grxvif) { struct gnttab_copy *copy_gop; struct netbk_rx_meta *meta; + int count = 0; /* * These variables are used iff get_page_ext returns true, * in which case they are guaranteed to be initialized. @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, bytes = MAX_BUFFER_OFFSET - npo->copy_off; copy_gop = npo->copy + npo->copy_prod++; + *grxvif = vif; + grxvif++; + count++; + copy_gop->flags = GNTCOPY_dest_gref; if (foreign) { struct xen_netbk *netbk = &xen_netbk[group]; @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, } else { void *vaddr = page_address(page); copy_gop->source.domid = DOMID_SELF; - copy_gop->source.u.gmfn = virt_to_mfn(vaddr); + if (!vif->persistent_grant) + copy_gop->source.u.gmfn = virt_to_mfn(vaddr); + else + copy_gop->source.u.gmfn = (unsigned long)vaddr; } copy_gop->source.offset = offset; copy_gop->dest.domid = vif->domid; @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, *head = 0; /* There must be something in this buffer now. */ } + return count; } /* @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, * zero GSO descriptors (for non-GSO packets) or one descriptor (for * frontend-side LRO). */ -static int netbk_gop_skb(struct sk_buff *skb, - struct netrx_pending_operations *npo) +static int netbk_gop_skb(struct xen_netbk *netbk, + struct sk_buff *skb, + struct netrx_pending_operations *npo, + struct xenvif **grxvif) { struct xenvif *vif = netdev_priv(skb->dev); int nr_frags = skb_shinfo(skb)->nr_frags; @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb, if (data + len > skb_tail_pointer(skb)) len = skb_tail_pointer(skb) - data; - netbk_gop_frag_copy(vif, skb, npo, - virt_to_page(data), len, offset, &head); + grxvif += netbk_gop_frag_copy(vif, skb, npo, + virt_to_page(data), len, + offset, &head, grxvif); + data += len; } for (i = 0; i < nr_frags; i++) { - netbk_gop_frag_copy(vif, skb, npo, - skb_frag_page(&skb_shinfo(skb)->frags[i]), - skb_frag_size(&skb_shinfo(skb)->frags[i]), - skb_shinfo(skb)->frags[i].page_offset, - &head); + grxvif += netbk_gop_frag_copy(vif, skb, npo, + skb_frag_page(&skb_shinfo(skb)->frags[i]), + skb_frag_size(&skb_shinfo(skb)->frags[i]), + skb_shinfo(skb)->frags[i].page_offset, + &head, grxvif); } return npo->meta_prod - old_meta_prod; @@ -593,6 +737,8 @@ struct skb_cb_overlay { static void xen_netbk_rx_action(struct xen_netbk *netbk) { struct xenvif *vif = NULL, *tmp; + struct xenvif **grxvif = netbk->gnttab_rx_vif; + int old_copy_prod = 0; s8 status; u16 irq, flags; struct xen_netif_rx_response *resp; @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) nr_frags = skb_shinfo(skb)->nr_frags; sco = (struct skb_cb_overlay *)skb->cb; - sco->meta_slots_used = netbk_gop_skb(skb, &npo); + sco->meta_slots_used = netbk_gop_skb(netbk, skb, &npo, grxvif); + grxvif += npo.copy_prod - old_copy_prod; + old_copy_prod = npo.copy_prod; count += nr_frags + 1; @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) return; BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op)); - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op, - npo.copy_prod); + ret = grant_memory_copy_op(GNTTABOP_copy, + &netbk->grant_copy_op, + npo.copy_prod, netbk, + false); BUG_ON(ret != 0); while ((skb = __skb_dequeue(&rxq)) != NULL) { @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, struct xenvif *vif, struct sk_buff *skb, struct xen_netif_tx_request *txp, - struct gnttab_copy *gop) + struct gnttab_copy *gop, + struct xenvif **gtxvif) { struct skb_shared_info *shinfo = skb_shinfo(skb); skb_frag_t *frags = shinfo->frags; @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop->source.domid = vif->domid; gop->source.offset = txp->offset; - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); + if (!vif->persistent_grant) + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); + else + gop->dest.u.gmfn = (unsigned long)page_address(page); + gop->dest.domid = DOMID_SELF; gop->dest.offset = txp->offset; @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop->flags = GNTCOPY_source_gref; gop++; + *gtxvif = vif; + gtxvif++; + memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp)); xenvif_get(vif); @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) { struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; + struct xenvif **gtxvif = netbk->gnttab_tx_vif; struct sk_buff *skb; int ret; @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) gop->source.domid = vif->domid; gop->source.offset = txreq.offset; - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); + if (!vif->persistent_grant) + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); + else + gop->dest.u.gmfn = (unsigned long)page_address(page); + gop->dest.domid = DOMID_SELF; gop->dest.offset = txreq.offset; @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) gop->flags = GNTCOPY_source_gref; gop++; + *gtxvif++ = vif; memcpy(&netbk->pending_tx_info[pending_idx].req, &txreq, sizeof(txreq)); @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) netbk->pending_cons++; request_gop = xen_netbk_get_requests(netbk, vif, - skb, txfrags, gop); + skb, txfrags, gop, gtxvif); if (request_gop == NULL) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); continue; } + gtxvif += request_gop - gop; gop = request_gop; vif->tx.req_cons = idx; @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk) if (nr_gops == 0) return; - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, - netbk->tx_copy_ops, nr_gops); + ret = grant_memory_copy_op(GNTTABOP_copy, + netbk->tx_copy_ops, nr_gops, + netbk, true); BUG_ON(ret); xen_netbk_tx_submit(netbk); diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 410018c..938e908 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev, goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, + "feature-persistent-grants", "%u", 1); + if (err) { + message = "writing feature-persistent-grants"; + goto abort_transaction; + } + err = xenbus_transaction_end(xbt, 0); } while (err == -EAGAIN); @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) val = 0; vif->csum = !val; - /* Map the shared frame, irq etc. */ + if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants", + "%u", &val) < 0) + val = 0; + vif->persistent_grant = !!val; + +/* Map the shared frame, irq etc. */ err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); if (err) { xenbus_dev_fatal(dev, err, -- 1.7.3.4
Annie Li
2012-Nov-15 07:04 UTC
[PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
For tx path, this implementation simplifies the work of searching out grant page from page pool based on grant reference. Signed-off-by: Annie Li <annie.li@oracle.com> --- drivers/net/xen-netback/common.h | 14 ++++++++++---- drivers/net/xen-netback/interface.c | 12 ++++++++---- drivers/net/xen-netback/netback.c | 15 +++++++++------ 3 files changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index a85cac6..02c8573 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -47,8 +47,6 @@ #define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) #define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) -#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ - (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) struct xen_netbk; @@ -111,8 +109,16 @@ struct xenvif { wait_queue_head_t waiting_to_free; - struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; - unsigned int persistent_gntcnt; + struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE]; + + /* + * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page + * using 2 copy operations. + */ + struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE]; + + unsigned int persistent_tx_gntcnt; + unsigned int persistent_rx_gntcnt; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 226d159..ecbe116 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -300,7 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, return ERR_PTR(err); } - vif->persistent_gntcnt = 0; + vif->persistent_tx_gntcnt = 0; + vif->persistent_rx_gntcnt = 0; netdev_dbg(dev, "Successfully created xenvif\n"); return vif; @@ -385,9 +386,12 @@ void xenvif_disconnect(struct xenvif *vif) unregister_netdev(vif->dev); xen_netbk_unmap_frontend_rings(vif); - if (vif->persistent_grant) - xenvif_free_grants(vif->persistent_gnt, - vif->persistent_gntcnt); + if (vif->persistent_grant) { + xenvif_free_grants(vif->persistent_tx_gnt, + vif->persistent_tx_gntcnt); + xenvif_free_grants(vif->persistent_rx_gnt, + vif->persistent_rx_gntcnt); + } free_netdev(vif->dev); } diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index a26d3fc..ec59c73 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -208,14 +208,17 @@ grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, BUG_ON(cmd != GNTTABOP_copy); for (i = 0; i < count; i++) { - if (tx_pool) + if (tx_pool) { vif = netbk->gnttab_tx_vif[i]; - else + gnt_count = &vif->persistent_tx_gntcnt; + gnt_total = XEN_NETIF_TX_RING_SIZE; + pers_entry = vif->persistent_tx_gnt; + } else { vif = netbk->gnttab_rx_vif[i]; - - pers_entry = vif->persistent_gnt; - gnt_count = &vif->persistent_gntcnt; - gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; + gnt_count = &vif->persistent_rx_gntcnt; + gnt_total = 2*XEN_NETIF_RX_RING_SIZE; + pers_entry = vif->persistent_rx_gnt; + } if (vif->persistent_grant) { void *saddr, *daddr; -- 1.7.3.4
Annie Li
2012-Nov-15 07:05 UTC
[PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
Tx/rx page pool are maintained. New grant is mapped and put into pool, unmap only happens when releasing/removing device. Signed-off-by: Annie Li <annie.li@oracle.com> --- drivers/net/xen-netfront.c | 372 +++++++++++++++++++++++++++++++++++++------- 1 files changed, 315 insertions(+), 57 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 0ebbb19..17b81c0 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -79,6 +79,13 @@ struct netfront_stats { struct u64_stats_sync syncp; }; +struct gnt_list { + grant_ref_t gref; + struct page *gnt_pages; + void *gnt_target; + struct gnt_list *tail; +}; + struct netfront_info { struct list_head list; struct net_device *netdev; @@ -109,6 +116,10 @@ struct netfront_info { grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; unsigned tx_skb_freelist; + struct gnt_list *tx_grant[NET_TX_RING_SIZE]; + struct gnt_list *tx_gnt_list; + unsigned int tx_gnt_cnt; + spinlock_t rx_lock ____cacheline_aligned_in_smp; struct xen_netif_rx_front_ring rx; int rx_ring_ref; @@ -126,6 +137,10 @@ struct netfront_info { grant_ref_t gref_rx_head; grant_ref_t grant_rx_ref[NET_RX_RING_SIZE]; + struct gnt_list *rx_grant[NET_RX_RING_SIZE]; + struct gnt_list *rx_gnt_list; + unsigned int rx_gnt_cnt; + unsigned long rx_pfn_array[NET_RX_RING_SIZE]; struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1]; struct mmu_update rx_mmu[NET_RX_RING_SIZE]; @@ -134,6 +149,7 @@ struct netfront_info { struct netfront_stats __percpu *stats; unsigned long rx_gso_checksum_fixup; + u8 persistent_gnt:1; }; struct netfront_rx_info { @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np, return ref; } +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np, + RING_IDX ri) +{ + int i = xennet_rxidx(ri); + struct gnt_list *gntlist = np->rx_grant[i]; + np->rx_grant[i] = NULL; + + return gntlist; +} + #ifdef CONFIG_SYSFS static int xennet_sysfs_addif(struct net_device *netdev); static void xennet_sysfs_delif(struct net_device *netdev); @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev) netif_wake_queue(dev); } +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev, + unsigned long mfn, void *vaddr, + unsigned int id, + grant_ref_t ref) +{ + struct netfront_info *np = netdev_priv(dev); + grant_ref_t gnt_ref; + struct gnt_list *gnt_list_entry; + + if (np->persistent_gnt && np->rx_gnt_cnt) { + gnt_list_entry = np->rx_gnt_list; + np->rx_gnt_list = np->rx_gnt_list->tail; + np->rx_gnt_cnt--; + + gnt_list_entry->gnt_target = vaddr; + gnt_ref = gnt_list_entry->gref; + np->rx_grant[id] = gnt_list_entry; + } else { + struct page *page; + + BUG_ON(!np->persistent_gnt && np->rx_gnt_cnt); + if (!ref) + gnt_ref + gnttab_claim_grant_reference(&np->gref_rx_head); + else + gnt_ref = ref; + BUG_ON((signed short)gnt_ref < 0); + + if (np->persistent_gnt) { + page = alloc_page(GFP_KERNEL); + if (!page) { + if (!ref) + gnttab_release_grant_reference( + &np->gref_rx_head, ref); + return -ENOMEM; + } + mfn = pfn_to_mfn(page_to_pfn(page)); + + gnt_list_entry = kmalloc(sizeof(struct gnt_list), + GFP_KERNEL); + if (!gnt_list_entry) { + __free_page(page); + if (!ref) + gnttab_release_grant_reference( + &np->gref_rx_head, ref); + return -ENOMEM; + } + gnt_list_entry->gref = gnt_ref; + gnt_list_entry->gnt_pages = page; + gnt_list_entry->gnt_target = vaddr; + + np->rx_grant[id] = gnt_list_entry; + } + + gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id, + mfn, 0); + } + np->grant_rx_ref[id] = gnt_ref; + + return gnt_ref; +} + static void xennet_alloc_rx_buffers(struct net_device *dev) { unsigned short id; @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev) int i, batch_target, notify; RING_IDX req_prod = np->rx.req_prod_pvt; grant_ref_t ref; - unsigned long pfn; - void *vaddr; struct xen_netif_rx_request *req; if (unlikely(!netif_carrier_ok(dev))) @@ -306,19 +392,16 @@ no_skb: BUG_ON(np->rx_skbs[id]); np->rx_skbs[id] = skb; - ref = gnttab_claim_grant_reference(&np->gref_rx_head); - BUG_ON((signed short)ref < 0); - np->grant_rx_ref[id] = ref; + page = skb_frag_page(&skb_shinfo(skb)->frags[0]); - pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0])); - vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0])); + ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), + page_address(page), id, 0); + if ((signed short)ref < 0) { + __skb_queue_tail(&np->rx_batch, skb); + break; + } req = RING_GET_REQUEST(&np->rx, req_prod + i); - gnttab_grant_foreign_access_ref(ref, - np->xbdev->otherend_id, - pfn_to_mfn(pfn), - 0); - req->id = id; req->gref = ref; } @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev) id = txrsp->id; skb = np->tx_skbs[id].skb; - if (unlikely(gnttab_query_foreign_access( - np->grant_tx_ref[id]) != 0)) { - printk(KERN_ALERT "xennet_tx_buf_gc: warning " - "-- grant still in use by backend " - "domain.\n"); - BUG(); + + if (np->persistent_gnt) { + struct gnt_list *gnt_list_entry; + + gnt_list_entry = np->tx_grant[id]; + BUG_ON(!gnt_list_entry); + + gnt_list_entry->tail = np->tx_gnt_list; + np->tx_gnt_list = gnt_list_entry; + np->tx_gnt_cnt++; + } else { + if (unlikely(gnttab_query_foreign_access( + np->grant_tx_ref[id]) != 0)) { + printk(KERN_ALERT "xennet_tx_buf_gc: warning " + "-- grant still in use by backend " + "domain.\n"); + BUG(); + } + + gnttab_end_foreign_access_ref( + np->grant_tx_ref[id], GNTMAP_readonly); + gnttab_release_grant_reference( + &np->gref_tx_head, np->grant_tx_ref[id]); } - gnttab_end_foreign_access_ref( - np->grant_tx_ref[id], GNTMAP_readonly); - gnttab_release_grant_reference( - &np->gref_tx_head, np->grant_tx_ref[id]); np->grant_tx_ref[id] = GRANT_INVALID_REF; add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id); dev_kfree_skb_irq(skb); @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev) xennet_maybe_wake_tx(dev); } +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev, + unsigned long mfn, + unsigned int id) +{ + struct netfront_info *np = netdev_priv(dev); + grant_ref_t ref; + struct page *granted_page; + + if (np->persistent_gnt && np->tx_gnt_cnt) { + struct gnt_list *gnt_list_entry; + + gnt_list_entry = np->tx_gnt_list; + np->tx_gnt_list = np->tx_gnt_list->tail; + np->tx_gnt_cnt--; + + ref = gnt_list_entry->gref; + np->tx_grant[id] = gnt_list_entry; + } else { + struct gnt_list *gnt_list_entry; + + BUG_ON(!np->persistent_gnt && np->tx_gnt_cnt); + ref = gnttab_claim_grant_reference(&np->gref_tx_head); + BUG_ON((signed short)ref < 0); + + if (np->persistent_gnt) { + granted_page = alloc_page(GFP_KERNEL); + if (!granted_page) { + gnttab_release_grant_reference( + &np->gref_tx_head, ref); + return -ENOMEM; + } + + mfn = pfn_to_mfn(page_to_pfn(granted_page)); + gnt_list_entry = kmalloc(sizeof(struct gnt_list), + GFP_KERNEL); + if (!gnt_list_entry) { + __free_page(granted_page); + gnttab_release_grant_reference( + &np->gref_tx_head, ref); + return -ENOMEM; + } + + gnt_list_entry->gref = ref; + gnt_list_entry->gnt_pages = granted_page; + np->tx_grant[id] = gnt_list_entry; + } + gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id, + mfn, 0); + } + + return ref; +} + static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, struct xen_netif_tx_request *tx) { @@ -421,6 +570,9 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, unsigned int len = skb_headlen(skb); unsigned int id; grant_ref_t ref; + struct gnt_list *gnt_list_entry; + void *out_addr; + void *in_addr; int i; /* While the header overlaps a page boundary (including being @@ -436,17 +588,19 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, np->tx_skbs[id].skb = skb_get(skb); tx = RING_GET_REQUEST(&np->tx, prod++); tx->id = id; - ref = gnttab_claim_grant_reference(&np->gref_tx_head); - BUG_ON((signed short)ref < 0); - mfn = virt_to_mfn(data); - gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id, - mfn, GNTMAP_readonly); - + ref = xennet_alloc_tx_ref(dev, mfn, id); tx->gref = np->grant_tx_ref[id] = ref; tx->offset = offset; tx->size = len; tx->flags = 0; + if (np->persistent_gnt) { + gnt_list_entry = np->tx_grant[id]; + out_addr = page_address(gnt_list_entry->gnt_pages); + in_addr = (void *)((unsigned long)data + & ~(PAGE_SIZE-1)); + memcpy(out_addr, in_addr, PAGE_SIZE); + } } /* Grant backend access to each skb fragment page. */ @@ -459,17 +613,19 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, np->tx_skbs[id].skb = skb_get(skb); tx = RING_GET_REQUEST(&np->tx, prod++); tx->id = id; - ref = gnttab_claim_grant_reference(&np->gref_tx_head); - BUG_ON((signed short)ref < 0); - mfn = pfn_to_mfn(page_to_pfn(skb_frag_page(frag))); - gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id, - mfn, GNTMAP_readonly); - + ref = xennet_alloc_tx_ref(dev, mfn, id); tx->gref = np->grant_tx_ref[id] = ref; tx->offset = frag->page_offset; tx->size = skb_frag_size(frag); tx->flags = 0; + if (np->persistent_gnt) { + gnt_list_entry = np->tx_grant[id]; + out_addr = page_address(gnt_list_entry->gnt_pages); + in_addr = (void *)((unsigned long)page_address( + skb_frag_page(frag)) & ~(PAGE_SIZE-1)); + memcpy(out_addr, in_addr, PAGE_SIZE); + } } np->tx.req_prod_pvt = prod; @@ -491,6 +647,9 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) unsigned int offset = offset_in_page(data); unsigned int len = skb_headlen(skb); unsigned long flags; + struct gnt_list *gnt_list_entry; + void *out_addr; + void *in_addr; frags += DIV_ROUND_UP(offset + len, PAGE_SIZE); if (unlikely(frags > MAX_SKB_FRAGS + 1)) { @@ -517,16 +676,20 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) tx = RING_GET_REQUEST(&np->tx, i); tx->id = id; - ref = gnttab_claim_grant_reference(&np->gref_tx_head); - BUG_ON((signed short)ref < 0); mfn = virt_to_mfn(data); - gnttab_grant_foreign_access_ref( - ref, np->xbdev->otherend_id, mfn, GNTMAP_readonly); + ref = xennet_alloc_tx_ref(dev, mfn, id); tx->gref = np->grant_tx_ref[id] = ref; tx->offset = offset; tx->size = len; extra = NULL; + if (np->persistent_gnt) { + gnt_list_entry = np->tx_grant[id]; + out_addr = page_address(gnt_list_entry->gnt_pages); + in_addr = (void *)((unsigned long)data & ~(PAGE_SIZE-1)); + memcpy(out_addr, in_addr, PAGE_SIZE); + } + tx->flags = 0; if (skb->ip_summed == CHECKSUM_PARTIAL) /* local packet? */ @@ -595,13 +758,17 @@ static int xennet_close(struct net_device *dev) } static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb, - grant_ref_t ref) + grant_ref_t ref, RING_IDX cons) { int new = xennet_rxidx(np->rx.req_prod_pvt); BUG_ON(np->rx_skbs[new]); np->rx_skbs[new] = skb; np->grant_rx_ref[new] = ref; + + if (np->persistent_gnt) + np->rx_grant[new] = xennet_get_rx_grant(np, cons); + RING_GET_REQUEST(&np->rx, np->rx.req_prod_pvt)->id = new; RING_GET_REQUEST(&np->rx, np->rx.req_prod_pvt)->gref = ref; np->rx.req_prod_pvt++; @@ -644,7 +811,7 @@ static int xennet_get_extras(struct netfront_info *np, skb = xennet_get_rx_skb(np, cons); ref = xennet_get_rx_ref(np, cons); - xennet_move_rx_slot(np, skb, ref); + xennet_move_rx_slot(np, skb, ref, cons); } while (extra->flags & XEN_NETIF_EXTRA_FLAG_MORE); np->rx.rsp_cons = cons; @@ -665,6 +832,12 @@ static int xennet_get_responses(struct netfront_info *np, int frags = 1; int err = 0; unsigned long ret; + struct gnt_list *gnt_list_entry; + + if (np->persistent_gnt) { + gnt_list_entry = xennet_get_rx_grant(np, cons); + BUG_ON(!gnt_list_entry); + } if (rx->flags & XEN_NETRXF_extra_info) { err = xennet_get_extras(np, extras, rp); @@ -677,7 +850,7 @@ static int xennet_get_responses(struct netfront_info *np, if (net_ratelimit()) dev_warn(dev, "rx->offset: %x, size: %u\n", rx->offset, rx->status); - xennet_move_rx_slot(np, skb, ref); + xennet_move_rx_slot(np, skb, ref, cons); err = -EINVAL; goto next; } @@ -695,11 +868,29 @@ static int xennet_get_responses(struct netfront_info *np, goto next; } - ret = gnttab_end_foreign_access_ref(ref, 0); - BUG_ON(!ret); - - gnttab_release_grant_reference(&np->gref_rx_head, ref); - + if (!np->persistent_gnt) { + ret = gnttab_end_foreign_access_ref(ref, 0); + BUG_ON(!ret); + gnttab_release_grant_reference(&np->gref_rx_head, ref); + } else { + struct page *grant_page; + void *grant_target; + + grant_page = gnt_list_entry->gnt_pages; + grant_target = gnt_list_entry->gnt_target; + BUG_ON(grant_page == 0); + BUG_ON(grant_target == 0); + + if (rx->status > 0) + memcpy(grant_target+rx->offset, + page_address(grant_page)+rx->offset, + rx->status); /* status encodes size */ + + gnt_list_entry->gref = ref; + gnt_list_entry->tail = np->rx_gnt_list; + np->rx_gnt_list = gnt_list_entry; + np->rx_gnt_cnt++; + } __skb_queue_tail(list, skb); next: @@ -716,6 +907,10 @@ next: rx = RING_GET_RESPONSE(&np->rx, cons + frags); skb = xennet_get_rx_skb(np, cons + frags); ref = xennet_get_rx_ref(np, cons + frags); + if (np->persistent_gnt) { + gnt_list_entry = xennet_get_rx_grant(np, cons + frags); + BUG_ON(!gnt_list_entry); + } frags++; } @@ -1090,16 +1285,32 @@ static void xennet_release_tx_bufs(struct netfront_info *np) struct sk_buff *skb; int i; + if (np->persistent_gnt) { + struct gnt_list *gnt_list_entry; + + while (np->tx_gnt_list) { + gnt_list_entry = np->tx_gnt_list; + np->tx_gnt_list = np->tx_gnt_list->tail; + gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); + gnttab_release_grant_reference(&np->gref_tx_head, + gnt_list_entry->gref); + + __free_page(gnt_list_entry->gnt_pages); + kfree(gnt_list_entry); + } + } + for (i = 0; i < NET_TX_RING_SIZE; i++) { /* Skip over entries which are actually freelist references */ if (skb_entry_is_link(&np->tx_skbs[i])) continue; - skb = np->tx_skbs[i].skb; - gnttab_end_foreign_access_ref(np->grant_tx_ref[i], - GNTMAP_readonly); - gnttab_release_grant_reference(&np->gref_tx_head, - np->grant_tx_ref[i]); + if (!np->persistent_gnt) { + gnttab_end_foreign_access_ref(np->grant_tx_ref[i], + GNTMAP_readonly); + gnttab_release_grant_reference(&np->gref_tx_head, + np->grant_tx_ref[i]); + } np->grant_tx_ref[i] = GRANT_INVALID_REF; add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, i); dev_kfree_skb_irq(skb); @@ -1124,6 +1335,20 @@ static void xennet_release_rx_bufs(struct netfront_info *np) spin_lock_bh(&np->rx_lock); + if (np->persistent_gnt) { + struct gnt_list *gnt_list_entry; + + while (np->rx_gnt_list) { + gnt_list_entry = np->rx_gnt_list; + np->rx_gnt_list = np->rx_gnt_list->tail; + gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); + gnttab_release_grant_reference(&np->gref_rx_head, + gnt_list_entry->gref); + __free_page(gnt_list_entry->gnt_pages); + kfree(gnt_list_entry); + } + } + for (id = 0; id < NET_RX_RING_SIZE; id++) { ref = np->grant_rx_ref[id]; if (ref == GRANT_INVALID_REF) { @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np) } skb = np->rx_skbs[id]; - mfn = gnttab_end_foreign_transfer_ref(ref); - gnttab_release_grant_reference(&np->gref_rx_head, ref); + if (!np->persistent_gnt) { + mfn = gnttab_end_foreign_transfer_ref(ref); + gnttab_release_grant_reference(&np->gref_rx_head, ref); + } np->grant_rx_ref[id] = GRANT_INVALID_REF; if (0 == mfn) { @@ -1607,6 +1834,13 @@ again: goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants", + "%u", info->persistent_gnt); + if (err) { + message = "writing feature-persistent-grants"; + xenbus_dev_fatal(dev, err, "%s", message); + } + err = xenbus_transaction_end(xbt, 0); if (err) { if (err == -EAGAIN) @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev) grant_ref_t ref; struct xen_netif_rx_request *req; unsigned int feature_rx_copy; + int ret, val; err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, "feature-rx-copy", "%u", &feature_rx_copy); @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev) return -ENODEV; } + err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, + "feature-persistent-grants", "%u", &val); + if (err != 1) + val = 0; + + np->persistent_gnt = !!val; + err = talk_to_netback(np->xbdev, np); if (err) return err; @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev) spin_lock_bh(&np->rx_lock); spin_lock_irq(&np->tx_lock); + np->tx_gnt_cnt = 0; + np->rx_gnt_cnt = 0; + /* Step 1: Discard all pending TX packet fragments. */ xennet_release_tx_bufs(np); + if (np->persistent_gnt) { + struct gnt_list *gnt_list_entry; + + while (np->rx_gnt_list) { + gnt_list_entry = np->rx_gnt_list; + np->rx_gnt_list = np->rx_gnt_list->tail; + gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); + __free_page(gnt_list_entry->gnt_pages); + kfree(gnt_list_entry); + } + } + /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */ for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) { skb_frag_t *frag; @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev) frag = &skb_shinfo(skb)->frags[0]; page = skb_frag_page(frag); - gnttab_grant_foreign_access_ref( - ref, np->xbdev->otherend_id, - pfn_to_mfn(page_to_pfn(page)), - 0); + ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), + page_address(page), requeue_idx, ref); + if ((signed short)ret < 0) + break; + req->gref = ref; req->id = requeue_idx; -- 1.7.3.4
Signed-off-by: Annie Li <annie.li@oracle.com> --- drivers/net/xen-netfront.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 17b81c0..66bb29f 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -202,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np, } static grant_ref_t xennet_get_rx_ref(struct netfront_info *np, - RING_IDX ri) + RING_IDX ri) { int i = xennet_rxidx(ri); grant_ref_t ref = np->grant_rx_ref[i]; @@ -1420,7 +1420,7 @@ static void xennet_uninit(struct net_device *dev) } static netdev_features_t xennet_fix_features(struct net_device *dev, - netdev_features_t features) + netdev_features_t features) { struct netfront_info *np = netdev_priv(dev); int val; @@ -1447,7 +1447,7 @@ static netdev_features_t xennet_fix_features(struct net_device *dev, } static int xennet_set_features(struct net_device *dev, - netdev_features_t features) + netdev_features_t features) { if (!(features & NETIF_F_SG) && dev->mtu > ETH_DATA_LEN) { netdev_info(dev, "Reducing MTU because no SG offload"); -- 1.7.3.4
Pasi Kärkkäinen
2012-Nov-15 07:40 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
Hello, On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:> This patch implements persistent grants for xen-netfront/netback. This > mechanism maintains page pools in netback/netfront, these page pools is used to > save grant pages which are mapped. This way improve performance which is wasted > when doing grant operations. > > Current netback/netfront does map/unmap grant operations frequently when > transmitting/receiving packets, and grant operations costs much cpu clock. In > this patch, netfront/netback maps grant pages when needed and then saves them > into a page pool for future use. All these pages will be unmapped when > removing/releasing the net device. >Do you have performance numbers available already? with/without persistent grants?> In netfront, two pools are maintained for transmitting and receiving packets. > When new grant pages are needed, the driver gets grant pages from this pool > first. If no free grant page exists, it allocates new page, maps it and then > saves it into the pool. The pool size for transmit/receive is exactly tx/rx > ring size. The driver uses memcpy(not grantcopy) to copy data grant pages. > Here, memcpy is copying the whole page size data. I tried to copy len size data > from offset, but network does not seem work well. I am trying to find the root > cause now. > > In netback, it also maintains two page pools for tx/rx. When netback gets a > request, it does a search first to find out whether the grant reference of > this request is already mapped into its page pool. If the grant ref is mapped, > the address of this mapped page is gotten and memcpy is used to copy data > between grant pages. However, if the grant ref is not mapped, a new page is > allocated, mapped with this grant ref, and then saved into page pool for > future use. Similarly, memcpy replaces grant copy to copy data between grant > pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are > used to save vif pointer for every request because current netback is not > per-vif based. This would be changed after implementing 1:1 model in netback. >Btw is xen-netback/xen-netfront multiqueue support something you''re planning to implement aswell? multiqueue allows single vif scaling to multiple vcpus/cores. Thanks, -- Pasi> This patch supports both persistent-grant and non persistent grant. A new > xenstore key "feature-persistent-grants" is used to represent this feature. > > This patch is based on linux3.4-rc3. I hit netperf/netserver failure on > linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this > netperf/netserver failure connects compound page commit in v3.7-rc1, but I did > hit BUG_ON with debug patch from thread > http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html > > > Annie Li (4): > xen/netback: implements persistent grant with one page pool. > xen/netback: Split one page pool into two(tx/rx) page pool. > Xen/netfront: Implement persistent grant in netfront. > fix code indent issue in xen-netfront. > > drivers/net/xen-netback/common.h | 24 ++- > drivers/net/xen-netback/interface.c | 26 +++ > drivers/net/xen-netback/netback.c | 215 ++++++++++++++++++-- > drivers/net/xen-netback/xenbus.c | 14 ++- > drivers/net/xen-netfront.c | 378 +++++++++++++++++++++++++++++------ > 5 files changed, 570 insertions(+), 87 deletions(-) > > -- > 1.7.3.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
ANNIE LI
2012-Nov-15 08:38 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 15:40, Pasi Kärkkäinen wrote:> Hello, > > On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: >> This patch implements persistent grants for xen-netfront/netback. This >> mechanism maintains page pools in netback/netfront, these page pools is used to >> save grant pages which are mapped. This way improve performance which is wasted >> when doing grant operations. >> >> Current netback/netfront does map/unmap grant operations frequently when >> transmitting/receiving packets, and grant operations costs much cpu clock. In >> this patch, netfront/netback maps grant pages when needed and then saves them >> into a page pool for future use. All these pages will be unmapped when >> removing/releasing the net device. >> > Do you have performance numbers available already? with/without persistent grants?I have some simple netperf/netserver test result with/without persistent grants, Following is result of with persistent grant patch, Guests, Sum, Avg, Min, Max 1, 15106.4, 15106.4, 15106.36, 15106.36 2, 13052.7, 6526.34, 6261.81, 6790.86 3, 12675.1, 6337.53, 6220.24, 6454.83 4, 13194, 6596.98, 6274.70, 6919.25 Following are result of without persistent patch Guests, Sum, Avg, Min, Max 1, 10864.1, 10864.1, 10864.10, 10864.10 2, 10898.5, 5449.24, 4862.08, 6036.40 3, 10734.5, 5367.26, 5261.43, 5473.08 4, 10924, 5461.99, 5314.84, 5609.14>> In netfront, two pools are maintained for transmitting and receiving packets. >> When new grant pages are needed, the driver gets grant pages from this pool >> first. If no free grant page exists, it allocates new page, maps it and then >> saves it into the pool. The pool size for transmit/receive is exactly tx/rx >> ring size. The driver uses memcpy(not grantcopy) to copy data grant pages. >> Here, memcpy is copying the whole page size data. I tried to copy len size data >> from offset, but network does not seem work well. I am trying to find the root >> cause now. >> >> In netback, it also maintains two page pools for tx/rx. When netback gets a >> request, it does a search first to find out whether the grant reference of >> this request is already mapped into its page pool. If the grant ref is mapped, >> the address of this mapped page is gotten and memcpy is used to copy data >> between grant pages. However, if the grant ref is not mapped, a new page is >> allocated, mapped with this grant ref, and then saved into page pool for >> future use. Similarly, memcpy replaces grant copy to copy data between grant >> pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are >> used to save vif pointer for every request because current netback is not >> per-vif based. This would be changed after implementing 1:1 model in netback. >> > Btw is xen-netback/xen-netfront multiqueue support something you''re planning to implement aswell?Currently, some patches exist for implementing 1:1 model in netback, but this should be different from what you mentioned, and they are not ready for upstream. These patches make netback thread per VIF, and mainly implement some concepts from netchannel2, such as multipage rings, seperate tx and rx rings, seperate tx and rx event channels, etc. Thanks Annie> multiqueue allows single vif scaling to multiple vcpus/cores. > > > Thanks, > > -- Pasi > > >> This patch supports both persistent-grant and non persistent grant. A new >> xenstore key "feature-persistent-grants" is used to represent this feature. >> >> This patch is based on linux3.4-rc3. I hit netperf/netserver failure on >> linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this >> netperf/netserver failure connects compound page commit in v3.7-rc1, but I did >> hit BUG_ON with debug patch from thread >> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html >> >> >> Annie Li (4): >> xen/netback: implements persistent grant with one page pool. >> xen/netback: Split one page pool into two(tx/rx) page pool. >> Xen/netfront: Implement persistent grant in netfront. >> fix code indent issue in xen-netfront. >> >> drivers/net/xen-netback/common.h | 24 ++- >> drivers/net/xen-netback/interface.c | 26 +++ >> drivers/net/xen-netback/netback.c | 215 ++++++++++++++++++-- >> drivers/net/xen-netback/xenbus.c | 14 ++- >> drivers/net/xen-netfront.c | 378 +++++++++++++++++++++++++++++------ >> 5 files changed, 570 insertions(+), 87 deletions(-) >> >> -- >> 1.7.3.4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel
Ian Campbell
2012-Nov-15 08:51 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote:> > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > Hello, > > > > On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > >> This patch implements persistent grants for xen-netfront/netback. This > >> mechanism maintains page pools in netback/netfront, these page pools is used to > >> save grant pages which are mapped. This way improve performance which is wasted > >> when doing grant operations. > >> > >> Current netback/netfront does map/unmap grant operations frequently when > >> transmitting/receiving packets, and grant operations costs much cpu clock. In > >> this patch, netfront/netback maps grant pages when needed and then saves them > >> into a page pool for future use. All these pages will be unmapped when > >> removing/releasing the net device. > >> > > Do you have performance numbers available already? with/without persistent grants? > I have some simple netperf/netserver test result with/without persistent > grants, > > Following is result of with persistent grant patch,> Guests, Sum, Avg, Min, Max > 1, 15106.4, 15106.4, 15106.36, 15106.36 > 2, 13052.7, 6526.34, 6261.81, 6790.86 > 3, 12675.1, 6337.53, 6220.24, 6454.83 > 4, 13194, 6596.98, 6274.70, 6919.25Are these pairs of guests or individual ones? I think the really interesting cases are when you get up to larger numbers of guests, aren''t they? ISTR that for blkio things got most interesting WRT persistent grants at the dozens of guests stage. Do you have any numbers for those? Have you run any tests other than netperf? Do you have numbers for a a persistent capable backend with a non-persistent frontend and vice versa?> > > Following are result of without persistent patch > > Guests, Sum, Avg, Min, Max > 1, 10864.1, 10864.1, 10864.10, 10864.10 > 2, 10898.5, 5449.24, 4862.08, 6036.40 > 3, 10734.5, 5367.26, 5261.43, 5473.08 > 4, 10924, 5461.99, 5314.84, 5609.14
Ian Campbell
2012-Nov-15 08:53 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:> > This patch is based on linux3.4-rc3. I hit netperf/netserver failure > on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure > whether thisnetperf/netserver failure connects compound page commit in > v3.7-rc1, but I did hit BUG_ON with debug patch from thread > http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.htmlDo you think you could cook up a netfront fix similar in principal to the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback? Ian.
Ian Campbell
2012-Nov-15 08:56 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:> > This patch implements persistent grants for xen-netfront/netback. This > mechanism maintains page pools in netback/netfront, these page pools > is used to save grant pages which are mapped. This way improve > performance which is wasted when doing grant operations.Please can you send a patch against xen-unstable.hg to document this new protocol variant in xen/include/public/io/netif.h. This header is a bit under-documented but lets not let it fall further behind (if you want to go further and document the existing features, in the style of the current blkif.h, then that would be awesome!). You may also want to provide a similar patch to Linux''s copy which is in linux/include/xen/interface/io/netif.h Ian.
ANNIE LI
2012-Nov-15 09:02 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 16:51, Ian Campbell wrote:> On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote: >> On 2012-11-15 15:40, Pasi Kärkkäinen wrote: >>> Hello, >>> >>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: >>>> This patch implements persistent grants for xen-netfront/netback. This >>>> mechanism maintains page pools in netback/netfront, these page pools is used to >>>> save grant pages which are mapped. This way improve performance which is wasted >>>> when doing grant operations. >>>> >>>> Current netback/netfront does map/unmap grant operations frequently when >>>> transmitting/receiving packets, and grant operations costs much cpu clock. In >>>> this patch, netfront/netback maps grant pages when needed and then saves them >>>> into a page pool for future use. All these pages will be unmapped when >>>> removing/releasing the net device. >>>> >>> Do you have performance numbers available already? with/without persistent grants? >> I have some simple netperf/netserver test result with/without persistent >> grants, >> >> Following is result of with persistent grant patch, >> Guests, Sum, Avg, Min, Max >> 1, 15106.4, 15106.4, 15106.36, 15106.36 >> 2, 13052.7, 6526.34, 6261.81, 6790.86 >> 3, 12675.1, 6337.53, 6220.24, 6454.83 >> 4, 13194, 6596.98, 6274.70, 6919.25 > Are these pairs of guests or individual ones?They are pairs of guests.> > I think the really interesting cases are when you get up to larger > numbers of guests, aren''t they?Right.> ISTR that for blkio things got most > interesting WRT persistent grants at the dozens of guests stage. Do you > have any numbers for those?No, I will run more test with more gusets.> > Have you run any tests other than netperf?No, I didn''t.> > Do you have numbers for a a persistent capable backend with a > non-persistent frontend and vice versa?I did it, but the test only runs with 4 guests too,will test with more guests. Thanks Annie> > >> >> Following are result of without persistent patch >> >> Guests, Sum, Avg, Min, Max >> 1, 10864.1, 10864.1, 10864.10, 10864.10 >> 2, 10898.5, 5449.24, 4862.08, 6036.40 >> 3, 10734.5, 5367.26, 5261.43, 5473.08 >> 4, 10924, 5461.99, 5314.84, 5609.14 >
Ian Campbell
2012-Nov-15 09:10 UTC
Re: [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:> This patch implements persistent grant in netback driver. Tx and rx > share the same page pool, this pool will be split into two parts > in next patch. > > Signed-off-by: Annie Li <annie.li@oracle.com> > --- > drivers/net/xen-netback/common.h | 18 +++- > drivers/net/xen-netback/interface.c | 22 ++++ > drivers/net/xen-netback/netback.c | 212 +++++++++++++++++++++++++++++++---- > drivers/net/xen-netback/xenbus.c | 14 ++- > 4 files changed, 239 insertions(+), 27 deletions(-) > > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h > index 94b79c3..a85cac6 100644 > --- a/drivers/net/xen-netback/common.h > +++ b/drivers/net/xen-netback/common.h > @@ -45,8 +45,19 @@ > #include <xen/grant_table.h> > #include <xen/xenbus.h> > > +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) > +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) > +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \BLOCK?> + (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) > + > struct xen_netbk; > > +struct persistent_entry { > + grant_ref_t forgranted; > + struct page *fpage; > + struct gnttab_map_grant_ref map; > +};Isn''t this duplicating a bunch of infrastructure which is also in blkback? Can we put it into some common helpers please?> + > struct xenvif { > /* Unique identifier for this interface. */ > domid_t domid; > @@ -75,6 +86,7 @@ struct xenvif { > > /* Internal feature information. */ > u8 can_queue:1; /* can queue packets for receiver? */ > + u8 persistent_grant:1; > > /* > * Allow xenvif_start_xmit() to peek ahead in the rx request > @@ -98,6 +110,9 @@ struct xenvif { > struct net_device *dev; > > wait_queue_head_t waiting_to_free; > + > + struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];What is the per-vif memory overhead of this array?> +static struct persistent_entry* > +get_per_gnt(struct persistent_entry **pers_entry, > + unsigned int count, grant_ref_t gref) > +{ > + int i; > + > + for (i = 0; i < count; i++) > + if (gref == pers_entry[i]->forgranted) > + return pers_entry[i];Isn''t this linear scan rather expensive? I think Roger implemented some sort of hash lookup for blkback which I think is required here too (and should be easy if you make that code common).> + > + return NULL; > +} > +> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > gop->source.domid = vif->domid; > gop->source.offset = txreq.offset; > > - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + if (!vif->persistent_grant) > + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + else > + gop->dest.u.gmfn = (unsigned long)page_address(page);page_address doesn''t return any sort of frame number, does it? This is rather confusing...> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) > val = 0; > vif->csum = !val; > > - /* Map the shared frame, irq etc. */ > + if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants", > + "%u", &val) < 0) > + val = 0; > + vif->persistent_grant = !!val; > + > +/* Map the shared frame, irq etc. */Please run the patches through checkpatch.pl> err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); > if (err) { > xenbus_dev_fatal(dev, err, > -- > 1.7.3.4 >
Ian Campbell
2012-Nov-15 09:15 UTC
Re: [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:> For tx path, this implementation simplifies the work of searching out > grant page from page pool based on grant reference.It''s still a linear search though, and it doesn''t look much simpler to me: for (i = 0; i < count; i++) { if (tx_pool) vif = netbk->gnttab_tx_vif[i]; else vif = netbk->gnttab_rx_vif[i]; pers_entry = vif->persistent_gnt; gnt_count = &vif->persistent_gntcnt; gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; becomes: for (i = 0; i < count; i++) { if (tx_pool) { vif = netbk->gnttab_tx_vif[i]; gnt_count = &vif->persistent_tx_gntcnt; gnt_total = XEN_NETIF_TX_RING_SIZE; pers_entry = vif->persistent_tx_gnt; } else { vif = netbk->gnttab_rx_vif[i]; gnt_count = &vif->persistent_rx_gntcnt; gnt_total = 2*XEN_NETIF_RX_RING_SIZE; pers_entry = vif->persistent_rx_gnt; }> @@ -111,8 +109,16 @@ struct xenvif { > > wait_queue_head_t waiting_to_free; > > - struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; > - unsigned int persistent_gntcnt; > + struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE]; > + > + /* > + * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment pageShouldn''t that been incorporated into MAXIMUM_OUTSTANDING_BLOCK_REQS (sic) too?> + * using 2 copy operations. > + */ > + struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE];What is the per-vif memory overhead after this change? Ian.
Wei Liu
2012-Nov-15 09:35 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com> wrote:> > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > >> Hello, >> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: >> >>> This patch implements persistent grants for xen-netfront/netback. This >>> mechanism maintains page pools in netback/netfront, these page pools is >>> used to >>> save grant pages which are mapped. This way improve performance which is >>> wasted >>> when doing grant operations. >>> >>> Current netback/netfront does map/unmap grant operations frequently when >>> transmitting/receiving packets, and grant operations costs much cpu >>> clock. In >>> this patch, netfront/netback maps grant pages when needed and then saves >>> them >>> into a page pool for future use. All these pages will be unmapped when >>> removing/releasing the net device. >>> >>> Do you have performance numbers available already? with/without >> persistent grants? >> > I have some simple netperf/netserver test result with/without persistent > grants, > > Following is result of with persistent grant patch, > > Guests, Sum, Avg, Min, Max > 1, 15106.4, 15106.4, 15106.36, 15106.36 > 2, 13052.7, 6526.34, 6261.81, 6790.86 > 3, 12675.1, 6337.53, 6220.24, 6454.83 > 4, 13194, 6596.98, 6274.70, 6919.25 > > > Following are result of without persistent patch > > Guests, Sum, Avg, Min, Max > 1, 10864.1, 10864.1, 10864.10, 10864.10 > 2, 10898.5, 5449.24, 4862.08, 6036.40 > 3, 10734.5, 5367.26, 5261.43, 5473.08 > 4, 10924, 5461.99, 5314.84, 5609.14 > > >Interesting results. Have you tested how good it is on a 10G nic, i.e. guest sending packets through physical network to another host. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Roger Pau Monné
2012-Nov-15 09:57 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 15/11/12 08:04, Annie Li wrote:> This patch implements persistent grant in netback driver. Tx and rx > share the same page pool, this pool will be split into two parts > in next patch. > > Signed-off-by: Annie Li <annie.li@oracle.com> > --- > drivers/net/xen-netback/common.h | 18 +++- > drivers/net/xen-netback/interface.c | 22 ++++ > drivers/net/xen-netback/netback.c | 212 +++++++++++++++++++++++++++++++---- > drivers/net/xen-netback/xenbus.c | 14 ++- > 4 files changed, 239 insertions(+), 27 deletions(-) > > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h > index 94b79c3..a85cac6 100644 > --- a/drivers/net/xen-netback/common.h > +++ b/drivers/net/xen-netback/common.h > @@ -45,8 +45,19 @@ > #include <xen/grant_table.h> > #include <xen/xenbus.h> > > +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) > +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) > +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ > + (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) > + > struct xen_netbk; > > +struct persistent_entry { > + grant_ref_t forgranted; > + struct page *fpage; > + struct gnttab_map_grant_ref map; > +};This should be common with the blkback implementation, I think we should move some structures/functions from blkback to a common place. When I implementated some functions in blkback I though they could be reused by other backends that wanted to use persistent grants, so I keep them free of blkback specific structures.> struct xenvif { > /* Unique identifier for this interface. */ > domid_t domid; > @@ -75,6 +86,7 @@ struct xenvif { > > /* Internal feature information. */ > u8 can_queue:1; /* can queue packets for receiver? */ > + u8 persistent_grant:1; > > /* > * Allow xenvif_start_xmit() to peek ahead in the rx request > @@ -98,6 +110,9 @@ struct xenvif { > struct net_device *dev; > > wait_queue_head_t waiting_to_free; > + > + struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; > + unsigned int persistent_gntcnt;This should be a red-black tree, which has the property of a search time <= O(log n), using an array is more expensive in terms of memory and has a worse search time O(n), this is specially interesting for netback, which can have twice as much persistent grants as blkback (because two rings are used). Take a look at the following functions from blkback; foreach_grant, add_persistent_gnt and get_persistent_gnt. They are generic functions to deal with persistent grants.> }; > > static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) > @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) > return to_xenbus_device(vif->dev->dev.parent); > } > > -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) > -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) > - > struct xenvif *xenvif_alloc(struct device *parent, > domid_t domid, > unsigned int handle); > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c > index b7d41f8..226d159 100644 > --- a/drivers/net/xen-netback/interface.c > +++ b/drivers/net/xen-netback/interface.c > @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, > return ERR_PTR(err); > } > > + vif->persistent_gntcnt = 0; > + > netdev_dbg(dev, "Successfully created xenvif\n"); > return vif; > } > @@ -343,6 +345,23 @@ err: > return err; > } > > +void xenvif_free_grants(struct persistent_entry **pers_entry, > + unsigned int count) > +{ > + int i, ret; > + struct gnttab_unmap_grant_ref unmap; > + > + for (i = 0; i < count; i++) { > + gnttab_set_unmap_op(&unmap, > + (unsigned long)page_to_pfn(pers_entry[i]->fpage), > + GNTMAP_host_map, > + pers_entry[i]->map.handle); > + ret = gnttab_unmap_refs(&unmap, &pers_entry[i]->fpage, > + 1, false);This is not correct, you should call gnttab_set_unmap_op on a batch of grants (up to BLKIF_MAX_SEGMENTS_PER_REQUEST), and then call gnttab_unmap_refs on all of them. Here is a simple example (take a look at blkback.c function xen_blkif_schedule to see an example with a red-black tree, I think this part of the code should also be made common): for (i = 0, segs_to_unmap = 0; i < count; i++) { gnttab_set_unmap_op(&unmap[segs_to_unmap], (unsigned long)page_to_pfn(pers_entry[i]->fpage), GNTMAP_host_map, pers_entry[i]->map.handle); pages[segs_to_unmap] (unsigned long)page_to_pfn(pers_entry[i]->fpage); if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST || (i + 1) == count) { ret = gnttab_unmap_refs(unmap, NULL, pages, segs_to_unmap); BUG_ON(ret); segs_to_unmap == 0; } }> + BUG_ON(ret); > + } > +} > + > void xenvif_disconnect(struct xenvif *vif) > { > struct net_device *dev = vif->dev; > @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif) > unregister_netdev(vif->dev); > > xen_netbk_unmap_frontend_rings(vif); > + if (vif->persistent_grant) > + xenvif_free_grants(vif->persistent_gnt, > + vif->persistent_gntcnt); > > free_netdev(vif->dev); > } > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index 2596401..a26d3fc 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -80,6 +80,8 @@ union page_ext { > void *mapping; > }; > > +struct xenvif; > + > struct xen_netbk { > wait_queue_head_t wq; > struct task_struct *task; > @@ -102,6 +104,7 @@ struct xen_netbk { > > struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; > struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; > + struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS]; > > u16 pending_ring[MAX_PENDING_REQS]; > > @@ -111,12 +114,139 @@ struct xen_netbk { > * straddles two buffers in the frontend. > */ > struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE]; > + struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE]; > struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; > }; > > static struct xen_netbk *xen_netbk; > static int xen_netbk_group_nr; > > +static struct persistent_entry* > +get_per_gnt(struct persistent_entry **pers_entry, > + unsigned int count, grant_ref_t gref) > +{ > + int i; > + > + for (i = 0; i < count; i++) > + if (gref == pers_entry[i]->forgranted) > + return pers_entry[i]; > + > + return NULL; > +}This should be replaced with common code shared with all persistent backends implementations.> + > +static void* > +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count, > + grant_ref_t ref, domid_t domid) > +{ > + struct gnttab_map_grant_ref *map; > + struct page *page; > + unsigned long vaddr; > + unsigned long pfn; > + uint32_t flags; > + int ret = 0; > + > + pers_entry[count] = (struct persistent_entry *) > + kmalloc(sizeof(struct persistent_entry), > + GFP_KERNEL); > + if (!pers_entry[count]) > + return ERR_PTR(-ENOMEM); > + > + map = &pers_entry[count]->map; > + page = alloc_page(GFP_KERNEL); > + if (!page) { > + kfree(pers_entry[count]); > + pers_entry[count] = NULL; > + return ERR_PTR(-ENOMEM); > + } > + > + pers_entry[count]->fpage = page; > + pfn = page_to_pfn(page); > + vaddr = (unsigned long)pfn_to_kaddr(pfn); > + flags = GNTMAP_host_map; > + > + gnttab_set_map_op(map, vaddr, flags, ref, domid); > + ret = gnttab_map_refs(map, NULL, &page, 1); > + BUG_ON(ret);This is highly inefficient, one of the points of using gnttab_set_map_op is that you can queue a bunch of grants, and then map them at the same time using gnttab_map_refs, but here you are using it to map a single grant at a time. You should instead see how much grants you need to map to complete the request and map them all at the same time.> + > + pers_entry[count]->forgranted = ref; > + > + return page_address(page); > +} > + > +static void* > +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count, > + grant_ref_t ref, domid_t domid, unsigned int total) > +{ > + struct persistent_entry *per_gnt; > + void *vaddr; > + > + per_gnt = get_per_gnt(pers_entry, *count, ref); > + > + if (per_gnt != NULL) > + return page_address(per_gnt->fpage); > + else { > + BUG_ON(*count >= total); > + vaddr = map_new_gnt(pers_entry, *count, ref, domid); > + if (IS_ERR_OR_NULL(vaddr)) > + return vaddr; > + *count += 1; > + return vaddr; > + } > +} > + > +static int > +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, > + struct xen_netbk *netbk, bool tx_pool) > +{ > + int i; > + struct xenvif *vif; > + struct gnttab_copy *uop = vuop; > + unsigned int *gnt_count; > + unsigned int gnt_total; > + struct persistent_entry **pers_entry; > + int ret = 0; > + > + BUG_ON(cmd != GNTTABOP_copy); > + for (i = 0; i < count; i++) { > + if (tx_pool) > + vif = netbk->gnttab_tx_vif[i]; > + else > + vif = netbk->gnttab_rx_vif[i]; > + > + pers_entry = vif->persistent_gnt; > + gnt_count = &vif->persistent_gntcnt; > + gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; > + > + if (vif->persistent_grant) { > + void *saddr, *daddr; > + > + saddr = uop[i].source.domid == DOMID_SELF ? > + (void *) uop[i].source.u.gmfn : > + get_ref_page(pers_entry, gnt_count, > + uop[i].source.u.ref, > + uop[i].source.domid, > + gnt_total); > + if (IS_ERR_OR_NULL(saddr)) > + return -ENOMEM; > + > + daddr = uop[i].dest.domid == DOMID_SELF ? > + (void *) uop[i].dest.u.gmfn : > + get_ref_page(pers_entry, gnt_count, > + uop[i].dest.u.ref, > + uop[i].dest.domid, > + gnt_total); > + if (IS_ERR_OR_NULL(daddr)) > + return -ENOMEM; > + > + memcpy(daddr+uop[i].dest.offset, > + saddr+uop[i].source.offset, uop[i].len); > + } else > + ret = HYPERVISOR_grant_table_op(cmd, &uop[i], 1); > + } > + > + return ret; > +} > + > void xen_netbk_add_xenvif(struct xenvif *vif) > { > int i; > @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, > * Set up the grant operations for this fragment. If it''s a flipping > * interface, we also set up the unmap request from here. > */ > -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > - struct netrx_pending_operations *npo, > - struct page *page, unsigned long size, > - unsigned long offset, int *head) > +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > + struct netrx_pending_operations *npo, > + struct page *page, unsigned long size, > + unsigned long offset, int *head, > + struct xenvif **grxvif) > { > struct gnttab_copy *copy_gop; > struct netbk_rx_meta *meta; > + int count = 0; > /* > * These variables are used iff get_page_ext returns true, > * in which case they are guaranteed to be initialized. > @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > bytes = MAX_BUFFER_OFFSET - npo->copy_off; > > copy_gop = npo->copy + npo->copy_prod++; > + *grxvif = vif; > + grxvif++; > + count++; > + > copy_gop->flags = GNTCOPY_dest_gref; > if (foreign) { > struct xen_netbk *netbk = &xen_netbk[group]; > @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > } else { > void *vaddr = page_address(page); > copy_gop->source.domid = DOMID_SELF; > - copy_gop->source.u.gmfn = virt_to_mfn(vaddr); > + if (!vif->persistent_grant) > + copy_gop->source.u.gmfn = virt_to_mfn(vaddr); > + else > + copy_gop->source.u.gmfn = (unsigned long)vaddr; > } > copy_gop->source.offset = offset; > copy_gop->dest.domid = vif->domid; > @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > *head = 0; /* There must be something in this buffer now. */ > > } > + return count; > } > > /* > @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, > * zero GSO descriptors (for non-GSO packets) or one descriptor (for > * frontend-side LRO). > */ > -static int netbk_gop_skb(struct sk_buff *skb, > - struct netrx_pending_operations *npo) > +static int netbk_gop_skb(struct xen_netbk *netbk, > + struct sk_buff *skb, > + struct netrx_pending_operations *npo, > + struct xenvif **grxvif) > { > struct xenvif *vif = netdev_priv(skb->dev); > int nr_frags = skb_shinfo(skb)->nr_frags; > @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb, > if (data + len > skb_tail_pointer(skb)) > len = skb_tail_pointer(skb) - data; > > - netbk_gop_frag_copy(vif, skb, npo, > - virt_to_page(data), len, offset, &head); > + grxvif += netbk_gop_frag_copy(vif, skb, npo, > + virt_to_page(data), len, > + offset, &head, grxvif); > + > data += len; > } > > for (i = 0; i < nr_frags; i++) { > - netbk_gop_frag_copy(vif, skb, npo, > - skb_frag_page(&skb_shinfo(skb)->frags[i]), > - skb_frag_size(&skb_shinfo(skb)->frags[i]), > - skb_shinfo(skb)->frags[i].page_offset, > - &head); > + grxvif += netbk_gop_frag_copy(vif, skb, npo, > + skb_frag_page(&skb_shinfo(skb)->frags[i]), > + skb_frag_size(&skb_shinfo(skb)->frags[i]), > + skb_shinfo(skb)->frags[i].page_offset, > + &head, grxvif); > } > > return npo->meta_prod - old_meta_prod; > @@ -593,6 +737,8 @@ struct skb_cb_overlay { > static void xen_netbk_rx_action(struct xen_netbk *netbk) > { > struct xenvif *vif = NULL, *tmp; > + struct xenvif **grxvif = netbk->gnttab_rx_vif; > + int old_copy_prod = 0; > s8 status; > u16 irq, flags; > struct xen_netif_rx_response *resp; > @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) > nr_frags = skb_shinfo(skb)->nr_frags; > > sco = (struct skb_cb_overlay *)skb->cb; > - sco->meta_slots_used = netbk_gop_skb(skb, &npo); > + sco->meta_slots_used = netbk_gop_skb(netbk, skb, &npo, grxvif); > + grxvif += npo.copy_prod - old_copy_prod; > + old_copy_prod = npo.copy_prod; > > count += nr_frags + 1; > > @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) > return; > > BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op)); > - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op, > - npo.copy_prod); > + ret = grant_memory_copy_op(GNTTABOP_copy, > + &netbk->grant_copy_op, > + npo.copy_prod, netbk, > + false); > BUG_ON(ret != 0); > > while ((skb = __skb_dequeue(&rxq)) != NULL) { > @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, > struct xenvif *vif, > struct sk_buff *skb, > struct xen_netif_tx_request *txp, > - struct gnttab_copy *gop) > + struct gnttab_copy *gop, > + struct xenvif **gtxvif) > { > struct skb_shared_info *shinfo = skb_shinfo(skb); > skb_frag_t *frags = shinfo->frags; > @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, > gop->source.domid = vif->domid; > gop->source.offset = txp->offset; > > - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + if (!vif->persistent_grant) > + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + else > + gop->dest.u.gmfn = (unsigned long)page_address(page); > + > gop->dest.domid = DOMID_SELF; > gop->dest.offset = txp->offset; > > @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, > gop->flags = GNTCOPY_source_gref; > > gop++; > + *gtxvif = vif; > + gtxvif++; > + > > memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp)); > xenvif_get(vif); > @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) > static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > { > struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; > + struct xenvif **gtxvif = netbk->gnttab_tx_vif; > struct sk_buff *skb; > int ret; > > @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > gop->source.domid = vif->domid; > gop->source.offset = txreq.offset; > > - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + if (!vif->persistent_grant) > + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > + else > + gop->dest.u.gmfn = (unsigned long)page_address(page); > + > gop->dest.domid = DOMID_SELF; > gop->dest.offset = txreq.offset; > > @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > gop->flags = GNTCOPY_source_gref; > > gop++; > + *gtxvif++ = vif; > > memcpy(&netbk->pending_tx_info[pending_idx].req, > &txreq, sizeof(txreq)); > @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > netbk->pending_cons++; > > request_gop = xen_netbk_get_requests(netbk, vif, > - skb, txfrags, gop); > + skb, txfrags, gop, gtxvif); > if (request_gop == NULL) { > kfree_skb(skb); > netbk_tx_err(vif, &txreq, idx); > continue; > } > + gtxvif += request_gop - gop; > gop = request_gop; > > vif->tx.req_cons = idx; > @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk) > > if (nr_gops == 0) > return; > - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, > - netbk->tx_copy_ops, nr_gops); > + ret = grant_memory_copy_op(GNTTABOP_copy, > + netbk->tx_copy_ops, nr_gops, > + netbk, true); > BUG_ON(ret); > > xen_netbk_tx_submit(netbk); > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c > index 410018c..938e908 100644 > --- a/drivers/net/xen-netback/xenbus.c > +++ b/drivers/net/xen-netback/xenbus.c > @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev, > goto abort_transaction; > } > > + err = xenbus_printf(xbt, dev->nodename, > + "feature-persistent-grants", "%u", 1); > + if (err) { > + message = "writing feature-persistent-grants"; > + goto abort_transaction; > + } > + > err = xenbus_transaction_end(xbt, 0); > } while (err == -EAGAIN); > > @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) > val = 0; > vif->csum = !val; > > - /* Map the shared frame, irq etc. */ > + if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants", > + "%u", &val) < 0)In block devices "feature-persistent" is used, so I think that for clearness it should be announced the same way in net.> + val = 0; > + vif->persistent_grant = !!val; > + > +/* Map the shared frame, irq etc. */ > err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); > if (err) { > xenbus_dev_fatal(dev, err, > -- > 1.7.3.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Roger Pau Monné
2012-Nov-15 10:52 UTC
Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
On 15/11/12 08:05, Annie Li wrote:> Tx/rx page pool are maintained. New grant is mapped and put into > pool, unmap only happens when releasing/removing device. > > Signed-off-by: Annie Li <annie.li@oracle.com> > --- > drivers/net/xen-netfront.c | 372 +++++++++++++++++++++++++++++++++++++------- > 1 files changed, 315 insertions(+), 57 deletions(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 0ebbb19..17b81c0 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -79,6 +79,13 @@ struct netfront_stats { > struct u64_stats_sync syncp; > }; > > +struct gnt_list { > + grant_ref_t gref; > + struct page *gnt_pages; > + void *gnt_target; > + struct gnt_list *tail; > +};This could also be shared with blkfront.> + > struct netfront_info { > struct list_head list; > struct net_device *netdev; > @@ -109,6 +116,10 @@ struct netfront_info { > grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; > unsigned tx_skb_freelist; > > + struct gnt_list *tx_grant[NET_TX_RING_SIZE]; > + struct gnt_list *tx_gnt_list; > + unsigned int tx_gnt_cnt;I don''t understand this, why do you need both an array and a list? I''m not familiar with net code, so I don''t know if this is some kind of special netfront thing? Anyway if you have to use a list I would recommend using one of the list constructions that''s already in the kernel, it simplifies the code and makes it more easy to understand than creating your own list structure.> + > spinlock_t rx_lock ____cacheline_aligned_in_smp; > struct xen_netif_rx_front_ring rx; > int rx_ring_ref; > @@ -126,6 +137,10 @@ struct netfront_info { > grant_ref_t gref_rx_head; > grant_ref_t grant_rx_ref[NET_RX_RING_SIZE]; > > + struct gnt_list *rx_grant[NET_RX_RING_SIZE]; > + struct gnt_list *rx_gnt_list; > + unsigned int rx_gnt_cnt;Same comment above here.> + > unsigned long rx_pfn_array[NET_RX_RING_SIZE]; > struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1]; > struct mmu_update rx_mmu[NET_RX_RING_SIZE]; > @@ -134,6 +149,7 @@ struct netfront_info { > struct netfront_stats __percpu *stats; > > unsigned long rx_gso_checksum_fixup; > + u8 persistent_gnt:1; > }; > > struct netfront_rx_info { > @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np, > return ref; > } > > +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np, > + RING_IDX ri) > +{ > + int i = xennet_rxidx(ri); > + struct gnt_list *gntlist = np->rx_grant[i]; > + np->rx_grant[i] = NULL;Ok, I think I get why do you need both an array and a list, is that because netfront doesn''t have some kind of shadow ring to keep track of issued requests? So each issued request has an associated gnt_list with the list of used grants? If so it would be good to add a comment about it.> + return gntlist; > +} > + > #ifdef CONFIG_SYSFS > static int xennet_sysfs_addif(struct net_device *netdev); > static void xennet_sysfs_delif(struct net_device *netdev); > @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev) > netif_wake_queue(dev); > } > > +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev, > + unsigned long mfn, void *vaddr, > + unsigned int id, > + grant_ref_t ref) > +{ > + struct netfront_info *np = netdev_priv(dev); > + grant_ref_t gnt_ref; > + struct gnt_list *gnt_list_entry; > + > + if (np->persistent_gnt && np->rx_gnt_cnt) { > + gnt_list_entry = np->rx_gnt_list; > + np->rx_gnt_list = np->rx_gnt_list->tail; > + np->rx_gnt_cnt--; > + > + gnt_list_entry->gnt_target = vaddr; > + gnt_ref = gnt_list_entry->gref; > + np->rx_grant[id] = gnt_list_entry; > + } else { > + struct page *page; > + > + BUG_ON(!np->persistent_gnt && np->rx_gnt_cnt); > + if (!ref) > + gnt_ref > + gnttab_claim_grant_reference(&np->gref_rx_head); > + else > + gnt_ref = ref; > + BUG_ON((signed short)gnt_ref < 0); > + > + if (np->persistent_gnt) {So you are only using persistent grants if the backend also supports them. Have you benchmarked the performance of a persistent frontend with a non-persistent backend. In the block case, usign a persistent frontend with a non-persistent backend let to an overall performance improvement, so blkfront uses persistent grants even if blkback doesn''t support them. Take a look at the following graph: http://xenbits.xen.org/people/royger/persistent_grants/nonpers_read.png> + page = alloc_page(GFP_KERNEL); > + if (!page) { > + if (!ref) > + gnttab_release_grant_reference( > + &np->gref_rx_head, ref); > + return -ENOMEM; > + } > + mfn = pfn_to_mfn(page_to_pfn(page)); > + > + gnt_list_entry = kmalloc(sizeof(struct gnt_list), > + GFP_KERNEL); > + if (!gnt_list_entry) { > + __free_page(page); > + if (!ref) > + gnttab_release_grant_reference( > + &np->gref_rx_head, ref); > + return -ENOMEM; > + } > + gnt_list_entry->gref = gnt_ref; > + gnt_list_entry->gnt_pages = page; > + gnt_list_entry->gnt_target = vaddr; > + > + np->rx_grant[id] = gnt_list_entry; > + } > + > + gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id, > + mfn, 0); > + } > + np->grant_rx_ref[id] = gnt_ref; > + > + return gnt_ref; > +} > + > static void xennet_alloc_rx_buffers(struct net_device *dev) > { > unsigned short id; > @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev) > int i, batch_target, notify; > RING_IDX req_prod = np->rx.req_prod_pvt; > grant_ref_t ref; > - unsigned long pfn; > - void *vaddr; > struct xen_netif_rx_request *req; > > if (unlikely(!netif_carrier_ok(dev))) > @@ -306,19 +392,16 @@ no_skb: > BUG_ON(np->rx_skbs[id]); > np->rx_skbs[id] = skb; > > - ref = gnttab_claim_grant_reference(&np->gref_rx_head); > - BUG_ON((signed short)ref < 0); > - np->grant_rx_ref[id] = ref; > + page = skb_frag_page(&skb_shinfo(skb)->frags[0]); > > - pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0])); > - vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0])); > + ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), > + page_address(page), id, 0); > + if ((signed short)ref < 0) { > + __skb_queue_tail(&np->rx_batch, skb); > + break; > + } > > req = RING_GET_REQUEST(&np->rx, req_prod + i); > - gnttab_grant_foreign_access_ref(ref, > - np->xbdev->otherend_id, > - pfn_to_mfn(pfn), > - 0); > - > req->id = id; > req->gref = ref; > } > @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev) > > id = txrsp->id; > skb = np->tx_skbs[id].skb; > - if (unlikely(gnttab_query_foreign_access( > - np->grant_tx_ref[id]) != 0)) { > - printk(KERN_ALERT "xennet_tx_buf_gc: warning " > - "-- grant still in use by backend " > - "domain.\n"); > - BUG(); > + > + if (np->persistent_gnt) { > + struct gnt_list *gnt_list_entry; > + > + gnt_list_entry = np->tx_grant[id]; > + BUG_ON(!gnt_list_entry); > + > + gnt_list_entry->tail = np->tx_gnt_list; > + np->tx_gnt_list = gnt_list_entry; > + np->tx_gnt_cnt++; > + } else { > + if (unlikely(gnttab_query_foreign_access( > + np->grant_tx_ref[id]) != 0)) { > + printk(KERN_ALERT "xennet_tx_buf_gc: warning " > + "-- grant still in use by backend " > + "domain.\n"); > + BUG(); > + } > + > + gnttab_end_foreign_access_ref( > + np->grant_tx_ref[id], GNTMAP_readonly);If I''ve read the code correctly, you are giving this frame both read/write permissions to the other end on xennet_alloc_tx_ref, but then you are only removing the read permissions? (see comment below on the xennet_alloc_tx_ref function).> + gnttab_release_grant_reference( > + &np->gref_tx_head, np->grant_tx_ref[id]); > } > - gnttab_end_foreign_access_ref( > - np->grant_tx_ref[id], GNTMAP_readonly); > - gnttab_release_grant_reference( > - &np->gref_tx_head, np->grant_tx_ref[id]); > np->grant_tx_ref[id] = GRANT_INVALID_REF; > add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id); > dev_kfree_skb_irq(skb); > @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev) > xennet_maybe_wake_tx(dev); > } > > +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev, > + unsigned long mfn, > + unsigned int id) > +{ > + struct netfront_info *np = netdev_priv(dev); > + grant_ref_t ref; > + struct page *granted_page; > + > + if (np->persistent_gnt && np->tx_gnt_cnt) { > + struct gnt_list *gnt_list_entry; > + > + gnt_list_entry = np->tx_gnt_list; > + np->tx_gnt_list = np->tx_gnt_list->tail; > + np->tx_gnt_cnt--; > + > + ref = gnt_list_entry->gref; > + np->tx_grant[id] = gnt_list_entry; > + } else { > + struct gnt_list *gnt_list_entry; > + > + BUG_ON(!np->persistent_gnt && np->tx_gnt_cnt); > + ref = gnttab_claim_grant_reference(&np->gref_tx_head); > + BUG_ON((signed short)ref < 0); > + > + if (np->persistent_gnt) { > + granted_page = alloc_page(GFP_KERNEL); > + if (!granted_page) { > + gnttab_release_grant_reference( > + &np->gref_tx_head, ref); > + return -ENOMEM; > + } > + > + mfn = pfn_to_mfn(page_to_pfn(granted_page)); > + gnt_list_entry = kmalloc(sizeof(struct gnt_list), > + GFP_KERNEL); > + if (!gnt_list_entry) { > + __free_page(granted_page); > + gnttab_release_grant_reference( > + &np->gref_tx_head, ref); > + return -ENOMEM; > + } > + > + gnt_list_entry->gref = ref; > + gnt_list_entry->gnt_pages = granted_page; > + np->tx_grant[id] = gnt_list_entry; > + } > + gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id, > + mfn, 0);If you are not always using persistent grants I guess you need to give read only permissions to this frame (GNTMAP_readonly). Also, for keeping things in logical order, isn''t it best that this function comes before xennet_tx_buf_gc?> + } > + > + return ref; > +} > + > @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np) > } > > skb = np->rx_skbs[id]; > - mfn = gnttab_end_foreign_transfer_ref(ref); > - gnttab_release_grant_reference(&np->gref_rx_head, ref); > + if (!np->persistent_gnt) { > + mfn = gnttab_end_foreign_transfer_ref(ref); > + gnttab_release_grant_reference(&np->gref_rx_head, ref); > + } > np->grant_rx_ref[id] = GRANT_INVALID_REF; > > if (0 == mfn) { > @@ -1607,6 +1834,13 @@ again: > goto abort_transaction; > } > > + err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants", > + "%u", info->persistent_gnt);As in netback, I think "feature-persistent" should be used.> + if (err) { > + message = "writing feature-persistent-grants"; > + xenbus_dev_fatal(dev, err, "%s", message); > + } > + > err = xenbus_transaction_end(xbt, 0); > if (err) { > if (err == -EAGAIN) > @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev) > grant_ref_t ref; > struct xen_netif_rx_request *req; > unsigned int feature_rx_copy; > + int ret, val; > > err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, > "feature-rx-copy", "%u", &feature_rx_copy); > @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev) > return -ENODEV; > } > > + err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, > + "feature-persistent-grants", "%u", &val); > + if (err != 1) > + val = 0; > + > + np->persistent_gnt = !!val; > + > err = talk_to_netback(np->xbdev, np); > if (err) > return err; > @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev) > spin_lock_bh(&np->rx_lock); > spin_lock_irq(&np->tx_lock); > > + np->tx_gnt_cnt = 0; > + np->rx_gnt_cnt = 0; > + > /* Step 1: Discard all pending TX packet fragments. */ > xennet_release_tx_bufs(np); > > + if (np->persistent_gnt) { > + struct gnt_list *gnt_list_entry; > + > + while (np->rx_gnt_list) { > + gnt_list_entry = np->rx_gnt_list; > + np->rx_gnt_list = np->rx_gnt_list->tail; > + gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); > + __free_page(gnt_list_entry->gnt_pages); > + kfree(gnt_list_entry); > + } > + } > + > /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */ > for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) { > skb_frag_t *frag; > @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev) > > frag = &skb_shinfo(skb)->frags[0]; > page = skb_frag_page(frag); > - gnttab_grant_foreign_access_ref( > - ref, np->xbdev->otherend_id, > - pfn_to_mfn(page_to_pfn(page)), > - 0); > + ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), > + page_address(page), requeue_idx, ref); > + if ((signed short)ret < 0) > + break; > + > req->gref = ref; > req->id = requeue_idx; > > -- > 1.7.3.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Roger Pau Monné
2012-Nov-15 10:56 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 15/11/12 09:38, ANNIE LI wrote:> > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: >> Hello, >> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: >>> This patch implements persistent grants for xen-netfront/netback. This >>> mechanism maintains page pools in netback/netfront, these page pools is used to >>> save grant pages which are mapped. This way improve performance which is wasted >>> when doing grant operations. >>> >>> Current netback/netfront does map/unmap grant operations frequently when >>> transmitting/receiving packets, and grant operations costs much cpu clock. In >>> this patch, netfront/netback maps grant pages when needed and then saves them >>> into a page pool for future use. All these pages will be unmapped when >>> removing/releasing the net device. >>> >> Do you have performance numbers available already? with/without persistent grants? > I have some simple netperf/netserver test result with/without persistent > grants, > > Following is result of with persistent grant patch, > > Guests, Sum, Avg, Min, Max > 1, 15106.4, 15106.4, 15106.36, 15106.36 > 2, 13052.7, 6526.34, 6261.81, 6790.86 > 3, 12675.1, 6337.53, 6220.24, 6454.83 > 4, 13194, 6596.98, 6274.70, 6919.25 > > > Following are result of without persistent patch > > Guests, Sum, Avg, Min, Max > 1, 10864.1, 10864.1, 10864.10, 10864.10 > 2, 10898.5, 5449.24, 4862.08, 6036.40 > 3, 10734.5, 5367.26, 5261.43, 5473.08 > 4, 10924, 5461.99, 5314.84, 5609.14In the block case, performance improvement is seen when using a large number of guests, could you perform the same benchmark increasing the number of guests to 15?
ANNIE LI
2012-Nov-15 11:12 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 17:35, Wei Liu wrote:> > On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com > <mailto:annie.li@oracle.com>> wrote: > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > Hello, > > On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > > This patch implements persistent grants for > xen-netfront/netback. This > mechanism maintains page pools in netback/netfront, these > page pools is used to > save grant pages which are mapped. This way improve > performance which is wasted > when doing grant operations. > > Current netback/netfront does map/unmap grant operations > frequently when > transmitting/receiving packets, and grant operations costs > much cpu clock. In > this patch, netfront/netback maps grant pages when needed > and then saves them > into a page pool for future use. All these pages will be > unmapped when > removing/releasing the net device. > > Do you have performance numbers available already? > with/without persistent grants? > > I have some simple netperf/netserver test result with/without > persistent grants, > > Following is result of with persistent grant patch, > > Guests, Sum, Avg, Min, Max > 1, 15106.4, 15106.4, 15106.36, 15106.36 > 2, 13052.7, 6526.34, 6261.81, 6790.86 > 3, 12675.1, 6337.53, 6220.24, 6454.83 > 4, 13194, 6596.98, 6274.70, 6919.25 > > > Following are result of without persistent patch > > Guests, Sum, Avg, Min, Max > 1, 10864.1, 10864.1, 10864.10, 10864.10 > 2, 10898.5, 5449.24, 4862.08, 6036.40 > 3, 10734.5, 5367.26, 5261.43, 5473.08 > 4, 10924, 5461.99, 5314.84, 5609.14 > > > > Interesting results. Have you tested how good it is on a 10G nic, i.e. > guest sending packets > through physical network to another host.Not yet. Thanks Annie> > > Wei. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
ANNIE LI
2012-Nov-15 11:14 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 18:56, Roger Pau Monné wrote:> On 15/11/12 09:38, ANNIE LI wrote: >> >> On 2012-11-15 15:40, Pasi Kärkkäinen wrote: >>> Hello, >>> >>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: >>>> This patch implements persistent grants for xen-netfront/netback. This >>>> mechanism maintains page pools in netback/netfront, these page pools is used to >>>> save grant pages which are mapped. This way improve performance which is wasted >>>> when doing grant operations. >>>> >>>> Current netback/netfront does map/unmap grant operations frequently when >>>> transmitting/receiving packets, and grant operations costs much cpu clock. In >>>> this patch, netfront/netback maps grant pages when needed and then saves them >>>> into a page pool for future use. All these pages will be unmapped when >>>> removing/releasing the net device. >>>> >>> Do you have performance numbers available already? with/without persistent grants? >> I have some simple netperf/netserver test result with/without persistent >> grants, >> >> Following is result of with persistent grant patch, >> >> Guests, Sum, Avg, Min, Max >> 1, 15106.4, 15106.4, 15106.36, 15106.36 >> 2, 13052.7, 6526.34, 6261.81, 6790.86 >> 3, 12675.1, 6337.53, 6220.24, 6454.83 >> 4, 13194, 6596.98, 6274.70, 6919.25 >> >> >> Following are result of without persistent patch >> >> Guests, Sum, Avg, Min, Max >> 1, 10864.1, 10864.1, 10864.10, 10864.10 >> 2, 10898.5, 5449.24, 4862.08, 6036.40 >> 3, 10734.5, 5367.26, 5261.43, 5473.08 >> 4, 10924, 5461.99, 5314.84, 5609.14 > In the block case, performance improvement is seen when using a large > number of guests, could you perform the same benchmark increasing the > number of guests to 15?Sure, but my current local environment does not meet such test requirement, let me find an environment and then start such test. Thanks Annie>
ANNIE LI
2012-Nov-15 11:14 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 16:53, Ian Campbell wrote:> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote: >> This patch is based on linux3.4-rc3. I hit netperf/netserver failure >> on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure >> whether thisnetperf/netserver failure connects compound page commit in >> v3.7-rc1, but I did hit BUG_ON with debug patch from thread >> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html > Do you think you could cook up a netfront fix similar in principal to > the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback?Ok, let me have a try to do similar thing in netfront. Thanks Annie> > Ian. > >
ANNIE LI
2012-Nov-15 11:14 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-15 16:56, Ian Campbell wrote:> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote: > >> This patch implements persistent grants for xen-netfront/netback. This >> mechanism maintains page pools in netback/netfront, these page pools >> is used to save grant pages which are mapped. This way improve >> performance which is wasted when doing grant operations. > Please can you send a patch against xen-unstable.hg to document this new > protocol variant in xen/include/public/io/netif.h. This header is a bit > under-documented but lets not let it fall further behind (if you want to > go further and document the existing features, in the style of the > current blkif.h, then that would be awesome!).Ok, no problem.> > You may also want to provide a similar patch to Linux''s copy which is in > linux/include/xen/interface/io/netif.hOk, will do that. Thanks Annie> > Ian. > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Ian Campbell
2012-Nov-15 11:15 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:> On 15/11/12 09:38, ANNIE LI wrote: > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > >> Hello, > >> > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > >>> This patch implements persistent grants for xen-netfront/netback. This > >>> mechanism maintains page pools in netback/netfront, these page pools is used to > >>> save grant pages which are mapped. This way improve performance which is wasted > >>> when doing grant operations. > >>> > >>> Current netback/netfront does map/unmap grant operations frequently when > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In > >>> this patch, netfront/netback maps grant pages when needed and then saves them > >>> into a page pool for future use. All these pages will be unmapped when > >>> removing/releasing the net device. > >>> > >> Do you have performance numbers available already? with/without persistent grants? > > I have some simple netperf/netserver test result with/without persistent > > grants, > > > > Following is result of with persistent grant patch, > > > > Guests, Sum, Avg, Min, Max > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > Following are result of without persistent patch > > > > Guests, Sum, Avg, Min, Max > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > 4, 10924, 5461.99, 5314.84, 5609.14 > > In the block case, performance improvement is seen when using a large > number of guests, could you perform the same benchmark increasing the > number of guests to 15?It would also be nice to see some analysis of the numbers which justify why this change is a good one without every reviewer having to evaluate the raw data themselves. In fact this should really be part of the commit message. Ian.
Konrad Rzeszutek Wilk
2012-Nov-15 18:29 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote:> On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote: > > On 15/11/12 09:38, ANNIE LI wrote: > > > > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > >> Hello, > > >> > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > > >>> This patch implements persistent grants for xen-netfront/netback. This > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to > > >>> save grant pages which are mapped. This way improve performance which is wasted > > >>> when doing grant operations. > > >>> > > >>> Current netback/netfront does map/unmap grant operations frequently when > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In > > >>> this patch, netfront/netback maps grant pages when needed and then saves them > > >>> into a page pool for future use. All these pages will be unmapped when > > >>> removing/releasing the net device. > > >>> > > >> Do you have performance numbers available already? with/without persistent grants? > > > I have some simple netperf/netserver test result with/without persistent > > > grants, > > > > > > Following is result of with persistent grant patch, > > > > > > Guests, Sum, Avg, Min, Max > > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > > > > Following are result of without persistent patch > > > > > > Guests, Sum, Avg, Min, Max > > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > > 4, 10924, 5461.99, 5314.84, 5609.14 > > > > In the block case, performance improvement is seen when using a large > > number of guests, could you perform the same benchmark increasing the > > number of guests to 15? > > It would also be nice to see some analysis of the numbers which justify > why this change is a good one without every reviewer having to evaluate > the raw data themselves. In fact this should really be part of the > commit message.You mean like a nice graph, eh? I will run these patches on my 32GB box and see if I can give you a nice PDF/jpg.> > Ian. >
Ian Campbell
2012-Nov-15 19:11 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 18:29 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote: > > On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote: > > > On 15/11/12 09:38, ANNIE LI wrote: > > > > > > > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > > >> Hello, > > > >> > > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > > > >>> This patch implements persistent grants for xen-netfront/netback. This > > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to > > > >>> save grant pages which are mapped. This way improve performance which is wasted > > > >>> when doing grant operations. > > > >>> > > > >>> Current netback/netfront does map/unmap grant operations frequently when > > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In > > > >>> this patch, netfront/netback maps grant pages when needed and then saves them > > > >>> into a page pool for future use. All these pages will be unmapped when > > > >>> removing/releasing the net device. > > > >>> > > > >> Do you have performance numbers available already? with/without persistent grants? > > > > I have some simple netperf/netserver test result with/without persistent > > > > grants, > > > > > > > > Following is result of with persistent grant patch, > > > > > > > > Guests, Sum, Avg, Min, Max > > > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > > > > > > > Following are result of without persistent patch > > > > > > > > Guests, Sum, Avg, Min, Max > > > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > > > 4, 10924, 5461.99, 5314.84, 5609.14 > > > > > > In the block case, performance improvement is seen when using a large > > > number of guests, could you perform the same benchmark increasing the > > > number of guests to 15? > > > > It would also be nice to see some analysis of the numbers which justify > > why this change is a good one without every reviewer having to evaluate > > the raw data themselves. In fact this should really be part of the > > commit message. > > You mean like a nice graph, eh?Together with an analysis of what it means and why it is a good thing, yes. Ian.> > I will run these patches on my 32GB box and see if I can give you > a nice PDF/jpg. > > > > > Ian. > >
ANNIE LI
2012-Nov-16 02:18 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 2012-11-15 17:10, Ian Campbell wrote:> On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote: >> This patch implements persistent grant in netback driver. Tx and rx >> share the same page pool, this pool will be split into two parts >> in next patch. >> >> Signed-off-by: Annie Li<annie.li@oracle.com> >> --- >> drivers/net/xen-netback/common.h | 18 +++- >> drivers/net/xen-netback/interface.c | 22 ++++ >> drivers/net/xen-netback/netback.c | 212 +++++++++++++++++++++++++++++++---- >> drivers/net/xen-netback/xenbus.c | 14 ++- >> 4 files changed, 239 insertions(+), 27 deletions(-) >> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h >> index 94b79c3..a85cac6 100644 >> --- a/drivers/net/xen-netback/common.h >> +++ b/drivers/net/xen-netback/common.h >> @@ -45,8 +45,19 @@ >> #include<xen/grant_table.h> >> #include<xen/xenbus.h> >> >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) >> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ > BLOCK?Oh, an error when splitting the patch, will fix it, thanks.> >> + (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) >> + >> struct xen_netbk; >> >> +struct persistent_entry { >> + grant_ref_t forgranted; >> + struct page *fpage; >> + struct gnttab_map_grant_ref map; >> +}; > Isn''t this duplicating a bunch of infrastructure which is also in > blkback? Can we put it into some common helpers please?Yes, "struct gnttab_map_grant_ref map" can be changed to handle like blkback to keep same as blkback, and share them in common helpers.> >> + >> struct xenvif { >> /* Unique identifier for this interface. */ >> domid_t domid; >> @@ -75,6 +86,7 @@ struct xenvif { >> >> /* Internal feature information. */ >> u8 can_queue:1; /* can queue packets for receiver? */ >> + u8 persistent_grant:1; >> >> /* >> * Allow xenvif_start_xmit() to peek ahead in the rx request >> @@ -98,6 +110,9 @@ struct xenvif { >> struct net_device *dev; >> >> wait_queue_head_t waiting_to_free; >> + >> + struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; > What is the per-vif memory overhead of this array?In this patch, The maximum of memory overhead is about (XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE (plus size of grant_ref_t and handle) which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached. In next patch of splitting tx/rx pool, the maximum is about (256+512)PAGE_SIZE.> >> +static struct persistent_entry* >> +get_per_gnt(struct persistent_entry **pers_entry, >> + unsigned int count, grant_ref_t gref) >> +{ >> + int i; >> + >> + for (i = 0; i< count; i++) >> + if (gref == pers_entry[i]->forgranted) >> + return pers_entry[i]; > Isn''t this linear scan rather expensive? I think Roger implemented some > sort of hash lookup for blkback which I think is required here too (and > should be easy if you make that code common).Agree, thanks.> >> + >> + return NULL; >> +} >> + >> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >> gop->source.domid = vif->domid; >> gop->source.offset = txreq.offset; >> >> - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + if (!vif->persistent_grant) >> + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + else >> + gop->dest.u.gmfn = (unsigned long)page_address(page); > page_address doesn''t return any sort of frame number, does it? This is > rather confusing...Yes. I only use dest.u.gmfn element to save the page_address here for future memcpy, and it does not mean to use frame number actually. To avoid confusion, here I can use gop->dest.u.gmfn = virt_to_mfn(page_address(page)); and then call mfn_to_virt when doing memcpy.> >> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) >> val = 0; >> vif->csum = !val; >> >> - /* Map the shared frame, irq etc. */ >> + if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants", >> + "%u",&val)< 0) >> + val = 0; >> + vif->persistent_grant = !!val; >> + >> +/* Map the shared frame, irq etc. */ > Please run the patches through checkpatch.plYes, I run checkpatch.pl before posting them. The only warning exists in initial code netfront.c, it is a printk code in xennet_tx_buf_gc, I did not fix that. Thanks Annie> >> err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); >> if (err) { >> xenbus_dev_fatal(dev, err, >> -- >> 1.7.3.4 >> >
ANNIE LI
2012-Nov-16 02:49 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 2012-11-15 17:57, Roger Pau Monné wrote:> On 15/11/12 08:04, Annie Li wrote: >> This patch implements persistent grant in netback driver. Tx and rx >> share the same page pool, this pool will be split into two parts >> in next patch. >> >> Signed-off-by: Annie Li<annie.li@oracle.com> >> --- >> drivers/net/xen-netback/common.h | 18 +++- >> drivers/net/xen-netback/interface.c | 22 ++++ >> drivers/net/xen-netback/netback.c | 212 +++++++++++++++++++++++++++++++---- >> drivers/net/xen-netback/xenbus.c | 14 ++- >> 4 files changed, 239 insertions(+), 27 deletions(-) >> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h >> index 94b79c3..a85cac6 100644 >> --- a/drivers/net/xen-netback/common.h >> +++ b/drivers/net/xen-netback/common.h >> @@ -45,8 +45,19 @@ >> #include<xen/grant_table.h> >> #include<xen/xenbus.h> >> >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) >> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ >> + (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE) >> + >> struct xen_netbk; >> >> +struct persistent_entry { >> + grant_ref_t forgranted; >> + struct page *fpage; >> + struct gnttab_map_grant_ref map; >> +}; > This should be common with the blkback implementation, I think we should > move some structures/functions from blkback to a common place. When I > implementated some functions in blkback I though they could be reused by > other backends that wanted to use persistent grants, so I keep them free > of blkback specific structures.Good idea, thanks.> >> struct xenvif { >> /* Unique identifier for this interface. */ >> domid_t domid; >> @@ -75,6 +86,7 @@ struct xenvif { >> >> /* Internal feature information. */ >> u8 can_queue:1; /* can queue packets for receiver? */ >> + u8 persistent_grant:1; >> >> /* >> * Allow xenvif_start_xmit() to peek ahead in the rx request >> @@ -98,6 +110,9 @@ struct xenvif { >> struct net_device *dev; >> >> wait_queue_head_t waiting_to_free; >> + >> + struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; >> + unsigned int persistent_gntcnt; > This should be a red-black tree, which has the property of a search time > <= O(log n), using an array is more expensive in terms of memory and has > a worse search time O(n), this is specially interesting for netback, > which can have twice as much persistent grants as blkback (because two > rings are used).Right, thanks.> > Take a look at the following functions from blkback; foreach_grant, > add_persistent_gnt and get_persistent_gnt. They are generic functions to > deal with persistent grants.Ok, thanks. Or moving those functions into a separate common file?> >> }; >> >> static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) >> @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) >> return to_xenbus_device(vif->dev->dev.parent); >> } >> >> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) >> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) >> - >> struct xenvif *xenvif_alloc(struct device *parent, >> domid_t domid, >> unsigned int handle); >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c >> index b7d41f8..226d159 100644 >> --- a/drivers/net/xen-netback/interface.c >> +++ b/drivers/net/xen-netback/interface.c >> @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, >> return ERR_PTR(err); >> } >> >> + vif->persistent_gntcnt = 0; >> + >> netdev_dbg(dev, "Successfully created xenvif\n"); >> return vif; >> } >> @@ -343,6 +345,23 @@ err: >> return err; >> } >> >> +void xenvif_free_grants(struct persistent_entry **pers_entry, >> + unsigned int count) >> +{ >> + int i, ret; >> + struct gnttab_unmap_grant_ref unmap; >> + >> + for (i = 0; i< count; i++) { >> + gnttab_set_unmap_op(&unmap, >> + (unsigned long)page_to_pfn(pers_entry[i]->fpage), >> + GNTMAP_host_map, >> + pers_entry[i]->map.handle); >> + ret = gnttab_unmap_refs(&unmap,&pers_entry[i]->fpage, >> + 1, false); > This is not correct, you should call gnttab_set_unmap_op on a batch of > grants (up to BLKIF_MAX_SEGMENTS_PER_REQUEST), and then call > gnttab_unmap_refs on all of them. Here is a simple example (take a look > at blkback.c function xen_blkif_schedule to see an example with a > red-black tree, I think this part of the code should also be made common): > > for (i = 0, segs_to_unmap = 0; i< count; i++) { > gnttab_set_unmap_op(&unmap[segs_to_unmap], > (unsigned long)page_to_pfn(pers_entry[i]->fpage), > GNTMAP_host_map, > pers_entry[i]->map.handle); > pages[segs_to_unmap] > (unsigned long)page_to_pfn(pers_entry[i]->fpage); > if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST || > (i + 1) == count) { > ret = gnttab_unmap_refs(unmap, NULL, pages, > segs_to_unmap); > BUG_ON(ret); > segs_to_unmap == 0; > } > }Got it, thanks.> >> + BUG_ON(ret); >> + } >> +} >> + >> void xenvif_disconnect(struct xenvif *vif) >> { >> struct net_device *dev = vif->dev; >> @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif) >> unregister_netdev(vif->dev); >> >> xen_netbk_unmap_frontend_rings(vif); >> + if (vif->persistent_grant) >> + xenvif_free_grants(vif->persistent_gnt, >> + vif->persistent_gntcnt); >> >> free_netdev(vif->dev); >> } >> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c >> index 2596401..a26d3fc 100644 >> --- a/drivers/net/xen-netback/netback.c >> +++ b/drivers/net/xen-netback/netback.c >> @@ -80,6 +80,8 @@ union page_ext { >> void *mapping; >> }; >> >> +struct xenvif; >> + >> struct xen_netbk { >> wait_queue_head_t wq; >> struct task_struct *task; >> @@ -102,6 +104,7 @@ struct xen_netbk { >> >> struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; >> struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; >> + struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS]; >> >> u16 pending_ring[MAX_PENDING_REQS]; >> >> @@ -111,12 +114,139 @@ struct xen_netbk { >> * straddles two buffers in the frontend. >> */ >> struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE]; >> + struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE]; >> struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; >> }; >> >> static struct xen_netbk *xen_netbk; >> static int xen_netbk_group_nr; >> >> +static struct persistent_entry* >> +get_per_gnt(struct persistent_entry **pers_entry, >> + unsigned int count, grant_ref_t gref) >> +{ >> + int i; >> + >> + for (i = 0; i< count; i++) >> + if (gref == pers_entry[i]->forgranted) >> + return pers_entry[i]; >> + >> + return NULL; >> +} > This should be replaced with common code shared with all persistent > backends implementations.Ok, thanks.> >> + >> +static void* >> +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count, >> + grant_ref_t ref, domid_t domid) >> +{ >> + struct gnttab_map_grant_ref *map; >> + struct page *page; >> + unsigned long vaddr; >> + unsigned long pfn; >> + uint32_t flags; >> + int ret = 0; >> + >> + pers_entry[count] = (struct persistent_entry *) >> + kmalloc(sizeof(struct persistent_entry), >> + GFP_KERNEL); >> + if (!pers_entry[count]) >> + return ERR_PTR(-ENOMEM); >> + >> + map =&pers_entry[count]->map; >> + page = alloc_page(GFP_KERNEL); >> + if (!page) { >> + kfree(pers_entry[count]); >> + pers_entry[count] = NULL; >> + return ERR_PTR(-ENOMEM); >> + } >> + >> + pers_entry[count]->fpage = page; >> + pfn = page_to_pfn(page); >> + vaddr = (unsigned long)pfn_to_kaddr(pfn); >> + flags = GNTMAP_host_map; >> + >> + gnttab_set_map_op(map, vaddr, flags, ref, domid); >> + ret = gnttab_map_refs(map, NULL,&page, 1); >> + BUG_ON(ret); > This is highly inefficient, one of the points of using gnttab_set_map_op > is that you can queue a bunch of grants, and then map them at the same > time using gnttab_map_refs, but here you are using it to map a single > grant at a time. You should instead see how much grants you need to map > to complete the request and map them all at the same time.Yes, it is inefficient here. But this is limited by current netback implementation. Current netback is not per-VIF based(not like blkback does). After combining persistent grant and non persistent grant together, every vif request in the queue may/may not support persistent grant. I have to judge whether every vif in the queue supports persistent grant or not. If it support, memcpy is used, if not, grantcopy is used. After making netback per-VIF works, this issue can be fixed.> >> + >> + pers_entry[count]->forgranted = ref; >> + >> + return page_address(page); >> +} >> + >> +static void* >> +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count, >> + grant_ref_t ref, domid_t domid, unsigned int total) >> +{ >> + struct persistent_entry *per_gnt; >> + void *vaddr; >> + >> + per_gnt = get_per_gnt(pers_entry, *count, ref); >> + >> + if (per_gnt != NULL) >> + return page_address(per_gnt->fpage); >> + else { >> + BUG_ON(*count>= total); >> + vaddr = map_new_gnt(pers_entry, *count, ref, domid); >> + if (IS_ERR_OR_NULL(vaddr)) >> + return vaddr; >> + *count += 1; >> + return vaddr; >> + } >> +} >> + >> +static int >> +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, >> + struct xen_netbk *netbk, bool tx_pool) >> +{ >> + int i; >> + struct xenvif *vif; >> + struct gnttab_copy *uop = vuop; >> + unsigned int *gnt_count; >> + unsigned int gnt_total; >> + struct persistent_entry **pers_entry; >> + int ret = 0; >> + >> + BUG_ON(cmd != GNTTABOP_copy); >> + for (i = 0; i< count; i++) { >> + if (tx_pool) >> + vif = netbk->gnttab_tx_vif[i]; >> + else >> + vif = netbk->gnttab_rx_vif[i]; >> + >> + pers_entry = vif->persistent_gnt; >> + gnt_count =&vif->persistent_gntcnt; >> + gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; >> + >> + if (vif->persistent_grant) { >> + void *saddr, *daddr; >> + >> + saddr = uop[i].source.domid == DOMID_SELF ? >> + (void *) uop[i].source.u.gmfn : >> + get_ref_page(pers_entry, gnt_count, >> + uop[i].source.u.ref, >> + uop[i].source.domid, >> + gnt_total); >> + if (IS_ERR_OR_NULL(saddr)) >> + return -ENOMEM; >> + >> + daddr = uop[i].dest.domid == DOMID_SELF ? >> + (void *) uop[i].dest.u.gmfn : >> + get_ref_page(pers_entry, gnt_count, >> + uop[i].dest.u.ref, >> + uop[i].dest.domid, >> + gnt_total); >> + if (IS_ERR_OR_NULL(daddr)) >> + return -ENOMEM; >> + >> + memcpy(daddr+uop[i].dest.offset, >> + saddr+uop[i].source.offset, uop[i].len); >> + } else >> + ret = HYPERVISOR_grant_table_op(cmd,&uop[i], 1); >> + } >> + >> + return ret; >> +} >> + >> void xen_netbk_add_xenvif(struct xenvif *vif) >> { >> int i; >> @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, >> * Set up the grant operations for this fragment. If it''s a flipping >> * interface, we also set up the unmap request from here. >> */ >> -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> - struct netrx_pending_operations *npo, >> - struct page *page, unsigned long size, >> - unsigned long offset, int *head) >> +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> + struct netrx_pending_operations *npo, >> + struct page *page, unsigned long size, >> + unsigned long offset, int *head, >> + struct xenvif **grxvif) >> { >> struct gnttab_copy *copy_gop; >> struct netbk_rx_meta *meta; >> + int count = 0; >> /* >> * These variables are used iff get_page_ext returns true, >> * in which case they are guaranteed to be initialized. >> @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> bytes = MAX_BUFFER_OFFSET - npo->copy_off; >> >> copy_gop = npo->copy + npo->copy_prod++; >> + *grxvif = vif; >> + grxvif++; >> + count++; >> + >> copy_gop->flags = GNTCOPY_dest_gref; >> if (foreign) { >> struct xen_netbk *netbk =&xen_netbk[group]; >> @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> } else { >> void *vaddr = page_address(page); >> copy_gop->source.domid = DOMID_SELF; >> - copy_gop->source.u.gmfn = virt_to_mfn(vaddr); >> + if (!vif->persistent_grant) >> + copy_gop->source.u.gmfn = virt_to_mfn(vaddr); >> + else >> + copy_gop->source.u.gmfn = (unsigned long)vaddr; >> } >> copy_gop->source.offset = offset; >> copy_gop->dest.domid = vif->domid; >> @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> *head = 0; /* There must be something in this buffer now. */ >> >> } >> + return count; >> } >> >> /* >> @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, >> * zero GSO descriptors (for non-GSO packets) or one descriptor (for >> * frontend-side LRO). >> */ >> -static int netbk_gop_skb(struct sk_buff *skb, >> - struct netrx_pending_operations *npo) >> +static int netbk_gop_skb(struct xen_netbk *netbk, >> + struct sk_buff *skb, >> + struct netrx_pending_operations *npo, >> + struct xenvif **grxvif) >> { >> struct xenvif *vif = netdev_priv(skb->dev); >> int nr_frags = skb_shinfo(skb)->nr_frags; >> @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb, >> if (data + len> skb_tail_pointer(skb)) >> len = skb_tail_pointer(skb) - data; >> >> - netbk_gop_frag_copy(vif, skb, npo, >> - virt_to_page(data), len, offset,&head); >> + grxvif += netbk_gop_frag_copy(vif, skb, npo, >> + virt_to_page(data), len, >> + offset,&head, grxvif); >> + >> data += len; >> } >> >> for (i = 0; i< nr_frags; i++) { >> - netbk_gop_frag_copy(vif, skb, npo, >> - skb_frag_page(&skb_shinfo(skb)->frags[i]), >> - skb_frag_size(&skb_shinfo(skb)->frags[i]), >> - skb_shinfo(skb)->frags[i].page_offset, >> -&head); >> + grxvif += netbk_gop_frag_copy(vif, skb, npo, >> + skb_frag_page(&skb_shinfo(skb)->frags[i]), >> + skb_frag_size(&skb_shinfo(skb)->frags[i]), >> + skb_shinfo(skb)->frags[i].page_offset, >> +&head, grxvif); >> } >> >> return npo->meta_prod - old_meta_prod; >> @@ -593,6 +737,8 @@ struct skb_cb_overlay { >> static void xen_netbk_rx_action(struct xen_netbk *netbk) >> { >> struct xenvif *vif = NULL, *tmp; >> + struct xenvif **grxvif = netbk->gnttab_rx_vif; >> + int old_copy_prod = 0; >> s8 status; >> u16 irq, flags; >> struct xen_netif_rx_response *resp; >> @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) >> nr_frags = skb_shinfo(skb)->nr_frags; >> >> sco = (struct skb_cb_overlay *)skb->cb; >> - sco->meta_slots_used = netbk_gop_skb(skb,&npo); >> + sco->meta_slots_used = netbk_gop_skb(netbk, skb,&npo, grxvif); >> + grxvif += npo.copy_prod - old_copy_prod; >> + old_copy_prod = npo.copy_prod; >> >> count += nr_frags + 1; >> >> @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) >> return; >> >> BUG_ON(npo.copy_prod> ARRAY_SIZE(netbk->grant_copy_op)); >> - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,&netbk->grant_copy_op, >> - npo.copy_prod); >> + ret = grant_memory_copy_op(GNTTABOP_copy, >> +&netbk->grant_copy_op, >> + npo.copy_prod, netbk, >> + false); >> BUG_ON(ret != 0); >> >> while ((skb = __skb_dequeue(&rxq)) != NULL) { >> @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, >> struct xenvif *vif, >> struct sk_buff *skb, >> struct xen_netif_tx_request *txp, >> - struct gnttab_copy *gop) >> + struct gnttab_copy *gop, >> + struct xenvif **gtxvif) >> { >> struct skb_shared_info *shinfo = skb_shinfo(skb); >> skb_frag_t *frags = shinfo->frags; >> @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, >> gop->source.domid = vif->domid; >> gop->source.offset = txp->offset; >> >> - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + if (!vif->persistent_grant) >> + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + else >> + gop->dest.u.gmfn = (unsigned long)page_address(page); >> + >> gop->dest.domid = DOMID_SELF; >> gop->dest.offset = txp->offset; >> >> @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, >> gop->flags = GNTCOPY_source_gref; >> >> gop++; >> + *gtxvif = vif; >> + gtxvif++; >> + >> >> memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp)); >> xenvif_get(vif); >> @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) >> static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >> { >> struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; >> + struct xenvif **gtxvif = netbk->gnttab_tx_vif; >> struct sk_buff *skb; >> int ret; >> >> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >> gop->source.domid = vif->domid; >> gop->source.offset = txreq.offset; >> >> - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + if (!vif->persistent_grant) >> + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> + else >> + gop->dest.u.gmfn = (unsigned long)page_address(page); >> + >> gop->dest.domid = DOMID_SELF; >> gop->dest.offset = txreq.offset; >> >> @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >> gop->flags = GNTCOPY_source_gref; >> >> gop++; >> + *gtxvif++ = vif; >> >> memcpy(&netbk->pending_tx_info[pending_idx].req, >> &txreq, sizeof(txreq)); >> @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >> netbk->pending_cons++; >> >> request_gop = xen_netbk_get_requests(netbk, vif, >> - skb, txfrags, gop); >> + skb, txfrags, gop, gtxvif); >> if (request_gop == NULL) { >> kfree_skb(skb); >> netbk_tx_err(vif,&txreq, idx); >> continue; >> } >> + gtxvif += request_gop - gop; >> gop = request_gop; >> >> vif->tx.req_cons = idx; >> @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk) >> >> if (nr_gops == 0) >> return; >> - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, >> - netbk->tx_copy_ops, nr_gops); >> + ret = grant_memory_copy_op(GNTTABOP_copy, >> + netbk->tx_copy_ops, nr_gops, >> + netbk, true); >> BUG_ON(ret); >> >> xen_netbk_tx_submit(netbk); >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c >> index 410018c..938e908 100644 >> --- a/drivers/net/xen-netback/xenbus.c >> +++ b/drivers/net/xen-netback/xenbus.c >> @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev, >> goto abort_transaction; >> } >> >> + err = xenbus_printf(xbt, dev->nodename, >> + "feature-persistent-grants", "%u", 1); >> + if (err) { >> + message = "writing feature-persistent-grants"; >> + goto abort_transaction; >> + } >> + >> err = xenbus_transaction_end(xbt, 0); >> } while (err == -EAGAIN); >> >> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) >> val = 0; >> vif->csum = !val; >> >> - /* Map the shared frame, irq etc. */ >> + if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants", >> + "%u",&val)< 0) > In block devices "feature-persistent" is used, so I think that for > clearness it should be announced the same way in net.Is it "feature-persistent" ? I checked your RFC patch, the key is "feature-persistent-grants". Thanks Annie> >> + val = 0; >> + vif->persistent_grant = !!val; >> + >> +/* Map the shared frame, irq etc. */ >> err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); >> if (err) { >> xenbus_dev_fatal(dev, err, >> -- >> 1.7.3.4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >>
ANNIE LI
2012-Nov-16 03:10 UTC
Re: [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
On 2012-11-15 17:15, Ian Campbell wrote:> On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote: >> For tx path, this implementation simplifies the work of searching out >> grant page from page pool based on grant reference. > It''s still a linear search though, and it doesn''t look much simpler to > me: > for (i = 0; i< count; i++) { > if (tx_pool) > vif = netbk->gnttab_tx_vif[i]; > else > vif = netbk->gnttab_rx_vif[i]; > > pers_entry = vif->persistent_gnt; > gnt_count =&vif->persistent_gntcnt; > gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS; > becomes: > for (i = 0; i< count; i++) { > if (tx_pool) { > vif = netbk->gnttab_tx_vif[i]; > gnt_count =&vif->persistent_tx_gntcnt; > gnt_total = XEN_NETIF_TX_RING_SIZE; > pers_entry = vif->persistent_tx_gnt; > } else { > vif = netbk->gnttab_rx_vif[i]; > gnt_count =&vif->persistent_rx_gntcnt; > gnt_total = 2*XEN_NETIF_RX_RING_SIZE; > pers_entry = vif->persistent_rx_gnt; > }Yes, the code is not simpler. If we make netback per-VIF based, then these code will disappear. The simplifying here means for tx path, the max search index is XEN_NETIF_TX_RING_SIZE(256 here), and this change can save some time when searching out grant page for specific grant reference.> >> @@ -111,8 +109,16 @@ struct xenvif { >> >> wait_queue_head_t waiting_to_free; >> >> - struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS]; >> - unsigned int persistent_gntcnt; >> + struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE]; >> + >> + /* >> + * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page > Shouldn''t that been incorporated into MAXIMUM_OUTSTANDING_BLOCK_REQS > (sic) too?Yes, the total value is same as MAXIMUM_OUTSTANDING_BLOCK_REQS. But here 2*XEN_NETIF_RX_RING_SIZE means it is only used by rx path, and it is used just like other elements in netback structure, such as grant_copy_op, meta, etc.>> + * using 2 copy operations. >> + */ >> + struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE]; > What is the per-vif memory overhead after this change?Per-vif memory overhead is following, for tx path, it is about XEN_NETIF_RX_RING_SIZE*PAGE_SIZE (256 PAGE_SIZE here) for rx path, it is about 2*XEN_NETIF_RX_RING_SIZE*PAGE_SIZE (512 PAGE_SIZE here) I can add some comment here. Thanks Annie> > Ian. >
ANNIE LI
2012-Nov-16 05:22 UTC
Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
On 2012-11-15 18:52, Roger Pau Monné wrote:> On 15/11/12 08:05, Annie Li wrote: >> Tx/rx page pool are maintained. New grant is mapped and put into >> pool, unmap only happens when releasing/removing device. >> >> Signed-off-by: Annie Li<annie.li@oracle.com> >> --- >> drivers/net/xen-netfront.c | 372 +++++++++++++++++++++++++++++++++++++------- >> 1 files changed, 315 insertions(+), 57 deletions(-) >> >> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c >> index 0ebbb19..17b81c0 100644 >> --- a/drivers/net/xen-netfront.c >> +++ b/drivers/net/xen-netfront.c >> @@ -79,6 +79,13 @@ struct netfront_stats { >> struct u64_stats_sync syncp; >> }; >> >> +struct gnt_list { >> + grant_ref_t gref; >> + struct page *gnt_pages; >> + void *gnt_target; >> + struct gnt_list *tail; >> +}; > This could also be shared with blkfront.Netfront does not have the shadow like blkfront, and it needs the gnt_target to save skb address of rx path. So we can share this too? blkfront would not use it actually.> >> + >> struct netfront_info { >> struct list_head list; >> struct net_device *netdev; >> @@ -109,6 +116,10 @@ struct netfront_info { >> grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; >> unsigned tx_skb_freelist; >> >> + struct gnt_list *tx_grant[NET_TX_RING_SIZE]; >> + struct gnt_list *tx_gnt_list; >> + unsigned int tx_gnt_cnt; > I don''t understand this, why do you need both an array and a list?The array tx_grant is just like other tx_skbs, grant_tx_ref. It saves grant entries corresponding every request in the ring. This is what netfront different from blkfront, netfront does not have shadow ring, and it only uses a ring size array to track every request in the ring. The list is like a pool to save all available persistent grants.> I''m > not familiar with net code, so I don''t know if this is some kind of > special netfront thing?Yes, this is different from blkfront. netfront uses ring size arrays to track every request in the ring.> > Anyway if you have to use a list I would recommend using one of the list > constructions that''s already in the kernel, it simplifies the code and > makes it more easy to understand than creating your own list structure.Ok, thanks.> >> + >> spinlock_t rx_lock ____cacheline_aligned_in_smp; >> struct xen_netif_rx_front_ring rx; >> int rx_ring_ref; >> @@ -126,6 +137,10 @@ struct netfront_info { >> grant_ref_t gref_rx_head; >> grant_ref_t grant_rx_ref[NET_RX_RING_SIZE]; >> >> + struct gnt_list *rx_grant[NET_RX_RING_SIZE]; >> + struct gnt_list *rx_gnt_list; >> + unsigned int rx_gnt_cnt; > Same comment above here.Same as above.> >> + >> unsigned long rx_pfn_array[NET_RX_RING_SIZE]; >> struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1]; >> struct mmu_update rx_mmu[NET_RX_RING_SIZE]; >> @@ -134,6 +149,7 @@ struct netfront_info { >> struct netfront_stats __percpu *stats; >> >> unsigned long rx_gso_checksum_fixup; >> + u8 persistent_gnt:1; >> }; >> >> struct netfront_rx_info { >> @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np, >> return ref; >> } >> >> +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np, >> + RING_IDX ri) >> +{ >> + int i = xennet_rxidx(ri); >> + struct gnt_list *gntlist = np->rx_grant[i]; >> + np->rx_grant[i] = NULL; > Ok, I think I get why do you need both an array and a list, is that > because netfront doesn''t have some kind of shadow ring to keep track of > issued requests?Yes.> > So each issued request has an associated gnt_list with the list of used > grants?gnt_list is kind of free grants. It is like a pool of free grants. If free grants exist in this list, free grant will be gotten from this list. If no, new grant will be allocated. In xennet_tx_buf_gc, free grants will be put into the list again if response status is OK.> If so it would be good to add a comment about it. > >> + return gntlist; >> +} >> + >> #ifdef CONFIG_SYSFS >> static int xennet_sysfs_addif(struct net_device *netdev); >> static void xennet_sysfs_delif(struct net_device *netdev); >> @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev) >> netif_wake_queue(dev); >> } >> >> +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev, >> + unsigned long mfn, void *vaddr, >> + unsigned int id, >> + grant_ref_t ref) >> +{ >> + struct netfront_info *np = netdev_priv(dev); >> + grant_ref_t gnt_ref; >> + struct gnt_list *gnt_list_entry; >> + >> + if (np->persistent_gnt&& np->rx_gnt_cnt) { >> + gnt_list_entry = np->rx_gnt_list; >> + np->rx_gnt_list = np->rx_gnt_list->tail; >> + np->rx_gnt_cnt--; >> + >> + gnt_list_entry->gnt_target = vaddr; >> + gnt_ref = gnt_list_entry->gref; >> + np->rx_grant[id] = gnt_list_entry; >> + } else { >> + struct page *page; >> + >> + BUG_ON(!np->persistent_gnt&& np->rx_gnt_cnt); >> + if (!ref) >> + gnt_ref >> + gnttab_claim_grant_reference(&np->gref_rx_head); >> + else >> + gnt_ref = ref; >> + BUG_ON((signed short)gnt_ref< 0); >> + >> + if (np->persistent_gnt) { > So you are only using persistent grants if the backend also supports > them.Current implementation is: If netback supports persistent grant, the frontend will work with persistent grant feature too. If netback does not support persistent grant, the frontend will work without persistent grant feature.> Have you benchmarked the performance of a persistent frontend with > a non-persistent backend.I remember I did some test before, not so sure. Will check it.> In the block case, usign a persistent frontend > with a non-persistent backend let to an overall performance improvement, > so blkfront uses persistent grants even if blkback doesn''t support them. > Take a look at the following graph: > > http://xenbits.xen.org/people/royger/persistent_grants/nonpers_read.pngGood idea, that makes sense. I will change netfront too, thanks.> >> + page = alloc_page(GFP_KERNEL); >> + if (!page) { >> + if (!ref) >> + gnttab_release_grant_reference( >> +&np->gref_rx_head, ref); >> + return -ENOMEM; >> + } >> + mfn = pfn_to_mfn(page_to_pfn(page)); >> + >> + gnt_list_entry = kmalloc(sizeof(struct gnt_list), >> + GFP_KERNEL); >> + if (!gnt_list_entry) { >> + __free_page(page); >> + if (!ref) >> + gnttab_release_grant_reference( >> +&np->gref_rx_head, ref); >> + return -ENOMEM; >> + } >> + gnt_list_entry->gref = gnt_ref; >> + gnt_list_entry->gnt_pages = page; >> + gnt_list_entry->gnt_target = vaddr; >> + >> + np->rx_grant[id] = gnt_list_entry; >> + } >> + >> + gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id, >> + mfn, 0); >> + } >> + np->grant_rx_ref[id] = gnt_ref; >> + >> + return gnt_ref; >> +} >> + >> static void xennet_alloc_rx_buffers(struct net_device *dev) >> { >> unsigned short id; >> @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev) >> int i, batch_target, notify; >> RING_IDX req_prod = np->rx.req_prod_pvt; >> grant_ref_t ref; >> - unsigned long pfn; >> - void *vaddr; >> struct xen_netif_rx_request *req; >> >> if (unlikely(!netif_carrier_ok(dev))) >> @@ -306,19 +392,16 @@ no_skb: >> BUG_ON(np->rx_skbs[id]); >> np->rx_skbs[id] = skb; >> >> - ref = gnttab_claim_grant_reference(&np->gref_rx_head); >> - BUG_ON((signed short)ref< 0); >> - np->grant_rx_ref[id] = ref; >> + page = skb_frag_page(&skb_shinfo(skb)->frags[0]); >> >> - pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0])); >> - vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0])); >> + ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), >> + page_address(page), id, 0); >> + if ((signed short)ref< 0) { >> + __skb_queue_tail(&np->rx_batch, skb); >> + break; >> + } >> >> req = RING_GET_REQUEST(&np->rx, req_prod + i); >> - gnttab_grant_foreign_access_ref(ref, >> - np->xbdev->otherend_id, >> - pfn_to_mfn(pfn), >> - 0); >> - >> req->id = id; >> req->gref = ref; >> } >> @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev) >> >> id = txrsp->id; >> skb = np->tx_skbs[id].skb; >> - if (unlikely(gnttab_query_foreign_access( >> - np->grant_tx_ref[id]) != 0)) { >> - printk(KERN_ALERT "xennet_tx_buf_gc: warning " >> - "-- grant still in use by backend " >> - "domain.\n"); >> - BUG(); >> + >> + if (np->persistent_gnt) { >> + struct gnt_list *gnt_list_entry; >> + >> + gnt_list_entry = np->tx_grant[id]; >> + BUG_ON(!gnt_list_entry); >> + >> + gnt_list_entry->tail = np->tx_gnt_list; >> + np->tx_gnt_list = gnt_list_entry; >> + np->tx_gnt_cnt++; >> + } else { >> + if (unlikely(gnttab_query_foreign_access( >> + np->grant_tx_ref[id]) != 0)) { >> + printk(KERN_ALERT "xennet_tx_buf_gc: warning " >> + "-- grant still in use by backend " >> + "domain.\n"); >> + BUG(); >> + } >> + >> + gnttab_end_foreign_access_ref( >> + np->grant_tx_ref[id], GNTMAP_readonly); > If I''ve read the code correctly, you are giving this frame both > read/write permissions to the other end on xennet_alloc_tx_ref, but then > you are only removing the read permissions? (see comment below on the > xennet_alloc_tx_ref function).Yes, this is a bug. For non persistent grant, it should remove the read permissions. For persistent grant, it should remove both. As mentioned above, it is better to enable persistent grant, I will change code and not consider non persistent grant. See comments below about why needing both permissions in xennet_alloc_tx_ref.> >> + gnttab_release_grant_reference( >> +&np->gref_tx_head, np->grant_tx_ref[id]); >> } >> - gnttab_end_foreign_access_ref( >> - np->grant_tx_ref[id], GNTMAP_readonly); >> - gnttab_release_grant_reference( >> -&np->gref_tx_head, np->grant_tx_ref[id]); >> np->grant_tx_ref[id] = GRANT_INVALID_REF; >> add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id); >> dev_kfree_skb_irq(skb); >> @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev) >> xennet_maybe_wake_tx(dev); >> } >> >> +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev, >> + unsigned long mfn, >> + unsigned int id) >> +{ >> + struct netfront_info *np = netdev_priv(dev); >> + grant_ref_t ref; >> + struct page *granted_page; >> + >> + if (np->persistent_gnt&& np->tx_gnt_cnt) { >> + struct gnt_list *gnt_list_entry; >> + >> + gnt_list_entry = np->tx_gnt_list; >> + np->tx_gnt_list = np->tx_gnt_list->tail; >> + np->tx_gnt_cnt--; >> + >> + ref = gnt_list_entry->gref; >> + np->tx_grant[id] = gnt_list_entry; >> + } else { >> + struct gnt_list *gnt_list_entry; >> + >> + BUG_ON(!np->persistent_gnt&& np->tx_gnt_cnt); >> + ref = gnttab_claim_grant_reference(&np->gref_tx_head); >> + BUG_ON((signed short)ref< 0); >> + >> + if (np->persistent_gnt) { >> + granted_page = alloc_page(GFP_KERNEL); >> + if (!granted_page) { >> + gnttab_release_grant_reference( >> +&np->gref_tx_head, ref); >> + return -ENOMEM; >> + } >> + >> + mfn = pfn_to_mfn(page_to_pfn(granted_page)); >> + gnt_list_entry = kmalloc(sizeof(struct gnt_list), >> + GFP_KERNEL); >> + if (!gnt_list_entry) { >> + __free_page(granted_page); >> + gnttab_release_grant_reference( >> +&np->gref_tx_head, ref); >> + return -ENOMEM; >> + } >> + >> + gnt_list_entry->gref = ref; >> + gnt_list_entry->gnt_pages = granted_page; >> + np->tx_grant[id] = gnt_list_entry; >> + } >> + gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id, >> + mfn, 0); > If you are not always using persistent grants I guess you need to give > read only permissions to this frame (GNTMAP_readonly).We can not use GNTMAP_readonly here because tx path packet data will be copied into these persistent grant pages. Mabbe it is better to use GNTMAP_readonly for nonpersistent and 0 for persistent grant. As mentioned above, it is better to enable persistent grant, I will change code and not consider non persistent grant.> Also, for keeping > things in logical order, isn''t it best that this function comes before > xennet_tx_buf_gc?xennet_alloc_tx_ref is called by following function xennet_make_frags, so I assume xennet_alloc_tx_ref is better to be close to xennet_make_frags. Xennet_tx_buf_gc does not have any connection with xennet_alloc_tx_ref, did I miss something?> >> + } >> + >> + return ref; >> +} >> + >> @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np) >> } >> >> skb = np->rx_skbs[id]; >> - mfn = gnttab_end_foreign_transfer_ref(ref); >> - gnttab_release_grant_reference(&np->gref_rx_head, ref); >> + if (!np->persistent_gnt) { >> + mfn = gnttab_end_foreign_transfer_ref(ref); >> + gnttab_release_grant_reference(&np->gref_rx_head, ref); >> + } >> np->grant_rx_ref[id] = GRANT_INVALID_REF; >> >> if (0 == mfn) { >> @@ -1607,6 +1834,13 @@ again: >> goto abort_transaction; >> } >> >> + err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants", >> + "%u", info->persistent_gnt); > As in netback, I think "feature-persistent" should be used.Same in blkback, I assume it is "feature-persistent-grants", right? I referred your RFC patch, did you change it later? Or I missed something? Thanks Annie> >> + if (err) { >> + message = "writing feature-persistent-grants"; >> + xenbus_dev_fatal(dev, err, "%s", message); >> + } >> + >> err = xenbus_transaction_end(xbt, 0); >> if (err) { >> if (err == -EAGAIN) >> @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev) >> grant_ref_t ref; >> struct xen_netif_rx_request *req; >> unsigned int feature_rx_copy; >> + int ret, val; >> >> err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, >> "feature-rx-copy", "%u",&feature_rx_copy); >> @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev) >> return -ENODEV; >> } >> >> + err = xenbus_scanf(XBT_NIL, np->xbdev->otherend, >> + "feature-persistent-grants", "%u",&val); >> + if (err != 1) >> + val = 0; >> + >> + np->persistent_gnt = !!val; >> + >> err = talk_to_netback(np->xbdev, np); >> if (err) >> return err; >> @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev) >> spin_lock_bh(&np->rx_lock); >> spin_lock_irq(&np->tx_lock); >> >> + np->tx_gnt_cnt = 0; >> + np->rx_gnt_cnt = 0; >> + >> /* Step 1: Discard all pending TX packet fragments. */ >> xennet_release_tx_bufs(np); >> >> + if (np->persistent_gnt) { >> + struct gnt_list *gnt_list_entry; >> + >> + while (np->rx_gnt_list) { >> + gnt_list_entry = np->rx_gnt_list; >> + np->rx_gnt_list = np->rx_gnt_list->tail; >> + gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); >> + __free_page(gnt_list_entry->gnt_pages); >> + kfree(gnt_list_entry); >> + } >> + } >> + >> /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */ >> for (requeue_idx = 0, i = 0; i< NET_RX_RING_SIZE; i++) { >> skb_frag_t *frag; >> @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev) >> >> frag =&skb_shinfo(skb)->frags[0]; >> page = skb_frag_page(frag); >> - gnttab_grant_foreign_access_ref( >> - ref, np->xbdev->otherend_id, >> - pfn_to_mfn(page_to_pfn(page)), >> - 0); >> + ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)), >> + page_address(page), requeue_idx, ref); >> + if ((signed short)ret< 0) >> + break; >> + >> req->gref = ref; >> req->id = requeue_idx; >> >> -- >> 1.7.3.4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >>
ANNIE LI
2012-Nov-16 07:57 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 2012-11-16 10:49, ANNIE LI wrote:> > > On 2012-11-15 17:57, Roger Pau Monné wrote: >>> >>> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be) >>> val = 0; >>> vif->csum = !val; >>> >>> - /* Map the shared frame, irq etc. */ >>> + if (xenbus_scanf(XBT_NIL, dev->otherend, >>> "feature-persistent-grants", >>> + "%u",&val)< 0) >> In block devices "feature-persistent" is used, so I think that for >> clearness it should be announced the same way in net. > Is it "feature-persistent" ? I checked your RFC patch, the key is > "feature-persistent-grants". > >My mistake. In your v2 patch, it is "feature-persistent". I will change the code as blkback/blkfront. Thanks Annie
ANNIE LI
2012-Nov-16 07:58 UTC
Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
On 2012-11-16 13:22, ANNIE LI wrote:> > > On 2012-11-15 18:52, Roger Pau Monné wrote: >>> + err = xenbus_printf(xbt, dev->nodename, >>> "feature-persistent-grants", >>> + "%u", info->persistent_gnt); >> As in netback, I think "feature-persistent" should be used. > > Same in blkback, I assume it is "feature-persistent-grants", right? > I referred your RFC patch, did you change it later? Or I missed > something? > >My mistake. In your v2 patch, it is "feature-persistent". I will change the code as blkback/blkfront. Thanks Annie
Ian Campbell
2012-Nov-16 09:27 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On Fri, 2012-11-16 at 02:18 +0000, ANNIE LI wrote:> In this patch, > The maximum of memory overhead is about > > (XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE (plus size of grant_ref_t and handle) > which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached. > > In next patch of splitting tx/rx pool, the maximum is about"about" or just "is"?> (256+512)PAGE_SIZE.IOW 3MB.> > > >> + > >> + return NULL; > >> +} > >> + > >> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > >> gop->source.domid = vif->domid; > >> gop->source.offset = txreq.offset; > >> > >> - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > >> + if (!vif->persistent_grant) > >> + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > >> + else > >> + gop->dest.u.gmfn = (unsigned long)page_address(page); > > page_address doesn''t return any sort of frame number, does it? This is > > rather confusing... > > Yes. I only use dest.u.gmfn element to save the page_address here for > future memcpy, and it does not mean to use frame number actually. To > avoid confusion, here I can use > > gop->dest.u.gmfn = virt_to_mfn(page_address(page)); > > and then call mfn_to_virt when doing memcpy.It seems a bit odd to be using the gop structure in this way when you aren''t actually doing a grant op on it. While investigating I noticed: +static int +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, + struct xen_netbk *netbk, bool tx_pool) ... + struct gnttab_copy *uop = vuop; Why void *vuop? Why not struct gnttab_copy * in the parameter? I also noticed your new grant_memory_copy_op() seems to have unbatched the grant ops in the non-persistent case, which is going to suck for performance in non-persistent mode. You need to pull the conditional and the HYPERVISOR_grant_table_op outside the loop and pass it full array instead of doing them one at a time. Ian
Ian Campbell
2012-Nov-16 09:32 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On Fri, 2012-11-16 at 02:49 +0000, ANNIE LI wrote:> > > > Take a look at the following functions from blkback; foreach_grant, > > add_persistent_gnt and get_persistent_gnt. They are generic functions to > > deal with persistent grants. > > Ok, thanks. > Or moving those functions into a separate common file?Please put them somewhere common.> > This is highly inefficient, one of the points of using gnttab_set_map_op > > is that you can queue a bunch of grants, and then map them at the same > > time using gnttab_map_refs, but here you are using it to map a single > > grant at a time. You should instead see how much grants you need to map > > to complete the request and map them all at the same time. > > Yes, it is inefficient here. But this is limited by current netback > implementation. Current netback is not per-VIF based(not like blkback > does). After combining persistent grant and non persistent grant > together, every vif request in the queue may/may not support persistent > grant. I have to judge whether every vif in the queue supports > persistent grant or not. If it support, memcpy is used, if not, > grantcopy is used.You could (and should) still batch all the grant copies into one hypercall, e.g. walk the list either doing memcpy or queuing up copyops as appropriate, then at the end if the queue is non-zero length issue the hypercall. I''d expect this lack of batching here and in the other case I just spotted to have a detrimental affect on guests running with this patch but not using persistent grants. Did you benchmark that case?> After making netback per-VIF works, this issue can be fixed.You''ve mentioned improvements which are conditional on this work a few times I think, perhaps it makes sense to make that change first? Ian.
ANNIE LI
2012-Nov-16 09:55 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 2012-11-16 17:27, Ian Campbell wrote:> On Fri, 2012-11-16 at 02:18 +0000, ANNIE LI wrote: >> In this patch, >> The maximum of memory overhead is about >> >> (XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE (plus size of grant_ref_t and handle) >> which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached. >> >> In next patch of splitting tx/rx pool, the maximum is about > "about" or just "is"?For only grant pages, it is this value. I took into account other element of grant_ref_t and map(change to handle in future)....> >> (256+512)PAGE_SIZE. > IOW 3MB. > >>>> + >>>> + return NULL; >>>> +} >>>> + >>>> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) >>>> gop->source.domid = vif->domid; >>>> gop->source.offset = txreq.offset; >>>> >>>> - gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >>>> + if (!vif->persistent_grant) >>>> + gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >>>> + else >>>> + gop->dest.u.gmfn = (unsigned long)page_address(page); >>> page_address doesn''t return any sort of frame number, does it? This is >>> rather confusing... >> Yes. I only use dest.u.gmfn element to save the page_address here for >> future memcpy, and it does not mean to use frame number actually. To >> avoid confusion, here I can use >> >> gop->dest.u.gmfn = virt_to_mfn(page_address(page)); >> >> and then call mfn_to_virt when doing memcpy. > It seems a bit odd to be using the gop structure in this way when you > aren''t actually doing a grant op on it. > > While investigating I noticed: > +static int > +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count, > + struct xen_netbk *netbk, bool tx_pool) > ... > + struct gnttab_copy *uop = vuop; > > Why void *vuop? Why not struct gnttab_copy * in the parameter?Sorry, my mistake.> > I also noticed your new grant_memory_copy_op() seems to have unbatched > the grant ops in the non-persistent case, which is going to suck for > performance in non-persistent mode. You need to pull the conditional and > the HYPERVISOR_grant_table_op outside the loop and pass it full array > instead of doing them one at a time.This still connects with netback per-VIF implementation. Currently, these could not be pulled out outside since netback queue may contains persistent and nonpersistent in the same queue. I did consider to implement per-VIF first and then the persistent grant, but thinking of it is part of wei''s patch combined with other patches, and finally decided to implement per-VIF later. But this does limit implementation of persistent grant. Thanks Annie> > Ian >
Ian Campbell
2012-Nov-16 09:57 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:> This patch implements persistent grants for xen-netfront/netback.Hang on a sec. It has just occurred to me that netfront/netback in the current mainline kernels don''t currently use grant maps at all, they use grant copy on both the tx and rx paths. The supposed benefit of persistent grants is to avoid the TLB shootdowns on grant unmap, but in the current code there should be exactly zero of those. If I understand correctly this patch goes from using grant copy operations to persistently mapping frames and then using memcpy on those buffers to copy in/out to local buffers. I''m finding it hard to think of a reason why this should perform any better, do you have a theory which explains it? (my best theory is that it has a beneficial impact on where the cache locality of the data, but netperf doesn''t typically actually access the data so I''m not sure why that would matter) Also AIUI this is also doing persistent grants for both Tx and Rx directions? For guest Rx does this mean it now copies twice, in dom0 from the DMA buffer to the guest provided buffer and then again in the guest from the granted buffer to a normal one? For guest Tx how do you handle the lifecycle of the grant mapped pages which are being sent up into the dom0 network stack? Or are you also now copying twice in this case? (i.e. guest copies into a granted buffer and dom0 copies out into a local buffer?) Did you do measurement of the Tx and Rx cases independently? Do you know that they both benefit from this change (rather than for example an improvement in one direction masking a regression in the other). Were the numbers you previously posted in one particular direction or did you measure both? Ian.
ANNIE LI
2012-Nov-16 11:34 UTC
Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
On 2012-11-16 17:32, Ian Campbell wrote:> On Fri, 2012-11-16 at 02:49 +0000, ANNIE LI wrote: >>> Take a look at the following functions from blkback; foreach_grant, >>> add_persistent_gnt and get_persistent_gnt. They are generic functions to >>> deal with persistent grants. >> Ok, thanks. >> Or moving those functions into a separate common file? > Please put them somewhere common.Ok.>>> This is highly inefficient, one of the points of using gnttab_set_map_op >>> is that you can queue a bunch of grants, and then map them at the same >>> time using gnttab_map_refs, but here you are using it to map a single >>> grant at a time. You should instead see how much grants you need to map >>> to complete the request and map them all at the same time. >> Yes, it is inefficient here. But this is limited by current netback >> implementation. Current netback is not per-VIF based(not like blkback >> does). After combining persistent grant and non persistent grant >> together, every vif request in the queue may/may not support persistent >> grant. I have to judge whether every vif in the queue supports >> persistent grant or not. If it support, memcpy is used, if not, >> grantcopy is used. > You could (and should) still batch all the grant copies into one > hypercall, e.g. walk the list either doing memcpy or queuing up copyops > as appropriate, then at the end if the queue is non-zero length issue > the hypercall.This still connects with netback per-VIF implementation.> I''d expect this lack of batching here and in the other case I just > spotted to have a detrimental affect on guests running with this patch > but not using persistent grants. Did you benchmark that case?I did some test before. But I''d better to create more detailed result under in different case.>> After making netback per-VIF works, this issue can be fixed. > You''ve mentioned improvements which are conditional on this work a few > times I think, perhaps it makes sense to make that change first?Yes, I did consider to implement per-VIF first before persistent grant. But thinking of it is part of wei''s patch and combined with other patches, and decided to implement it later. But making the change first would make things more clear. Thanks Annie> Ian. > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message tomajordomo@vger.kernel.org > More majordomo info athttp://vger.kernel.org/majordomo-info.html
ANNIE LI
2012-Nov-16 11:37 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-16 17:57, Ian Campbell wrote:> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote: >> This patch implements persistent grants for xen-netfront/netback. > Hang on a sec. It has just occurred to me that netfront/netback in the > current mainline kernels don''t currently use grant maps at all, they use > grant copy on both the tx and rx paths.Ah, this patch is based on v3.4-rc3. Current mainline kernel does not pass the netperf/netserver case. As I mentioned earlier, I hit BUG_ON with your debug patch too when testing mainline kernel with netperf/netserver. This is interesting, I should have check the latest code.> > The supposed benefit of persistent grants is to avoid the TLB shootdowns > on grant unmap, but in the current code there should be exactly zero of > those.Is there any performance document about current grant copy code in mainline kernel?> > If I understand correctly this patch goes from using grant copy > operations to persistently mapping frames and then using memcpy on those > buffers to copy in/out to local buffers. I''m finding it hard to think of > a reason why this should perform any better, do you have a theory which > explains it?This patch is aiming to fix spin lock issue of grant operations, it comes out to avoid possible grant operations(including grant map and copy).> (my best theory is that it has a beneficial impact on where > the cache locality of the data, but netperf doesn''t typically actually > access the data so I''m not sure why that would matter) > > Also AIUI this is also doing persistent grants for both Tx and Rx > directions?Yes.> > For guest Rx does this mean it now copies twice, in dom0 from the DMA > buffer to the guest provided buffer and then again in the guest from the > granted buffer to a normal one?Yes.> > For guest Tx how do you handle the lifecycle of the grant mapped pages > which are being sent up into the dom0 network stack? Or are you also now > copying twice in this case? (i.e. guest copies into a granted buffer and > dom0 copies out into a local buffer?)Copy twice: guest copies into a granted buffer and dom0 copies out into a local buffer.> > Did you do measurement of the Tx and Rx cases independently?No.> Do you know > that they both benefit from this change (rather than for example an > improvement in one direction masking a regression in the other).On theory, this implementation avoid spinlock issue of grant operation, so they should both benefit from it.> Were > the numbers you previously posted in one particular direction or did you > measure both?One particular direction, one runs as server, the other runs as client. Thanks Annie> > Ian. >
Ian Campbell
2012-Nov-16 11:46 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Fri, 2012-11-16 at 11:37 +0000, ANNIE LI wrote:> > On 2012-11-16 17:57, Ian Campbell wrote: > > On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote: > >> This patch implements persistent grants for xen-netfront/netback. > > Hang on a sec. It has just occurred to me that netfront/netback in the > > current mainline kernels don''t currently use grant maps at all, they use > > grant copy on both the tx and rx paths. > > Ah, this patch is based on v3.4-rc3.Nothing has changed in more recent kernels in this regard.> > > > The supposed benefit of persistent grants is to avoid the TLB shootdowns > > on grant unmap, but in the current code there should be exactly zero of > > those. > > Is there any performance document about current grant copy code in > mainline kernel?Not AFAIK.> > If I understand correctly this patch goes from using grant copy > > operations to persistently mapping frames and then using memcpy on those > > buffers to copy in/out to local buffers. I''m finding it hard to think of > > a reason why this should perform any better, do you have a theory which > > explains it? > > This patch is aiming to fix spin lock issue of grant operations, it > comes out to avoid possible grant operations(including grant map and copy).Makes sense. This is the sort of thing which ought to feature prominently in commit messages and/or introductory mails.> > Do you know > > that they both benefit from this change (rather than for example an > > improvement in one direction masking a regression in the other). > > On theory, this implementation avoid spinlock issue of grant operation, > so they should both benefit from it.It seems like having netfront simply allocate itself a pool of grant references which it reuses would give equivalent benefits whilst being a smaller patch, with no protocol change and avoiding double copying. In fact by avoiding the double copy I''d expect it to be even better.> > Were > > the numbers you previously posted in one particular direction or did you > > measure both? > > One particular direction, one runs as server, the other runs as client.I think you need to measure both dom0->domU and domU->dom0 to get the full picture since AIUI netperf sends the bulk data in only one direction with just ACKs coming back the other way. Ian.
Konrad Rzeszutek Wilk
2012-Nov-16 15:18 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:> This patch implements persistent grants for xen-netfront/netback. This > mechanism maintains page pools in netback/netfront, these page pools is used to > save grant pages which are mapped. This way improve performance which is wasted > when doing grant operations. > > Current netback/netfront does map/unmap grant operations frequently when > transmitting/receiving packets, and grant operations costs much cpu clock. In > this patch, netfront/netback maps grant pages when needed and then saves them > into a page pool for future use. All these pages will be unmapped when > removing/releasing the net device. > > In netfront, two pools are maintained for transmitting and receiving packets. > When new grant pages are needed, the driver gets grant pages from this pool > first. If no free grant page exists, it allocates new page, maps it and then > saves it into the pool. The pool size for transmit/receive is exactly tx/rx > ring size. The driver uses memcpy(not grantcopy) to copy data grant pages. > Here, memcpy is copying the whole page size data. I tried to copy len size data > from offset, but network does not seem work well. I am trying to find the root > cause now. > > In netback, it also maintains two page pools for tx/rx. When netback gets a > request, it does a search first to find out whether the grant reference of > this request is already mapped into its page pool. If the grant ref is mapped, > the address of this mapped page is gotten and memcpy is used to copy data > between grant pages. However, if the grant ref is not mapped, a new page is > allocated, mapped with this grant ref, and then saved into page pool for > future use. Similarly, memcpy replaces grant copy to copy data between grant > pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are > used to save vif pointer for every request because current netback is not > per-vif based. This would be changed after implementing 1:1 model in netback. > > This patch supports both persistent-grant and non persistent grant. A new > xenstore key "feature-persistent-grants" is used to represent this feature. > > This patch is based on linux3.4-rc3. I hit netperf/netserver failure on > linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this > netperf/netserver failure connects compound page commit in v3.7-rc1, but I did > hit BUG_ON with debug patch from thread > http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.htmlFYI, I get this: 477.814511] BUG: sleeping function called from invalid context at /home/konrad/ssd/linux/mm/page_alloc.c:2487 [ 477.815281] in_atomic(): 1, irqs_disabled(): 1, pid: 3017, name: netperf [ 477.815281] Pid: 3017, comm: netperf Not tainted 3.5.0upstream-00004-g69047bb #1 [ 477.815281] Call Trace: [ 477.815281] [<ffffffff810b990a>] __might_sleep+0xda/0x100 [ 477.815281] [<ffffffff81142e93>] __alloc_pages_nodemask+0x223/0x920 [ 477.815281] [<ffffffff81158439>] ? zone_statistics+0x99/0xc0 [ 477.815281] [<ffffffff81076e79>] ? default_spin_lock_flags+0x9/0x10 [ 477.815281] [<ffffffff81615e3a>] ? _raw_spin_lock_irqsave+0x3a/0x50 [ 477.815281] [<ffffffff81076e79>] ? default_spin_lock_flags+0x9/0x10 [ 477.815281] [<ffffffff81098977>] ? lock_timer_base+0x37/0x70 [ 477.815281] [<ffffffff8109a03d>] ? mod_timer_pending+0x11d/0x230 [ 477.815281] [<ffffffff81616144>] ? _raw_spin_unlock_bh+0x24/0x30 [ 477.815281] [<ffffffff8117e7e1>] alloc_pages_current+0xb1/0x110 [ 477.815281] [<ffffffffa0034238>] xennet_alloc_tx_ref+0x78/0x1c0 [xen_netfront] [ 477.815281] [<ffffffffa00344eb>] xennet_start_xmit+0x16b/0x9f0 [xen_netfront] [ 477.815281] [<ffffffff814c69eb>] dev_hard_start_xmit+0x2fb/0x6f0 [ 477.815281] [<ffffffff814e4566>] sch_direct_xmit+0x116/0x1e0 [ 477.815281] [<ffffffff814c6f6a>] dev_queue_xmit+0x18a/0x6b0 [ 477.815281] [<ffffffff8151264e>] ip_finish_output+0x18e/0x300 [ 477.815281] [<ffffffff81512821>] ip_output+0x61/0xa0 [ 477.815281] [<ffffffff81511b82>] ? __ip_local_out+0xa2/0xb0 [ 477.815281] [<ffffffff81511bb4>] ip_local_out+0x24/0x30 [ 477.815281] [<ffffffff81511ffe>] ip_queue_xmit+0x15e/0x410 [ 477.815281] [<ffffffff81528354>] tcp_transmit_skb+0x424/0x8f0 [ 477.815281] [<ffffffff8152a8c2>] tcp_write_xmit+0x1f2/0x9c0 [ 477.815281] [<ffffffff81182194>] ? ksize+0x14/0x70 [ 477.815281] [<ffffffff8152b711>] __tcp_push_pending_frames+0x21/0x90 [ 477.815281] [<ffffffff8151db23>] tcp_sendmsg+0x983/0xcd0 [ 477.815281] [<ffffffff81540daf>] inet_sendmsg+0x7f/0xd0 [ 477.815281] [<ffffffff81290dde>] ? selinux_socket_sendmsg+0x1e/0x20 [ 477.815281] [<ffffffff814aed13>] sock_sendmsg+0xf3/0x120 [ 477.815281] [<ffffffff81076f48>] ? pvclock_clocksource_read+0x58/0xd0 [ 477.815281] [<ffffffff812de7c0>] ? timerqueue_add+0x60/0xb0 [ 477.815281] [<ffffffff810b0b85>] ? enqueue_hrtimer+0x25/0xb0 [ 477.815281] [<ffffffff814af4d4>] sys_sendto+0x104/0x140 [ 477.815281] [<ffffffff81041279>] ? xen_clocksource_read+0x39/0x50 [ 477.815281] [<ffffffff81041419>] ? xen_clocksource_get_cycles+0x9/0x10 [ 477.815281] [<ffffffff810d3242>] ? getnstimeofday+0x52/0xe0 [ 477.815281] [<ffffffff8161dfb9>] system_call_fastpath+0x16/0x1b> > > Annie Li (4): > xen/netback: implements persistent grant with one page pool. > xen/netback: Split one page pool into two(tx/rx) page pool. > Xen/netfront: Implement persistent grant in netfront. > fix code indent issue in xen-netfront. > > drivers/net/xen-netback/common.h | 24 ++- > drivers/net/xen-netback/interface.c | 26 +++ > drivers/net/xen-netback/netback.c | 215 ++++++++++++++++++-- > drivers/net/xen-netback/xenbus.c | 14 ++- > drivers/net/xen-netfront.c | 378 +++++++++++++++++++++++++++++------ > 5 files changed, 570 insertions(+), 87 deletions(-) > > -- > 1.7.3.4
Konrad Rzeszutek Wilk
2012-Nov-16 15:21 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 11:56:17AM +0100, Roger Pau Monné wrote:> On 15/11/12 09:38, ANNIE LI wrote: > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > >> Hello, > >> > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > >>> This patch implements persistent grants for xen-netfront/netback. This > >>> mechanism maintains page pools in netback/netfront, these page pools is used to > >>> save grant pages which are mapped. This way improve performance which is wasted > >>> when doing grant operations. > >>> > >>> Current netback/netfront does map/unmap grant operations frequently when > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In > >>> this patch, netfront/netback maps grant pages when needed and then saves them > >>> into a page pool for future use. All these pages will be unmapped when > >>> removing/releasing the net device. > >>> > >> Do you have performance numbers available already? with/without persistent grants? > > I have some simple netperf/netserver test result with/without persistent > > grants, > > > > Following is result of with persistent grant patch, > > > > Guests, Sum, Avg, Min, Max > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > Following are result of without persistent patch > > > > Guests, Sum, Avg, Min, Max > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > 4, 10924, 5461.99, 5314.84, 5609.14 > > In the block case, performance improvement is seen when using a large > number of guests, could you perform the same benchmark increasing the > number of guests to 15?Keep in mind that one of the things that is limiting these numbers is that netback is very CPU intensive. So I think it could get much much faster - but netback pegs at 100%. With Wei Liu''s patches the CPU usage did drop by 40% (this is when I tested the old netback with Wei''s netback patches)- so we should see even a further speed increase.
Konrad Rzeszutek Wilk
2012-Nov-16 15:23 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 07:11:07PM +0000, Ian Campbell wrote:> On Thu, 2012-11-15 at 18:29 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote: > > > On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote: > > > > On 15/11/12 09:38, ANNIE LI wrote: > > > > > > > > > > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > > > >> Hello, > > > > >> > > > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > > > > >>> This patch implements persistent grants for xen-netfront/netback. This > > > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to > > > > >>> save grant pages which are mapped. This way improve performance which is wasted > > > > >>> when doing grant operations. > > > > >>> > > > > >>> Current netback/netfront does map/unmap grant operations frequently when > > > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In > > > > >>> this patch, netfront/netback maps grant pages when needed and then saves them > > > > >>> into a page pool for future use. All these pages will be unmapped when > > > > >>> removing/releasing the net device. > > > > >>> > > > > >> Do you have performance numbers available already? with/without persistent grants? > > > > > I have some simple netperf/netserver test result with/without persistent > > > > > grants, > > > > > > > > > > Following is result of with persistent grant patch, > > > > > > > > > > Guests, Sum, Avg, Min, Max > > > > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > > > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > > > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > > > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > > > > > > > > > > Following are result of without persistent patch > > > > > > > > > > Guests, Sum, Avg, Min, Max > > > > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > > > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > > > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > > > > 4, 10924, 5461.99, 5314.84, 5609.14 > > > > > > > > In the block case, performance improvement is seen when using a large > > > > number of guests, could you perform the same benchmark increasing the > > > > number of guests to 15? > > > > > > It would also be nice to see some analysis of the numbers which justify > > > why this change is a good one without every reviewer having to evaluate > > > the raw data themselves. In fact this should really be part of the > > > commit message. > > > > You mean like a nice graph, eh? > > Together with an analysis of what it means and why it is a good thing, > yes.OK, lets put that on the TODO list for the next posting. In the meantime - it sounds like you (the maintainer) are happy with the direction this is going. The other things we want to do _after_ these patches is to look at the Wei Liu patches and try to address the different reviewers comments. The neat thing about them is that they have a concept of a page pool system. And with persistent pages in both blkback and netback this gets more exciting.> > Ian. > > > > > I will run these patches on my 32GB box and see if I can give you > > a nice PDF/jpg. > > > > > > > > Ian. > > > >
Konrad Rzeszutek Wilk
2012-Nov-16 15:34 UTC
Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On Thu, Nov 15, 2012 at 05:35:13PM +0800, Wei Liu wrote:> On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com> wrote: > > > > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote: > > > >> Hello, > >> > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote: > >> > >>> This patch implements persistent grants for xen-netfront/netback. This > >>> mechanism maintains page pools in netback/netfront, these page pools is > >>> used to > >>> save grant pages which are mapped. This way improve performance which is > >>> wasted > >>> when doing grant operations. > >>> > >>> Current netback/netfront does map/unmap grant operations frequently when > >>> transmitting/receiving packets, and grant operations costs much cpu > >>> clock. In > >>> this patch, netfront/netback maps grant pages when needed and then saves > >>> them > >>> into a page pool for future use. All these pages will be unmapped when > >>> removing/releasing the net device. > >>> > >>> Do you have performance numbers available already? with/without > >> persistent grants? > >> > > I have some simple netperf/netserver test result with/without persistent > > grants, > > > > Following is result of with persistent grant patch, > > > > Guests, Sum, Avg, Min, Max > > 1, 15106.4, 15106.4, 15106.36, 15106.36 > > 2, 13052.7, 6526.34, 6261.81, 6790.86 > > 3, 12675.1, 6337.53, 6220.24, 6454.83 > > 4, 13194, 6596.98, 6274.70, 6919.25 > > > > > > Following are result of without persistent patch > > > > Guests, Sum, Avg, Min, Max > > 1, 10864.1, 10864.1, 10864.10, 10864.10 > > 2, 10898.5, 5449.24, 4862.08, 6036.40 > > 3, 10734.5, 5367.26, 5261.43, 5473.08 > > 4, 10924, 5461.99, 5314.84, 5609.14 > > > > > > > Interesting results. Have you tested how good it is on a 10G nic, i.e. > guest sending packets > through physical network to another host.Not yet. This was done with two guests pounding each other. I am setting two machines up for Annie so she can do that type of testing and also with more guests.
annie li
2012-Nov-17 04:39 UTC
Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
On 2012-11-16 19:46, Ian Campbell wrote:> > It seems like having netfront simply allocate itself a pool of grant > references which it reuses would give equivalent benefits whilst being a > smaller patch, with no protocol change and avoiding double copying. In > fact by avoiding the double copy I''d expect it to be even better. > >OK, I can try this implementation. And I''d better to compare the performance between double copy and this implementation to see how much grant lock affects grant copy too.> I think you need to measure both dom0->domU and domU->dom0 to get the > full picture since AIUI netperf sends the bulk data in only one > direction with just ACKs coming back the other way. >Yes. My environment does not meet requirement of more VMs, I would do more thorough test after Konrad setups the environment. Thanks Annie