Changes in V4: - Squash several patches - Per-cpu scratch improvement, guard against race condition, NUMA awared allocation - Extend xenbus ring mapping interface to benifit all Xen BE / FE - Remove RX protocol stub patch Changes in V3: - Rework of per-cpu scratch space - Multi page ring support - Split event channels - Rx protocol stub - Fix a minor bug in module_put path Changes in V2: - Fix minor bugs in V1 - Embed pending_tx_info into page pool - Per-cpu scratch space - Notification code path clean up This version has been tested by Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> V1: A new netback implementation which includes three major features: - Global page pool support - NAPI + kthread 1:1 model - Netback internal name changes This patch series is the foundation of furture work. So it is better to get it right first. Patch 1 and 3 have the real meat. The first benifit of 1:1 model will be scheduling fairness. The rational behind a global page pool is that we need to limit overall memory consumed by all vifs. Utilization of NAPI enables the possibility to mitigate interrupts/events, the code path is cleaned up in a separated patch. Netback internal changes cleans up the code structure after switching to 1:1 model. It also prepares netback for further code layout changes. ----- drivers/block/xen-blkback/xenbus.c | 8 +- drivers/block/xen-blkfront.c | 5 +- drivers/net/xen-netback/Makefile | 2 +- drivers/net/xen-netback/common.h | 107 +++-- drivers/net/xen-netback/interface.c | 230 ++++++-- drivers/net/xen-netback/netback.c | 999 +++++++++++++++------------------ drivers/net/xen-netback/page_pool.c | 185 ++++++ drivers/net/xen-netback/page_pool.h | 66 +++ drivers/net/xen-netback/xenbus.c | 178 ++++++- drivers/net/xen-netfront.c | 393 ++++++++++---- drivers/pci/xen-pcifront.c | 5 +- drivers/scsi/xen-scsiback/common.h | 3 +- drivers/scsi/xen-scsiback/interface.c | 6 +- drivers/scsi/xen-scsiback/xenbus.c | 4 +- drivers/scsi/xen-scsifront/xenbus.c | 5 +- drivers/xen/xen-pciback/xenbus.c | 11 +- drivers/xen/xenbus/xenbus_client.c | 282 +++++++--- include/xen/xenbus.h | 15 +- 18 files changed, 1646 insertions(+), 858 deletions(-)
A global page pool. Since we are moving to 1:1 model netback, it is better to limit total RAM consumed by all the vifs. With this patch, each vif gets page from the pool and puts the page back when it is finished with the page. This pool is only meant to access via exported interfaces. Internals are subject to change when we discover new requirements for the pool. Current exported interfaces include: page_pool_init: pool init page_pool_destroy: pool destruction page_pool_get: get a page from pool page_pool_put: put page back to pool is_in_pool: tell whether a page belongs to the pool Current implementation has following defects: - Global locking - No starve prevention mechanism / reservation logic Global locking tends to cause contention on the pool. No reservation logic may cause vif to starve. A possible solution to these two problems will be each vif maintains its local cache and claims a portion of the pool. However the implementation will be tricky when coming to pool management, so let''s worry about that later. Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/Makefile | 2 +- drivers/net/xen-netback/common.h | 6 + drivers/net/xen-netback/netback.c | 158 ++++++++++++------------------ drivers/net/xen-netback/page_pool.c | 185 +++++++++++++++++++++++++++++++++++ drivers/net/xen-netback/page_pool.h | 63 ++++++++++++ 5 files changed, 317 insertions(+), 97 deletions(-) create mode 100644 drivers/net/xen-netback/page_pool.c create mode 100644 drivers/net/xen-netback/page_pool.h diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile index e346e81..dc4b8b1 100644 --- a/drivers/net/xen-netback/Makefile +++ b/drivers/net/xen-netback/Makefile @@ -1,3 +1,3 @@ obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o -xen-netback-y := netback.o xenbus.o interface.o +xen-netback-y := netback.o xenbus.o interface.o page_pool.o diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 94b79c3..288b2f3 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,6 +45,12 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +struct pending_tx_info { + struct xen_netif_tx_request req; + struct xenvif *vif; +}; +typedef unsigned int pending_ring_idx_t; + struct xen_netbk; struct xenvif { diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 59effac..d11205f 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -33,6 +33,7 @@ */ #include "common.h" +#include "page_pool.h" #include <linux/kthread.h> #include <linux/if_vlan.h> @@ -46,12 +47,6 @@ #include <asm/xen/hypercall.h> #include <asm/xen/page.h> -struct pending_tx_info { - struct xen_netif_tx_request req; - struct xenvif *vif; -}; -typedef unsigned int pending_ring_idx_t; - struct netbk_rx_meta { int id; int size; @@ -65,21 +60,6 @@ struct netbk_rx_meta { #define MAX_BUFFER_OFFSET PAGE_SIZE -/* extra field used in struct page */ -union page_ext { - struct { -#if BITS_PER_LONG < 64 -#define IDX_WIDTH 8 -#define GROUP_WIDTH (BITS_PER_LONG - IDX_WIDTH) - unsigned int group:GROUP_WIDTH; - unsigned int idx:IDX_WIDTH; -#else - unsigned int group, idx; -#endif - } e; - void *mapping; -}; - struct xen_netbk { wait_queue_head_t wq; struct task_struct *task; @@ -89,7 +69,7 @@ struct xen_netbk { struct timer_list net_timer; - struct page *mmap_pages[MAX_PENDING_REQS]; + idx_t mmap_pages[MAX_PENDING_REQS]; pending_ring_idx_t pending_prod; pending_ring_idx_t pending_cons; @@ -100,7 +80,6 @@ struct xen_netbk { atomic_t netfront_count; - struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; u16 pending_ring[MAX_PENDING_REQS]; @@ -160,7 +139,7 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, static inline unsigned long idx_to_pfn(struct xen_netbk *netbk, u16 idx) { - return page_to_pfn(netbk->mmap_pages[idx]); + return page_to_pfn(to_page(netbk->mmap_pages[idx])); } static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, @@ -169,45 +148,6 @@ static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, return (unsigned long)pfn_to_kaddr(idx_to_pfn(netbk, idx)); } -/* extra field used in struct page */ -static inline void set_page_ext(struct page *pg, struct xen_netbk *netbk, - unsigned int idx) -{ - unsigned int group = netbk - xen_netbk; - union page_ext ext = { .e = { .group = group + 1, .idx = idx } }; - - BUILD_BUG_ON(sizeof(ext) > sizeof(ext.mapping)); - pg->mapping = ext.mapping; -} - -static int get_page_ext(struct page *pg, - unsigned int *pgroup, unsigned int *pidx) -{ - union page_ext ext = { .mapping = pg->mapping }; - struct xen_netbk *netbk; - unsigned int group, idx; - - group = ext.e.group - 1; - - if (group < 0 || group >= xen_netbk_group_nr) - return 0; - - netbk = &xen_netbk[group]; - - idx = ext.e.idx; - - if ((idx < 0) || (idx >= MAX_PENDING_REQS)) - return 0; - - if (netbk->mmap_pages[idx] != pg) - return 0; - - *pgroup = group; - *pidx = idx; - - return 1; -} - /* * This is the amount of packet we copy rather than map, so that the * guest can''t fiddle with the contents of the headers while we do @@ -398,8 +338,8 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, * These variables are used iff get_page_ext returns true, * in which case they are guaranteed to be initialized. */ - unsigned int uninitialized_var(group), uninitialized_var(idx); - int foreign = get_page_ext(page, &group, &idx); + unsigned int uninitialized_var(idx); + int foreign = is_in_pool(page, &idx); unsigned long bytes; /* Data must not cross a page boundary. */ @@ -427,10 +367,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, copy_gop = npo->copy + npo->copy_prod++; copy_gop->flags = GNTCOPY_dest_gref; if (foreign) { - struct xen_netbk *netbk = &xen_netbk[group]; - struct pending_tx_info *src_pend; - - src_pend = &netbk->pending_tx_info[idx]; + struct pending_tx_info *src_pend = to_txinfo(idx); copy_gop->source.domid = src_pend->vif->domid; copy_gop->source.u.ref = src_pend->req.gref; @@ -906,11 +843,11 @@ static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk, u16 pending_idx) { struct page *page; - page = alloc_page(GFP_KERNEL|__GFP_COLD); + int idx; + page = page_pool_get(netbk, &idx); if (!page) return NULL; - set_page_ext(page, netbk, pending_idx); - netbk->mmap_pages[pending_idx] = page; + netbk->mmap_pages[pending_idx] = idx; return page; } @@ -931,8 +868,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, for (i = start; i < shinfo->nr_frags; i++, txp++) { struct page *page; pending_ring_idx_t index; - struct pending_tx_info *pending_tx_info - netbk->pending_tx_info; + int idx; + struct pending_tx_info *pending_tx_info; index = pending_index(netbk->pending_cons++); pending_idx = netbk->pending_ring[index]; @@ -940,6 +877,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, if (!page) return NULL; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + gop->source.u.ref = txp->gref; gop->source.domid = vif->domid; gop->source.offset = txp->offset; @@ -953,9 +893,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop++; - memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp)); + memcpy(&pending_tx_info->req, txp, sizeof(*txp)); xenvif_get(vif); - pending_tx_info[pending_idx].vif = vif; + pending_tx_info->vif = vif; frag_set_pending_idx(&frags[i], pending_idx); } @@ -968,8 +908,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, { struct gnttab_copy *gop = *gopp; u16 pending_idx = *((u16 *)skb->data); - struct pending_tx_info *pending_tx_info = netbk->pending_tx_info; - struct xenvif *vif = pending_tx_info[pending_idx].vif; + struct pending_tx_info *pending_tx_info; + int idx; + struct xenvif *vif = NULL; struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -980,7 +921,10 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, if (unlikely(err)) { pending_ring_idx_t index; index = pending_index(netbk->pending_prod++); - txp = &pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[index]; + pending_tx_info = to_txinfo(idx); + txp = &pending_tx_info->req; + vif = pending_tx_info->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); netbk->pending_ring[index] = pending_idx; xenvif_put(vif); @@ -1005,7 +949,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, } /* Error on this fragment: respond to client with an error. */ - txp = &netbk->pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[pending_idx]; + txp = &to_txinfo(idx)->req; + vif = to_txinfo(idx)->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; @@ -1042,10 +988,15 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) struct xen_netif_tx_request *txp; struct page *page; u16 pending_idx; + int idx; + struct pending_tx_info *pending_tx_info; pending_idx = frag_get_pending_idx(frag); - txp = &netbk->pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + + txp = &pending_tx_info->req; page = virt_to_page(idx_to_kaddr(netbk, pending_idx)); __skb_fill_page_desc(skb, i, page, txp->offset, txp->size); skb->len += txp->size; @@ -1053,7 +1004,7 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) skb->truesize += txp->size; /* Take an extra reference to offset xen_netbk_idx_release */ - get_page(netbk->mmap_pages[pending_idx]); + get_page(page); xen_netbk_idx_release(netbk, pending_idx); } } @@ -1233,6 +1184,8 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) int work_to_do; unsigned int data_len; pending_ring_idx_t index; + int pool_idx; + struct pending_tx_info *pending_tx_info; /* Get a netif from the list with work to do. */ vif = poll_net_schedule_list(netbk); @@ -1347,9 +1300,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) gop++; - memcpy(&netbk->pending_tx_info[pending_idx].req, + pool_idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(pool_idx); + + memcpy(&pending_tx_info->req, &txreq, sizeof(txreq)); - netbk->pending_tx_info[pending_idx].vif = vif; + pending_tx_info->vif = vif; *((u16 *)skb->data) = pending_idx; __skb_put(skb, data_len); @@ -1397,10 +1353,16 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) struct xenvif *vif; u16 pending_idx; unsigned data_len; + int idx; + struct pending_tx_info *pending_tx_info; pending_idx = *((u16 *)skb->data); - vif = netbk->pending_tx_info[pending_idx].vif; - txp = &netbk->pending_tx_info[pending_idx].req; + + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + + vif = pending_tx_info->vif; + txp = &pending_tx_info->req; /* Check the remap error code. */ if (unlikely(xen_netbk_tx_check_gop(netbk, skb, &gop))) { @@ -1480,12 +1442,14 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) struct xenvif *vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; + int idx; /* Already complete? */ - if (netbk->mmap_pages[pending_idx] == NULL) + if (netbk->mmap_pages[pending_idx] == INVALID_ENTRY) return; - pending_tx_info = &netbk->pending_tx_info[pending_idx]; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); vif = pending_tx_info->vif; @@ -1496,9 +1460,9 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) xenvif_put(vif); - netbk->mmap_pages[pending_idx]->mapping = 0; - put_page(netbk->mmap_pages[pending_idx]); - netbk->mmap_pages[pending_idx] = NULL; + page_pool_put(netbk->mmap_pages[pending_idx]); + + netbk->mmap_pages[pending_idx] = INVALID_ENTRY; } static void make_tx_response(struct xenvif *vif, @@ -1681,19 +1645,21 @@ static int __init netback_init(void) wake_up_process(netbk->task); } - rc = xenvif_xenbus_init(); + rc = page_pool_init(); if (rc) goto failed_init; + rc = xenvif_xenbus_init(); + if (rc) + goto pool_failed_init; + return 0; +pool_failed_init: + page_pool_destroy(); failed_init: while (--group >= 0) { struct xen_netbk *netbk = &xen_netbk[group]; - for (i = 0; i < MAX_PENDING_REQS; i++) { - if (netbk->mmap_pages[i]) - __free_page(netbk->mmap_pages[i]); - } del_timer(&netbk->net_timer); kthread_stop(netbk->task); } diff --git a/drivers/net/xen-netback/page_pool.c b/drivers/net/xen-netback/page_pool.c new file mode 100644 index 0000000..294f48b --- /dev/null +++ b/drivers/net/xen-netback/page_pool.c @@ -0,0 +1,185 @@ +/* + * Global page pool for netback. + * + * Wei Liu <wei.liu2@citrix.com> + * Copyright (c) Citrix Systems + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "common.h" +#include "page_pool.h" +#include <asm/xen/page.h> + +static idx_t free_head; +static int free_count; +static unsigned long pool_size; +static DEFINE_SPINLOCK(pool_lock); +static struct page_pool_entry *pool; + +static int get_free_entry(void) +{ + int idx; + + spin_lock(&pool_lock); + + if (free_count == 0) { + spin_unlock(&pool_lock); + return -ENOSPC; + } + + idx = free_head; + free_count--; + free_head = pool[idx].u.fl; + pool[idx].u.fl = INVALID_ENTRY; + + spin_unlock(&pool_lock); + + return idx; +} + +static void put_free_entry(idx_t idx) +{ + spin_lock(&pool_lock); + + pool[idx].u.fl = free_head; + free_head = idx; + free_count++; + + spin_unlock(&pool_lock); +} + +static inline void set_page_ext(struct page *pg, unsigned int idx) +{ + union page_ext ext = { .idx = idx }; + + BUILD_BUG_ON(sizeof(ext) > sizeof(ext.mapping)); + pg->mapping = ext.mapping; +} + +static int get_page_ext(struct page *pg, unsigned int *pidx) +{ + union page_ext ext = { .mapping = pg->mapping }; + int idx; + + idx = ext.idx; + + if ((idx < 0) || (idx >= pool_size)) + return 0; + + if (pool[idx].page != pg) + return 0; + + *pidx = idx; + + return 1; +} + +int is_in_pool(struct page *page, int *pidx) +{ + return get_page_ext(page, pidx); +} + +struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) +{ + int idx; + struct page *page; + + idx = get_free_entry(); + if (idx < 0) + return NULL; + page = alloc_page(GFP_ATOMIC); + + if (page == NULL) { + put_free_entry(idx); + return NULL; + } + + set_page_ext(page, idx); + pool[idx].u.netbk = netbk; + pool[idx].page = page; + + *pidx = idx; + + return page; +} + +void page_pool_put(int idx) +{ + struct page *page = pool[idx].page; + + pool[idx].page = NULL; + pool[idx].u.netbk = NULL; + page->mapping = 0; + put_page(page); + put_free_entry(idx); +} + +int page_pool_init() +{ + int cpus = 0; + int i; + + cpus = num_online_cpus(); + pool_size = cpus * ENTRIES_PER_CPU; + + pool = vzalloc(sizeof(struct page_pool_entry) * pool_size); + + if (!pool) + return -ENOMEM; + + for (i = 0; i < pool_size - 1; i++) + pool[i].u.fl = i+1; + pool[pool_size-1].u.fl = INVALID_ENTRY; + free_count = pool_size; + free_head = 0; + + return 0; +} + +void page_pool_destroy() +{ + int i; + for (i = 0; i < pool_size; i++) + if (pool[i].page) + put_page(pool[i].page); + + vfree(pool); +} + +struct page *to_page(int idx) +{ + return pool[idx].page; +} + +struct xen_netbk *to_netbk(int idx) +{ + return pool[idx].u.netbk; +} + +struct pending_tx_info *to_txinfo(int idx) +{ + return &pool[idx].tx_info; +} diff --git a/drivers/net/xen-netback/page_pool.h b/drivers/net/xen-netback/page_pool.h new file mode 100644 index 0000000..572b037 --- /dev/null +++ b/drivers/net/xen-netback/page_pool.h @@ -0,0 +1,63 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef __PAGE_POOL_H__ +#define __PAGE_POOL_H__ + +#include "common.h" + +typedef uint32_t idx_t; + +#define ENTRIES_PER_CPU (1024) +#define INVALID_ENTRY 0xffffffff + +struct page_pool_entry { + struct page *page; + struct pending_tx_info tx_info; + union { + struct xen_netbk *netbk; + idx_t fl; + } u; +}; + +union page_ext { + idx_t idx; + void *mapping; +}; + +int page_pool_init(void); +void page_pool_destroy(void); + + +struct page *page_pool_get(struct xen_netbk *netbk, int *pidx); +void page_pool_put(int idx); +int is_in_pool(struct page *page, int *pidx); + +struct page *to_page(int idx); +struct xen_netbk *to_netbk(int idx); +struct pending_tx_info *to_txinfo(int idx); + +#endif /* __PAGE_POOL_H__ */ -- 1.7.2.5
Enables users to unload netback module. Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 1 + drivers/net/xen-netback/netback.c | 14 ++++++++++++++ drivers/net/xen-netback/xenbus.c | 5 +++++ 3 files changed, 20 insertions(+), 0 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 288b2f3..372c7f5 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -126,6 +126,7 @@ void xenvif_get(struct xenvif *vif); void xenvif_put(struct xenvif *vif); int xenvif_xenbus_init(void); +void xenvif_xenbus_exit(void); int xenvif_schedulable(struct xenvif *vif); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index d11205f..3059684 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1670,5 +1670,19 @@ failed_init: module_init(netback_init); +static void __exit netback_exit(void) +{ + int i; + xenvif_xenbus_exit(); + for (i = 0; i < xen_netbk_group_nr; i++) { + struct xen_netbk *netbk = &xen_netbk[i]; + del_timer_sync(&netbk->net_timer); + kthread_stop(netbk->task); + } + vfree(xen_netbk); + page_pool_destroy(); +} +module_exit(netback_exit); + MODULE_LICENSE("Dual BSD/GPL"); MODULE_ALIAS("xen-backend:vif"); diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 410018c..65d14f2 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void) { return xenbus_register_backend(&netback_driver); } + +void xenvif_xenbus_exit(void) +{ + return xenbus_unregister_driver(&netback_driver); +} -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 03/13] netback: add module get/put operations along with vif connect/disconnect.
If there is vif running and user unloads netback, it will certainly cause problems -- guest''s network interface just mysteriously stops working. disconnect function may get called by the generic framework even before vif connects. Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/interface.c | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 1825629..4795c0f 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -312,6 +312,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, if (vif->irq) return 0; + __module_get(THIS_MODULE); + err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); if (err < 0) goto err; @@ -339,12 +341,15 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, err_unmap: xen_netbk_unmap_frontend_rings(vif); err: + module_put(THIS_MODULE); return err; } void xenvif_disconnect(struct xenvif *vif) { struct net_device *dev = vif->dev; + int need_module_put = 0; + if (netif_carrier_ok(dev)) { rtnl_lock(); netif_carrier_off(dev); /* discard queued packets */ @@ -359,12 +364,17 @@ void xenvif_disconnect(struct xenvif *vif) del_timer_sync(&vif->credit_timeout); - if (vif->irq) + if (vif->irq) { unbind_from_irqhandler(vif->irq, vif); + need_module_put = 1; + } unregister_netdev(vif->dev); xen_netbk_unmap_frontend_rings(vif); free_netdev(vif->dev); + + if (need_module_put) + module_put(THIS_MODULE); } -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 04/13] netback: switch to NAPI + kthread model
This patch implements 1:1 model netback. We utilizes NAPI and kthread to do the weight-lifting job: - NAPI is used for guest side TX (host side RX) - kthread is used for guest side RX (host side TX) This model provides better scheduling fairness among vifs. It also lays the foundation for future work. Changes in V4: Remove unwanted notification generation during NAPI processing. In original implementation, tx_build_gops tends to update req_event pointer every time it sees tx error or finish one batch. Remove those code to only update req_event pointer when we really want to shut down NAPI. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 36 ++-- drivers/net/xen-netback/interface.c | 95 ++++++--- drivers/net/xen-netback/netback.c | 381 +++++++++++------------------------ drivers/net/xen-netback/xenbus.c | 1 - 4 files changed, 195 insertions(+), 318 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 372c7f5..1e4d462 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -47,7 +47,6 @@ struct pending_tx_info { struct xen_netif_tx_request req; - struct xenvif *vif; }; typedef unsigned int pending_ring_idx_t; @@ -61,14 +60,17 @@ struct xenvif { /* Reference to netback processing backend. */ struct xen_netbk *netbk; + /* Use NAPI for guest TX */ + struct napi_struct napi; + /* Use kthread for guest RX */ + struct task_struct *task; + wait_queue_head_t wq; + u8 fe_dev_addr[6]; /* Physical parameters of the comms window. */ unsigned int irq; - /* List of frontends to notify after a batch of frames sent. */ - struct list_head notify_list; - /* The shared rings and indexes. */ struct xen_netif_tx_back_ring tx; struct xen_netif_rx_back_ring rx; @@ -99,11 +101,7 @@ struct xenvif { unsigned long rx_gso_checksum_fixup; /* Miscellaneous private stuff. */ - struct list_head schedule_list; - atomic_t refcnt; struct net_device *dev; - - wait_queue_head_t waiting_to_free; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) @@ -122,9 +120,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, unsigned long rx_ring_ref, unsigned int evtchn); void xenvif_disconnect(struct xenvif *vif); -void xenvif_get(struct xenvif *vif); -void xenvif_put(struct xenvif *vif); - int xenvif_xenbus_init(void); void xenvif_xenbus_exit(void); @@ -140,18 +135,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, grant_ref_t tx_ring_ref, grant_ref_t rx_ring_ref); -/* (De)Register a xenvif with the netback backend. */ -void xen_netbk_add_xenvif(struct xenvif *vif); -void xen_netbk_remove_xenvif(struct xenvif *vif); - -/* (De)Schedule backend processing for a xenvif */ -void xen_netbk_schedule_xenvif(struct xenvif *vif); -void xen_netbk_deschedule_xenvif(struct xenvif *vif); - /* Check for SKBs from frontend and schedule backend processing */ void xen_netbk_check_rx_xenvif(struct xenvif *vif); -/* Receive an SKB from the frontend */ -void xenvif_receive_skb(struct xenvif *vif, struct sk_buff *skb); /* Queue an SKB for transmission to the frontend */ void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb); @@ -161,4 +146,13 @@ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); +/* Allocate and free xen_netbk structure */ +struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif); +void xen_netbk_free_netbk(struct xen_netbk *netbk); + +int xen_netbk_tx_action(struct xen_netbk *netbk, int budget); +void xen_netbk_rx_action(struct xen_netbk *netbk); + +int xen_netbk_kthread(void *data); + #endif /* __XEN_NETBACK__COMMON_H__ */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 4795c0f..1d9688a 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -30,6 +30,7 @@ #include "common.h" +#include <linux/kthread.h> #include <linux/ethtool.h> #include <linux/rtnetlink.h> #include <linux/if_vlan.h> @@ -38,17 +39,7 @@ #include <asm/xen/hypercall.h> #define XENVIF_QUEUE_LENGTH 32 - -void xenvif_get(struct xenvif *vif) -{ - atomic_inc(&vif->refcnt); -} - -void xenvif_put(struct xenvif *vif) -{ - if (atomic_dec_and_test(&vif->refcnt)) - wake_up(&vif->waiting_to_free); -} +#define XENVIF_NAPI_WEIGHT 64 int xenvif_schedulable(struct xenvif *vif) { @@ -67,14 +58,38 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id) if (vif->netbk == NULL) return IRQ_NONE; - xen_netbk_schedule_xenvif(vif); - if (xenvif_rx_schedulable(vif)) netif_wake_queue(vif->dev); + if (RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) + napi_schedule(&vif->napi); + return IRQ_HANDLED; } +static int xenvif_poll(struct napi_struct *napi, int budget) +{ + struct xenvif *vif = container_of(napi, struct xenvif, napi); + int work_done; + + work_done = xen_netbk_tx_action(vif->netbk, budget); + + if (work_done < budget) { + int more_to_do = 0; + unsigned long flag; + + local_irq_save(flag); + + RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do); + if (!more_to_do || work_done < 0) + __napi_complete(napi); + + local_irq_restore(flag); + } + + return work_done; +} + static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct xenvif *vif = netdev_priv(dev); @@ -90,7 +105,6 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) /* Reserve ring slots for the worst-case number of fragments. */ vif->rx_req_cons_peek += xen_netbk_count_skb_slots(vif, skb); - xenvif_get(vif); if (vif->can_queue && xen_netbk_must_stop_queue(vif)) netif_stop_queue(dev); @@ -105,11 +119,6 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } -void xenvif_receive_skb(struct xenvif *vif, struct sk_buff *skb) -{ - netif_rx_ni(skb); -} - void xenvif_notify_tx_completion(struct xenvif *vif) { if (netif_queue_stopped(vif->dev) && xenvif_rx_schedulable(vif)) @@ -124,16 +133,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev) static void xenvif_up(struct xenvif *vif) { - xen_netbk_add_xenvif(vif); + napi_enable(&vif->napi); enable_irq(vif->irq); xen_netbk_check_rx_xenvif(vif); } static void xenvif_down(struct xenvif *vif) { + napi_disable(&vif->napi); disable_irq(vif->irq); - xen_netbk_deschedule_xenvif(vif); - xen_netbk_remove_xenvif(vif); } static int xenvif_open(struct net_device *dev) @@ -259,14 +267,11 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, vif = netdev_priv(dev); vif->domid = domid; vif->handle = handle; - vif->netbk = NULL; + vif->netbk = NULL; + vif->can_sg = 1; vif->csum = 1; - atomic_set(&vif->refcnt, 1); - init_waitqueue_head(&vif->waiting_to_free); vif->dev = dev; - INIT_LIST_HEAD(&vif->schedule_list); - INIT_LIST_HEAD(&vif->notify_list); vif->credit_bytes = vif->remaining_credit = ~0UL; vif->credit_usec = 0UL; @@ -290,6 +295,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, memset(dev->dev_addr, 0xFF, ETH_ALEN); dev->dev_addr[0] &= ~0x01; + netif_napi_add(dev, &vif->napi, xenvif_poll, XENVIF_NAPI_WEIGHT); + netif_carrier_off(dev); err = register_netdev(dev); @@ -326,7 +333,23 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, vif->irq = err; disable_irq(vif->irq); - xenvif_get(vif); + vif->netbk = xen_netbk_alloc_netbk(vif); + if (!vif->netbk) { + pr_warn("Could not allocate xen_netbk\n"); + err = -ENOMEM; + goto err_unbind; + } + + + init_waitqueue_head(&vif->wq); + vif->task = kthread_create(xen_netbk_kthread, + (void *)vif, + "vif%d.%d", vif->domid, vif->handle); + if (IS_ERR(vif->task)) { + pr_warn("Could not create kthread\n"); + err = PTR_ERR(vif->task); + goto err_free_netbk; + } rtnl_lock(); if (!vif->can_sg && vif->dev->mtu > ETH_DATA_LEN) @@ -337,7 +360,13 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, xenvif_up(vif); rtnl_unlock(); + wake_up_process(vif->task); + return 0; +err_free_netbk: + xen_netbk_free_netbk(vif->netbk); +err_unbind: + unbind_from_irqhandler(vif->irq, vif); err_unmap: xen_netbk_unmap_frontend_rings(vif); err: @@ -356,11 +385,15 @@ void xenvif_disconnect(struct xenvif *vif) if (netif_running(dev)) xenvif_down(vif); rtnl_unlock(); - xenvif_put(vif); } - atomic_dec(&vif->refcnt); - wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0); + if (vif->task) + kthread_stop(vif->task); + + if (vif->netbk) + xen_netbk_free_netbk(vif->netbk); + + netif_napi_del(&vif->napi); del_timer_sync(&vif->credit_timeout); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 3059684..8e4c9a9 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -61,24 +61,15 @@ struct netbk_rx_meta { #define MAX_BUFFER_OFFSET PAGE_SIZE struct xen_netbk { - wait_queue_head_t wq; - struct task_struct *task; - struct sk_buff_head rx_queue; struct sk_buff_head tx_queue; - struct timer_list net_timer; - idx_t mmap_pages[MAX_PENDING_REQS]; pending_ring_idx_t pending_prod; pending_ring_idx_t pending_cons; - struct list_head net_schedule_list; - - /* Protect the net_schedule_list in netif. */ - spinlock_t net_schedule_list_lock; - atomic_t netfront_count; + struct xenvif *vif; struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; @@ -93,42 +84,14 @@ struct xen_netbk { struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; }; -static struct xen_netbk *xen_netbk; -static int xen_netbk_group_nr; - -void xen_netbk_add_xenvif(struct xenvif *vif) -{ - int i; - int min_netfront_count; - int min_group = 0; - struct xen_netbk *netbk; - - min_netfront_count = atomic_read(&xen_netbk[0].netfront_count); - for (i = 0; i < xen_netbk_group_nr; i++) { - int netfront_count = atomic_read(&xen_netbk[i].netfront_count); - if (netfront_count < min_netfront_count) { - min_group = i; - min_netfront_count = netfront_count; - } - } - - netbk = &xen_netbk[min_group]; - - vif->netbk = netbk; - atomic_inc(&netbk->netfront_count); -} - -void xen_netbk_remove_xenvif(struct xenvif *vif) -{ - struct xen_netbk *netbk = vif->netbk; - vif->netbk = NULL; - atomic_dec(&netbk->netfront_count); -} - static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); + +static inline int tx_work_todo(struct xen_netbk *netbk); +static inline int rx_work_todo(struct xen_netbk *netbk); + static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 id, s8 st, @@ -179,11 +142,6 @@ static inline pending_ring_idx_t nr_pending_reqs(struct xen_netbk *netbk) netbk->pending_prod + netbk->pending_cons; } -static void xen_netbk_kick_thread(struct xen_netbk *netbk) -{ - wake_up(&netbk->wq); -} - static int max_required_rx_slots(struct xenvif *vif) { int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE); @@ -368,8 +326,9 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, copy_gop->flags = GNTCOPY_dest_gref; if (foreign) { struct pending_tx_info *src_pend = to_txinfo(idx); + struct xen_netbk *rnetbk = to_netbk(idx); - copy_gop->source.domid = src_pend->vif->domid; + copy_gop->source.domid = rnetbk->vif->domid; copy_gop->source.u.ref = src_pend->req.gref; copy_gop->flags |= GNTCOPY_source_gref; } else { @@ -527,11 +486,18 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_rx_action(struct xen_netbk *netbk) +static void xen_netbk_kick_thread(struct xen_netbk *netbk) +{ + struct xenvif *vif = netbk->vif; + + wake_up(&vif->wq); +} + +void xen_netbk_rx_action(struct xen_netbk *netbk) { - struct xenvif *vif = NULL, *tmp; + struct xenvif *vif = NULL; s8 status; - u16 irq, flags; + u16 flags; struct xen_netif_rx_response *resp; struct sk_buff_head rxq; struct sk_buff *skb; @@ -541,6 +507,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) int count; unsigned long offset; struct skb_cb_overlay *sco; + int need_to_notify = 0; struct netrx_pending_operations npo = { .copy = netbk->grant_copy_op, @@ -641,25 +608,19 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); - irq = vif->irq; - if (ret && list_empty(&vif->notify_list)) - list_add_tail(&vif->notify_list, ¬ify); + if (ret) + need_to_notify = 1; xenvif_notify_tx_completion(vif); - xenvif_put(vif); npo.meta_cons += sco->meta_slots_used; dev_kfree_skb(skb); } - list_for_each_entry_safe(vif, tmp, ¬ify, notify_list) { + if (need_to_notify) notify_remote_via_irq(vif->irq); - list_del_init(&vif->notify_list); - } - /* More work to do? */ - if (!skb_queue_empty(&netbk->rx_queue) && - !timer_pending(&netbk->net_timer)) + if (!skb_queue_empty(&netbk->rx_queue)) xen_netbk_kick_thread(netbk); } @@ -672,86 +633,17 @@ void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) xen_netbk_kick_thread(netbk); } -static void xen_netbk_alarm(unsigned long data) -{ - struct xen_netbk *netbk = (struct xen_netbk *)data; - xen_netbk_kick_thread(netbk); -} - -static int __on_net_schedule_list(struct xenvif *vif) -{ - return !list_empty(&vif->schedule_list); -} - -/* Must be called with net_schedule_list_lock held */ -static void remove_from_net_schedule_list(struct xenvif *vif) -{ - if (likely(__on_net_schedule_list(vif))) { - list_del_init(&vif->schedule_list); - xenvif_put(vif); - } -} - -static struct xenvif *poll_net_schedule_list(struct xen_netbk *netbk) -{ - struct xenvif *vif = NULL; - - spin_lock_irq(&netbk->net_schedule_list_lock); - if (list_empty(&netbk->net_schedule_list)) - goto out; - - vif = list_first_entry(&netbk->net_schedule_list, - struct xenvif, schedule_list); - if (!vif) - goto out; - - xenvif_get(vif); - - remove_from_net_schedule_list(vif); -out: - spin_unlock_irq(&netbk->net_schedule_list_lock); - return vif; -} - -void xen_netbk_schedule_xenvif(struct xenvif *vif) -{ - unsigned long flags; - struct xen_netbk *netbk = vif->netbk; - - if (__on_net_schedule_list(vif)) - goto kick; - - spin_lock_irqsave(&netbk->net_schedule_list_lock, flags); - if (!__on_net_schedule_list(vif) && - likely(xenvif_schedulable(vif))) { - list_add_tail(&vif->schedule_list, &netbk->net_schedule_list); - xenvif_get(vif); - } - spin_unlock_irqrestore(&netbk->net_schedule_list_lock, flags); - -kick: - smp_mb(); - if ((nr_pending_reqs(netbk) < (MAX_PENDING_REQS/2)) && - !list_empty(&netbk->net_schedule_list)) - xen_netbk_kick_thread(netbk); -} - -void xen_netbk_deschedule_xenvif(struct xenvif *vif) -{ - struct xen_netbk *netbk = vif->netbk; - spin_lock_irq(&netbk->net_schedule_list_lock); - remove_from_net_schedule_list(vif); - spin_unlock_irq(&netbk->net_schedule_list_lock); -} - void xen_netbk_check_rx_xenvif(struct xenvif *vif) { int more_to_do; RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do); + /* In this check function, we are supposed to do be''s rx, + * which means fe''s tx */ + if (more_to_do) - xen_netbk_schedule_xenvif(vif); + napi_schedule(&vif->napi); } static void tx_add_credit(struct xenvif *vif) @@ -793,8 +685,6 @@ static void netbk_tx_err(struct xenvif *vif, txp = RING_GET_REQUEST(&vif->tx, cons++); } while (1); vif->tx.req_cons = cons; - xen_netbk_check_rx_xenvif(vif); - xenvif_put(vif); } static int netbk_count_requests(struct xenvif *vif, @@ -894,8 +784,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop++; memcpy(&pending_tx_info->req, txp, sizeof(*txp)); - xenvif_get(vif); - pending_tx_info->vif = vif; + frag_set_pending_idx(&frags[i], pending_idx); } @@ -910,7 +799,8 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, u16 pending_idx = *((u16 *)skb->data); struct pending_tx_info *pending_tx_info; int idx; - struct xenvif *vif = NULL; + struct xenvif *vif = netbk->vif; + struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -924,10 +814,8 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, idx = netbk->mmap_pages[index]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; - vif = pending_tx_info->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); } /* Skip first skb fragment if it is on same page as header fragment. */ @@ -951,11 +839,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, /* Error on this fragment: respond to client with an error. */ idx = netbk->mmap_pages[pending_idx]; txp = &to_txinfo(idx)->req; - vif = to_txinfo(idx)->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); /* Not the first error? Preceding frags already invalidated. */ if (err) @@ -1171,10 +1057,9 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; struct sk_buff *skb; int ret; + struct xenvif *vif = netbk->vif; - while (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) && - !list_empty(&netbk->net_schedule_list)) { - struct xenvif *vif; + while ((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { struct xen_netif_tx_request txreq; struct xen_netif_tx_request txfrags[MAX_SKB_FRAGS]; struct page *page; @@ -1187,26 +1072,19 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) int pool_idx; struct pending_tx_info *pending_tx_info; - /* Get a netif from the list with work to do. */ - vif = poll_net_schedule_list(netbk); - if (!vif) - continue; - - RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, work_to_do); + work_to_do = RING_HAS_UNCONSUMED_REQUESTS(&vif->tx); if (!work_to_do) { - xenvif_put(vif); - continue; + break; } idx = vif->tx.req_cons; rmb(); /* Ensure that we see the request before we copy it. */ memcpy(&txreq, RING_GET_REQUEST(&vif->tx, idx), sizeof(txreq)); - /* Credit-based scheduling. */ + /* Credit-based traffic shaping. */ if (txreq.size > vif->remaining_credit && tx_credit_exceeded(vif, txreq.size)) { - xenvif_put(vif); - continue; + break; } vif->remaining_credit -= txreq.size; @@ -1221,14 +1099,14 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) idx = vif->tx.req_cons; if (unlikely(work_to_do < 0)) { netbk_tx_err(vif, &txreq, idx); - continue; + break; } } ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do); if (unlikely(ret < 0)) { netbk_tx_err(vif, &txreq, idx - ret); - continue; + break; } idx += ret; @@ -1236,7 +1114,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) netdev_dbg(vif->dev, "Bad packet size: %d\n", txreq.size); netbk_tx_err(vif, &txreq, idx); - continue; + break; } /* No crossing a page as the payload mustn''t fragment. */ @@ -1246,7 +1124,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) txreq.offset, txreq.size, (txreq.offset&~PAGE_MASK) + txreq.size); netbk_tx_err(vif, &txreq, idx); - continue; + break; } index = pending_index(netbk->pending_cons); @@ -1275,7 +1153,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (netbk_set_skb_gso(vif, skb, gso)) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } } @@ -1284,7 +1162,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (!page) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } gop->source.u.ref = txreq.gref; @@ -1305,7 +1183,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) memcpy(&pending_tx_info->req, &txreq, sizeof(txreq)); - pending_tx_info->vif = vif; + *((u16 *)skb->data) = pending_idx; __skb_put(skb, data_len); @@ -1329,12 +1207,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (request_gop == NULL) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } gop = request_gop; vif->tx.req_cons = idx; - xen_netbk_check_rx_xenvif(vif); if ((gop-netbk->tx_copy_ops) >= ARRAY_SIZE(netbk->tx_copy_ops)) break; @@ -1343,14 +1220,18 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) return gop - netbk->tx_copy_ops; } -static void xen_netbk_tx_submit(struct xen_netbk *netbk) +static int xen_netbk_tx_submit(struct xen_netbk *netbk, + struct gnttab_copy *tco, + int budget) { struct gnttab_copy *gop = netbk->tx_copy_ops; struct sk_buff *skb; + struct xenvif *vif = netbk->vif; + int work_done = 0; - while ((skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { + while ((work_done < budget) && + (skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { struct xen_netif_tx_request *txp; - struct xenvif *vif; u16 pending_idx; unsigned data_len; int idx; @@ -1361,7 +1242,6 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) idx = netbk->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); - vif = pending_tx_info->vif; txp = &pending_tx_info->req; /* Check the remap error code. */ @@ -1415,31 +1295,41 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) vif->dev->stats.rx_bytes += skb->len; vif->dev->stats.rx_packets++; - xenvif_receive_skb(vif, skb); + work_done++; + + netif_receive_skb(skb); } + + return work_done; } /* Called after netfront has transmitted */ -static void xen_netbk_tx_action(struct xen_netbk *netbk) +int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) { unsigned nr_gops; int ret; + int work_done; + + if (unlikely(!tx_work_todo(netbk))) + return 0; nr_gops = xen_netbk_tx_build_gops(netbk); if (nr_gops == 0) - return; + return 0; + ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, netbk->tx_copy_ops, nr_gops); BUG_ON(ret); - xen_netbk_tx_submit(netbk); + work_done = xen_netbk_tx_submit(netbk, netbk->tx_copy_ops, budget); + return work_done; } static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) { - struct xenvif *vif; + struct xenvif *vif = netbk->vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; int idx; @@ -1451,15 +1341,11 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) idx = netbk->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); - vif = pending_tx_info->vif; - make_tx_response(vif, &pending_tx_info->req, XEN_NETIF_RSP_OKAY); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); - page_pool_put(netbk->mmap_pages[pending_idx]); netbk->mmap_pages[pending_idx] = INVALID_ENTRY; @@ -1516,37 +1402,13 @@ static inline int rx_work_todo(struct xen_netbk *netbk) static inline int tx_work_todo(struct xen_netbk *netbk) { - - if (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) && - !list_empty(&netbk->net_schedule_list)) + if (likely(RING_HAS_UNCONSUMED_REQUESTS(&netbk->vif->tx)) && + (nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) return 1; return 0; } -static int xen_netbk_kthread(void *data) -{ - struct xen_netbk *netbk = data; - while (!kthread_should_stop()) { - wait_event_interruptible(netbk->wq, - rx_work_todo(netbk) || - tx_work_todo(netbk) || - kthread_should_stop()); - cond_resched(); - - if (kthread_should_stop()) - break; - - if (rx_work_todo(netbk)) - xen_netbk_rx_action(netbk); - - if (tx_work_todo(netbk)) - xen_netbk_tx_action(netbk); - } - - return 0; -} - void xen_netbk_unmap_frontend_rings(struct xenvif *vif) { if (vif->tx.sring) @@ -1592,78 +1454,74 @@ err: return err; } -static int __init netback_init(void) +struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif) { int i; - int rc = 0; - int group; - - if (!xen_domain()) - return -ENODEV; + struct xen_netbk *netbk; - xen_netbk_group_nr = num_online_cpus(); - xen_netbk = vzalloc(sizeof(struct xen_netbk) * xen_netbk_group_nr); - if (!xen_netbk) { + netbk = vzalloc(sizeof(struct xen_netbk)); + if (!netbk) { printk(KERN_ALERT "%s: out of memory\n", __func__); - return -ENOMEM; + return NULL; } - for (group = 0; group < xen_netbk_group_nr; group++) { - struct xen_netbk *netbk = &xen_netbk[group]; - skb_queue_head_init(&netbk->rx_queue); - skb_queue_head_init(&netbk->tx_queue); - - init_timer(&netbk->net_timer); - netbk->net_timer.data = (unsigned long)netbk; - netbk->net_timer.function = xen_netbk_alarm; - - netbk->pending_cons = 0; - netbk->pending_prod = MAX_PENDING_REQS; - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->pending_ring[i] = i; - - init_waitqueue_head(&netbk->wq); - netbk->task = kthread_create(xen_netbk_kthread, - (void *)netbk, - "netback/%u", group); - - if (IS_ERR(netbk->task)) { - printk(KERN_ALERT "kthread_create() fails at netback\n"); - del_timer(&netbk->net_timer); - rc = PTR_ERR(netbk->task); - goto failed_init; - } + netbk->vif = vif; + + skb_queue_head_init(&netbk->rx_queue); + skb_queue_head_init(&netbk->tx_queue); + + netbk->pending_cons = 0; + netbk->pending_prod = MAX_PENDING_REQS; + for (i = 0; i < MAX_PENDING_REQS; i++) + netbk->pending_ring[i] = i; + + for (i = 0; i < MAX_PENDING_REQS; i++) + netbk->mmap_pages[i] = INVALID_ENTRY; + + return netbk; +} - kthread_bind(netbk->task, group); +void xen_netbk_free_netbk(struct xen_netbk *netbk) +{ + vfree(netbk); +} - INIT_LIST_HEAD(&netbk->net_schedule_list); +int xen_netbk_kthread(void *data) +{ + struct xenvif *vif = data; + struct xen_netbk *netbk = vif->netbk; - spin_lock_init(&netbk->net_schedule_list_lock); + while (!kthread_should_stop()) { + wait_event_interruptible(vif->wq, + rx_work_todo(netbk) || + kthread_should_stop()); + cond_resched(); - atomic_set(&netbk->netfront_count, 0); + if (kthread_should_stop()) + break; - wake_up_process(netbk->task); + if (rx_work_todo(netbk)) + xen_netbk_rx_action(netbk); } + return 0; +} + + +static int __init netback_init(void) +{ + int rc = 0; + + if (!xen_domain()) + return -ENODEV; + rc = page_pool_init(); if (rc) goto failed_init; - rc = xenvif_xenbus_init(); - if (rc) - goto pool_failed_init; - - return 0; + return xenvif_xenbus_init(); -pool_failed_init: - page_pool_destroy(); failed_init: - while (--group >= 0) { - struct xen_netbk *netbk = &xen_netbk[group]; - del_timer(&netbk->net_timer); - kthread_stop(netbk->task); - } - vfree(xen_netbk); return rc; } @@ -1672,14 +1530,7 @@ module_init(netback_init); static void __exit netback_exit(void) { - int i; xenvif_xenbus_exit(); - for (i = 0; i < xen_netbk_group_nr; i++) { - struct xen_netbk *netbk = &xen_netbk[i]; - del_timer_sync(&netbk->net_timer); - kthread_stop(netbk->task); - } - vfree(xen_netbk); page_pool_destroy(); } module_exit(netback_exit); diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 65d14f2..f1e89ca 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -387,7 +387,6 @@ static void connect(struct backend_info *be) netif_wake_queue(be->vif->dev); } - static int connect_rings(struct backend_info *be) { struct xenvif *vif = be->vif; -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 05/13] netback: switch to per-cpu scratch space.
In the 1:1 model, given that there are maximum nr_online_cpus netbacks running, we can use per-cpu scratch space, thus shrinking size of struct xen_netbk. Changes in V4: Carefully guard against CPU hotplug race condition. NAPI and kthread will bail when scratch spaces are not available. Scratch space allocation is NUMA awared. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 15 ++ drivers/net/xen-netback/netback.c | 261 ++++++++++++++++++++++++++++++------- 2 files changed, 229 insertions(+), 47 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 1e4d462..65df480 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,6 +45,21 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +#define DRV_NAME "netback: " + +struct netbk_rx_meta { + int id; + int size; + int gso_size; +}; + +#define MAX_PENDING_REQS 256 + +/* Discriminate from any valid pending_idx value. */ +#define INVALID_PENDING_IDX 0xFFFF + +#define MAX_BUFFER_OFFSET PAGE_SIZE + struct pending_tx_info { struct xen_netif_tx_request req; }; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 8e4c9a9..5584853 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1,3 +1,4 @@ + /* * Back-end of the driver for virtual network devices. This portion of the * driver exports a ''unified'' network-device interface that can be accessed @@ -38,6 +39,7 @@ #include <linux/kthread.h> #include <linux/if_vlan.h> #include <linux/udp.h> +#include <linux/cpu.h> #include <net/tcp.h> @@ -47,18 +49,17 @@ #include <asm/xen/hypercall.h> #include <asm/xen/page.h> -struct netbk_rx_meta { - int id; - int size; - int gso_size; -}; -#define MAX_PENDING_REQS 256 +DEFINE_PER_CPU(struct gnttab_copy *, tx_copy_ops); -/* Discriminate from any valid pending_idx value. */ -#define INVALID_PENDING_IDX 0xFFFF +/* + * Given MAX_BUFFER_OFFSET of 4096 the worst case is that each + * head/fragment page uses 2 copy operations because it + * straddles two buffers in the frontend. + */ +DEFINE_PER_CPU(struct gnttab_copy *, grant_copy_op); +DEFINE_PER_CPU(struct netbk_rx_meta *, meta); -#define MAX_BUFFER_OFFSET PAGE_SIZE struct xen_netbk { struct sk_buff_head rx_queue; @@ -71,17 +72,7 @@ struct xen_netbk { struct xenvif *vif; - struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; - u16 pending_ring[MAX_PENDING_REQS]; - - /* - * Given MAX_BUFFER_OFFSET of 4096 the worst case is that each - * head/fragment page uses 2 copy operations because it - * straddles two buffers in the frontend. - */ - struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE]; - struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; }; static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); @@ -508,12 +499,29 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) unsigned long offset; struct skb_cb_overlay *sco; int need_to_notify = 0; + static int unusable_count; + + struct gnttab_copy *gco = get_cpu_var(grant_copy_op); + struct netbk_rx_meta *m = get_cpu_var(meta); struct netrx_pending_operations npo = { - .copy = netbk->grant_copy_op, - .meta = netbk->meta, + .copy = gco, + .meta = m, }; + if (gco == NULL || m == NULL) { + put_cpu_var(grant_copy_op); + put_cpu_var(meta); + if (unusable_count == 1000) { + pr_alert("CPU %x scratch space is not usable," + " not doing any TX work for vif%u.%u\n", + smp_processor_id(), + netbk->vif->domid, netbk->vif->handle); + unusable_count = 0; + } + return; + } + skb_queue_head_init(&rxq); count = 0; @@ -534,13 +542,16 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) break; } - BUG_ON(npo.meta_prod > ARRAY_SIZE(netbk->meta)); + BUG_ON(npo.meta_prod > MAX_PENDING_REQS); - if (!npo.copy_prod) + if (!npo.copy_prod) { + put_cpu_var(grant_copy_op); + put_cpu_var(meta); return; + } - BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op)); - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op, + BUG_ON(npo.copy_prod > (2 * XEN_NETIF_RX_RING_SIZE)); + ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, gco, npo.copy_prod); BUG_ON(ret != 0); @@ -549,14 +560,14 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) vif = netdev_priv(skb->dev); - if (netbk->meta[npo.meta_cons].gso_size && vif->gso_prefix) { + if (m[npo.meta_cons].gso_size && vif->gso_prefix) { resp = RING_GET_RESPONSE(&vif->rx, vif->rx.rsp_prod_pvt++); resp->flags = XEN_NETRXF_gso_prefix | XEN_NETRXF_more_data; - resp->offset = netbk->meta[npo.meta_cons].gso_size; - resp->id = netbk->meta[npo.meta_cons].id; + resp->offset = m[npo.meta_cons].gso_size; + resp->id = m[npo.meta_cons].id; resp->status = sco->meta_slots_used; npo.meta_cons++; @@ -581,12 +592,12 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) flags |= XEN_NETRXF_data_validated; offset = 0; - resp = make_rx_response(vif, netbk->meta[npo.meta_cons].id, + resp = make_rx_response(vif, m[npo.meta_cons].id, status, offset, - netbk->meta[npo.meta_cons].size, + m[npo.meta_cons].size, flags); - if (netbk->meta[npo.meta_cons].gso_size && !vif->gso_prefix) { + if (m[npo.meta_cons].gso_size && !vif->gso_prefix) { struct xen_netif_extra_info *gso (struct xen_netif_extra_info *) RING_GET_RESPONSE(&vif->rx, @@ -594,7 +605,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) resp->flags |= XEN_NETRXF_extra_info; - gso->u.gso.size = netbk->meta[npo.meta_cons].gso_size; + gso->u.gso.size = m[npo.meta_cons].gso_size; gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4; gso->u.gso.pad = 0; gso->u.gso.features = 0; @@ -604,7 +615,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) } netbk_add_frag_responses(vif, status, - netbk->meta + npo.meta_cons + 1, + m + npo.meta_cons + 1, sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); @@ -622,6 +633,9 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) if (!skb_queue_empty(&netbk->rx_queue)) xen_netbk_kick_thread(netbk); + + put_cpu_var(grant_copy_op); + put_cpu_var(meta); } void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) @@ -1052,9 +1066,10 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) +static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, + struct gnttab_copy *tco) { - struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; + struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; int ret; struct xenvif *vif = netbk->vif; @@ -1213,18 +1228,18 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) vif->tx.req_cons = idx; - if ((gop-netbk->tx_copy_ops) >= ARRAY_SIZE(netbk->tx_copy_ops)) + if ((gop - tco) >= MAX_PENDING_REQS) break; } - return gop - netbk->tx_copy_ops; + return gop - tco; } static int xen_netbk_tx_submit(struct xen_netbk *netbk, struct gnttab_copy *tco, int budget) { - struct gnttab_copy *gop = netbk->tx_copy_ops; + struct gnttab_copy *gop = tco; struct sk_buff *skb; struct xenvif *vif = netbk->vif; int work_done = 0; @@ -1309,20 +1324,42 @@ int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) unsigned nr_gops; int ret; int work_done; + struct gnttab_copy *tco; + static int unusable_count; if (unlikely(!tx_work_todo(netbk))) return 0; - nr_gops = xen_netbk_tx_build_gops(netbk); + tco = get_cpu_var(tx_copy_ops); + + if (tco == NULL) { + put_cpu_var(tx_copy_ops); + unusable_count++; + if (unusable_count == 1000) { + pr_alert("CPU %x scratch space" + " is not usable," + " not doing any RX work for vif%u.%u\n", + smp_processor_id(), + netbk->vif->domid, netbk->vif->handle); + unusable_count = 0; + } + return -ENOMEM; + } + + nr_gops = xen_netbk_tx_build_gops(netbk, tco); - if (nr_gops == 0) + if (nr_gops == 0) { + put_cpu_var(tx_copy_ops); return 0; + } ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, - netbk->tx_copy_ops, nr_gops); + tco, nr_gops); BUG_ON(ret); - work_done = xen_netbk_tx_submit(netbk, netbk->tx_copy_ops, budget); + work_done = xen_netbk_tx_submit(netbk, tco, budget); + + put_cpu_var(tx_copy_ops); return work_done; } @@ -1461,7 +1498,7 @@ struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif) netbk = vzalloc(sizeof(struct xen_netbk)); if (!netbk) { - printk(KERN_ALERT "%s: out of memory\n", __func__); + pr_alert(DRV_NAME "%s: out of memory\n", __func__); return NULL; } @@ -1507,31 +1544,161 @@ int xen_netbk_kthread(void *data) return 0; } +static int __create_percpu_scratch_space(unsigned int cpu) +{ + /* Guard against race condition */ + if (per_cpu(tx_copy_ops, cpu) || + per_cpu(grant_copy_op, cpu) || + per_cpu(meta, cpu)) + return 0; + + per_cpu(tx_copy_ops, cpu) + vzalloc_node(sizeof(struct gnttab_copy) * MAX_PENDING_REQS, + cpu_to_node(cpu)); + + if (!per_cpu(tx_copy_ops, cpu)) + per_cpu(tx_copy_ops, cpu) = vzalloc(sizeof(struct gnttab_copy) + * MAX_PENDING_REQS); + + per_cpu(grant_copy_op, cpu) + vzalloc_node(sizeof(struct gnttab_copy) + * 2 * XEN_NETIF_RX_RING_SIZE, cpu_to_node(cpu)); + + if (!per_cpu(grant_copy_op, cpu)) + per_cpu(grant_copy_op, cpu) + vzalloc(sizeof(struct gnttab_copy) + * 2 * XEN_NETIF_RX_RING_SIZE); + + + per_cpu(meta, cpu) = vzalloc_node(sizeof(struct xenvif_rx_meta) + * 2 * XEN_NETIF_RX_RING_SIZE, + cpu_to_node(cpu)); + if (!per_cpu(meta, cpu)) + per_cpu(meta, cpu) = vzalloc(sizeof(struct xenvif_rx_meta) + * 2 * XEN_NETIF_RX_RING_SIZE); + + if (!per_cpu(tx_copy_ops, cpu) || + !per_cpu(grant_copy_op, cpu) || + !per_cpu(meta, cpu)) + return -ENOMEM; + + return 0; +} + +static void __free_percpu_scratch_space(unsigned int cpu) +{ + /* freeing NULL pointer is legit */ + /* carefully work around race condition */ + void *tmp; + tmp = per_cpu(tx_copy_ops, cpu); + per_cpu(tx_copy_ops, cpu) = NULL; + vfree(tmp); + + tmp = per_cpu(grant_copy_op, cpu); + per_cpu(grant_copy_op, cpu) = NULL; + vfree(tmp); + + tmp = per_cpu(meta, cpu); + per_cpu(meta, cpu) = NULL; + vfree(tmp); +} + +static int __netback_percpu_callback(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + int rc = NOTIFY_DONE; + + switch (action) { + case CPU_ONLINE: + case CPU_ONLINE_FROZEN: + pr_info("CPU %x online, creating scratch space\n", cpu); + rc = __create_percpu_scratch_space(cpu); + if (rc) { + pr_alert("failed to create scratch space" + " for CPU %x\n", cpu); + /* FIXME: nothing more we can do here, we will + * print out warning message when thread or + * NAPI runs on this cpu. Also stop getting + * called in the future. + */ + __free_percpu_scratch_space(cpu); + rc = NOTIFY_BAD; + } else { + rc = NOTIFY_OK; + } + break; + case CPU_DEAD: + case CPU_DEAD_FROZEN: + pr_info("CPU %x offline, destroying scratch space\n", + cpu); + __free_percpu_scratch_space(cpu); + rc = NOTIFY_OK; + break; + default: + break; + } + + return rc; +} + +static struct notifier_block netback_notifier_block = { + .notifier_call = __netback_percpu_callback, +}; static int __init netback_init(void) { - int rc = 0; + int rc = -ENOMEM; + int cpu; if (!xen_domain()) return -ENODEV; + /* Don''t need to disable preempt here, since nobody else will + * touch these percpu areas during start up. */ + for_each_online_cpu(cpu) { + rc = __create_percpu_scratch_space(cpu); + + if (rc) + goto failed_init; + } + + register_hotcpu_notifier(&netback_notifier_block); + rc = page_pool_init(); if (rc) - goto failed_init; + goto failed_init_pool; - return xenvif_xenbus_init(); + rc = xenvif_xenbus_init(); + if (rc) + goto failed_init_xenbus; -failed_init: return rc; +failed_init_xenbus: + page_pool_destroy(); +failed_init_pool: + unregister_hotcpu_notifier(&netback_notifier_block); +failed_init: + for_each_online_cpu(cpu) + __free_percpu_scratch_space(cpu); + return rc; } module_init(netback_init); static void __exit netback_exit(void) { + int cpu; + xenvif_xenbus_exit(); page_pool_destroy(); + + unregister_hotcpu_notifier(&netback_notifier_block); + + /* Since we''re here, nobody else will touch per-cpu area. */ + for_each_online_cpu(cpu) + __free_percpu_scratch_space(cpu); } module_exit(netback_exit); -- 1.7.2.5
In the 1:1 model, there is no need to keep xen_netbk and xenvif separated. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 35 +++--- drivers/net/xen-netback/interface.c | 36 +++--- drivers/net/xen-netback/netback.c | 219 +++++++++++++---------------------- drivers/net/xen-netback/page_pool.c | 10 +- drivers/net/xen-netback/page_pool.h | 13 ++- 5 files changed, 124 insertions(+), 189 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 65df480..ea91bb6 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -46,6 +46,7 @@ #include <xen/xenbus.h> #define DRV_NAME "netback: " +#include "page_pool.h" struct netbk_rx_meta { int id; @@ -53,28 +54,21 @@ struct netbk_rx_meta { int gso_size; }; -#define MAX_PENDING_REQS 256 - /* Discriminate from any valid pending_idx value. */ #define INVALID_PENDING_IDX 0xFFFF #define MAX_BUFFER_OFFSET PAGE_SIZE -struct pending_tx_info { - struct xen_netif_tx_request req; -}; -typedef unsigned int pending_ring_idx_t; +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) -struct xen_netbk; +#define MAX_PENDING_REQS 256 struct xenvif { /* Unique identifier for this interface. */ domid_t domid; unsigned int handle; - /* Reference to netback processing backend. */ - struct xen_netbk *netbk; - /* Use NAPI for guest TX */ struct napi_struct napi; /* Use kthread for guest RX */ @@ -117,6 +111,16 @@ struct xenvif { /* Miscellaneous private stuff. */ struct net_device *dev; + + struct sk_buff_head rx_queue; + struct sk_buff_head tx_queue; + + idx_t mmap_pages[MAX_PENDING_REQS]; + + pending_ring_idx_t pending_prod; + pending_ring_idx_t pending_cons; + + u16 pending_ring[MAX_PENDING_REQS]; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) @@ -124,9 +128,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) return to_xenbus_device(vif->dev->dev.parent); } -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) - struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, unsigned int handle); @@ -161,12 +162,8 @@ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); -/* Allocate and free xen_netbk structure */ -struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif); -void xen_netbk_free_netbk(struct xen_netbk *netbk); - -int xen_netbk_tx_action(struct xen_netbk *netbk, int budget); -void xen_netbk_rx_action(struct xen_netbk *netbk); +int xen_netbk_tx_action(struct xenvif *vif, int budget); +void xen_netbk_rx_action(struct xenvif *vif); int xen_netbk_kthread(void *data); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 1d9688a..9b7d596 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -55,9 +55,6 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id) { struct xenvif *vif = dev_id; - if (vif->netbk == NULL) - return IRQ_NONE; - if (xenvif_rx_schedulable(vif)) netif_wake_queue(vif->dev); @@ -72,7 +69,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget) struct xenvif *vif = container_of(napi, struct xenvif, napi); int work_done; - work_done = xen_netbk_tx_action(vif->netbk, budget); + work_done = xen_netbk_tx_action(vif, budget); if (work_done < budget) { int more_to_do = 0; @@ -96,7 +93,8 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) BUG_ON(skb->dev != dev); - if (vif->netbk == NULL) + /* Drop the packet if vif is not ready */ + if (vif->task == NULL) goto drop; /* Drop the packet if the target domain has no receive buffers. */ @@ -253,6 +251,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, int err; struct net_device *dev; struct xenvif *vif; + int i; char name[IFNAMSIZ] = {}; snprintf(name, IFNAMSIZ - 1, "vif%u.%u", domid, handle); @@ -267,7 +266,6 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, vif = netdev_priv(dev); vif->domid = domid; vif->handle = handle; - vif->netbk = NULL; vif->can_sg = 1; vif->csum = 1; @@ -286,6 +284,17 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, dev->tx_queue_len = XENVIF_QUEUE_LENGTH; + skb_queue_head_init(&vif->rx_queue); + skb_queue_head_init(&vif->tx_queue); + + vif->pending_cons = 0; + vif->pending_prod = MAX_PENDING_REQS; + for (i = 0; i < MAX_PENDING_REQS; i++) + vif->pending_ring[i] = i; + + for (i = 0; i < MAX_PENDING_REQS; i++) + vif->mmap_pages[i] = INVALID_ENTRY; + /* * Initialise a dummy MAC address. We choose the numerically * largest non-broadcast address to prevent the address getting @@ -333,14 +342,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, vif->irq = err; disable_irq(vif->irq); - vif->netbk = xen_netbk_alloc_netbk(vif); - if (!vif->netbk) { - pr_warn("Could not allocate xen_netbk\n"); - err = -ENOMEM; - goto err_unbind; - } - - init_waitqueue_head(&vif->wq); vif->task = kthread_create(xen_netbk_kthread, (void *)vif, @@ -348,7 +349,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, if (IS_ERR(vif->task)) { pr_warn("Could not create kthread\n"); err = PTR_ERR(vif->task); - goto err_free_netbk; + goto err_unbind; } rtnl_lock(); @@ -363,8 +364,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, wake_up_process(vif->task); return 0; -err_free_netbk: - xen_netbk_free_netbk(vif->netbk); err_unbind: unbind_from_irqhandler(vif->irq, vif); err_unmap: @@ -390,9 +389,6 @@ void xenvif_disconnect(struct xenvif *vif) if (vif->task) kthread_stop(vif->task); - if (vif->netbk) - xen_netbk_free_netbk(vif->netbk); - netif_napi_del(&vif->napi); del_timer_sync(&vif->credit_timeout); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 5584853..ef9cfbe 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -60,28 +60,13 @@ DEFINE_PER_CPU(struct gnttab_copy *, tx_copy_ops); DEFINE_PER_CPU(struct gnttab_copy *, grant_copy_op); DEFINE_PER_CPU(struct netbk_rx_meta *, meta); - -struct xen_netbk { - struct sk_buff_head rx_queue; - struct sk_buff_head tx_queue; - - idx_t mmap_pages[MAX_PENDING_REQS]; - - pending_ring_idx_t pending_prod; - pending_ring_idx_t pending_cons; - - struct xenvif *vif; - - u16 pending_ring[MAX_PENDING_REQS]; -}; - -static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); +static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); -static inline int tx_work_todo(struct xen_netbk *netbk); -static inline int rx_work_todo(struct xen_netbk *netbk); +static inline int tx_work_todo(struct xenvif *vif); +static inline int rx_work_todo(struct xenvif *vif); static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 id, @@ -90,16 +75,16 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 size, u16 flags); -static inline unsigned long idx_to_pfn(struct xen_netbk *netbk, +static inline unsigned long idx_to_pfn(struct xenvif *vif, u16 idx) { - return page_to_pfn(to_page(netbk->mmap_pages[idx])); + return page_to_pfn(to_page(vif->mmap_pages[idx])); } -static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, +static inline unsigned long idx_to_kaddr(struct xenvif *vif, u16 idx) { - return (unsigned long)pfn_to_kaddr(idx_to_pfn(netbk, idx)); + return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx)); } /* @@ -127,10 +112,10 @@ static inline pending_ring_idx_t pending_index(unsigned i) return i & (MAX_PENDING_REQS-1); } -static inline pending_ring_idx_t nr_pending_reqs(struct xen_netbk *netbk) +static inline pending_ring_idx_t nr_pending_reqs(struct xenvif *vif) { return MAX_PENDING_REQS - - netbk->pending_prod + netbk->pending_cons; + vif->pending_prod + vif->pending_cons; } static int max_required_rx_slots(struct xenvif *vif) @@ -317,9 +302,9 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, copy_gop->flags = GNTCOPY_dest_gref; if (foreign) { struct pending_tx_info *src_pend = to_txinfo(idx); - struct xen_netbk *rnetbk = to_netbk(idx); + struct xenvif *rvif = to_vif(idx); - copy_gop->source.domid = rnetbk->vif->domid; + copy_gop->source.domid = rvif->domid; copy_gop->source.u.ref = src_pend->req.gref; copy_gop->flags |= GNTCOPY_source_gref; } else { @@ -477,16 +462,13 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_kick_thread(struct xen_netbk *netbk) +static void xen_netbk_kick_thread(struct xenvif *vif) { - struct xenvif *vif = netbk->vif; - wake_up(&vif->wq); } -void xen_netbk_rx_action(struct xen_netbk *netbk) +void xen_netbk_rx_action(struct xenvif *vif) { - struct xenvif *vif = NULL; s8 status; u16 flags; struct xen_netif_rx_response *resp; @@ -516,7 +498,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) pr_alert("CPU %x scratch space is not usable," " not doing any TX work for vif%u.%u\n", smp_processor_id(), - netbk->vif->domid, netbk->vif->handle); + vif->domid, vif->handle); unusable_count = 0; } return; @@ -526,7 +508,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) count = 0; - while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) { + while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) { vif = netdev_priv(skb->dev); nr_frags = skb_shinfo(skb)->nr_frags; @@ -558,8 +540,6 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) while ((skb = __skb_dequeue(&rxq)) != NULL) { sco = (struct skb_cb_overlay *)skb->cb; - vif = netdev_priv(skb->dev); - if (m[npo.meta_cons].gso_size && vif->gso_prefix) { resp = RING_GET_RESPONSE(&vif->rx, vif->rx.rsp_prod_pvt++); @@ -631,8 +611,8 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) if (need_to_notify) notify_remote_via_irq(vif->irq); - if (!skb_queue_empty(&netbk->rx_queue)) - xen_netbk_kick_thread(netbk); + if (!skb_queue_empty(&vif->rx_queue)) + xen_netbk_kick_thread(vif); put_cpu_var(grant_copy_op); put_cpu_var(meta); @@ -640,11 +620,9 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) { - struct xen_netbk *netbk = vif->netbk; - - skb_queue_tail(&netbk->rx_queue, skb); + skb_queue_tail(&vif->rx_queue, skb); - xen_netbk_kick_thread(netbk); + xen_netbk_kick_thread(vif); } void xen_netbk_check_rx_xenvif(struct xenvif *vif) @@ -742,21 +720,20 @@ static int netbk_count_requests(struct xenvif *vif, return frags; } -static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk, +static struct page *xen_netbk_alloc_page(struct xenvif *vif, struct sk_buff *skb, u16 pending_idx) { struct page *page; int idx; - page = page_pool_get(netbk, &idx); + page = page_pool_get(vif, &idx); if (!page) return NULL; - netbk->mmap_pages[pending_idx] = idx; + vif->mmap_pages[pending_idx] = idx; return page; } -static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, - struct xenvif *vif, +static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, struct sk_buff *skb, struct xen_netif_tx_request *txp, struct gnttab_copy *gop) @@ -775,13 +752,13 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, int idx; struct pending_tx_info *pending_tx_info; - index = pending_index(netbk->pending_cons++); - pending_idx = netbk->pending_ring[index]; - page = xen_netbk_alloc_page(netbk, skb, pending_idx); + index = pending_index(vif->pending_cons++); + pending_idx = vif->pending_ring[index]; + page = xen_netbk_alloc_page(vif, skb, pending_idx); if (!page) return NULL; - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); gop->source.u.ref = txp->gref; @@ -805,7 +782,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, return gop; } -static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, +static int xen_netbk_tx_check_gop(struct xenvif *vif, struct sk_buff *skb, struct gnttab_copy **gopp) { @@ -813,8 +790,6 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, u16 pending_idx = *((u16 *)skb->data); struct pending_tx_info *pending_tx_info; int idx; - struct xenvif *vif = netbk->vif; - struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -824,12 +799,12 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, err = gop->status; if (unlikely(err)) { pending_ring_idx_t index; - index = pending_index(netbk->pending_prod++); - idx = netbk->mmap_pages[index]; + index = pending_index(vif->pending_prod++); + idx = vif->mmap_pages[index]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); - netbk->pending_ring[index] = pending_idx; + vif->pending_ring[index] = pending_idx; } /* Skip first skb fragment if it is on same page as header fragment. */ @@ -846,16 +821,16 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, if (likely(!newerr)) { /* Had a previous error? Invalidate this fragment. */ if (unlikely(err)) - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); continue; } /* Error on this fragment: respond to client with an error. */ - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; txp = &to_txinfo(idx)->req; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); - index = pending_index(netbk->pending_prod++); - netbk->pending_ring[index] = pending_idx; + index = pending_index(vif->pending_prod++); + vif->pending_ring[index] = pending_idx; /* Not the first error? Preceding frags already invalidated. */ if (err) @@ -863,10 +838,10 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, /* First error: invalidate header and preceding fragments. */ pending_idx = *((u16 *)skb->data); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); for (j = start; j < i; j++) { pending_idx = frag_get_pending_idx(&shinfo->frags[j]); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } /* Remember the error: invalidate all subsequent fragments. */ @@ -877,7 +852,7 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, return err; } -static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) +static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -893,11 +868,11 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) pending_idx = frag_get_pending_idx(frag); - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; - page = virt_to_page(idx_to_kaddr(netbk, pending_idx)); + page = virt_to_page(idx_to_kaddr(vif, pending_idx)); __skb_fill_page_desc(skb, i, page, txp->offset, txp->size); skb->len += txp->size; skb->data_len += txp->size; @@ -905,7 +880,7 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) /* Take an extra reference to offset xen_netbk_idx_release */ get_page(page); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } } @@ -1066,15 +1041,14 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, +static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, struct gnttab_copy *tco) { struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; int ret; - struct xenvif *vif = netbk->vif; - while ((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { + while ((nr_pending_reqs(vif) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { struct xen_netif_tx_request txreq; struct xen_netif_tx_request txfrags[MAX_SKB_FRAGS]; struct page *page; @@ -1142,8 +1116,8 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, break; } - index = pending_index(netbk->pending_cons); - pending_idx = netbk->pending_ring[index]; + index = pending_index(vif->pending_cons); + pending_idx = vif->pending_ring[index]; data_len = (txreq.size > PKT_PROT_LEN && ret < MAX_SKB_FRAGS) ? @@ -1173,7 +1147,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, } /* XXX could copy straight to head */ - page = xen_netbk_alloc_page(netbk, skb, pending_idx); + page = xen_netbk_alloc_page(vif, skb, pending_idx); if (!page) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); @@ -1193,7 +1167,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, gop++; - pool_idx = netbk->mmap_pages[pending_idx]; + pool_idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(pool_idx); memcpy(&pending_tx_info->req, @@ -1213,11 +1187,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, INVALID_PENDING_IDX); } - __skb_queue_tail(&netbk->tx_queue, skb); + __skb_queue_tail(&vif->tx_queue, skb); - netbk->pending_cons++; + vif->pending_cons++; - request_gop = xen_netbk_get_requests(netbk, vif, + request_gop = xen_netbk_get_requests(vif, skb, txfrags, gop); if (request_gop == NULL) { kfree_skb(skb); @@ -1235,17 +1209,16 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, return gop - tco; } -static int xen_netbk_tx_submit(struct xen_netbk *netbk, - struct gnttab_copy *tco, - int budget) +static int xen_netbk_tx_submit(struct xenvif *vif, + struct gnttab_copy *tco, + int budget) { struct gnttab_copy *gop = tco; struct sk_buff *skb; - struct xenvif *vif = netbk->vif; int work_done = 0; while ((work_done < budget) && - (skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { + (skb = __skb_dequeue(&vif->tx_queue)) != NULL) { struct xen_netif_tx_request *txp; u16 pending_idx; unsigned data_len; @@ -1254,13 +1227,13 @@ static int xen_netbk_tx_submit(struct xen_netbk *netbk, pending_idx = *((u16 *)skb->data); - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; /* Check the remap error code. */ - if (unlikely(xen_netbk_tx_check_gop(netbk, skb, &gop))) { + if (unlikely(xen_netbk_tx_check_gop(vif, skb, &gop))) { netdev_dbg(vif->dev, "netback grant failed.\n"); skb_shinfo(skb)->nr_frags = 0; kfree_skb(skb); @@ -1269,7 +1242,7 @@ static int xen_netbk_tx_submit(struct xen_netbk *netbk, data_len = skb->len; memcpy(skb->data, - (void *)(idx_to_kaddr(netbk, pending_idx)|txp->offset), + (void *)(idx_to_kaddr(vif, pending_idx)|txp->offset), data_len); if (data_len < txp->size) { /* Append the packet payload as a fragment. */ @@ -1277,7 +1250,7 @@ static int xen_netbk_tx_submit(struct xen_netbk *netbk, txp->size -= data_len; } else { /* Schedule a response immediately. */ - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } if (txp->flags & XEN_NETTXF_csum_blank) @@ -1285,7 +1258,7 @@ static int xen_netbk_tx_submit(struct xen_netbk *netbk, else if (txp->flags & XEN_NETTXF_data_validated) skb->ip_summed = CHECKSUM_UNNECESSARY; - xen_netbk_fill_frags(netbk, skb); + xen_netbk_fill_frags(vif, skb); /* * If the initial fragment was < PKT_PROT_LEN then @@ -1319,7 +1292,7 @@ static int xen_netbk_tx_submit(struct xen_netbk *netbk, } /* Called after netfront has transmitted */ -int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) +int xen_netbk_tx_action(struct xenvif *vif, int budget) { unsigned nr_gops; int ret; @@ -1327,7 +1300,7 @@ int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) struct gnttab_copy *tco; static int unusable_count; - if (unlikely(!tx_work_todo(netbk))) + if (unlikely(!tx_work_todo(vif))) return 0; tco = get_cpu_var(tx_copy_ops); @@ -1340,13 +1313,13 @@ int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) " is not usable," " not doing any RX work for vif%u.%u\n", smp_processor_id(), - netbk->vif->domid, netbk->vif->handle); + vif->domid, vif->handle); unusable_count = 0; } return -ENOMEM; } - nr_gops = xen_netbk_tx_build_gops(netbk, tco); + nr_gops = xen_netbk_tx_build_gops(vif, tco); if (nr_gops == 0) { put_cpu_var(tx_copy_ops); @@ -1357,35 +1330,34 @@ int xen_netbk_tx_action(struct xen_netbk *netbk, int budget) tco, nr_gops); BUG_ON(ret); - work_done = xen_netbk_tx_submit(netbk, tco, budget); + work_done = xen_netbk_tx_submit(vif, tco, budget); put_cpu_var(tx_copy_ops); return work_done; } -static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) +static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx) { - struct xenvif *vif = netbk->vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; int idx; /* Already complete? */ - if (netbk->mmap_pages[pending_idx] == INVALID_ENTRY) + if (vif->mmap_pages[pending_idx] == INVALID_ENTRY) return; - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); make_tx_response(vif, &pending_tx_info->req, XEN_NETIF_RSP_OKAY); - index = pending_index(netbk->pending_prod++); - netbk->pending_ring[index] = pending_idx; + index = pending_index(vif->pending_prod++); + vif->pending_ring[index] = pending_idx; - page_pool_put(netbk->mmap_pages[pending_idx]); + page_pool_put(vif->mmap_pages[pending_idx]); - netbk->mmap_pages[pending_idx] = INVALID_ENTRY; + vif->mmap_pages[pending_idx] = INVALID_ENTRY; } static void make_tx_response(struct xenvif *vif, @@ -1432,15 +1404,15 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, return resp; } -static inline int rx_work_todo(struct xen_netbk *netbk) +static inline int rx_work_todo(struct xenvif *vif) { - return !skb_queue_empty(&netbk->rx_queue); + return !skb_queue_empty(&vif->rx_queue); } -static inline int tx_work_todo(struct xen_netbk *netbk) +static inline int tx_work_todo(struct xenvif *vif) { - if (likely(RING_HAS_UNCONSUMED_REQUESTS(&netbk->vif->tx)) && - (nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) + if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) && + (nr_pending_reqs(vif) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) return 1; return 0; @@ -1491,54 +1463,21 @@ err: return err; } -struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif) -{ - int i; - struct xen_netbk *netbk; - - netbk = vzalloc(sizeof(struct xen_netbk)); - if (!netbk) { - pr_alert(DRV_NAME "%s: out of memory\n", __func__); - return NULL; - } - - netbk->vif = vif; - - skb_queue_head_init(&netbk->rx_queue); - skb_queue_head_init(&netbk->tx_queue); - - netbk->pending_cons = 0; - netbk->pending_prod = MAX_PENDING_REQS; - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->pending_ring[i] = i; - - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->mmap_pages[i] = INVALID_ENTRY; - - return netbk; -} - -void xen_netbk_free_netbk(struct xen_netbk *netbk) -{ - vfree(netbk); -} - int xen_netbk_kthread(void *data) { struct xenvif *vif = data; - struct xen_netbk *netbk = vif->netbk; while (!kthread_should_stop()) { wait_event_interruptible(vif->wq, - rx_work_todo(netbk) || + rx_work_todo(vif) || kthread_should_stop()); cond_resched(); if (kthread_should_stop()) break; - if (rx_work_todo(netbk)) - xen_netbk_rx_action(netbk); + if (rx_work_todo(vif)) + xen_netbk_rx_action(vif); } return 0; diff --git a/drivers/net/xen-netback/page_pool.c b/drivers/net/xen-netback/page_pool.c index 294f48b..ce00a93 100644 --- a/drivers/net/xen-netback/page_pool.c +++ b/drivers/net/xen-netback/page_pool.c @@ -102,7 +102,7 @@ int is_in_pool(struct page *page, int *pidx) return get_page_ext(page, pidx); } -struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) +struct page *page_pool_get(struct xenvif *vif, int *pidx) { int idx; struct page *page; @@ -118,7 +118,7 @@ struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) } set_page_ext(page, idx); - pool[idx].u.netbk = netbk; + pool[idx].u.vif = vif; pool[idx].page = page; *pidx = idx; @@ -131,7 +131,7 @@ void page_pool_put(int idx) struct page *page = pool[idx].page; pool[idx].page = NULL; - pool[idx].u.netbk = NULL; + pool[idx].u.vif = NULL; page->mapping = 0; put_page(page); put_free_entry(idx); @@ -174,9 +174,9 @@ struct page *to_page(int idx) return pool[idx].page; } -struct xen_netbk *to_netbk(int idx) +struct xenvif *to_vif(int idx) { - return pool[idx].u.netbk; + return pool[idx].u.vif; } struct pending_tx_info *to_txinfo(int idx) diff --git a/drivers/net/xen-netback/page_pool.h b/drivers/net/xen-netback/page_pool.h index 572b037..efae17c 100644 --- a/drivers/net/xen-netback/page_pool.h +++ b/drivers/net/xen-netback/page_pool.h @@ -27,7 +27,10 @@ #ifndef __PAGE_POOL_H__ #define __PAGE_POOL_H__ -#include "common.h" +struct pending_tx_info { + struct xen_netif_tx_request req; +}; +typedef unsigned int pending_ring_idx_t; typedef uint32_t idx_t; @@ -38,8 +41,8 @@ struct page_pool_entry { struct page *page; struct pending_tx_info tx_info; union { - struct xen_netbk *netbk; - idx_t fl; + struct xenvif *vif; + idx_t fl; } u; }; @@ -52,12 +55,12 @@ int page_pool_init(void); void page_pool_destroy(void); -struct page *page_pool_get(struct xen_netbk *netbk, int *pidx); +struct page *page_pool_get(struct xenvif *vif, int *pidx); void page_pool_put(int idx); int is_in_pool(struct page *page, int *pidx); struct page *to_page(int idx); -struct xen_netbk *to_netbk(int idx); +struct xenvif *to_vif(int idx); struct pending_tx_info *to_txinfo(int idx); #endif /* __PAGE_POOL_H__ */ -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 07/13] netback: alter internal function/structure names.
Since we''ve melted xen_netbk into xenvif, so it is better to give functions clearer names. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 26 ++-- drivers/net/xen-netback/interface.c | 20 ++-- drivers/net/xen-netback/netback.c | 210 +++++++++++++++++----------------- 3 files changed, 128 insertions(+), 128 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index ea91bb6..b7d4442 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -48,7 +48,7 @@ #define DRV_NAME "netback: " #include "page_pool.h" -struct netbk_rx_meta { +struct xenvif_rx_meta { int id; int size; int gso_size; @@ -141,30 +141,30 @@ void xenvif_xenbus_exit(void); int xenvif_schedulable(struct xenvif *vif); -int xen_netbk_rx_ring_full(struct xenvif *vif); +int xenvif_rx_ring_full(struct xenvif *vif); -int xen_netbk_must_stop_queue(struct xenvif *vif); +int xenvif_must_stop_queue(struct xenvif *vif); /* (Un)Map communication rings. */ -void xen_netbk_unmap_frontend_rings(struct xenvif *vif); -int xen_netbk_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref); +void xenvif_unmap_frontend_rings(struct xenvif *vif); +int xenvif_map_frontend_rings(struct xenvif *vif, + grant_ref_t tx_ring_ref, + grant_ref_t rx_ring_ref); /* Check for SKBs from frontend and schedule backend processing */ -void xen_netbk_check_rx_xenvif(struct xenvif *vif); +void xenvif_check_rx_xenvif(struct xenvif *vif); /* Queue an SKB for transmission to the frontend */ -void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb); +void xenvif_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb); /* Notify xenvif that ring now has space to send an skb to the frontend */ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); +unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); -int xen_netbk_tx_action(struct xenvif *vif, int budget); -void xen_netbk_rx_action(struct xenvif *vif); +int xenvif_tx_action(struct xenvif *vif, int budget); +void xenvif_rx_action(struct xenvif *vif); -int xen_netbk_kthread(void *data); +int xenvif_kthread(void *data); #endif /* __XEN_NETBACK__COMMON_H__ */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 9b7d596..b2bde8f 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -48,7 +48,7 @@ int xenvif_schedulable(struct xenvif *vif) static int xenvif_rx_schedulable(struct xenvif *vif) { - return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif); + return xenvif_schedulable(vif) && !xenvif_rx_ring_full(vif); } static irqreturn_t xenvif_interrupt(int irq, void *dev_id) @@ -69,7 +69,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget) struct xenvif *vif = container_of(napi, struct xenvif, napi); int work_done; - work_done = xen_netbk_tx_action(vif, budget); + work_done = xenvif_tx_action(vif, budget); if (work_done < budget) { int more_to_do = 0; @@ -102,12 +102,12 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) goto drop; /* Reserve ring slots for the worst-case number of fragments. */ - vif->rx_req_cons_peek += xen_netbk_count_skb_slots(vif, skb); + vif->rx_req_cons_peek += xenvif_count_skb_slots(vif, skb); - if (vif->can_queue && xen_netbk_must_stop_queue(vif)) + if (vif->can_queue && xenvif_must_stop_queue(vif)) netif_stop_queue(dev); - xen_netbk_queue_tx_skb(vif, skb); + xenvif_queue_tx_skb(vif, skb); return NETDEV_TX_OK; @@ -133,7 +133,7 @@ static void xenvif_up(struct xenvif *vif) { napi_enable(&vif->napi); enable_irq(vif->irq); - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); } static void xenvif_down(struct xenvif *vif) @@ -330,7 +330,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, __module_get(THIS_MODULE); - err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); + err = xenvif_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); if (err < 0) goto err; @@ -343,7 +343,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, disable_irq(vif->irq); init_waitqueue_head(&vif->wq); - vif->task = kthread_create(xen_netbk_kthread, + vif->task = kthread_create(xenvif_kthread, (void *)vif, "vif%d.%d", vif->domid, vif->handle); if (IS_ERR(vif->task)) { @@ -367,7 +367,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, err_unbind: unbind_from_irqhandler(vif->irq, vif); err_unmap: - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); err: module_put(THIS_MODULE); return err; @@ -400,7 +400,7 @@ void xenvif_disconnect(struct xenvif *vif) unregister_netdev(vif->dev); - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); free_netdev(vif->dev); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index ef9cfbe..384f4e5 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -58,9 +58,9 @@ DEFINE_PER_CPU(struct gnttab_copy *, tx_copy_ops); * straddles two buffers in the frontend. */ DEFINE_PER_CPU(struct gnttab_copy *, grant_copy_op); -DEFINE_PER_CPU(struct netbk_rx_meta *, meta); +DEFINE_PER_CPU(struct xenvif_rx_meta *, meta); -static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx); +static void xenvif_idx_release(struct xenvif *vif, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); @@ -128,7 +128,7 @@ static int max_required_rx_slots(struct xenvif *vif) return max; } -int xen_netbk_rx_ring_full(struct xenvif *vif) +int xenvif_rx_ring_full(struct xenvif *vif) { RING_IDX peek = vif->rx_req_cons_peek; RING_IDX needed = max_required_rx_slots(vif); @@ -137,16 +137,16 @@ int xen_netbk_rx_ring_full(struct xenvif *vif) ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed); } -int xen_netbk_must_stop_queue(struct xenvif *vif) +int xenvif_must_stop_queue(struct xenvif *vif) { - if (!xen_netbk_rx_ring_full(vif)) + if (!xenvif_rx_ring_full(vif)) return 0; vif->rx.sring->req_event = vif->rx_req_cons_peek + max_required_rx_slots(vif); mb(); /* request notification /then/ check the queue */ - return xen_netbk_rx_ring_full(vif); + return xenvif_rx_ring_full(vif); } /* @@ -192,9 +192,9 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head) /* * Figure out how many ring slots we''re going to need to send @skb to * the guest. This function is essentially a dry run of - * netbk_gop_frag_copy. + * xenvif_gop_frag_copy. */ -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) +unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) { unsigned int count; int i, copy_off; @@ -233,15 +233,15 @@ struct netrx_pending_operations { unsigned copy_prod, copy_cons; unsigned meta_prod, meta_cons; struct gnttab_copy *copy; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; int copy_off; grant_ref_t copy_gref; }; -static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, - struct netrx_pending_operations *npo) +static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif, + struct netrx_pending_operations *npo) { - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; struct xen_netif_rx_request *req; req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++); @@ -261,13 +261,13 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, * Set up the grant operations for this fragment. If it''s a flipping * interface, we also set up the unmap request from here. */ -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, - struct netrx_pending_operations *npo, - struct page *page, unsigned long size, - unsigned long offset, int *head) +static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, + struct netrx_pending_operations *npo, + struct page *page, unsigned long size, + unsigned long offset, int *head) { struct gnttab_copy *copy_gop; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; /* * These variables are used iff get_page_ext returns true, * in which case they are guaranteed to be initialized. @@ -346,14 +346,14 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, * zero GSO descriptors (for non-GSO packets) or one descriptor (for * frontend-side LRO). */ -static int netbk_gop_skb(struct sk_buff *skb, - struct netrx_pending_operations *npo) +static int xenvif_gop_skb(struct sk_buff *skb, + struct netrx_pending_operations *npo) { struct xenvif *vif = netdev_priv(skb->dev); int nr_frags = skb_shinfo(skb)->nr_frags; int i; struct xen_netif_rx_request *req; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; unsigned char *data; int head = 1; int old_meta_prod; @@ -390,30 +390,30 @@ static int netbk_gop_skb(struct sk_buff *skb, if (data + len > skb_tail_pointer(skb)) len = skb_tail_pointer(skb) - data; - netbk_gop_frag_copy(vif, skb, npo, - virt_to_page(data), len, offset, &head); + xenvif_gop_frag_copy(vif, skb, npo, + virt_to_page(data), len, offset, &head); data += len; } for (i = 0; i < nr_frags; i++) { - netbk_gop_frag_copy(vif, skb, npo, - skb_frag_page(&skb_shinfo(skb)->frags[i]), - skb_frag_size(&skb_shinfo(skb)->frags[i]), - skb_shinfo(skb)->frags[i].page_offset, - &head); + xenvif_gop_frag_copy(vif, skb, npo, + skb_frag_page(&skb_shinfo(skb)->frags[i]), + skb_frag_size(&skb_shinfo(skb)->frags[i]), + skb_shinfo(skb)->frags[i].page_offset, + &head); } return npo->meta_prod - old_meta_prod; } /* - * This is a twin to netbk_gop_skb. Assume that netbk_gop_skb was + * This is a twin to xenvif_gop_skb. Assume that xenvif_gop_skb was * used to set up the operations on the top of * netrx_pending_operations, which have since been done. Check that * they didn''t give any errors and advance over them. */ -static int netbk_check_gop(struct xenvif *vif, int nr_meta_slots, - struct netrx_pending_operations *npo) +static int xenvif_check_gop(struct xenvif *vif, int nr_meta_slots, + struct netrx_pending_operations *npo) { struct gnttab_copy *copy_op; int status = XEN_NETIF_RSP_OKAY; @@ -432,9 +432,9 @@ static int netbk_check_gop(struct xenvif *vif, int nr_meta_slots, return status; } -static void netbk_add_frag_responses(struct xenvif *vif, int status, - struct netbk_rx_meta *meta, - int nr_meta_slots) +static void xenvif_add_frag_responses(struct xenvif *vif, int status, + struct xenvif_rx_meta *meta, + int nr_meta_slots) { int i; unsigned long offset; @@ -462,12 +462,12 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_kick_thread(struct xenvif *vif) +static void xenvif_kick_thread(struct xenvif *vif) { wake_up(&vif->wq); } -void xen_netbk_rx_action(struct xenvif *vif) +void xenvif_rx_action(struct xenvif *vif) { s8 status; u16 flags; @@ -484,7 +484,7 @@ void xen_netbk_rx_action(struct xenvif *vif) static int unusable_count; struct gnttab_copy *gco = get_cpu_var(grant_copy_op); - struct netbk_rx_meta *m = get_cpu_var(meta); + struct xenvif_rx_meta *m = get_cpu_var(meta); struct netrx_pending_operations npo = { .copy = gco, @@ -513,7 +513,7 @@ void xen_netbk_rx_action(struct xenvif *vif) nr_frags = skb_shinfo(skb)->nr_frags; sco = (struct skb_cb_overlay *)skb->cb; - sco->meta_slots_used = netbk_gop_skb(skb, &npo); + sco->meta_slots_used = xenvif_gop_skb(skb, &npo); count += nr_frags + 1; @@ -558,7 +558,7 @@ void xen_netbk_rx_action(struct xenvif *vif) vif->dev->stats.tx_bytes += skb->len; vif->dev->stats.tx_packets++; - status = netbk_check_gop(vif, sco->meta_slots_used, &npo); + status = xenvif_check_gop(vif, sco->meta_slots_used, &npo); if (sco->meta_slots_used == 1) flags = 0; @@ -594,9 +594,9 @@ void xen_netbk_rx_action(struct xenvif *vif) gso->flags = 0; } - netbk_add_frag_responses(vif, status, - m + npo.meta_cons + 1, - sco->meta_slots_used); + xenvif_add_frag_responses(vif, status, + m + npo.meta_cons + 1, + sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); if (ret) @@ -612,20 +612,20 @@ void xen_netbk_rx_action(struct xenvif *vif) notify_remote_via_irq(vif->irq); if (!skb_queue_empty(&vif->rx_queue)) - xen_netbk_kick_thread(vif); + xenvif_kick_thread(vif); put_cpu_var(grant_copy_op); put_cpu_var(meta); } -void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) +void xenvif_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) { skb_queue_tail(&vif->rx_queue, skb); - xen_netbk_kick_thread(vif); + xenvif_kick_thread(vif); } -void xen_netbk_check_rx_xenvif(struct xenvif *vif) +void xenvif_check_rx_xenvif(struct xenvif *vif) { int more_to_do; @@ -662,11 +662,11 @@ static void tx_credit_callback(unsigned long data) { struct xenvif *vif = (struct xenvif *)data; tx_add_credit(vif); - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); } -static void netbk_tx_err(struct xenvif *vif, - struct xen_netif_tx_request *txp, RING_IDX end) +static void xenvif_tx_err(struct xenvif *vif, + struct xen_netif_tx_request *txp, RING_IDX end) { RING_IDX cons = vif->tx.req_cons; @@ -679,7 +679,7 @@ static void netbk_tx_err(struct xenvif *vif, vif->tx.req_cons = cons; } -static int netbk_count_requests(struct xenvif *vif, +static int xenvif_count_requests(struct xenvif *vif, struct xen_netif_tx_request *first, struct xen_netif_tx_request *txp, int work_to_do) @@ -720,9 +720,9 @@ static int netbk_count_requests(struct xenvif *vif, return frags; } -static struct page *xen_netbk_alloc_page(struct xenvif *vif, - struct sk_buff *skb, - u16 pending_idx) +static struct page *xenvif_alloc_page(struct xenvif *vif, + struct sk_buff *skb, + u16 pending_idx) { struct page *page; int idx; @@ -733,10 +733,10 @@ static struct page *xen_netbk_alloc_page(struct xenvif *vif, return page; } -static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, - struct sk_buff *skb, - struct xen_netif_tx_request *txp, - struct gnttab_copy *gop) +static struct gnttab_copy *xenvif_get_requests(struct xenvif *vif, + struct sk_buff *skb, + struct xen_netif_tx_request *txp, + struct gnttab_copy *gop) { struct skb_shared_info *shinfo = skb_shinfo(skb); skb_frag_t *frags = shinfo->frags; @@ -754,7 +754,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, index = pending_index(vif->pending_cons++); pending_idx = vif->pending_ring[index]; - page = xen_netbk_alloc_page(vif, skb, pending_idx); + page = xenvif_alloc_page(vif, skb, pending_idx); if (!page) return NULL; @@ -782,9 +782,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, return gop; } -static int xen_netbk_tx_check_gop(struct xenvif *vif, - struct sk_buff *skb, - struct gnttab_copy **gopp) +static int xenvif_tx_check_gop(struct xenvif *vif, + struct sk_buff *skb, + struct gnttab_copy **gopp) { struct gnttab_copy *gop = *gopp; u16 pending_idx = *((u16 *)skb->data); @@ -821,7 +821,7 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, if (likely(!newerr)) { /* Had a previous error? Invalidate this fragment. */ if (unlikely(err)) - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); continue; } @@ -838,10 +838,10 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, /* First error: invalidate header and preceding fragments. */ pending_idx = *((u16 *)skb->data); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); for (j = start; j < i; j++) { pending_idx = frag_get_pending_idx(&shinfo->frags[j]); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } /* Remember the error: invalidate all subsequent fragments. */ @@ -852,7 +852,7 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, return err; } -static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) +static void xenvif_fill_frags(struct xenvif *vif, struct sk_buff *skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -878,15 +878,15 @@ static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) skb->data_len += txp->size; skb->truesize += txp->size; - /* Take an extra reference to offset xen_netbk_idx_release */ + /* Take an extra reference to offset xenvif_idx_release */ get_page(page); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } } -static int xen_netbk_get_extras(struct xenvif *vif, - struct xen_netif_extra_info *extras, - int work_to_do) +static int xenvif_get_extras(struct xenvif *vif, + struct xen_netif_extra_info *extras, + int work_to_do) { struct xen_netif_extra_info extra; RING_IDX cons = vif->tx.req_cons; @@ -914,9 +914,9 @@ static int xen_netbk_get_extras(struct xenvif *vif, return work_to_do; } -static int netbk_set_skb_gso(struct xenvif *vif, - struct sk_buff *skb, - struct xen_netif_extra_info *gso) +static int xenvif_set_skb_gso(struct xenvif *vif, + struct sk_buff *skb, + struct xen_netif_extra_info *gso) { if (!gso->u.gso.size) { netdev_dbg(vif->dev, "GSO size must not be zero.\n"); @@ -1041,8 +1041,8 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, - struct gnttab_copy *tco) +static unsigned xenvif_tx_build_gops(struct xenvif *vif, + struct gnttab_copy *tco) { struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; @@ -1083,18 +1083,18 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, memset(extras, 0, sizeof(extras)); if (txreq.flags & XEN_NETTXF_extra_info) { - work_to_do = xen_netbk_get_extras(vif, extras, + work_to_do = xenvif_get_extras(vif, extras, work_to_do); idx = vif->tx.req_cons; if (unlikely(work_to_do < 0)) { - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } } - ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do); + ret = xenvif_count_requests(vif, &txreq, txfrags, work_to_do); if (unlikely(ret < 0)) { - netbk_tx_err(vif, &txreq, idx - ret); + xenvif_tx_err(vif, &txreq, idx - ret); break; } idx += ret; @@ -1102,7 +1102,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, if (unlikely(txreq.size < ETH_HLEN)) { netdev_dbg(vif->dev, "Bad packet size: %d\n", txreq.size); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1112,7 +1112,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, "txreq.offset: %x, size: %u, end: %lu\n", txreq.offset, txreq.size, (txreq.offset&~PAGE_MASK) + txreq.size); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1128,7 +1128,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, if (unlikely(skb == NULL)) { netdev_dbg(vif->dev, "Can''t allocate a skb in start_xmit.\n"); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1139,18 +1139,18 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, struct xen_netif_extra_info *gso; gso = &extras[XEN_NETIF_EXTRA_TYPE_GSO - 1]; - if (netbk_set_skb_gso(vif, skb, gso)) { + if (xenvif_set_skb_gso(vif, skb, gso)) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } } /* XXX could copy straight to head */ - page = xen_netbk_alloc_page(vif, skb, pending_idx); + page = xenvif_alloc_page(vif, skb, pending_idx); if (!page) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1191,11 +1191,11 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, vif->pending_cons++; - request_gop = xen_netbk_get_requests(vif, + request_gop = xenvif_get_requests(vif, skb, txfrags, gop); if (request_gop == NULL) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } gop = request_gop; @@ -1209,9 +1209,9 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, return gop - tco; } -static int xen_netbk_tx_submit(struct xenvif *vif, - struct gnttab_copy *tco, - int budget) +static int xenvif_tx_submit(struct xenvif *vif, + struct gnttab_copy *tco, + int budget) { struct gnttab_copy *gop = tco; struct sk_buff *skb; @@ -1233,7 +1233,7 @@ static int xen_netbk_tx_submit(struct xenvif *vif, txp = &pending_tx_info->req; /* Check the remap error code. */ - if (unlikely(xen_netbk_tx_check_gop(vif, skb, &gop))) { + if (unlikely(xenvif_tx_check_gop(vif, skb, &gop))) { netdev_dbg(vif->dev, "netback grant failed.\n"); skb_shinfo(skb)->nr_frags = 0; kfree_skb(skb); @@ -1250,7 +1250,7 @@ static int xen_netbk_tx_submit(struct xenvif *vif, txp->size -= data_len; } else { /* Schedule a response immediately. */ - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } if (txp->flags & XEN_NETTXF_csum_blank) @@ -1258,7 +1258,7 @@ static int xen_netbk_tx_submit(struct xenvif *vif, else if (txp->flags & XEN_NETTXF_data_validated) skb->ip_summed = CHECKSUM_UNNECESSARY; - xen_netbk_fill_frags(vif, skb); + xenvif_fill_frags(vif, skb); /* * If the initial fragment was < PKT_PROT_LEN then @@ -1292,7 +1292,8 @@ static int xen_netbk_tx_submit(struct xenvif *vif, } /* Called after netfront has transmitted */ -int xen_netbk_tx_action(struct xenvif *vif, int budget) + +int xenvif_tx_action(struct xenvif *vif, int budget) { unsigned nr_gops; int ret; @@ -1319,7 +1320,7 @@ int xen_netbk_tx_action(struct xenvif *vif, int budget) return -ENOMEM; } - nr_gops = xen_netbk_tx_build_gops(vif, tco); + nr_gops = xenvif_tx_build_gops(vif, tco); if (nr_gops == 0) { put_cpu_var(tx_copy_ops); @@ -1330,14 +1331,14 @@ int xen_netbk_tx_action(struct xenvif *vif, int budget) tco, nr_gops); BUG_ON(ret); - work_done = xen_netbk_tx_submit(vif, tco, budget); + work_done = xenvif_tx_submit(vif, tco, budget); put_cpu_var(tx_copy_ops); return work_done; } -static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx) +static void xenvif_idx_release(struct xenvif *vif, u16 pending_idx) { struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; @@ -1418,7 +1419,7 @@ static inline int tx_work_todo(struct xenvif *vif) return 0; } -void xen_netbk_unmap_frontend_rings(struct xenvif *vif) +void xenvif_unmap_frontend_rings(struct xenvif *vif) { if (vif->tx.sring) xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), @@ -1428,9 +1429,9 @@ void xen_netbk_unmap_frontend_rings(struct xenvif *vif) vif->rx.sring); } -int xen_netbk_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref) +int xenvif_map_frontend_rings(struct xenvif *vif, + grant_ref_t tx_ring_ref, + grant_ref_t rx_ring_ref) { void *addr; struct xen_netif_tx_sring *txs; @@ -1459,11 +1460,11 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, return 0; err: - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); return err; } -int xen_netbk_kthread(void *data) +int xenvif_kthread(void *data) { struct xenvif *vif = data; @@ -1477,7 +1478,7 @@ int xen_netbk_kthread(void *data) break; if (rx_work_todo(vif)) - xen_netbk_rx_action(vif); + xenvif_rx_action(vif); } return 0; @@ -1508,7 +1509,6 @@ static int __create_percpu_scratch_space(unsigned int cpu) vzalloc(sizeof(struct gnttab_copy) * 2 * XEN_NETIF_RX_RING_SIZE); - per_cpu(meta, cpu) = vzalloc_node(sizeof(struct xenvif_rx_meta) * 2 * XEN_NETIF_RX_RING_SIZE, cpu_to_node(cpu)); -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/xenbus/xenbus_client.c | 282 +++++++++++++++++++++++++----------- include/xen/xenbus.h | 15 ++- 2 files changed, 206 insertions(+), 91 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c index 566d2ad..d73b9c6 100644 --- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -53,14 +53,16 @@ struct xenbus_map_node { struct vm_struct *area; /* PV */ struct page *page; /* HVM */ }; - grant_handle_t handle; + grant_handle_t handle[XENBUS_MAX_RING_PAGES]; + unsigned int nr_handles; }; static DEFINE_SPINLOCK(xenbus_valloc_lock); static LIST_HEAD(xenbus_valloc_pages); struct xenbus_ring_ops { - int (*map)(struct xenbus_device *dev, int gnt, void **vaddr); + int (*map)(struct xenbus_device *dev, int gnt[], int nr_gnts, + void **vaddr); int (*unmap)(struct xenbus_device *dev, void *vaddr); }; @@ -356,17 +358,38 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err, /** * xenbus_grant_ring * @dev: xenbus device - * @ring_mfn: mfn of ring to grant - - * Grant access to the given @ring_mfn to the peer of the given device. Return - * 0 on success, or -errno on error. On error, the device will switch to - * XenbusStateClosing, and the error will be saved in the store. + * @vaddr: starting virtual address of the ring + * @nr_pages: number of page to be granted + * @grefs: grant reference array to be filled in + * Grant access to the given @vaddr to the peer of the given device. + * Then fill in @grefs with grant references. Return 0 on success, or + * -errno on error. On error, the device will switch to + * XenbusStateClosing, and the first error will be saved in the store. */ -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn) +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr, + int nr_pages, int grefs[]) { - int err = gnttab_grant_foreign_access(dev->otherend_id, ring_mfn, 0); - if (err < 0) - xenbus_dev_fatal(dev, err, "granting access to ring page"); + int i; + int err; + + for (i = 0; i < nr_pages; i++) { + unsigned long addr = (unsigned long)vaddr + + (PAGE_SIZE * i); + err = gnttab_grant_foreign_access(dev->otherend_id, + virt_to_mfn(addr), 0); + if (err < 0) { + xenbus_dev_fatal(dev, err, + "granting access to ring page"); + goto fail; + } + grefs[i] = err; + } + + return 0; + +fail: + for ( ; i >= 0; i--) + gnttab_end_foreign_access_ref(grefs[i], 0); return err; } EXPORT_SYMBOL_GPL(xenbus_grant_ring); @@ -447,7 +470,8 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn); /** * xenbus_map_ring_valloc * @dev: xenbus device - * @gnt_ref: grant reference + * @gnt_ref: grant reference array + * @nr_grefs: number of grant reference * @vaddr: pointer to address to be filled out by mapping * * Based on Rusty Russell''s skeleton driver''s map_page. @@ -458,52 +482,74 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn); * or -ENOMEM on error. If an error is returned, device will switch to * XenbusStateClosing and the error message will be saved in XenStore. */ -int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref, void **vaddr) +int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref[], + int nr_grefs, void **vaddr) { - return ring_ops->map(dev, gnt_ref, vaddr); + return ring_ops->map(dev, gnt_ref, nr_grefs, vaddr); } EXPORT_SYMBOL_GPL(xenbus_map_ring_valloc); +static int __xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, + struct xenbus_map_node *node); + static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev, - int gnt_ref, void **vaddr) + int gnt_ref[], int nr_grefs, void **vaddr) { - struct gnttab_map_grant_ref op = { - .flags = GNTMAP_host_map | GNTMAP_contains_pte, - .ref = gnt_ref, - .dom = dev->otherend_id, - }; + struct gnttab_map_grant_ref op[XENBUS_MAX_RING_PAGES]; struct xenbus_map_node *node; struct vm_struct *area; - pte_t *pte; + pte_t *pte[XENBUS_MAX_RING_PAGES]; + int i; + int err = 0; *vaddr = NULL; + if (nr_grefs > XENBUS_MAX_RING_PAGES) + return -EINVAL; + node = kzalloc(sizeof(*node), GFP_KERNEL); if (!node) return -ENOMEM; - area = alloc_vm_area(PAGE_SIZE, &pte); + area = alloc_vm_area(PAGE_SIZE * nr_grefs, pte); if (!area) { kfree(node); return -ENOMEM; } - op.host_addr = arbitrary_virt_to_machine(pte).maddr; + for (i = 0; i < nr_grefs; i++) { + op[i].flags = GNTMAP_host_map | GNTMAP_contains_pte, + op[i].ref = gnt_ref[i], + op[i].dom = dev->otherend_id, + op[i].host_addr = arbitrary_virt_to_machine(pte[i]).maddr; + }; - if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) + if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, op, nr_grefs)) BUG(); - if (op.status != GNTST_okay) { - free_vm_area(area); - kfree(node); - xenbus_dev_fatal(dev, op.status, + node->nr_handles = nr_grefs; + node->area = area; + + for (i = 0; i < nr_grefs; i++) { + if (op[i].status != GNTST_okay) { + err = op[i].status; + node->handle[i] = INVALID_GRANT_HANDLE; + continue; + } + node->handle[i] = op[i].handle; + } + + if (err != 0) { + for (i = 0; i < nr_grefs; i++) + xenbus_dev_fatal(dev, op[i].status, "mapping in shared page %d from domain %d", - gnt_ref, dev->otherend_id); - return op.status; + gnt_ref[i], dev->otherend_id); + + __xenbus_unmap_ring_vfree_pv(dev, node); + + return err; } - node->handle = op.handle; - node->area = area; spin_lock(&xenbus_valloc_lock); list_add(&node->next, &xenbus_valloc_pages); @@ -514,28 +560,34 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev, } static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev, - int gnt_ref, void **vaddr) + int gnt_ref[], int nr_grefs, void **vaddr) { struct xenbus_map_node *node; int err; void *addr; + if (nr_grefs > XENBUS_MAX_RING_PAGES) + return -EINVAL; + *vaddr = NULL; node = kzalloc(sizeof(*node), GFP_KERNEL); if (!node) return -ENOMEM; - err = alloc_xenballooned_pages(1, &node->page, false /* lowmem */); + err = alloc_xenballooned_pages(nr_grefs, &node->page, + false /* lowmem */); if (err) goto out_err; addr = pfn_to_kaddr(page_to_pfn(node->page)); - err = xenbus_map_ring(dev, gnt_ref, &node->handle, addr); + err = xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handle, addr); if (err) goto out_err; + node->nr_handles = nr_grefs; + spin_lock(&xenbus_valloc_lock); list_add(&node->next, &xenbus_valloc_pages); spin_unlock(&xenbus_valloc_lock); @@ -544,7 +596,7 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev, return 0; out_err: - free_xenballooned_pages(1, &node->page); + free_xenballooned_pages(nr_grefs, &node->page); kfree(node); return err; } @@ -553,36 +605,52 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev, /** * xenbus_map_ring * @dev: xenbus device - * @gnt_ref: grant reference - * @handle: pointer to grant handle to be filled + * @gnt_ref: grant reference array + * @nr_grefs: number of grant reference + * @handle: pointer to grant handle array to be filled, mind the size * @vaddr: address to be mapped to * - * Map a page of memory into this domain from another domain''s grant table. + * Map pages of memory into this domain from another domain''s grant table. * xenbus_map_ring does not allocate the virtual address space (you must do - * this yourself!). It only maps in the page to the specified address. + * this yourself!). It only maps in the pages to the specified address. * Returns 0 on success, and GNTST_* (see xen/include/interface/grant_table.h) * or -ENOMEM on error. If an error is returned, device will switch to - * XenbusStateClosing and the error message will be saved in XenStore. + * XenbusStateClosing and the last error message will be saved in XenStore. */ -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref, - grant_handle_t *handle, void *vaddr) +int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref[], int nr_grefs, + grant_handle_t handle[], void *vaddr) { - struct gnttab_map_grant_ref op; - - gnttab_set_map_op(&op, (phys_addr_t)vaddr, GNTMAP_host_map, gnt_ref, - dev->otherend_id); + struct gnttab_map_grant_ref op[XENBUS_MAX_RING_PAGES]; + int i; + int err = GNTST_okay; /* 0 */ + + for (i = 0; i < nr_grefs; i++) { + unsigned long addr = (unsigned long)vaddr + + (PAGE_SIZE * i); + gnttab_set_map_op(&op[i], (phys_addr_t)addr, + GNTMAP_host_map, gnt_ref[i], + dev->otherend_id); + } - if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) + if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, op, nr_grefs)) BUG(); - if (op.status != GNTST_okay) { - xenbus_dev_fatal(dev, op.status, + + for (i = 0; i < nr_grefs; i++) { + if (op[i].status != GNTST_okay) { + err = op[i].status; + xenbus_dev_fatal(dev, err, "mapping in shared page %d from domain %d", - gnt_ref, dev->otherend_id); - } else - *handle = op.handle; + gnt_ref[i], dev->otherend_id); + handle[i] = INVALID_GRANT_HANDLE; + } else + handle[i] = op[i].handle; + } + + if (err != GNTST_okay) + xenbus_unmap_ring(dev, handle, nr_grefs, vaddr); - return op.status; + return err; } EXPORT_SYMBOL_GPL(xenbus_map_ring); @@ -605,13 +673,53 @@ int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr) } EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree); +static int __xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, + struct xenbus_map_node *node) +{ + struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES]; + unsigned int level; + int i, j; + int err = GNTST_okay; + + j = 0; + for (i = 0; i < node->nr_handles; i++) { + unsigned long vaddr = (unsigned long)node->area->addr + + (PAGE_SIZE * i); + if (node->handle[i] != INVALID_GRANT_HANDLE) { + memset(&op[j], 0, sizeof(op[0])); + op[j].host_addr = arbitrary_virt_to_machine( + lookup_address(vaddr, &level)).maddr; + op[j].handle = node->handle[i]; + j++; + node->handle[i] = INVALID_GRANT_HANDLE; + } + } + + if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op, j)) + BUG(); + + node->nr_handles = 0; + + for (i = 0; i < j; i++) { + if (op[i].status != GNTST_okay) { + err = op[i].status; + xenbus_dev_error(dev, err, + "unmapping page %d at handle %d error %d", + i, op[i].handle, err); + } + } + + if (err == GNTST_okay) + free_vm_area(node->area); + + kfree(node); + + return err; +} + static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr) { struct xenbus_map_node *node; - struct gnttab_unmap_grant_ref op = { - .host_addr = (unsigned long)vaddr, - }; - unsigned int level; spin_lock(&xenbus_valloc_lock); list_for_each_entry(node, &xenbus_valloc_pages, next) { @@ -630,29 +738,14 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr) return GNTST_bad_virt_addr; } - op.handle = node->handle; - op.host_addr = arbitrary_virt_to_machine( - lookup_address((unsigned long)vaddr, &level)).maddr; - - if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1)) - BUG(); - - if (op.status == GNTST_okay) - free_vm_area(node->area); - else - xenbus_dev_error(dev, op.status, - "unmapping page at handle %d error %d", - node->handle, op.status); - - kfree(node); - return op.status; + return __xenbus_unmap_ring_vfree_pv(dev, node); } static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr) { int rv; struct xenbus_map_node *node; - void *addr; + void *addr = NULL; spin_lock(&xenbus_valloc_lock); list_for_each_entry(node, &xenbus_valloc_pages, next) { @@ -672,10 +765,10 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr) return GNTST_bad_virt_addr; } - rv = xenbus_unmap_ring(dev, node->handle, addr); + rv = xenbus_unmap_ring(dev, node->handle, node->nr_handles, addr); if (!rv) - free_xenballooned_pages(1, &node->page); + free_xenballooned_pages(node->nr_handles, &node->page); else WARN(1, "Leaking %p\n", vaddr); @@ -687,6 +780,7 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr) * xenbus_unmap_ring * @dev: xenbus device * @handle: grant handle + * @nr_handles: number of grant handle * @vaddr: addr to unmap * * Unmap a page of memory in this domain that was imported from another domain. @@ -694,21 +788,37 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr) * (see xen/include/interface/grant_table.h). */ int xenbus_unmap_ring(struct xenbus_device *dev, - grant_handle_t handle, void *vaddr) + grant_handle_t handle[], int nr_handles, + void *vaddr) { - struct gnttab_unmap_grant_ref op; - - gnttab_set_unmap_op(&op, (phys_addr_t)vaddr, GNTMAP_host_map, handle); + struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES]; + int i, j; + int err = GNTST_okay; + + j = 0; + for (i = 0; i < nr_handles; i++) { + unsigned long addr = (unsigned long)vaddr + + (PAGE_SIZE * i); + if (handle[i] != INVALID_GRANT_HANDLE) { + gnttab_set_unmap_op(&op[j++], (phys_addr_t)addr, + GNTMAP_host_map, handle[i]); + handle[i] = INVALID_GRANT_HANDLE; + } + } - if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1)) + if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op, j)) BUG(); - if (op.status != GNTST_okay) - xenbus_dev_error(dev, op.status, + for (i = 0; i < j; i++) { + if (op[i].status != GNTST_okay) { + err = op[i].status; + xenbus_dev_error(dev, err, "unmapping page at handle %d error %d", - handle, op.status); + handle[i], err); + } + } - return op.status; + return err; } EXPORT_SYMBOL_GPL(xenbus_unmap_ring); diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index e8c599b..284647f 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -46,6 +46,10 @@ #include <xen/interface/io/xenbus.h> #include <xen/interface/io/xs_wire.h> +#define XENBUS_MAX_RING_PAGE_ORDER 2 +#define XENBUS_MAX_RING_PAGES 4 +#define INVALID_GRANT_HANDLE (~0U) + /* Register callback to watch this node. */ struct xenbus_watch { @@ -195,15 +199,16 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch, const char *pathfmt, ...); int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state); -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn); +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr, + int nr_pages, int grefs[]); int xenbus_map_ring_valloc(struct xenbus_device *dev, - int gnt_ref, void **vaddr); -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref, - grant_handle_t *handle, void *vaddr); + int gnt_ref[], int nr_grefs, void **vaddr); +int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref[], int nr_grefs, + grant_handle_t handle[], void *vaddr); int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr); int xenbus_unmap_ring(struct xenbus_device *dev, - grant_handle_t handle, void *vaddr); + grant_handle_t handle[], int nr_handles, void *vaddr); int xenbus_alloc_evtchn(struct xenbus_device *dev, int *port); int xenbus_bind_evtchn(struct xenbus_device *dev, int remote_port, int *port); -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 09/13] Bundle fix for xen backends and frontends
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/block/xen-blkback/xenbus.c | 8 +++++--- drivers/block/xen-blkfront.c | 5 +++-- drivers/net/xen-netback/netback.c | 4 ++-- drivers/net/xen-netfront.c | 9 +++++---- drivers/pci/xen-pcifront.c | 5 +++-- drivers/scsi/xen-scsiback/common.h | 3 ++- drivers/scsi/xen-scsiback/interface.c | 6 ++++-- drivers/scsi/xen-scsiback/xenbus.c | 4 ++-- drivers/scsi/xen-scsifront/xenbus.c | 5 +++-- drivers/xen/xen-pciback/xenbus.c | 11 ++++++----- 10 files changed, 35 insertions(+), 25 deletions(-) diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 9e9c8a1..ef7e88b 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) return blkif; } -static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page, +static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page[], + int nr_pages, unsigned int evtchn) { int err; @@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page, if (blkif->irq) return 0; - err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring); + err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, + nr_pages, &blkif->blk_ring); if (err < 0) return err; @@ -779,7 +781,7 @@ static int connect_ring(struct backend_info *be) ring_ref, evtchn, be->blkif->blk_protocol, protocol); /* Map the shared frame, irq etc. */ - err = xen_blkif_map(be->blkif, ring_ref, evtchn); + err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn); if (err) { xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u", ring_ref, evtchn); diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 2f22874..2c6443a 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -827,6 +827,7 @@ static int setup_blkring(struct xenbus_device *dev, { struct blkif_sring *sring; int err; + int grefs[1]; info->ring_ref = GRANT_INVALID_REF; @@ -840,13 +841,13 @@ static int setup_blkring(struct xenbus_device *dev, sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST); - err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring)); + err = xenbus_grant_ring(dev, info->ring.sring, 1, grefs); if (err < 0) { free_page((unsigned long)sring); info->ring.sring = NULL; goto fail; } - info->ring_ref = err; + info->ring_ref = grefs[0]; err = xenbus_alloc_evtchn(dev, &info->evtchn); if (err) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 384f4e5..cb1a661 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1440,7 +1440,7 @@ int xenvif_map_frontend_rings(struct xenvif *vif, int err = -ENOMEM; err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), - tx_ring_ref, &addr); + &tx_ring_ref, 1, &addr); if (err) goto err; @@ -1448,7 +1448,7 @@ int xenvif_map_frontend_rings(struct xenvif *vif, BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE); err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), - rx_ring_ref, &addr); + &rx_ring_ref, 1, &addr); if (err) goto err; diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 01f589d..b7ff815 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1482,6 +1482,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) struct xen_netif_tx_sring *txs; struct xen_netif_rx_sring *rxs; int err; + int grefs[1]; struct net_device *netdev = info->netdev; info->tx_ring_ref = GRANT_INVALID_REF; @@ -1505,13 +1506,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) SHARED_RING_INIT(txs); FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE); - err = xenbus_grant_ring(dev, virt_to_mfn(txs)); + err = xenbus_grant_ring(dev, txs, 1, grefs); if (err < 0) { free_page((unsigned long)txs); goto fail; } - info->tx_ring_ref = err; + info->tx_ring_ref = grefs[0]; rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH); if (!rxs) { err = -ENOMEM; @@ -1521,12 +1522,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) SHARED_RING_INIT(rxs); FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE); - err = xenbus_grant_ring(dev, virt_to_mfn(rxs)); + err = xenbus_grant_ring(dev, rxs, 1, grefs); if (err < 0) { free_page((unsigned long)rxs); goto fail; } - info->rx_ring_ref = err; + info->rx_ring_ref = grefs[0]; err = xenbus_alloc_evtchn(dev, &info->evtchn); if (err) diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c index 7cf3d2f..394c926 100644 --- a/drivers/pci/xen-pcifront.c +++ b/drivers/pci/xen-pcifront.c @@ -767,12 +767,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev) { int err = 0; struct xenbus_transaction trans; + int grefs[1]; - err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info)); + err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs); if (err < 0) goto out; - pdev->gnt_ref = err; + pdev->gnt_ref = grefs[0]; err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn); if (err) diff --git a/drivers/scsi/xen-scsiback/common.h b/drivers/scsi/xen-scsiback/common.h index dafa79e..4d13617 100644 --- a/drivers/scsi/xen-scsiback/common.h +++ b/drivers/scsi/xen-scsiback/common.h @@ -150,7 +150,8 @@ typedef struct { irqreturn_t scsiback_intr(int, void *); int scsiback_init_sring(struct vscsibk_info *info, - unsigned long ring_ref, unsigned int evtchn); + int ring_ref[], int nr_refs, + unsigned int evtchn); int scsiback_schedule(void *data); diff --git a/drivers/scsi/xen-scsiback/interface.c b/drivers/scsi/xen-scsiback/interface.c index 663568e..fad0a63 100644 --- a/drivers/scsi/xen-scsiback/interface.c +++ b/drivers/scsi/xen-scsiback/interface.c @@ -60,7 +60,8 @@ struct vscsibk_info *vscsibk_info_alloc(domid_t domid) } int scsiback_init_sring(struct vscsibk_info *info, - unsigned long ring_ref, unsigned int evtchn) + int ring_ref[], int nr_refs, + unsigned int evtchn) { struct vscsiif_sring *sring; int err; @@ -73,7 +74,8 @@ int scsiback_init_sring(struct vscsibk_info *info, return -1; } - err = xenbus_map_ring_valloc(info->dev, ring_ref, &info->ring_area); + err = xenbus_map_ring_valloc(info->dev, ring_ref, nr_refs, + &info->ring_area); if (err < 0) return -ENOMEM; diff --git a/drivers/scsi/xen-scsiback/xenbus.c b/drivers/scsi/xen-scsiback/xenbus.c index 2869f89..81d5598 100644 --- a/drivers/scsi/xen-scsiback/xenbus.c +++ b/drivers/scsi/xen-scsiback/xenbus.c @@ -60,7 +60,7 @@ static int __vscsiif_name(struct backend_info *be, char *buf) static int scsiback_map(struct backend_info *be) { struct xenbus_device *dev = be->dev; - unsigned long ring_ref = 0; + int ring_ref = 0; unsigned int evtchn = 0; int err; char name[TASK_COMM_LEN]; @@ -72,7 +72,7 @@ static int scsiback_map(struct backend_info *be) xenbus_dev_fatal(dev, err, "reading %s ring", dev->otherend); return err; } - err = scsiback_init_sring(be->info, ring_ref, evtchn); + err = scsiback_init_sring(be->info, &ring_ref, 1, evtchn); if (err) return err; diff --git a/drivers/scsi/xen-scsifront/xenbus.c b/drivers/scsi/xen-scsifront/xenbus.c index bc5c289..8726410 100644 --- a/drivers/scsi/xen-scsifront/xenbus.c +++ b/drivers/scsi/xen-scsifront/xenbus.c @@ -60,6 +60,7 @@ static int scsifront_alloc_ring(struct vscsifrnt_info *info) struct xenbus_device *dev = info->dev; struct vscsiif_sring *sring; int err = -ENOMEM; + int grefs[1]; info->ring_ref = GRANT_INVALID_REF; @@ -73,14 +74,14 @@ static int scsifront_alloc_ring(struct vscsifrnt_info *info) SHARED_RING_INIT(sring); FRONT_RING_INIT(&info->ring, sring, PAGE_SIZE); - err = xenbus_grant_ring(dev, virt_to_mfn(sring)); + err = xenbus_grant_ring(dev, sring, 1, grefs); if (err < 0) { free_page((unsigned long) sring); info->ring.sring = NULL; xenbus_dev_fatal(dev, err, "fail to grant shared ring (Front to Back)"); goto free_sring; } - info->ring_ref = err; + info->ring_ref = grefs[0]; err = xenbus_alloc_evtchn(dev, &info->evtchn); if (err) diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c index 5a42ae7..0d8a98c 100644 --- a/drivers/xen/xen-pciback/xenbus.c +++ b/drivers/xen/xen-pciback/xenbus.c @@ -98,17 +98,18 @@ static void free_pdev(struct xen_pcibk_device *pdev) kfree(pdev); } -static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref, - int remote_evtchn) +static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref[], + int nr_grefs, + int remote_evtchn) { int err = 0; void *vaddr; dev_dbg(&pdev->xdev->dev, "Attaching to frontend resources - gnt_ref=%d evtchn=%d\n", - gnt_ref, remote_evtchn); + gnt_ref[0], remote_evtchn); - err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr); + err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr); if (err < 0) { xenbus_dev_fatal(pdev->xdev, err, "Error mapping other domain page in ours."); @@ -172,7 +173,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev) goto out; } - err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn); + err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn); if (err) goto out; -- 1.7.2.5
Extend netback to support multi page ring. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 35 +++++++--- drivers/net/xen-netback/interface.c | 43 ++++++++++-- drivers/net/xen-netback/netback.c | 74 ++++++++------------ drivers/net/xen-netback/xenbus.c | 129 +++++++++++++++++++++++++++++++++-- 4 files changed, 213 insertions(+), 68 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index b7d4442..1bb16ec 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -59,10 +59,18 @@ struct xenvif_rx_meta { #define MAX_BUFFER_OFFSET PAGE_SIZE -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) +#define NETBK_TX_RING_SIZE(_nr_pages) \ + (__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))) +#define NETBK_RX_RING_SIZE(_nr_pages) \ + (__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))) -#define MAX_PENDING_REQS 256 +#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER +#define NETBK_MAX_RING_PAGES (1U << NETBK_MAX_RING_PAGE_ORDER) + +#define NETBK_MAX_TX_RING_SIZE NETBK_TX_RING_SIZE(NETBK_MAX_RING_PAGES) +#define NETBK_MAX_RX_RING_SIZE NETBK_RX_RING_SIZE(NETBK_MAX_RING_PAGES) + +#define MAX_PENDING_REQS NETBK_MAX_TX_RING_SIZE struct xenvif { /* Unique identifier for this interface. */ @@ -83,6 +91,8 @@ struct xenvif { /* The shared rings and indexes. */ struct xen_netif_tx_back_ring tx; struct xen_netif_rx_back_ring rx; + int nr_tx_handles; + int nr_rx_handles; /* Frontend feature information. */ u8 can_sg:1; @@ -132,8 +142,10 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, unsigned int handle); -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, - unsigned long rx_ring_ref, unsigned int evtchn); +int xenvif_connect(struct xenvif *vif, + unsigned long tx_ring_ref[], unsigned int tx_ring_order, + unsigned long rx_ring_ref[], unsigned int rx_ring_order, + unsigned int evtchn); void xenvif_disconnect(struct xenvif *vif); int xenvif_xenbus_init(void); @@ -146,10 +158,12 @@ int xenvif_rx_ring_full(struct xenvif *vif); int xenvif_must_stop_queue(struct xenvif *vif); /* (Un)Map communication rings. */ -void xenvif_unmap_frontend_rings(struct xenvif *vif); -int xenvif_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref); +void xenvif_unmap_frontend_ring(struct xenvif *vif, void *addr); +int xenvif_map_frontend_ring(struct xenvif *vif, + void **vaddr, + int domid, + int ring_ref[], + unsigned int ring_ref_count); /* Check for SKBs from frontend and schedule backend processing */ void xenvif_check_rx_xenvif(struct xenvif *vif); @@ -167,4 +181,7 @@ void xenvif_rx_action(struct xenvif *vif); int xenvif_kthread(void *data); +extern unsigned int MODPARM_netback_max_tx_ring_page_order; +extern unsigned int MODPARM_netback_max_rx_ring_page_order; + #endif /* __XEN_NETBACK__COMMON_H__ */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index b2bde8f..e1aa003 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -319,10 +319,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, return vif; } -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, - unsigned long rx_ring_ref, unsigned int evtchn) +int xenvif_connect(struct xenvif *vif, + unsigned long tx_ring_ref[], unsigned int tx_ring_ref_count, + unsigned long rx_ring_ref[], unsigned int rx_ring_ref_count, + unsigned int evtchn) { int err = -ENOMEM; + void *addr; + struct xen_netif_tx_sring *txs; + struct xen_netif_rx_sring *rxs; + int tmp[NETBK_MAX_RING_PAGES], i; /* Already connected through? */ if (vif->irq) @@ -330,15 +336,33 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, __module_get(THIS_MODULE); - err = xenvif_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); - if (err < 0) + for (i = 0; i < tx_ring_ref_count; i++) + tmp[i] = tx_ring_ref[i]; + + err = xenvif_map_frontend_ring(vif, &addr, vif->domid, + tmp, tx_ring_ref_count); + if (err) goto err; + txs = addr; + BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE * tx_ring_ref_count); + vif->nr_tx_handles = tx_ring_ref_count; + + for (i = 0; i < rx_ring_ref_count; i++) + tmp[i] = rx_ring_ref[i]; + + err = xenvif_map_frontend_ring(vif, &addr, vif->domid, + tmp, rx_ring_ref_count); + if (err) + goto err_tx_unmap; + rxs = addr; + BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count); + vif->nr_rx_handles = rx_ring_ref_count; err = bind_interdomain_evtchn_to_irqhandler( vif->domid, evtchn, xenvif_interrupt, 0, vif->dev->name, vif); if (err < 0) - goto err_unmap; + goto err_rx_unmap; vif->irq = err; disable_irq(vif->irq); @@ -366,8 +390,10 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, return 0; err_unbind: unbind_from_irqhandler(vif->irq, vif); -err_unmap: - xenvif_unmap_frontend_rings(vif); +err_rx_unmap: + xenvif_unmap_frontend_ring(vif, (void *)vif->tx.sring); +err_tx_unmap: + xenvif_unmap_frontend_ring(vif, (void *)vif->rx.sring); err: module_put(THIS_MODULE); return err; @@ -400,7 +426,8 @@ void xenvif_disconnect(struct xenvif *vif) unregister_netdev(vif->dev); - xenvif_unmap_frontend_rings(vif); + xenvif_unmap_frontend_ring(vif, (void *)vif->tx.sring); + xenvif_unmap_frontend_ring(vif, (void *)vif->rx.sring); free_netdev(vif->dev); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index cb1a661..60c8951 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -49,6 +49,17 @@ #include <asm/xen/hypercall.h> #include <asm/xen/page.h> +unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER; +module_param_named(netback_max_rx_ring_page_order, + MODPARM_netback_max_rx_ring_page_order, uint, 0); +MODULE_PARM_DESC(netback_max_rx_ring_page_order, + "Maximum supported receiver ring page order"); + +unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER; +module_param_named(netback_max_tx_ring_page_order, + MODPARM_netback_max_tx_ring_page_order, uint, 0); +MODULE_PARM_DESC(netback_max_tx_ring_page_order, + "Maximum supported transmitter ring page order"); DEFINE_PER_CPU(struct gnttab_copy *, tx_copy_ops); @@ -134,7 +145,8 @@ int xenvif_rx_ring_full(struct xenvif *vif) RING_IDX needed = max_required_rx_slots(vif); return ((vif->rx.sring->req_prod - peek) < needed) || - ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed); + ((vif->rx.rsp_prod_pvt + + NETBK_RX_RING_SIZE(vif->nr_rx_handles) - peek) < needed); } int xenvif_must_stop_queue(struct xenvif *vif) @@ -520,7 +532,8 @@ void xenvif_rx_action(struct xenvif *vif) __skb_queue_tail(&rxq, skb); /* Filled the batch queue? */ - if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE) + if (count + MAX_SKB_FRAGS >+ NETBK_RX_RING_SIZE(vif->nr_rx_handles)) break; } @@ -532,7 +545,7 @@ void xenvif_rx_action(struct xenvif *vif) return; } - BUG_ON(npo.copy_prod > (2 * XEN_NETIF_RX_RING_SIZE)); + BUG_ON(npo.copy_prod > (2 * NETBK_MAX_RX_RING_SIZE)); ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, gco, npo.copy_prod); BUG_ON(ret != 0); @@ -1419,48 +1432,22 @@ static inline int tx_work_todo(struct xenvif *vif) return 0; } -void xenvif_unmap_frontend_rings(struct xenvif *vif) +void xenvif_unmap_frontend_ring(struct xenvif *vif, void *addr) { - if (vif->tx.sring) - xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), - vif->tx.sring); - if (vif->rx.sring) - xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), - vif->rx.sring); + if (addr) + xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), addr); } -int xenvif_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref) +int xenvif_map_frontend_ring(struct xenvif *vif, + void **vaddr, + int domid, + int ring_ref[], + unsigned int ring_ref_count) { - void *addr; - struct xen_netif_tx_sring *txs; - struct xen_netif_rx_sring *rxs; - - int err = -ENOMEM; + int err = 0; err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), - &tx_ring_ref, 1, &addr); - if (err) - goto err; - - txs = (struct xen_netif_tx_sring *)addr; - BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE); - - err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), - &rx_ring_ref, 1, &addr); - if (err) - goto err; - - rxs = (struct xen_netif_rx_sring *)addr; - BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE); - - vif->rx_req_cons_peek = 0; - - return 0; - -err: - xenvif_unmap_frontend_rings(vif); + ring_ref, ring_ref_count, vaddr); return err; } @@ -1486,7 +1473,6 @@ int xenvif_kthread(void *data) static int __create_percpu_scratch_space(unsigned int cpu) { - /* Guard against race condition */ if (per_cpu(tx_copy_ops, cpu) || per_cpu(grant_copy_op, cpu) || per_cpu(meta, cpu)) @@ -1502,19 +1488,19 @@ static int __create_percpu_scratch_space(unsigned int cpu) per_cpu(grant_copy_op, cpu) vzalloc_node(sizeof(struct gnttab_copy) - * 2 * XEN_NETIF_RX_RING_SIZE, cpu_to_node(cpu)); + * 2 * NETBK_MAX_RX_RING_SIZE, cpu_to_node(cpu)); if (!per_cpu(grant_copy_op, cpu)) per_cpu(grant_copy_op, cpu) vzalloc(sizeof(struct gnttab_copy) - * 2 * XEN_NETIF_RX_RING_SIZE); + * 2 * NETBK_MAX_RX_RING_SIZE); per_cpu(meta, cpu) = vzalloc_node(sizeof(struct xenvif_rx_meta) - * 2 * XEN_NETIF_RX_RING_SIZE, + * 2 * NETBK_MAX_RX_RING_SIZE, cpu_to_node(cpu)); if (!per_cpu(meta, cpu)) per_cpu(meta, cpu) = vzalloc(sizeof(struct xenvif_rx_meta) - * 2 * XEN_NETIF_RX_RING_SIZE); + * 2 * NETBK_MAX_RX_RING_SIZE); if (!per_cpu(tx_copy_ops, cpu) || !per_cpu(grant_copy_op, cpu) || diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index f1e89ca..79499fc 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -113,6 +113,23 @@ static int netback_probe(struct xenbus_device *dev, message = "writing feature-rx-flip"; goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, + "max-tx-ring-page-order", + "%u", + MODPARM_netback_max_tx_ring_page_order); + if (err) { + message = "writing max-tx-ring-page-order"; + goto abort_transaction; + } + + err = xenbus_printf(xbt, dev->nodename, + "max-rx-ring-page-order", + "%u", + MODPARM_netback_max_rx_ring_page_order); + if (err) { + message = "writing max-rx-ring-page-order"; + goto abort_transaction; + } err = xenbus_transaction_end(xbt, 0); } while (err == -EAGAIN); @@ -391,22 +408,108 @@ static int connect_rings(struct backend_info *be) { struct xenvif *vif = be->vif; struct xenbus_device *dev = be->dev; - unsigned long tx_ring_ref, rx_ring_ref; unsigned int evtchn, rx_copy; int err; int val; + unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES]; + unsigned long rx_ring_ref[NETBK_MAX_RING_PAGES]; + unsigned int tx_ring_order; + unsigned int rx_ring_order; err = xenbus_gather(XBT_NIL, dev->otherend, - "tx-ring-ref", "%lu", &tx_ring_ref, - "rx-ring-ref", "%lu", &rx_ring_ref, "event-channel", "%u", &evtchn, NULL); if (err) { xenbus_dev_fatal(dev, err, - "reading %s/ring-ref and event-channel", + "reading %s/event-channel", dev->otherend); return err; } + err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u", + &tx_ring_order); + if (err < 0) { + tx_ring_order = 0; + + err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu", + &tx_ring_ref[0]); + if (err < 0) { + xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref", + dev->otherend); + return err; + } + } else { + unsigned int i; + + if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) { + err = -EINVAL; + + xenbus_dev_fatal(dev, err, + "%s/tx-ring-page-order too big", + dev->otherend); + return err; + } + + for (i = 0; i < (1U << tx_ring_order); i++) { + char ring_ref_name[sizeof("tx-ring-ref") + 2]; + + snprintf(ring_ref_name, sizeof(ring_ref_name), + "tx-ring-ref%u", i); + + err = xenbus_scanf(XBT_NIL, dev->otherend, + ring_ref_name, "%lu", + &tx_ring_ref[i]); + if (err < 0) { + xenbus_dev_fatal(dev, err, + "reading %s/%s", + dev->otherend, + ring_ref_name); + return err; + } + } + } + + err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-order", "%u", + &rx_ring_order); + if (err < 0) { + rx_ring_order = 0; + err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-ref", "%lu", + &rx_ring_ref[0]); + if (err < 0) { + xenbus_dev_fatal(dev, err, "reading %s/rx-ring-ref", + dev->otherend); + return err; + } + } else { + unsigned int i; + + if (rx_ring_order > MODPARM_netback_max_rx_ring_page_order) { + err = -EINVAL; + + xenbus_dev_fatal(dev, err, + "%s/rx-ring-page-order too big", + dev->otherend); + return err; + } + + for (i = 0; i < (1U << rx_ring_order); i++) { + char ring_ref_name[sizeof("rx-ring-ref") + 2]; + + snprintf(ring_ref_name, sizeof(ring_ref_name), + "rx-ring-ref%u", i); + + err = xenbus_scanf(XBT_NIL, dev->otherend, + ring_ref_name, "%lu", + &rx_ring_ref[i]); + if (err < 0) { + xenbus_dev_fatal(dev, err, + "reading %s/%s", + dev->otherend, + ring_ref_name); + return err; + } + } + } + err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy", "%u", &rx_copy); if (err == -ENOENT) { @@ -453,11 +556,23 @@ static int connect_rings(struct backend_info *be) vif->csum = !val; /* Map the shared frame, irq etc. */ - err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn); + err = xenvif_connect(vif, + tx_ring_ref, (1U << tx_ring_order), + rx_ring_ref, (1U << rx_ring_order), + evtchn); if (err) { + int i; xenbus_dev_fatal(dev, err, - "mapping shared-frames %lu/%lu port %u", - tx_ring_ref, rx_ring_ref, evtchn); + "binding port %u", + evtchn); + for (i = 0; i < (1U << tx_ring_order); i++) + xenbus_dev_fatal(dev, err, + "mapping tx ring handle: %lu", + tx_ring_ref[i]); + for (i = 0; i < (1U << rx_ring_order); i++) + xenbus_dev_fatal(dev, err, + "mapping rx ring handle: %lu", + tx_ring_ref[i]); return err; } return 0; -- 1.7.2.5
Originally, netback and netfront only use one event channel to do tx / rx notification. This may cause unnecessary wake-up of NAPI / kthread. When guest tx is completed, netback will only notify tx_irq. If feature-split-event-channels ==0, rx_irq = tx_irq, so RX protocol will just work as expected. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 7 ++- drivers/net/xen-netback/interface.c | 82 +++++++++++++++++++++++++++-------- drivers/net/xen-netback/netback.c | 4 +- drivers/net/xen-netback/xenbus.c | 51 +++++++++++++++++---- 4 files changed, 111 insertions(+), 33 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 1bb16ec..a0497fc 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -85,8 +85,9 @@ struct xenvif { u8 fe_dev_addr[6]; - /* Physical parameters of the comms window. */ - unsigned int irq; + /* When feature-split-event-channels = 0, tx_irq = rx_irq */ + unsigned int tx_irq; + unsigned int rx_irq; /* The shared rings and indexes. */ struct xen_netif_tx_back_ring tx; @@ -145,7 +146,7 @@ struct xenvif *xenvif_alloc(struct device *parent, int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref[], unsigned int tx_ring_order, unsigned long rx_ring_ref[], unsigned int rx_ring_order, - unsigned int evtchn); + unsigned int tx_evtchn, unsigned int rx_evtchn); void xenvif_disconnect(struct xenvif *vif); int xenvif_xenbus_init(void); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index e1aa003..c6dbd50 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -51,19 +51,35 @@ static int xenvif_rx_schedulable(struct xenvif *vif) return xenvif_schedulable(vif) && !xenvif_rx_ring_full(vif); } -static irqreturn_t xenvif_interrupt(int irq, void *dev_id) +static irqreturn_t xenvif_tx_interrupt(int irq, void *dev_id) { struct xenvif *vif = dev_id; - if (xenvif_rx_schedulable(vif)) - netif_wake_queue(vif->dev); - if (RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) napi_schedule(&vif->napi); return IRQ_HANDLED; } +static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id) +{ + struct xenvif *vif = dev_id; + + if (xenvif_schedulable(vif) && !xenvif_rx_ring_full(vif)) + netif_wake_queue(vif->dev); + + return IRQ_HANDLED; +} + +static irqreturn_t xenvif_interrupt(int irq, void *dev_id) +{ + xenvif_tx_interrupt(irq, dev_id); + + xenvif_rx_interrupt(irq, dev_id); + + return IRQ_HANDLED; +} + static int xenvif_poll(struct napi_struct *napi, int budget) { struct xenvif *vif = container_of(napi, struct xenvif, napi); @@ -132,14 +148,16 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev) static void xenvif_up(struct xenvif *vif) { napi_enable(&vif->napi); - enable_irq(vif->irq); + enable_irq(vif->tx_irq); + enable_irq(vif->rx_irq); xenvif_check_rx_xenvif(vif); } static void xenvif_down(struct xenvif *vif) { napi_disable(&vif->napi); - disable_irq(vif->irq); + disable_irq(vif->tx_irq); + disable_irq(vif->rx_irq); } static int xenvif_open(struct net_device *dev) @@ -322,7 +340,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref[], unsigned int tx_ring_ref_count, unsigned long rx_ring_ref[], unsigned int rx_ring_ref_count, - unsigned int evtchn) + unsigned int tx_evtchn, unsigned int rx_evtchn) { int err = -ENOMEM; void *addr; @@ -331,7 +349,7 @@ int xenvif_connect(struct xenvif *vif, int tmp[NETBK_MAX_RING_PAGES], i; /* Already connected through? */ - if (vif->irq) + if (vif->tx_irq) return 0; __module_get(THIS_MODULE); @@ -358,13 +376,34 @@ int xenvif_connect(struct xenvif *vif, BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count); vif->nr_rx_handles = rx_ring_ref_count; - err = bind_interdomain_evtchn_to_irqhandler( - vif->domid, evtchn, xenvif_interrupt, 0, - vif->dev->name, vif); - if (err < 0) - goto err_rx_unmap; - vif->irq = err; - disable_irq(vif->irq); + if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */ + err = bind_interdomain_evtchn_to_irqhandler( + vif->domid, tx_evtchn, xenvif_interrupt, 0, + vif->dev->name, vif); + if (err < 0) + goto err_rx_unmap; + vif->tx_irq = vif->rx_irq = err; + disable_irq(vif->tx_irq); + disable_irq(vif->rx_irq); + } else { + err = bind_interdomain_evtchn_to_irqhandler( + vif->domid, tx_evtchn, xenvif_tx_interrupt, 0, + vif->dev->name, vif); + if (err < 0) + goto err_rx_unmap; + vif->tx_irq = err; + disable_irq(vif->tx_irq); + + err = bind_interdomain_evtchn_to_irqhandler( + vif->domid, rx_evtchn, xenvif_rx_interrupt, 0, + vif->dev->name, vif); + if (err < 0) { + unbind_from_irqhandler(vif->tx_irq, vif); + goto err_rx_unmap; + } + vif->rx_irq = err; + disable_irq(vif->rx_irq); + } init_waitqueue_head(&vif->wq); vif->task = kthread_create(xenvif_kthread, @@ -389,7 +428,12 @@ int xenvif_connect(struct xenvif *vif, return 0; err_unbind: - unbind_from_irqhandler(vif->irq, vif); + if (vif->tx_irq == vif->rx_irq) + unbind_from_irqhandler(vif->tx_irq, vif); + else { + unbind_from_irqhandler(vif->tx_irq, vif); + unbind_from_irqhandler(vif->rx_irq, vif); + } err_rx_unmap: xenvif_unmap_frontend_ring(vif, (void *)vif->tx.sring); err_tx_unmap: @@ -419,10 +463,12 @@ void xenvif_disconnect(struct xenvif *vif) del_timer_sync(&vif->credit_timeout); - if (vif->irq) { - unbind_from_irqhandler(vif->irq, vif); + if (vif->tx_irq) { + unbind_from_irqhandler(vif->tx_irq, vif); need_module_put = 1; } + if (vif->tx_irq != vif->rx_irq) + unbind_from_irqhandler(vif->rx_irq, vif); unregister_netdev(vif->dev); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 60c8951..957cf9d 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -622,7 +622,7 @@ void xenvif_rx_action(struct xenvif *vif) } if (need_to_notify) - notify_remote_via_irq(vif->irq); + notify_remote_via_irq(vif->rx_irq); if (!skb_queue_empty(&vif->rx_queue)) xenvif_kick_thread(vif); @@ -1392,7 +1392,7 @@ static void make_tx_response(struct xenvif *vif, vif->tx.rsp_prod_pvt = ++i; RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->tx, notify); if (notify) - notify_remote_via_irq(vif->irq); + notify_remote_via_irq(vif->tx_irq); } static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 79499fc..3772e0c 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -131,6 +131,14 @@ static int netback_probe(struct xenbus_device *dev, goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, + "feature-split-event-channels", + "%u", 1); + if (err) { + message = "writing feature-split-event-channels"; + goto abort_transaction; + } + err = xenbus_transaction_end(xbt, 0); } while (err == -EAGAIN); @@ -408,7 +416,7 @@ static int connect_rings(struct backend_info *be) { struct xenvif *vif = be->vif; struct xenbus_device *dev = be->dev; - unsigned int evtchn, rx_copy; + unsigned int tx_evtchn, rx_evtchn, rx_copy; int err; int val; unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES]; @@ -417,12 +425,30 @@ static int connect_rings(struct backend_info *be) unsigned int rx_ring_order; err = xenbus_gather(XBT_NIL, dev->otherend, - "event-channel", "%u", &evtchn, NULL); + "event-channel", "%u", &tx_evtchn, NULL); if (err) { - xenbus_dev_fatal(dev, err, - "reading %s/event-channel", - dev->otherend); - return err; + err = xenbus_gather(XBT_NIL, dev->otherend, + "event-channel-tx", "%u", &tx_evtchn, + NULL); + if (err) { + xenbus_dev_fatal(dev, err, + "reading %s/event-channel-tx", + dev->otherend); + return err; + } + err = xenbus_gather(XBT_NIL, dev->otherend, + "event-channel-rx", "%u", &rx_evtchn, + NULL); + if (err) { + xenbus_dev_fatal(dev, err, + "reading %s/event-channel-rx", + dev->otherend); + return err; + } + dev_info(&dev->dev, "split event channels\n"); + } else { + rx_evtchn = tx_evtchn; + dev_info(&dev->dev, "single event channel\n"); } err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u", @@ -559,12 +585,17 @@ static int connect_rings(struct backend_info *be) err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order), rx_ring_ref, (1U << rx_ring_order), - evtchn); + tx_evtchn, rx_evtchn); if (err) { int i; - xenbus_dev_fatal(dev, err, - "binding port %u", - evtchn); + if (tx_evtchn == rx_evtchn) + xenbus_dev_fatal(dev, err, + "binding port %u", + tx_evtchn); + else + xenbus_dev_fatal(dev, err, + "binding tx port %u, rx port %u", + tx_evtchn, rx_evtchn); for (i = 0; i < (1U << tx_ring_order); i++) xenbus_dev_fatal(dev, err, "mapping tx ring handle: %lu", -- 1.7.2.5
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netfront.c | 228 ++++++++++++++++++++++++++++++-------------- 1 files changed, 156 insertions(+), 72 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index b7ff815..a1cfb24 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -66,9 +66,18 @@ struct netfront_cb { #define GRANT_INVALID_REF 0 -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256) +#define XENNET_MAX_RING_PAGE_ORDER 2 +#define XENNET_MAX_RING_PAGES (1U << XENNET_MAX_RING_PAGE_ORDER) + +#define NET_TX_RING_SIZE(_nr_pages) \ + __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages)) +#define NET_RX_RING_SIZE(_nr_pages) \ + __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages)) + +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES) +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES) + +#define TX_MAX_TARGET XENNET_MAX_TX_RING_SIZE struct netfront_stats { u64 rx_packets; @@ -84,12 +93,19 @@ struct netfront_info { struct napi_struct napi; + /* Statistics */ + struct netfront_stats __percpu *stats; + + unsigned long rx_gso_checksum_fixup; + unsigned int evtchn; struct xenbus_device *xbdev; spinlock_t tx_lock; struct xen_netif_tx_front_ring tx; - int tx_ring_ref; + int tx_ring_ref[XENNET_MAX_RING_PAGES]; + int tx_ring_page_order; + int tx_ring_pages; /* * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries @@ -103,36 +119,33 @@ struct netfront_info { union skb_entry { struct sk_buff *skb; unsigned long link; - } tx_skbs[NET_TX_RING_SIZE]; + } tx_skbs[XENNET_MAX_TX_RING_SIZE]; grant_ref_t gref_tx_head; - grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; + grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE]; unsigned tx_skb_freelist; spinlock_t rx_lock ____cacheline_aligned_in_smp; struct xen_netif_rx_front_ring rx; - int rx_ring_ref; + int rx_ring_ref[XENNET_MAX_RING_PAGES]; + int rx_ring_page_order; + int rx_ring_pages; /* Receive-ring batched refills. */ #define RX_MIN_TARGET 8 #define RX_DFL_MIN_TARGET 64 -#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256) +#define RX_MAX_TARGET XENNET_MAX_RX_RING_SIZE unsigned rx_min_target, rx_max_target, rx_target; struct sk_buff_head rx_batch; struct timer_list rx_refill_timer; - struct sk_buff *rx_skbs[NET_RX_RING_SIZE]; + struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE]; grant_ref_t gref_rx_head; - grant_ref_t grant_rx_ref[NET_RX_RING_SIZE]; - - unsigned long rx_pfn_array[NET_RX_RING_SIZE]; - struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1]; - struct mmu_update rx_mmu[NET_RX_RING_SIZE]; - - /* Statistics */ - struct netfront_stats __percpu *stats; + grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE]; - unsigned long rx_gso_checksum_fixup; + unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE]; + struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1]; + struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE]; }; struct netfront_rx_info { @@ -170,15 +183,15 @@ static unsigned short get_id_from_freelist(unsigned *head, return id; } -static int xennet_rxidx(RING_IDX idx) +static int xennet_rxidx(RING_IDX idx, struct netfront_info *info) { - return idx & (NET_RX_RING_SIZE - 1); + return idx & (NET_RX_RING_SIZE(info->rx_ring_pages) - 1); } static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np, RING_IDX ri) { - int i = xennet_rxidx(ri); + int i = xennet_rxidx(ri, np); struct sk_buff *skb = np->rx_skbs[i]; np->rx_skbs[i] = NULL; return skb; @@ -187,7 +200,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np, static grant_ref_t xennet_get_rx_ref(struct netfront_info *np, RING_IDX ri) { - int i = xennet_rxidx(ri); + int i = xennet_rxidx(ri, np); grant_ref_t ref = np->grant_rx_ref[i]; np->grant_rx_ref[i] = GRANT_INVALID_REF; return ref; @@ -300,7 +313,7 @@ no_skb: skb->dev = dev; - id = xennet_rxidx(req_prod + i); + id = xennet_rxidx(req_prod + i, np); BUG_ON(np->rx_skbs[id]); np->rx_skbs[id] = skb; @@ -596,7 +609,7 @@ static int xennet_close(struct net_device *dev) static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb, grant_ref_t ref) { - int new = xennet_rxidx(np->rx.req_prod_pvt); + int new = xennet_rxidx(np->rx.req_prod_pvt, np); BUG_ON(np->rx_skbs[new]); np->rx_skbs[new] = skb; @@ -1089,7 +1102,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np) struct sk_buff *skb; int i; - for (i = 0; i < NET_TX_RING_SIZE; i++) { + for (i = 0; i < NET_TX_RING_SIZE(np->tx_ring_pages); i++) { /* Skip over entries which are actually freelist references */ if (skb_entry_is_link(&np->tx_skbs[i])) continue; @@ -1123,7 +1136,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np) spin_lock_bh(&np->rx_lock); - for (id = 0; id < NET_RX_RING_SIZE; id++) { + for (id = 0; id < NET_RX_RING_SIZE(np->rx_ring_pages); id++) { ref = np->grant_rx_ref[id]; if (ref == GRANT_INVALID_REF) { unused++; @@ -1305,13 +1318,13 @@ static struct net_device * __devinit xennet_create_dev(struct xenbus_device *dev /* Initialise tx_skbs as a free chain containing every entry. */ np->tx_skb_freelist = 0; - for (i = 0; i < NET_TX_RING_SIZE; i++) { + for (i = 0; i < XENNET_MAX_TX_RING_SIZE; i++) { skb_entry_set_link(&np->tx_skbs[i], i+1); np->grant_tx_ref[i] = GRANT_INVALID_REF; } /* Clear out rx_skbs */ - for (i = 0; i < NET_RX_RING_SIZE; i++) { + for (i = 0; i < XENNET_MAX_RX_RING_SIZE; i++) { np->rx_skbs[i] = NULL; np->grant_rx_ref[i] = GRANT_INVALID_REF; } @@ -1409,13 +1422,6 @@ static int __devinit netfront_probe(struct xenbus_device *dev, return err; } -static void xennet_end_access(int ref, void *page) -{ - /* This frees the page as a side-effect */ - if (ref != GRANT_INVALID_REF) - gnttab_end_foreign_access(ref, 0, (unsigned long)page); -} - static void xennet_disconnect_backend(struct netfront_info *info) { /* Stop old i/f to prevent errors whilst we rebuild the state. */ @@ -1429,12 +1435,12 @@ static void xennet_disconnect_backend(struct netfront_info *info) unbind_from_irqhandler(info->netdev->irq, info->netdev); info->evtchn = info->netdev->irq = 0; - /* End access and free the pages */ - xennet_end_access(info->tx_ring_ref, info->tx.sring); - xennet_end_access(info->rx_ring_ref, info->rx.sring); + xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring); + free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order); + + xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring); + free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order); - info->tx_ring_ref = GRANT_INVALID_REF; - info->rx_ring_ref = GRANT_INVALID_REF; info->tx.sring = NULL; info->rx.sring = NULL; } @@ -1482,11 +1488,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) struct xen_netif_tx_sring *txs; struct xen_netif_rx_sring *rxs; int err; - int grefs[1]; struct net_device *netdev = info->netdev; + unsigned int max_tx_ring_page_order, max_rx_ring_page_order; + int i; - info->tx_ring_ref = GRANT_INVALID_REF; - info->rx_ring_ref = GRANT_INVALID_REF; + for (i = 0; i < XENNET_MAX_RING_PAGES; i++) { + info->tx_ring_ref[i] = GRANT_INVALID_REF; + info->rx_ring_ref[i] = GRANT_INVALID_REF; + } info->rx.sring = NULL; info->tx.sring = NULL; netdev->irq = 0; @@ -1497,50 +1506,91 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) goto fail; } - txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH); + err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, + "max-tx-ring-page-order", "%u", + &max_tx_ring_page_order); + if (err < 0) { + info->tx_ring_page_order = 0; + dev_info(&dev->dev, "single tx ring\n"); + } else { + info->tx_ring_page_order = max_tx_ring_page_order; + dev_info(&dev->dev, "multi page tx ring, order = %d\n", + max_tx_ring_page_order); + } + info->tx_ring_pages = (1U << info->tx_ring_page_order); + + txs = (struct xen_netif_tx_sring *) + __get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH, + info->tx_ring_page_order); if (!txs) { err = -ENOMEM; xenbus_dev_fatal(dev, err, "allocating tx ring page"); goto fail; } SHARED_RING_INIT(txs); - FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE); + FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages); + + err = xenbus_grant_ring(dev, txs, info->tx_ring_pages, + info->tx_ring_ref); + + if (err < 0) + goto grant_tx_ring_fail; - err = xenbus_grant_ring(dev, txs, 1, grefs); + err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, + "max-rx-ring-page-order", "%u", + &max_rx_ring_page_order); if (err < 0) { - free_page((unsigned long)txs); - goto fail; + info->rx_ring_page_order = 0; + dev_info(&dev->dev, "single rx ring\n"); + } else { + info->rx_ring_page_order = max_rx_ring_page_order; + dev_info(&dev->dev, "multi page rx ring, order = %d\n", + max_rx_ring_page_order); } + info->rx_ring_pages = (1U << info->rx_ring_page_order); - info->tx_ring_ref = grefs[0]; - rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH); + rxs = (struct xen_netif_rx_sring *) + __get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH, + info->rx_ring_page_order); if (!rxs) { err = -ENOMEM; xenbus_dev_fatal(dev, err, "allocating rx ring page"); - goto fail; + goto alloc_rx_ring_fail; } SHARED_RING_INIT(rxs); - FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE); + FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages); - err = xenbus_grant_ring(dev, rxs, 1, grefs); - if (err < 0) { - free_page((unsigned long)rxs); - goto fail; - } - info->rx_ring_ref = grefs[0]; + err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages, + info->rx_ring_ref); + + if (err < 0) + goto grant_rx_ring_fail; + + FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages); err = xenbus_alloc_evtchn(dev, &info->evtchn); if (err) - goto fail; + goto alloc_evtchn_fail; err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt, 0, netdev->name, netdev); if (err < 0) - goto fail; + goto bind_fail; netdev->irq = err; + return 0; - fail: +bind_fail: + xenbus_free_evtchn(dev, info->evtchn); +alloc_evtchn_fail: + xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring); +grant_rx_ring_fail: + free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order); +alloc_rx_ring_fail: + xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring); +grant_tx_ring_fail: + free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order); +fail: return err; } @@ -1551,6 +1601,7 @@ static int talk_to_netback(struct xenbus_device *dev, const char *message; struct xenbus_transaction xbt; int err; + int i; /* Create shared ring, alloc event channel. */ err = setup_netfront(dev, info); @@ -1564,18 +1615,50 @@ again: goto destroy_ring; } - err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u", - info->tx_ring_ref); - if (err) { - message = "writing tx ring-ref"; - goto abort_transaction; + if (info->tx_ring_page_order == 0) + err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u", + info->tx_ring_ref[0]); + else { + err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u", + info->tx_ring_page_order); + if (err) { + message = "writing tx ring-ref"; + goto abort_transaction; + } + for (i = 0; i < info->tx_ring_pages; i++) { + char name[sizeof("tx-ring-ref")+2]; + snprintf(name, sizeof(name), "tx-ring-ref%u", i); + err = xenbus_printf(xbt, dev->nodename, name, "%u", + info->tx_ring_ref[i]); + if (err) { + message = "writing tx ring-ref"; + goto abort_transaction; + } + } } - err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u", - info->rx_ring_ref); - if (err) { - message = "writing rx ring-ref"; - goto abort_transaction; + + if (info->rx_ring_page_order == 0) + err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u", + info->rx_ring_ref[0]); + else { + err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u", + info->rx_ring_page_order); + if (err) { + message = "writing tx ring-ref"; + goto abort_transaction; + } + for (i = 0; i < info->rx_ring_pages; i++) { + char name[sizeof("rx-ring-ref")+2]; + snprintf(name, sizeof(name), "rx-ring-ref%u", i); + err = xenbus_printf(xbt, dev->nodename, name, "%u", + info->rx_ring_ref[i]); + if (err) { + message = "writing rx ring-ref"; + goto abort_transaction; + } + } } + err = xenbus_printf(xbt, dev->nodename, "event-channel", "%u", info->evtchn); if (err) { @@ -1662,7 +1745,8 @@ static int xennet_connect(struct net_device *dev) xennet_release_tx_bufs(np); /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */ - for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) { + for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE(np->rx_ring_pages); + i++) { skb_frag_t *frag; const struct page *page; if (!np->rx_skbs[i]) -- 1.7.2.5
Wei Liu
2012-Feb-02 16:49 UTC
[RFC PATCH V4 13/13] netfront: split event channels support.
If this feature is not activated, rx_irq = tx_irq. See corresponding netback change log for details. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netfront.c | 178 +++++++++++++++++++++++++++++++++++-------- 1 files changed, 145 insertions(+), 33 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index a1cfb24..9d70665 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -98,7 +98,9 @@ struct netfront_info { unsigned long rx_gso_checksum_fixup; - unsigned int evtchn; + unsigned int tx_evtchn, rx_evtchn; + unsigned int tx_irq, rx_irq; + struct xenbus_device *xbdev; spinlock_t tx_lock; @@ -342,7 +344,7 @@ no_skb: push: RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify); if (notify) - notify_remote_via_irq(np->netdev->irq); + notify_remote_via_irq(np->rx_irq); } static int xennet_open(struct net_device *dev) @@ -575,7 +577,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify); if (notify) - notify_remote_via_irq(np->netdev->irq); + notify_remote_via_irq(np->tx_irq); u64_stats_update_begin(&stats->syncp); stats->tx_bytes += skb->len; @@ -1240,22 +1242,36 @@ static int xennet_set_features(struct net_device *dev, u32 features) return 0; } -static irqreturn_t xennet_interrupt(int irq, void *dev_id) +static irqreturn_t xennet_tx_interrupt(int irq, void *dev_id) { - struct net_device *dev = dev_id; - struct netfront_info *np = netdev_priv(dev); + struct netfront_info *np = dev_id; + struct net_device *dev = np->netdev; unsigned long flags; spin_lock_irqsave(&np->tx_lock, flags); + xennet_tx_buf_gc(dev); + spin_unlock_irqrestore(&np->tx_lock, flags); - if (likely(netif_carrier_ok(dev))) { - xennet_tx_buf_gc(dev); - /* Under tx_lock: protects access to rx shared-ring indexes. */ - if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx)) - napi_schedule(&np->napi); - } + return IRQ_HANDLED; +} - spin_unlock_irqrestore(&np->tx_lock, flags); +static irqreturn_t xennet_rx_interrupt(int irq, void *dev_id) +{ + struct netfront_info *np = dev_id; + struct net_device *dev = np->netdev; + + if (likely(netif_carrier_ok(dev) && + RING_HAS_UNCONSUMED_RESPONSES(&np->rx))) + napi_schedule(&np->napi); + + return IRQ_HANDLED; +} + +static irqreturn_t xennet_interrupt(int irq, void *dev_id) +{ + xennet_tx_interrupt(irq, dev_id); + + xennet_rx_interrupt(irq, dev_id); return IRQ_HANDLED; } @@ -1431,9 +1447,15 @@ static void xennet_disconnect_backend(struct netfront_info *info) spin_unlock_irq(&info->tx_lock); spin_unlock_bh(&info->rx_lock); - if (info->netdev->irq) - unbind_from_irqhandler(info->netdev->irq, info->netdev); - info->evtchn = info->netdev->irq = 0; + if (info->tx_irq && (info->tx_irq == info->rx_irq)) + unbind_from_irqhandler(info->tx_irq, info); + if (info->tx_irq && (info->tx_irq != info->rx_irq)) { + unbind_from_irqhandler(info->tx_irq, info); + unbind_from_irqhandler(info->rx_irq, info); + } + + info->tx_evtchn = info->tx_irq = 0; + info->rx_evtchn = info->rx_irq = 0; xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring); free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order); @@ -1483,11 +1505,80 @@ static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[]) return 0; } +static int setup_netfront_single(struct netfront_info *info) +{ + int err; + + err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn); + + if (err < 0) + goto fail; + + err = bind_evtchn_to_irqhandler(info->tx_evtchn, + xennet_interrupt, + 0, info->netdev->name, info); + if (err < 0) + goto bind_fail; + info->rx_evtchn = info->tx_evtchn; + info->rx_irq = info->tx_irq = err; + dev_info(&info->xbdev->dev, "single event channel, irq = %d\n", + info->tx_irq); + + return 0; + +bind_fail: + xenbus_free_evtchn(info->xbdev, info->tx_evtchn); +fail: + return err; +} + +static int setup_netfront_split(struct netfront_info *info) +{ + int err; + + err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn); + if (err) + goto fail; + err = xenbus_alloc_evtchn(info->xbdev, &info->rx_evtchn); + if (err) + goto alloc_rx_evtchn_fail; + + err = bind_evtchn_to_irqhandler(info->tx_evtchn, + xennet_tx_interrupt, + 0, info->netdev->name, info); + if (err < 0) + goto bind_tx_fail; + info->tx_irq = err; + err = bind_evtchn_to_irqhandler(info->rx_evtchn, + xennet_rx_interrupt, + 0, info->netdev->name, info); + if (err < 0) + goto bind_rx_fail; + + info->rx_irq = err; + dev_info(&info->xbdev->dev, "split event channels," + " tx_irq = %d, rx_irq = %d\n", + info->tx_irq, info->rx_irq); + + + return 0; + +bind_rx_fail: + unbind_from_irqhandler(info->tx_irq, info); +bind_tx_fail: + xenbus_free_evtchn(info->xbdev, info->rx_evtchn); +alloc_rx_evtchn_fail: + xenbus_free_evtchn(info->xbdev, info->tx_evtchn); +fail: + return err; +} + static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) { struct xen_netif_tx_sring *txs; struct xen_netif_rx_sring *rxs; int err; + unsigned int feature_split_evtchn; struct net_device *netdev = info->netdev; unsigned int max_tx_ring_page_order, max_rx_ring_page_order; int i; @@ -1507,6 +1598,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) } err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, + "feature-split-event-channels", "%u", + &feature_split_evtchn); + + if (err < 0) + feature_split_evtchn = 0; + + err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, "max-tx-ring-page-order", "%u", &max_tx_ring_page_order); if (err < 0) { @@ -1568,21 +1666,17 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages); - err = xenbus_alloc_evtchn(dev, &info->evtchn); - if (err) - goto alloc_evtchn_fail; + if (!feature_split_evtchn) + err = setup_netfront_single(info); + else + err = setup_netfront_split(info); - err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt, - 0, netdev->name, netdev); - if (err < 0) - goto bind_fail; - netdev->irq = err; + if (err) + goto setup_evtchn_failed; return 0; -bind_fail: - xenbus_free_evtchn(dev, info->evtchn); -alloc_evtchn_fail: +setup_evtchn_failed: xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring); grant_rx_ring_fail: free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order); @@ -1659,11 +1753,27 @@ again: } } - err = xenbus_printf(xbt, dev->nodename, - "event-channel", "%u", info->evtchn); - if (err) { - message = "writing event-channel"; - goto abort_transaction; + + if (info->tx_evtchn == info->rx_evtchn) { + err = xenbus_printf(xbt, dev->nodename, + "event-channel", "%u", info->tx_evtchn); + if (err) { + message = "writing event-channel"; + goto abort_transaction; + } + } else { + err = xenbus_printf(xbt, dev->nodename, + "event-channel-tx", "%u", info->tx_evtchn); + if (err) { + message = "writing event-channel-tx"; + goto abort_transaction; + } + err = xenbus_printf(xbt, dev->nodename, + "event-channel-rx", "%u", info->rx_evtchn); + if (err) { + message = "writing event-channel-rx"; + goto abort_transaction; + } } err = xenbus_printf(xbt, dev->nodename, "request-rx-copy", "%u", @@ -1777,7 +1887,9 @@ static int xennet_connect(struct net_device *dev) * packets. */ netif_carrier_on(np->netdev); - notify_remote_via_irq(np->netdev->irq); + notify_remote_via_irq(np->tx_irq); + if (np->tx_irq != np->rx_irq) + notify_remote_via_irq(np->rx_irq); xennet_tx_buf_gc(dev); xennet_alloc_rx_buffers(dev); -- 1.7.2.5
Eric Dumazet
2012-Feb-02 17:08 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
Le jeudi 02 février 2012 à 16:49 +0000, Wei Liu a écrit :> Enables users to unload netback module. > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/net/xen-netback/common.h | 1 + > drivers/net/xen-netback/netback.c | 14 ++++++++++++++ > drivers/net/xen-netback/xenbus.c | 5 +++++ > 3 files changed, 20 insertions(+), 0 deletions(-) > > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h > index 288b2f3..372c7f5 100644 > --- a/drivers/net/xen-netback/common.h > +++ b/drivers/net/xen-netback/common.h > @@ -126,6 +126,7 @@ void xenvif_get(struct xenvif *vif); > void xenvif_put(struct xenvif *vif); > > int xenvif_xenbus_init(void); > +void xenvif_xenbus_exit(void); > > int xenvif_schedulable(struct xenvif *vif); > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index d11205f..3059684 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -1670,5 +1670,19 @@ failed_init: > > module_init(netback_init); >While reviewing this code, I can see current netback_init() is buggy. It assumes all online cpus xen_netbk_group_nr are numbered from 0 to xen_netbk_group_nr-1 This is not right. Instead of using : xen_netbk = vzalloc(sizeof(struct xen_netbk) * xen_netbk_group_nr); You should use a percpu variable to get proper NUMA properties. And instead of looping like : for (group = 0; group < xen_netbk_group_nr; group++) { You must use : for_each_online_cpu(cpu) { ... } [ and also use kthread_create_on_node() instead of kthread_create() ]
Le jeudi 02 février 2012 à 16:49 +0000, Wei Liu a écrit :> A global page pool. Since we are moving to 1:1 model netback, it is > better to limit total RAM consumed by all the vifs. > > With this patch, each vif gets page from the pool and puts the page > back when it is finished with the page. > > This pool is only meant to access via exported interfaces. Internals > are subject to change when we discover new requirements for the pool. > > Current exported interfaces include: > > page_pool_init: pool init > page_pool_destroy: pool destruction > page_pool_get: get a page from pool > page_pool_put: put page back to pool > is_in_pool: tell whether a page belongs to the pool > > Current implementation has following defects: > - Global locking > - No starve prevention mechanism / reservation logic > > Global locking tends to cause contention on the pool. No reservation > logic may cause vif to starve. A possible solution to these two > problems will be each vif maintains its local cache and claims a > portion of the pool. However the implementation will be tricky when > coming to pool management, so let''s worry about that later. > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > ---Hmm, this kind of stuff should be discussed on lkml. I doubt we want yet another memory allocator, with a global lock (contended), and no NUMA properties.> +int page_pool_init() > +{ > + int cpus = 0; > + int i; > + > + cpus = num_online_cpus(); > + pool_size = cpus * ENTRIES_PER_CPU; > + > + pool = vzalloc(sizeof(struct page_pool_entry) * pool_size); > + > + if (!pool) > + return -ENOMEM; > + > + for (i = 0; i < pool_size - 1; i++) > + pool[i].u.fl = i+1; > + pool[pool_size-1].u.fl = INVALID_ENTRY; > + free_count = pool_size; > + free_head = 0; > + > + return 0; > +} > +num_online_cpus() disease once again. code depending on num_online_cpus() is always suspicious.
Wei Liu
2012-Feb-02 17:28 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Thu, 2012-02-02 at 17:08 +0000, Eric Dumazet wrote:> Le jeudi 02 février 2012 à 16:49 +0000, Wei Liu a écrit : > > Enables users to unload netback module. > > > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/net/xen-netback/common.h | 1 + > > drivers/net/xen-netback/netback.c | 14 ++++++++++++++ > > drivers/net/xen-netback/xenbus.c | 5 +++++ > > 3 files changed, 20 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h > > index 288b2f3..372c7f5 100644 > > --- a/drivers/net/xen-netback/common.h > > +++ b/drivers/net/xen-netback/common.h > > @@ -126,6 +126,7 @@ void xenvif_get(struct xenvif *vif); > > void xenvif_put(struct xenvif *vif); > > > > int xenvif_xenbus_init(void); > > +void xenvif_xenbus_exit(void); > > > > int xenvif_schedulable(struct xenvif *vif); > > > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > > index d11205f..3059684 100644 > > --- a/drivers/net/xen-netback/netback.c > > +++ b/drivers/net/xen-netback/netback.c > > @@ -1670,5 +1670,19 @@ failed_init: > > > > module_init(netback_init); > > > > While reviewing this code, I can see current netback_init() is buggy. > > It assumes all online cpus xen_netbk_group_nr are numbered from 0 to > xen_netbk_group_nr-1 > > This is not right. >You''re right about this. But this part is destined to get wiped out (in the very near future?) -- see following patches. So I don''t think it is worthy to fix this. Wei.> Instead of using : > > xen_netbk = vzalloc(sizeof(struct xen_netbk) * xen_netbk_group_nr); > > You should use a percpu variable to get proper NUMA properties. > > And instead of looping like : > > for (group = 0; group < xen_netbk_group_nr; group++) { > > You must use : > > for_each_online_cpu(cpu) { > ... > } > > [ and also use kthread_create_on_node() instead of kthread_create() ] > > >
Eric Dumazet
2012-Feb-02 17:48 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
Le jeudi 02 février 2012 à 17:28 +0000, Wei Liu a écrit :> You''re right about this. > > But this part is destined to get wiped out (in the very near future?) -- > see following patches. So I don''t think it is worthy to fix this. >Before adding new bugs, you must fix previous ones. Do you think stable teams will do the backport of all your upcoming patches ?
Ian Campbell
2012-Feb-02 19:59 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Thu, 2012-02-02 at 17:48 +0000, Eric Dumazet wrote:> Le jeudi 02 février 2012 à 17:28 +0000, Wei Liu a écrit : > > > You''re right about this. > > > > But this part is destined to get wiped out (in the very near future?) -- > > see following patches. So I don''t think it is worthy to fix this. > > > > Before adding new bugs, you must fix previous ones.I''ve never heard of this requirement before! It''s a wonder anyone ever gets anything done. Anyway, I think it would be reasonable to just remove the kthread_bind call from this loop. We don''t actually want/need a thread per online CPU in any strict sense, we just want there to be some number of worker threads available and ~numcpus at start of day is a good enough approximation for that number. There have been patches floating around in the past which make the number of groups a module parameter which would also be a reasonable thing to dig out if we weren''t just about to remove all this code anyway. So removing the kthread_bind is good enough for the short term, and for stable if people feel that is necessary, and we can continue in mainline with the direction Wei''s patches are taking us. Ian.
Eric Dumazet
2012-Feb-02 20:34 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
Le jeudi 02 février 2012 à 19:59 +0000, Ian Campbell a écrit :> On Thu, 2012-02-02 at 17:48 +0000, Eric Dumazet wrote: > > Le jeudi 02 février 2012 à 17:28 +0000, Wei Liu a écrit : > > > > > You''re right about this. > > > > > > But this part is destined to get wiped out (in the very near future?) -- > > > see following patches. So I don''t think it is worthy to fix this. > > > > > > > Before adding new bugs, you must fix previous ones. > > I''ve never heard of this requirement before! It''s a wonder anyone ever > gets anything done. > > Anyway, I think it would be reasonable to just remove the kthread_bind > call from this loop. We don''t actually want/need a thread per online CPU > in any strict sense, we just want there to be some number of worker > threads available and ~numcpus at start of day is a good enough > approximation for that number. There have been patches floating around > in the past which make the number of groups a module parameter which > would also be a reasonable thing to dig out if we weren''t just about to > remove all this code anyway. > > So removing the kthread_bind is good enough for the short term, and for > stable if people feel that is necessary, and we can continue in mainline > with the direction Wei''s patches are taking us. >That sounds a right fix. Why do think its not reasonable that I ask a bug fix ? Next time, dont bother send patches for review if you dont want reviewers.
Eric Dumazet
2012-Feb-02 20:37 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
Le jeudi 02 février 2012 à 21:34 +0100, Eric Dumazet a écrit :> That sounds a right fix. > > Why do think its not reasonable that I ask a bug fix ? > > Next time, dont bother send patches for review if you dont want > reviewers.FYI, here the trace you can get right now with this bug : [ 1180.114682] WARNING: at arch/x86/kernel/smp.c:120 native_smp_send_reschedule+0x5b/0x60() [ 1180.114685] Hardware name: ProLiant BL460c G6 [ 1180.114686] Modules linked in: ipmi_devintf nfsd exportfs ipmi_si hpilo bnx2x crc32c libcrc32c mdio [last unloaded: scsi_wait_scan] [ 1180.114697] Pid: 7, comm: migration/1 Not tainted 3.3.0-rc2+ #609 [ 1180.114699] Call Trace: [ 1180.114701] <IRQ> [<ffffffff81037adf>] warn_slowpath_common+0x7f/0xc0 [ 1180.114708] [<ffffffff81037b3a>] warn_slowpath_null+0x1a/0x20 [ 1180.114711] [<ffffffff8101ecfb>] native_smp_send_reschedule+0x5b/0x60 [ 1180.114715] [<ffffffff810744ec>] trigger_load_balance+0x28c/0x370 [ 1180.114720] [<ffffffff8106be14>] scheduler_tick+0x114/0x160 [ 1180.114724] [<ffffffff8104956e>] update_process_times+0x6e/0x90 [ 1180.114729] [<ffffffff8108c614>] tick_sched_timer+0x64/0xc0 [ 1180.114733] [<ffffffff8105fe54>] __run_hrtimer+0x84/0x1f0 [ 1180.114736] [<ffffffff8108c5b0>] ? tick_nohz_handler+0xf0/0xf0 [ 1180.114739] [<ffffffff8103f271>] ? __do_softirq+0x101/0x240 [ 1180.114742] [<ffffffff81060783>] hrtimer_interrupt+0xf3/0x220 [ 1180.114747] [<ffffffff810a90b0>] ? queue_stop_cpus_work+0x100/0x100 [ 1180.114751] [<ffffffff8171ee09>] smp_apic_timer_interrupt+0x69/0x99 [ 1180.114754] [<ffffffff8171dd4b>] apic_timer_interrupt+0x6b/0x70 [ 1180.114756] <EOI> [<ffffffff810a9143>] ? stop_machine_cpu_stop+0x93/0xc0 [ 1180.114761] [<ffffffff810a8da7>] cpu_stopper_thread+0xd7/0x1a0 [ 1180.114766] [<ffffffff81713dc7>] ? __schedule+0x3a7/0x7e0 [ 1180.114768] [<ffffffff81064058>] ? __wake_up_common+0x58/0x90 [ 1180.114771] [<ffffffff810a8cd0>] ? cpu_stop_signal_done+0x40/0x40 [ 1180.114773] [<ffffffff8105b5c3>] kthread+0x93/0xa0 [ 1180.114776] [<ffffffff8171e594>] kernel_thread_helper+0x4/0x10 [ 1180.114779] [<ffffffff8105b530>] ? kthread_freezable_should_stop+0x80/0x80 [ 1180.114781] [<ffffffff8171e590>] ? gs_change+0xb/0xb
Ian Campbell
2012-Feb-02 20:50 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Thu, 2012-02-02 at 20:34 +0000, Eric Dumazet wrote:> Le jeudi 02 février 2012 à 19:59 +0000, Ian Campbell a écrit : > > On Thu, 2012-02-02 at 17:48 +0000, Eric Dumazet wrote: > > > Le jeudi 02 février 2012 à 17:28 +0000, Wei Liu a écrit : > > > > > > > You''re right about this. > > > > > > > > But this part is destined to get wiped out (in the very near future?) -- > > > > see following patches. So I don''t think it is worthy to fix this. > > > > > > > > > > Before adding new bugs, you must fix previous ones. > > > > I''ve never heard of this requirement before! It''s a wonder anyone ever > > gets anything done. > > > > Anyway, I think it would be reasonable to just remove the kthread_bind > > call from this loop. We don''t actually want/need a thread per online CPU > > in any strict sense, we just want there to be some number of worker > > threads available and ~numcpus at start of day is a good enough > > approximation for that number. There have been patches floating around > > in the past which make the number of groups a module parameter which > > would also be a reasonable thing to dig out if we weren''t just about to > > remove all this code anyway. > > > > So removing the kthread_bind is good enough for the short term, and for > > stable if people feel that is necessary, and we can continue in mainline > > with the direction Wei''s patches are taking us. > > > > That sounds a right fix. > > Why do think its not reasonable that I ask a bug fix ?I don''t think it is at all unreasonable to ask for bug fixes but in this case Wei''s series is removing the code in question (which would also undoubtedly fix the bug). As it happens the fix turns out to be simple but if it were complex I would perhaps have disagreed more strongly about spending effort fixing code that is removed 2 patches later, although obviously that would have depended on the specifics of the fix in that case.> Next time, dont bother send patches for review if you dont want > reviewers.The review which you are doing is certainly very much appreciated, I''m sorry if my disagreement over this one point gave/gives the impression that it is not. Ian.
Paul Gortmaker
2012-Feb-02 22:52 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Thu, Feb 2, 2012 at 3:50 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Thu, 2012-02-02 at 20:34 +0000, Eric Dumazet wrote:[...]> > I don''t think it is at all unreasonable to ask for bug fixes but in this > case Wei''s series is removing the code in question (which would also > undoubtedly fix the bug). > > As it happens the fix turns out to be simple but if it were complex I > would perhaps have disagreed more strongly about spending effort fixing > code that is removed 2 patches later, although obviously that would have > depended on the specifics of the fix in that case.Lots of people are relying on git bisect. If you introduce build failures or known bugs into any point in history, you take away from the value in git bisect. Sure, it happens by accident, but it shouldn''t ever be done knowingly. Paul.> >> Next time, dont bother send patches for review if you dont want >> reviewers. > > The review which you are doing is certainly very much appreciated, I''m > sorry if my disagreement over this one point gave/gives the impression > that it is not. > > Ian. > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Konrad Rzeszutek Wilk
2012-Feb-03 02:34 UTC
Re: [RFC PATCH V4 09/13] Bundle fix for xen backends and frontends
On Thu, Feb 02, 2012 at 04:49:19PM +0000, Wei Liu wrote:> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/block/xen-blkback/xenbus.c | 8 +++++--- > drivers/block/xen-blkfront.c | 5 +++-- > drivers/net/xen-netback/netback.c | 4 ++-- > drivers/net/xen-netfront.c | 9 +++++---- > drivers/pci/xen-pcifront.c | 5 +++-- > drivers/scsi/xen-scsiback/common.h | 3 ++- > drivers/scsi/xen-scsiback/interface.c | 6 ++++-- > drivers/scsi/xen-scsiback/xenbus.c | 4 ++-- > drivers/scsi/xen-scsifront/xenbus.c | 5 +++--Heheh. If you could seperate the scsi[back|front] from this patchset that would be great. The reason is that SCSI front/back aren''t yet ready for upstream.> drivers/xen/xen-pciback/xenbus.c | 11 ++++++----- > 10 files changed, 35 insertions(+), 25 deletions(-) > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c > index 9e9c8a1..ef7e88b 100644 > --- a/drivers/block/xen-blkback/xenbus.c > +++ b/drivers/block/xen-blkback/xenbus.c > @@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) > return blkif; > } > > -static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page, > +static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page[], > + int nr_pages, > unsigned int evtchn) > { > int err; > @@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page, > if (blkif->irq) > return 0; > > - err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring); > + err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, > + nr_pages, &blkif->blk_ring); > if (err < 0) > return err; > > @@ -779,7 +781,7 @@ static int connect_ring(struct backend_info *be) > ring_ref, evtchn, be->blkif->blk_protocol, protocol); > > /* Map the shared frame, irq etc. */ > - err = xen_blkif_map(be->blkif, ring_ref, evtchn); > + err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn); > if (err) { > xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u", > ring_ref, evtchn); > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > index 2f22874..2c6443a 100644 > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -827,6 +827,7 @@ static int setup_blkring(struct xenbus_device *dev, > { > struct blkif_sring *sring; > int err; > + int grefs[1]; > > info->ring_ref = GRANT_INVALID_REF; > > @@ -840,13 +841,13 @@ static int setup_blkring(struct xenbus_device *dev, > > sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST); > > - err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring)); > + err = xenbus_grant_ring(dev, info->ring.sring, 1, grefs); > if (err < 0) { > free_page((unsigned long)sring); > info->ring.sring = NULL; > goto fail; > } > - info->ring_ref = err; > + info->ring_ref = grefs[0]; > > err = xenbus_alloc_evtchn(dev, &info->evtchn); > if (err) > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index 384f4e5..cb1a661 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -1440,7 +1440,7 @@ int xenvif_map_frontend_rings(struct xenvif *vif, > int err = -ENOMEM; > > err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), > - tx_ring_ref, &addr); > + &tx_ring_ref, 1, &addr); > if (err) > goto err; > > @@ -1448,7 +1448,7 @@ int xenvif_map_frontend_rings(struct xenvif *vif, > BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE); > > err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif), > - rx_ring_ref, &addr); > + &rx_ring_ref, 1, &addr); > if (err) > goto err; > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 01f589d..b7ff815 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -1482,6 +1482,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > struct xen_netif_tx_sring *txs; > struct xen_netif_rx_sring *rxs; > int err; > + int grefs[1]; > struct net_device *netdev = info->netdev; > > info->tx_ring_ref = GRANT_INVALID_REF; > @@ -1505,13 +1506,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > SHARED_RING_INIT(txs); > FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE); > > - err = xenbus_grant_ring(dev, virt_to_mfn(txs)); > + err = xenbus_grant_ring(dev, txs, 1, grefs); > if (err < 0) { > free_page((unsigned long)txs); > goto fail; > } > > - info->tx_ring_ref = err; > + info->tx_ring_ref = grefs[0]; > rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH); > if (!rxs) { > err = -ENOMEM; > @@ -1521,12 +1522,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > SHARED_RING_INIT(rxs); > FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE); > > - err = xenbus_grant_ring(dev, virt_to_mfn(rxs)); > + err = xenbus_grant_ring(dev, rxs, 1, grefs); > if (err < 0) { > free_page((unsigned long)rxs); > goto fail; > } > - info->rx_ring_ref = err; > + info->rx_ring_ref = grefs[0]; > > err = xenbus_alloc_evtchn(dev, &info->evtchn); > if (err) > diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c > index 7cf3d2f..394c926 100644 > --- a/drivers/pci/xen-pcifront.c > +++ b/drivers/pci/xen-pcifront.c > @@ -767,12 +767,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev) > { > int err = 0; > struct xenbus_transaction trans; > + int grefs[1]; > > - err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info)); > + err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs); > if (err < 0) > goto out; > > - pdev->gnt_ref = err; > + pdev->gnt_ref = grefs[0]; > > err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn); > if (err) > diff --git a/drivers/scsi/xen-scsiback/common.h b/drivers/scsi/xen-scsiback/common.h > index dafa79e..4d13617 100644 > --- a/drivers/scsi/xen-scsiback/common.h > +++ b/drivers/scsi/xen-scsiback/common.h > @@ -150,7 +150,8 @@ typedef struct { > > irqreturn_t scsiback_intr(int, void *); > int scsiback_init_sring(struct vscsibk_info *info, > - unsigned long ring_ref, unsigned int evtchn); > + int ring_ref[], int nr_refs, > + unsigned int evtchn); > int scsiback_schedule(void *data); > > > diff --git a/drivers/scsi/xen-scsiback/interface.c b/drivers/scsi/xen-scsiback/interface.c > index 663568e..fad0a63 100644 > --- a/drivers/scsi/xen-scsiback/interface.c > +++ b/drivers/scsi/xen-scsiback/interface.c > @@ -60,7 +60,8 @@ struct vscsibk_info *vscsibk_info_alloc(domid_t domid) > } > > int scsiback_init_sring(struct vscsibk_info *info, > - unsigned long ring_ref, unsigned int evtchn) > + int ring_ref[], int nr_refs, > + unsigned int evtchn) > { > struct vscsiif_sring *sring; > int err; > @@ -73,7 +74,8 @@ int scsiback_init_sring(struct vscsibk_info *info, > return -1; > } > > - err = xenbus_map_ring_valloc(info->dev, ring_ref, &info->ring_area); > + err = xenbus_map_ring_valloc(info->dev, ring_ref, nr_refs, > + &info->ring_area); > if (err < 0) > return -ENOMEM; > > diff --git a/drivers/scsi/xen-scsiback/xenbus.c b/drivers/scsi/xen-scsiback/xenbus.c > index 2869f89..81d5598 100644 > --- a/drivers/scsi/xen-scsiback/xenbus.c > +++ b/drivers/scsi/xen-scsiback/xenbus.c > @@ -60,7 +60,7 @@ static int __vscsiif_name(struct backend_info *be, char *buf) > static int scsiback_map(struct backend_info *be) > { > struct xenbus_device *dev = be->dev; > - unsigned long ring_ref = 0; > + int ring_ref = 0; > unsigned int evtchn = 0; > int err; > char name[TASK_COMM_LEN]; > @@ -72,7 +72,7 @@ static int scsiback_map(struct backend_info *be) > xenbus_dev_fatal(dev, err, "reading %s ring", dev->otherend); > return err; > } > - err = scsiback_init_sring(be->info, ring_ref, evtchn); > + err = scsiback_init_sring(be->info, &ring_ref, 1, evtchn); > if (err) > return err; > > diff --git a/drivers/scsi/xen-scsifront/xenbus.c b/drivers/scsi/xen-scsifront/xenbus.c > index bc5c289..8726410 100644 > --- a/drivers/scsi/xen-scsifront/xenbus.c > +++ b/drivers/scsi/xen-scsifront/xenbus.c > @@ -60,6 +60,7 @@ static int scsifront_alloc_ring(struct vscsifrnt_info *info) > struct xenbus_device *dev = info->dev; > struct vscsiif_sring *sring; > int err = -ENOMEM; > + int grefs[1]; > > > info->ring_ref = GRANT_INVALID_REF; > @@ -73,14 +74,14 @@ static int scsifront_alloc_ring(struct vscsifrnt_info *info) > SHARED_RING_INIT(sring); > FRONT_RING_INIT(&info->ring, sring, PAGE_SIZE); > > - err = xenbus_grant_ring(dev, virt_to_mfn(sring)); > + err = xenbus_grant_ring(dev, sring, 1, grefs); > if (err < 0) { > free_page((unsigned long) sring); > info->ring.sring = NULL; > xenbus_dev_fatal(dev, err, "fail to grant shared ring (Front to Back)"); > goto free_sring; > } > - info->ring_ref = err; > + info->ring_ref = grefs[0]; > > err = xenbus_alloc_evtchn(dev, &info->evtchn); > if (err) > diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c > index 5a42ae7..0d8a98c 100644 > --- a/drivers/xen/xen-pciback/xenbus.c > +++ b/drivers/xen/xen-pciback/xenbus.c > @@ -98,17 +98,18 @@ static void free_pdev(struct xen_pcibk_device *pdev) > kfree(pdev); > } > > -static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref, > - int remote_evtchn) > +static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref[], > + int nr_grefs, > + int remote_evtchn) > { > int err = 0; > void *vaddr; > > dev_dbg(&pdev->xdev->dev, > "Attaching to frontend resources - gnt_ref=%d evtchn=%d\n", > - gnt_ref, remote_evtchn); > + gnt_ref[0], remote_evtchn); > > - err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr); > + err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr); > if (err < 0) { > xenbus_dev_fatal(pdev->xdev, err, > "Error mapping other domain page in ours."); > @@ -172,7 +173,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev) > goto out; > } > > - err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn); > + err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn); > if (err) > goto out; > > -- > 1.7.2.5
Ian Campbell
2012-Feb-03 06:38 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Thu, 2012-02-02 at 22:52 +0000, Paul Gortmaker wrote:> On Thu, Feb 2, 2012 at 3:50 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Thu, 2012-02-02 at 20:34 +0000, Eric Dumazet wrote: > > [...] > > > > > I don''t think it is at all unreasonable to ask for bug fixes but in this > > case Wei''s series is removing the code in question (which would also > > undoubtedly fix the bug). > > > > As it happens the fix turns out to be simple but if it were complex I > > would perhaps have disagreed more strongly about spending effort fixing > > code that is removed 2 patches later, although obviously that would have > > depended on the specifics of the fix in that case. > > Lots of people are relying on git bisect. If you introduce build failures > or known bugs into any point in history, you take away from the value > in git bisect. Sure, it happens by accident, but it shouldn''t ever be > done knowingly.Sure. In this case the bug has been there since 2.6.39, it isn''t introduced by this series. Ian.
Eric Dumazet
2012-Feb-03 07:25 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
Le vendredi 03 février 2012 à 06:38 +0000, Ian Campbell a écrit :> On Thu, 2012-02-02 at 22:52 +0000, Paul Gortmaker wrote: > > On Thu, Feb 2, 2012 at 3:50 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > On Thu, 2012-02-02 at 20:34 +0000, Eric Dumazet wrote: > > > > [...] > > > > > > > > I don''t think it is at all unreasonable to ask for bug fixes but in this > > > case Wei''s series is removing the code in question (which would also > > > undoubtedly fix the bug). > > > > > > As it happens the fix turns out to be simple but if it were complex I > > > would perhaps have disagreed more strongly about spending effort fixing > > > code that is removed 2 patches later, although obviously that would have > > > depended on the specifics of the fix in that case. > > > > Lots of people are relying on git bisect. If you introduce build failures > > or known bugs into any point in history, you take away from the value > > in git bisect. Sure, it happens by accident, but it shouldn''t ever be > > done knowingly. > > Sure. In this case the bug has been there since 2.6.39, it isn''t > introduced by this series. >We are stuck right now with a bug introduced in 2.6.39, (IP redirects), and because fix was done in 3.1, we are unable to provide a fix fo stable 3.0 kernel. Something that takes 15 minutes to fix now, can take several days of work later.
Ian Campbell
2012-Feb-03 08:02 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Fri, 2012-02-03 at 07:25 +0000, Eric Dumazet wrote:> Le vendredi 03 février 2012 à 06:38 +0000, Ian Campbell a écrit : > > On Thu, 2012-02-02 at 22:52 +0000, Paul Gortmaker wrote: > > > On Thu, Feb 2, 2012 at 3:50 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > > On Thu, 2012-02-02 at 20:34 +0000, Eric Dumazet wrote: > > > > > > [...] > > > > > > > > > > > I don''t think it is at all unreasonable to ask for bug fixes but in this > > > > case Wei''s series is removing the code in question (which would also > > > > undoubtedly fix the bug). > > > > > > > > As it happens the fix turns out to be simple but if it were complex I > > > > would perhaps have disagreed more strongly about spending effort fixing > > > > code that is removed 2 patches later, although obviously that would have > > > > depended on the specifics of the fix in that case. > > > > > > Lots of people are relying on git bisect. If you introduce build failures > > > or known bugs into any point in history, you take away from the value > > > in git bisect. Sure, it happens by accident, but it shouldn''t ever be > > > done knowingly. > > > > Sure. In this case the bug has been there since 2.6.39, it isn''t > > introduced by this series. > > > > We are stuck right now with a bug introduced in 2.6.39, (IP redirects), > and because fix was done in 3.1, we are unable to provide a fix fo > stable 3.0 kernel. > > Something that takes 15 minutes to fix now, can take several days of > work later.Sure. Here is the patch. I''ve compile tested it but not run it yet since I''m supposed to be packing for a trip, I''ll be back on Wednesday. It seems straight forward enough though. 8<-------------------------------- From 6f3d3068f6e049c2d810f9fc667d57667bea77dc Mon Sep 17 00:00:00 2001 From: Ian Campbell <ian.campbell@citrix.com> Date: Fri, 3 Feb 2012 07:47:23 +0000 Subject: [PATCH] xen: netback: do not bind netback threads to specific CPUs netback_init does not take proper account of which CPUs is online. However we don''t require a thread per CPU, just a pool of worker threads, of which the number of CPUs at start of day is as good a number as any. Therefore do not bind netback threads to particular CPUs. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Wei Lui <wei.lui2@citrix.com> --- drivers/net/xen-netback/netback.c | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 59effac..31ad3ee 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1670,8 +1670,6 @@ static int __init netback_init(void) goto failed_init; } - kthread_bind(netbk->task, group); - INIT_LIST_HEAD(&netbk->net_schedule_list); spin_lock_init(&netbk->net_schedule_list_lock); -- 1.7.2.5
Wei Liu
2012-Feb-03 11:27 UTC
Re: [RFC PATCH V4 02/13] netback: add module unload function.
On Fri, 2012-02-03 at 07:25 +0000, Eric Dumazet wrote:> > We are stuck right now with a bug introduced in 2.6.39, (IP redirects), > and because fix was done in 3.1, we are unable to provide a fix fo > stable 3.0 kernel. > > Something that takes 15 minutes to fix now, can take several days of > work later. > > >You''re right. Will stick Ian''s patch in front of my next series. Thanks for your effort in reviewing. Wei.
Konrad Rzeszutek Wilk
2012-Feb-03 16:55 UTC
Re: [RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
On Thu, Feb 02, 2012 at 04:49:18PM +0000, Wei Liu wrote:>So this does the job but with this patch you introduce a compile bisection bug, which is a not good. The way around is that in this patch you also introduce temporary scaffolding so that the drivers can build. Something as simple as an function that calls the new version, but has the right arguments. Then the next patch (the one that actually does change the backends, will back that wrapper out).
Wei Liu
2012-Feb-03 17:20 UTC
Re: [RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
On Fri, 2012-02-03 at 16:55 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Feb 02, 2012 at 04:49:18PM +0000, Wei Liu wrote: > > > > So this does the job but with this patch you introduce a compile bisection > bug, which is a not good. The way around is that in this patch you also > introduce temporary scaffolding so that the drivers can build. Something > as simple as an function that calls the new version, but has the right > arguments. Then the next patch (the one that actually does change > the backends, will back that wrapper out). >How about squashing these two patches. The changes in backends are trivial. Wei.
Konrad Rzeszutek Wilk
2012-Feb-03 17:35 UTC
Re: [RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
On Fri, Feb 03, 2012 at 05:20:25PM +0000, Wei Liu wrote:> On Fri, 2012-02-03 at 16:55 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Feb 02, 2012 at 04:49:18PM +0000, Wei Liu wrote: > > > > > > > So this does the job but with this patch you introduce a compile bisection > > bug, which is a not good. The way around is that in this patch you also > > introduce temporary scaffolding so that the drivers can build. Something > > as simple as an function that calls the new version, but has the right > > arguments. Then the next patch (the one that actually does change > > the backends, will back that wrapper out). > > > > How about squashing these two patches. The changes in backends are > trivial.That could be done as well.
Konrad Rzeszutek Wilk
2012-Feb-06 17:21 UTC
Re: [RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
On Fri, Feb 03, 2012 at 05:20:25PM +0000, Wei Liu wrote:> On Fri, 2012-02-03 at 16:55 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Feb 02, 2012 at 04:49:18PM +0000, Wei Liu wrote: > > > > > > > So this does the job but with this patch you introduce a compile bisection > > bug, which is a not good. The way around is that in this patch you also > > introduce temporary scaffolding so that the drivers can build. Something > > as simple as an function that calls the new version, but has the right > > arguments. Then the next patch (the one that actually does change > > the backends, will back that wrapper out). > > > > How about squashing these two patches. The changes in backends are > trivial.One thing I forgot to mention is that since the backends touch different subsystems maintainers tree - you will need to get Acks from all of them on a single patch. That should not be an technical issue - except that some maintainers can take longer to respond - so your whole patchset might be delayed by that.
Wei Liu
2012-Feb-06 17:30 UTC
Re: [RFC PATCH V4 08/13] xenbus_client: extend interface to support mapping / unmapping of multi page ring.
On Mon, 2012-02-06 at 17:21 +0000, Konrad Rzeszutek Wilk wrote:> One thing I forgot to mention is that since the backends touch different > subsystems maintainers tree - you will need to get Acks from all of them > on a single patch. That should not be an technical issue - except that some > maintainers can take longer to respond - so your whole patchset might > be delayed by that.Sure, that''s not a big problem. Wei.
Konrad Rzeszutek Wilk
2012-Feb-15 22:42 UTC
Re: [RFC PATCH V4 12/13] netfront: multi page ring support.
On Thu, Feb 02, 2012 at 04:49:22PM +0000, Wei Liu wrote:> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>It also needs this: From 4cf97c025792cf073edc4d312b962ecc0b3b67ab Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Wed, 15 Feb 2012 17:39:46 -0500 Subject: [PATCH] xen/net: Don''t try to use all of the rings if we are not built for it. Otherwise we end up: BUG: unable to handle kernel paging request at ffff88004000c0c8 IP: [<ffffffff810f1ee4>] free_one_page+0x144/0x410 PGD 1806063 PUD 0 22:22:34 tst007 logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/1/0 00 [#1] SMP CPU 0 Modules linked in: Pid: 17, comm: xenwatch Not tainted 3.2.0upstream #2 Xen HVM domU RIP: 0010:[<ffffffff810f1ee4>] [<ffffffff810f1ee4>] free_one_page+0x144/0x410 RSP: 0018:ffff88003bea3c40 EFLAGS: 00010046 .. snip. Call Trace: [<ffffffff810f2c7f>] __free_pages_ok+0x9f/0xe0 [<ffffffff810f4eab>] __free_pages+0x1b/0x40 [<ffffffff810f4f1a>] free_pages+0x4a/0x60 [<ffffffff8138b33d>] xennet_disconnect_backend+0xbd/0x130 [<ffffffff8138bd88>] talk_to_netback+0x8e8/0x1160 [<ffffffff812f4e28>] ? xenbus_gather+0xd8/0x170 [<ffffffff8138e3bd>] netback_changed+0xcd/0x550 [<ffffffff812f5bb8>] xenbus_otherend_changed+0xa8/0xb0 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/net/xen-netfront.c | 14 +++++++++++++- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 0223552..1eadd90 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -29,6 +29,8 @@ * IN THE SOFTWARE. */ +#define DEBUG 1 + #include <linux/module.h> #include <linux/kernel.h> #include <linux/netdevice.h> @@ -66,7 +68,7 @@ struct netfront_cb { #define GRANT_INVALID_REF 0 -#define XENNET_MAX_RING_PAGE_ORDER 2 +#define XENNET_MAX_RING_PAGE_ORDER 4 #define XENNET_MAX_RING_PAGES (1U << XENNET_MAX_RING_PAGE_ORDER) #define NET_TX_RING_SIZE(_nr_pages) \ @@ -1611,6 +1613,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) info->tx_ring_page_order = 0; dev_info(&dev->dev, "single tx ring\n"); } else { + if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", + max_tx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); + max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; + } info->tx_ring_page_order = max_tx_ring_page_order; dev_info(&dev->dev, "multi page tx ring, order = %d\n", max_tx_ring_page_order); @@ -1642,6 +1649,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) dev_info(&dev->dev, "single rx ring\n"); } else { info->rx_ring_page_order = max_rx_ring_page_order; + if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", + max_rx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); + max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; + } dev_info(&dev->dev, "multi page rx ring, order = %d\n", max_rx_ring_page_order); } -- 1.7.9.48.g85da4d
David Miller
2012-Feb-15 22:52 UTC
Re: [RFC PATCH V4 12/13] netfront: multi page ring support.
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Wed, 15 Feb 2012 17:42:53 -0500> @@ -29,6 +29,8 @@ > * IN THE SOFTWARE. > */ > > +#define DEBUG 1 > + > #include <linux/module.h> > #include <linux/kernel.h> > #include <linux/netdevice.h>This is never appropriate.
Konrad Rzeszutek Wilk
2012-Feb-15 23:53 UTC
Re: [RFC PATCH V4 12/13] netfront: multi page ring support.
On Wed, Feb 15, 2012 at 05:52:12PM -0500, David Miller wrote:> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Wed, 15 Feb 2012 17:42:53 -0500 > > > @@ -29,6 +29,8 @@ > > * IN THE SOFTWARE. > > */ > > > > +#define DEBUG 1 > > + > > #include <linux/module.h> > > #include <linux/kernel.h> > > #include <linux/netdevice.h> > > This is never appropriate.HA! No it is not. Thanks for spotting it. I was thinking that Liu would actually squash the fix the patch I responded to.
On Wed, 2012-02-15 at 22:42 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Feb 02, 2012 at 04:49:22PM +0000, Wei Liu wrote: > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > It also needs this: > > From 4cf97c025792cf073edc4d312b962ecc0b3b67ab Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Wed, 15 Feb 2012 17:39:46 -0500 > Subject: [PATCH] xen/net: Don''t try to use all of the rings if we are not > built for it. > > Otherwise we end up: > > BUG: unable to handle kernel paging request at ffff88004000c0c8 > IP: [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > PGD 1806063 PUD 0 > 22:22:34 tst007 logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/1/0 > 00 [#1] SMP > CPU 0 > Modules linked in: > > Pid: 17, comm: xenwatch Not tainted 3.2.0upstream #2 Xen HVM domU > RIP: 0010:[<ffffffff810f1ee4>] [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > RSP: 0018:ffff88003bea3c40 EFLAGS: 00010046 > .. snip. > Call Trace: > [<ffffffff810f2c7f>] __free_pages_ok+0x9f/0xe0 > [<ffffffff810f4eab>] __free_pages+0x1b/0x40 > [<ffffffff810f4f1a>] free_pages+0x4a/0x60 > [<ffffffff8138b33d>] xennet_disconnect_backend+0xbd/0x130 > [<ffffffff8138bd88>] talk_to_netback+0x8e8/0x1160 > [<ffffffff812f4e28>] ? xenbus_gather+0xd8/0x170 > [<ffffffff8138e3bd>] netback_changed+0xcd/0x550 > [<ffffffff812f5bb8>] xenbus_otherend_changed+0xa8/0xb0 > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/net/xen-netfront.c | 14 +++++++++++++- > 1 files changed, 13 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 0223552..1eadd90 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -29,6 +29,8 @@ > * IN THE SOFTWARE. > */ > > +#define DEBUG 1 > + > #include <linux/module.h> > #include <linux/kernel.h> > #include <linux/netdevice.h> > @@ -66,7 +68,7 @@ struct netfront_cb { > > #define GRANT_INVALID_REF 0 > > -#define XENNET_MAX_RING_PAGE_ORDER 2 > +#define XENNET_MAX_RING_PAGE_ORDER 4I guess this is you tuning with page order? And here is not the only one place you changed? As a matter of fact, in the previous patch 8 I encode hard limit 2 on the ring page order, your change here will stop FE / BE from connecting. I think I will also need to change this to something like #define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER to remind people to modify that value.> #define XENNET_MAX_RING_PAGES (1U << XENNET_MAX_RING_PAGE_ORDER) > > #define NET_TX_RING_SIZE(_nr_pages) \ > @@ -1611,6 +1613,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > info->tx_ring_page_order = 0; > dev_info(&dev->dev, "single tx ring\n"); > } else { > + if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { > + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", > + max_tx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); > + max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; > + } > info->tx_ring_page_order = max_tx_ring_page_order; > dev_info(&dev->dev, "multi page tx ring, order = %d\n", > max_tx_ring_page_order); > @@ -1642,6 +1649,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > dev_info(&dev->dev, "single rx ring\n"); > } else { > info->rx_ring_page_order = max_rx_ring_page_order; > + if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { > + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", > + max_rx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); > + max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; > + } > dev_info(&dev->dev, "multi page rx ring, order = %d\n", > max_rx_ring_page_order); > }Thanks for this, I will squash it into my patch. Wei.
On Thu, 2012-02-16 at 10:02 +0000, Wei Liu (Intern) wrote:> On Wed, 2012-02-15 at 22:42 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Feb 02, 2012 at 04:49:22PM +0000, Wei Liu wrote: > > > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > > It also needs this: > > > > From 4cf97c025792cf073edc4d312b962ecc0b3b67ab Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Date: Wed, 15 Feb 2012 17:39:46 -0500 > > Subject: [PATCH] xen/net: Don''t try to use all of the rings if we are not > > built for it. > > > > Otherwise we end up: > > > > BUG: unable to handle kernel paging request at ffff88004000c0c8 > > IP: [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > > PGD 1806063 PUD 0 > > 22:22:34 tst007 logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/1/0 > > 00 [#1] SMP > > CPU 0 > > Modules linked in: > > > > Pid: 17, comm: xenwatch Not tainted 3.2.0upstream #2 Xen HVM domU > > RIP: 0010:[<ffffffff810f1ee4>] [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > > RSP: 0018:ffff88003bea3c40 EFLAGS: 00010046 > > .. snip. > > Call Trace: > > [<ffffffff810f2c7f>] __free_pages_ok+0x9f/0xe0 > > [<ffffffff810f4eab>] __free_pages+0x1b/0x40 > > [<ffffffff810f4f1a>] free_pages+0x4a/0x60 > > [<ffffffff8138b33d>] xennet_disconnect_backend+0xbd/0x130 > > [<ffffffff8138bd88>] talk_to_netback+0x8e8/0x1160 > > [<ffffffff812f4e28>] ? xenbus_gather+0xd8/0x170 > > [<ffffffff8138e3bd>] netback_changed+0xcd/0x550 > > [<ffffffff812f5bb8>] xenbus_otherend_changed+0xa8/0xb0 > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > drivers/net/xen-netfront.c | 14 +++++++++++++- > > 1 files changed, 13 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > > index 0223552..1eadd90 100644 > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -29,6 +29,8 @@ > > * IN THE SOFTWARE. > > */ > > > > +#define DEBUG 1 > > + > > #include <linux/module.h> > > #include <linux/kernel.h> > > #include <linux/netdevice.h> > > @@ -66,7 +68,7 @@ struct netfront_cb { > > > > #define GRANT_INVALID_REF 0 > > > > -#define XENNET_MAX_RING_PAGE_ORDER 2 > > +#define XENNET_MAX_RING_PAGE_ORDER 4 > > I guess this is you tuning with page order? And here is not the only one > place you changed? > > As a matter of fact, in the previous patch 8 I encode hard limit 2 on > the ring page order, your change here will stop FE / BE from connecting. > > I think I will also need to change this to something like > > #define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER > > to remind people to modify that value. >To be more precise on this, tangling with RING_PAGE_ORDER will not affect FE, because the mapping is done in BE. However if you make RING_PAGE_ORDER larger than BE limit, it will fail. So the above #define is actually asking people playing with FE to check BE limit. :-( Wei.
Konrad Rzeszutek Wilk
2012-Feb-16 22:57 UTC
Re: [RFC PATCH V4 12/13] netfront: multi page ring support.
On Thu, Feb 16, 2012 at 10:02:51AM +0000, Wei Liu wrote:> On Wed, 2012-02-15 at 22:42 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Feb 02, 2012 at 04:49:22PM +0000, Wei Liu wrote: > > > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > > It also needs this: > > > > From 4cf97c025792cf073edc4d312b962ecc0b3b67ab Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Date: Wed, 15 Feb 2012 17:39:46 -0500 > > Subject: [PATCH] xen/net: Don''t try to use all of the rings if we are not > > built for it. > > > > Otherwise we end up: > > > > BUG: unable to handle kernel paging request at ffff88004000c0c8 > > IP: [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > > PGD 1806063 PUD 0 > > 22:22:34 tst007 logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/1/0 > > 00 [#1] SMP > > CPU 0 > > Modules linked in: > > > > Pid: 17, comm: xenwatch Not tainted 3.2.0upstream #2 Xen HVM domU > > RIP: 0010:[<ffffffff810f1ee4>] [<ffffffff810f1ee4>] free_one_page+0x144/0x410 > > RSP: 0018:ffff88003bea3c40 EFLAGS: 00010046 > > .. snip. > > Call Trace: > > [<ffffffff810f2c7f>] __free_pages_ok+0x9f/0xe0 > > [<ffffffff810f4eab>] __free_pages+0x1b/0x40 > > [<ffffffff810f4f1a>] free_pages+0x4a/0x60 > > [<ffffffff8138b33d>] xennet_disconnect_backend+0xbd/0x130 > > [<ffffffff8138bd88>] talk_to_netback+0x8e8/0x1160 > > [<ffffffff812f4e28>] ? xenbus_gather+0xd8/0x170 > > [<ffffffff8138e3bd>] netback_changed+0xcd/0x550 > > [<ffffffff812f5bb8>] xenbus_otherend_changed+0xa8/0xb0 > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > drivers/net/xen-netfront.c | 14 +++++++++++++- > > 1 files changed, 13 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > > index 0223552..1eadd90 100644 > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -29,6 +29,8 @@ > > * IN THE SOFTWARE. > > */ > > > > +#define DEBUG 1 > > + > > #include <linux/module.h> > > #include <linux/kernel.h> > > #include <linux/netdevice.h> > > @@ -66,7 +68,7 @@ struct netfront_cb { > > > > #define GRANT_INVALID_REF 0 > > > > -#define XENNET_MAX_RING_PAGE_ORDER 2 > > +#define XENNET_MAX_RING_PAGE_ORDER 4 > > I guess this is you tuning with page order? And here is not the only one > place you changed?Yup. Was playing with it and saw this blow up.> > As a matter of fact, in the previous patch 8 I encode hard limit 2 on > the ring page order, your change here will stop FE / BE from connecting.I think it will work OK - it will just use up to 4 instead of the default two.> > I think I will also need to change this to something like > > #define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER > > to remind people to modify that value.<nods>> > > #define XENNET_MAX_RING_PAGES (1U << XENNET_MAX_RING_PAGE_ORDER) > > > > #define NET_TX_RING_SIZE(_nr_pages) \ > > @@ -1611,6 +1613,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > > info->tx_ring_page_order = 0; > > dev_info(&dev->dev, "single tx ring\n"); > > } else { > > + if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { > > + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", > > + max_tx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); > > + max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; > > + } > > info->tx_ring_page_order = max_tx_ring_page_order; > > dev_info(&dev->dev, "multi page tx ring, order = %d\n", > > max_tx_ring_page_order); > > @@ -1642,6 +1649,11 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info) > > dev_info(&dev->dev, "single rx ring\n"); > > } else { > > info->rx_ring_page_order = max_rx_ring_page_order; > > + if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) { > > + dev_warn(&dev->dev, "Backend can do %d pages but we can only do %d!\n", > > + max_rx_ring_page_order, XENNET_MAX_RING_PAGE_ORDER); > > + max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER; > > + } > > dev_info(&dev->dev, "multi page rx ring, order = %d\n", > > max_rx_ring_page_order); > > } > > Thanks for this, I will squash it into my patch.Thanks.> > > Wei.
Konrad Rzeszutek Wilk
2012-Feb-17 15:10 UTC
Re: [RFC PATCH V4 12/13] netfront: multi page ring support.
> > > -#define XENNET_MAX_RING_PAGE_ORDER 2 > > > +#define XENNET_MAX_RING_PAGE_ORDER 4 > > > > I guess this is you tuning with page order? And here is not the only one > > place you changed? > > > > As a matter of fact, in the previous patch 8 I encode hard limit 2 on > > the ring page order, your change here will stop FE / BE from connecting. > > > > I think I will also need to change this to something like > > > > #define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER > > > > to remind people to modify that value. > > > > To be more precise on this, tangling with RING_PAGE_ORDER will not > affect FE, because the mapping is done in BE. However if you make > RING_PAGE_ORDER larger than BE limit, it will fail. > > So the above #define is actually asking people playing with FE to check > BE limit. :-(Say that in two years we decide that the ring order in the FE should be 256, and we also change that in the backend. Some customers might still be running with the old backends which advertise only 4. Or vice-versa. The users run a brand new BE which advertises 256 and the user is running a frontend that can only do 4. It (frontend) should be able to safely negotiate the proper minimum value.
Konrad Rzeszutek Wilk
2012-Feb-17 19:19 UTC
Re: [RFC PATCH V4 01/13] netback: page pool version 1
> > Hmm, this kind of stuff should be discussed on lkml. > > I doubt we want yet another memory allocator, with a global lock > (contended), and no NUMA properties.That should be fixed. Are there any existing memory pools that could be used instead? I (And I think everybody) is all for using the existing APIs if they can do the job. I was lookign a bit at the dmapool code, but that requires something we don''t have - the ''struct device''. We could manufacture a fake one, but that just stinks of hack. It [pagepool] also should use the shrinker API I think.
On Fri, 2012-02-17 at 19:19 +0000, Konrad Rzeszutek Wilk wrote:> > > > Hmm, this kind of stuff should be discussed on lkml. > > > > I doubt we want yet another memory allocator, with a global lock > > (contended), and no NUMA properties. > > That should be fixed. Are there any existing memory pools that could be used > instead? I (And I think everybody) is all for using the existing APIs if they > can do the job. I was lookign a bit at the dmapool code, but that requires something > we don''t have - the ''struct device''. We could manufacture a fake one, but that just > stinks of hack. >I''ve been thinking about this for a long time, so any recommendation is welcomed. It is not my intention to write yet another memory allocator. What I need is a data structure to track pages owned by netback. Let me state the requirements of this data structure: 1. limits overall memory used by all vifs (this could also be met by the underlying allocator) 2. provides function to tell whether a particular page is mapped from foreign domain -- is_in_pool() is a surrogate for that 3. provides function to back-reference owner vif of the page To achieve requirement 2, page pool manipulates page->mapping field. To achieve requirement 3, page pool maintains idx <-> vif relationship internally. I think I can use mempool internally for page allocation. But I still need a data structure to meet other requirements.> It [pagepool] also should use the shrinker API I think.This is doable, but let''s make everybody happy with the page pool design and implementation first. Wei.