A new netback implementation which includes three major features: - Global page pool support - NAPI + kthread 1:1 model - Netback internal name changes Changes in V2: - Fix minor bugs in V1 - Embed pending_tx_info into page pool - Per-cpu scratch space - Notification code path clean up This patch series is the foundation of furture work. So it is better to get it right first. Patch 1 and 3 have the real meat. The first benifit of 1:1 model will be scheduling fairness. The rational behind a global page pool is that we need to limit overall memory consumed by all vifs. Utilization of NAPI enables the possibility to mitigate interrupts/events, the code path is cleaned up in a separated patch. Netback internal changes cleans up the code structure after switching to 1:1 model. It also prepares netback for further code layout changes. --- drivers/net/xen-netback/Makefile | 2 +- drivers/net/xen-netback/common.h | 78 ++-- drivers/net/xen-netback/interface.c | 117 ++++-- drivers/net/xen-netback/netback.c | 836 ++++++++++++++--------------------- drivers/net/xen-netback/page_pool.c | 185 ++++++++ drivers/net/xen-netback/page_pool.h | 66 +++ drivers/net/xen-netback/xenbus.c | 6 +- 7 files changed, 704 insertions(+), 586 deletions(-)
A global page pool. Since we are moving to 1:1 model netback, it is better to limit total RAM consumed by all the vifs. With this patch, each vif gets page from the pool and puts the page back when it is finished with the page. This pool is only meant to access via exported interfaces. Internals are subject to change when we discover new requirements for the pool. Current exported interfaces include: page_pool_init: pool init page_pool_destroy: pool destruction page_pool_get: get a page from pool page_pool_put: put page back to pool is_in_pool: tell whether a page belongs to the pool Current implementation has following defects: - Global locking - No starve prevention mechanism / reservation logic Global locking tends to cause contention on the pool. No reservation logic may cause vif to starve. A possible solution to these two problems will be each vif maintains its local cache and claims a portion of the pool. However the implementation will be tricky when coming to pool management, so let''s worry about that later. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/Makefile | 2 +- drivers/net/xen-netback/common.h | 6 + drivers/net/xen-netback/netback.c | 158 ++++++++++++------------------ drivers/net/xen-netback/page_pool.c | 185 +++++++++++++++++++++++++++++++++++ drivers/net/xen-netback/page_pool.h | 63 ++++++++++++ 5 files changed, 317 insertions(+), 97 deletions(-) create mode 100644 drivers/net/xen-netback/page_pool.c create mode 100644 drivers/net/xen-netback/page_pool.h diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile index e346e81..dc4b8b1 100644 --- a/drivers/net/xen-netback/Makefile +++ b/drivers/net/xen-netback/Makefile @@ -1,3 +1,3 @@ obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o -xen-netback-y := netback.o xenbus.o interface.o +xen-netback-y := netback.o xenbus.o interface.o page_pool.o diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 94b79c3..288b2f3 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,6 +45,12 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +struct pending_tx_info { + struct xen_netif_tx_request req; + struct xenvif *vif; +}; +typedef unsigned int pending_ring_idx_t; + struct xen_netbk; struct xenvif { diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 59effac..d11205f 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -33,6 +33,7 @@ */ #include "common.h" +#include "page_pool.h" #include <linux/kthread.h> #include <linux/if_vlan.h> @@ -46,12 +47,6 @@ #include <asm/xen/hypercall.h> #include <asm/xen/page.h> -struct pending_tx_info { - struct xen_netif_tx_request req; - struct xenvif *vif; -}; -typedef unsigned int pending_ring_idx_t; - struct netbk_rx_meta { int id; int size; @@ -65,21 +60,6 @@ struct netbk_rx_meta { #define MAX_BUFFER_OFFSET PAGE_SIZE -/* extra field used in struct page */ -union page_ext { - struct { -#if BITS_PER_LONG < 64 -#define IDX_WIDTH 8 -#define GROUP_WIDTH (BITS_PER_LONG - IDX_WIDTH) - unsigned int group:GROUP_WIDTH; - unsigned int idx:IDX_WIDTH; -#else - unsigned int group, idx; -#endif - } e; - void *mapping; -}; - struct xen_netbk { wait_queue_head_t wq; struct task_struct *task; @@ -89,7 +69,7 @@ struct xen_netbk { struct timer_list net_timer; - struct page *mmap_pages[MAX_PENDING_REQS]; + idx_t mmap_pages[MAX_PENDING_REQS]; pending_ring_idx_t pending_prod; pending_ring_idx_t pending_cons; @@ -100,7 +80,6 @@ struct xen_netbk { atomic_t netfront_count; - struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; u16 pending_ring[MAX_PENDING_REQS]; @@ -160,7 +139,7 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, static inline unsigned long idx_to_pfn(struct xen_netbk *netbk, u16 idx) { - return page_to_pfn(netbk->mmap_pages[idx]); + return page_to_pfn(to_page(netbk->mmap_pages[idx])); } static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, @@ -169,45 +148,6 @@ static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, return (unsigned long)pfn_to_kaddr(idx_to_pfn(netbk, idx)); } -/* extra field used in struct page */ -static inline void set_page_ext(struct page *pg, struct xen_netbk *netbk, - unsigned int idx) -{ - unsigned int group = netbk - xen_netbk; - union page_ext ext = { .e = { .group = group + 1, .idx = idx } }; - - BUILD_BUG_ON(sizeof(ext) > sizeof(ext.mapping)); - pg->mapping = ext.mapping; -} - -static int get_page_ext(struct page *pg, - unsigned int *pgroup, unsigned int *pidx) -{ - union page_ext ext = { .mapping = pg->mapping }; - struct xen_netbk *netbk; - unsigned int group, idx; - - group = ext.e.group - 1; - - if (group < 0 || group >= xen_netbk_group_nr) - return 0; - - netbk = &xen_netbk[group]; - - idx = ext.e.idx; - - if ((idx < 0) || (idx >= MAX_PENDING_REQS)) - return 0; - - if (netbk->mmap_pages[idx] != pg) - return 0; - - *pgroup = group; - *pidx = idx; - - return 1; -} - /* * This is the amount of packet we copy rather than map, so that the * guest can''t fiddle with the contents of the headers while we do @@ -398,8 +338,8 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, * These variables are used iff get_page_ext returns true, * in which case they are guaranteed to be initialized. */ - unsigned int uninitialized_var(group), uninitialized_var(idx); - int foreign = get_page_ext(page, &group, &idx); + unsigned int uninitialized_var(idx); + int foreign = is_in_pool(page, &idx); unsigned long bytes; /* Data must not cross a page boundary. */ @@ -427,10 +367,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, copy_gop = npo->copy + npo->copy_prod++; copy_gop->flags = GNTCOPY_dest_gref; if (foreign) { - struct xen_netbk *netbk = &xen_netbk[group]; - struct pending_tx_info *src_pend; - - src_pend = &netbk->pending_tx_info[idx]; + struct pending_tx_info *src_pend = to_txinfo(idx); copy_gop->source.domid = src_pend->vif->domid; copy_gop->source.u.ref = src_pend->req.gref; @@ -906,11 +843,11 @@ static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk, u16 pending_idx) { struct page *page; - page = alloc_page(GFP_KERNEL|__GFP_COLD); + int idx; + page = page_pool_get(netbk, &idx); if (!page) return NULL; - set_page_ext(page, netbk, pending_idx); - netbk->mmap_pages[pending_idx] = page; + netbk->mmap_pages[pending_idx] = idx; return page; } @@ -931,8 +868,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, for (i = start; i < shinfo->nr_frags; i++, txp++) { struct page *page; pending_ring_idx_t index; - struct pending_tx_info *pending_tx_info - netbk->pending_tx_info; + int idx; + struct pending_tx_info *pending_tx_info; index = pending_index(netbk->pending_cons++); pending_idx = netbk->pending_ring[index]; @@ -940,6 +877,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, if (!page) return NULL; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + gop->source.u.ref = txp->gref; gop->source.domid = vif->domid; gop->source.offset = txp->offset; @@ -953,9 +893,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop++; - memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp)); + memcpy(&pending_tx_info->req, txp, sizeof(*txp)); xenvif_get(vif); - pending_tx_info[pending_idx].vif = vif; + pending_tx_info->vif = vif; frag_set_pending_idx(&frags[i], pending_idx); } @@ -968,8 +908,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, { struct gnttab_copy *gop = *gopp; u16 pending_idx = *((u16 *)skb->data); - struct pending_tx_info *pending_tx_info = netbk->pending_tx_info; - struct xenvif *vif = pending_tx_info[pending_idx].vif; + struct pending_tx_info *pending_tx_info; + int idx; + struct xenvif *vif = NULL; struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -980,7 +921,10 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, if (unlikely(err)) { pending_ring_idx_t index; index = pending_index(netbk->pending_prod++); - txp = &pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[index]; + pending_tx_info = to_txinfo(idx); + txp = &pending_tx_info->req; + vif = pending_tx_info->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); netbk->pending_ring[index] = pending_idx; xenvif_put(vif); @@ -1005,7 +949,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, } /* Error on this fragment: respond to client with an error. */ - txp = &netbk->pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[pending_idx]; + txp = &to_txinfo(idx)->req; + vif = to_txinfo(idx)->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; @@ -1042,10 +988,15 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) struct xen_netif_tx_request *txp; struct page *page; u16 pending_idx; + int idx; + struct pending_tx_info *pending_tx_info; pending_idx = frag_get_pending_idx(frag); - txp = &netbk->pending_tx_info[pending_idx].req; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + + txp = &pending_tx_info->req; page = virt_to_page(idx_to_kaddr(netbk, pending_idx)); __skb_fill_page_desc(skb, i, page, txp->offset, txp->size); skb->len += txp->size; @@ -1053,7 +1004,7 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) skb->truesize += txp->size; /* Take an extra reference to offset xen_netbk_idx_release */ - get_page(netbk->mmap_pages[pending_idx]); + get_page(page); xen_netbk_idx_release(netbk, pending_idx); } } @@ -1233,6 +1184,8 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) int work_to_do; unsigned int data_len; pending_ring_idx_t index; + int pool_idx; + struct pending_tx_info *pending_tx_info; /* Get a netif from the list with work to do. */ vif = poll_net_schedule_list(netbk); @@ -1347,9 +1300,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) gop++; - memcpy(&netbk->pending_tx_info[pending_idx].req, + pool_idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(pool_idx); + + memcpy(&pending_tx_info->req, &txreq, sizeof(txreq)); - netbk->pending_tx_info[pending_idx].vif = vif; + pending_tx_info->vif = vif; *((u16 *)skb->data) = pending_idx; __skb_put(skb, data_len); @@ -1397,10 +1353,16 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) struct xenvif *vif; u16 pending_idx; unsigned data_len; + int idx; + struct pending_tx_info *pending_tx_info; pending_idx = *((u16 *)skb->data); - vif = netbk->pending_tx_info[pending_idx].vif; - txp = &netbk->pending_tx_info[pending_idx].req; + + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); + + vif = pending_tx_info->vif; + txp = &pending_tx_info->req; /* Check the remap error code. */ if (unlikely(xen_netbk_tx_check_gop(netbk, skb, &gop))) { @@ -1480,12 +1442,14 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) struct xenvif *vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; + int idx; /* Already complete? */ - if (netbk->mmap_pages[pending_idx] == NULL) + if (netbk->mmap_pages[pending_idx] == INVALID_ENTRY) return; - pending_tx_info = &netbk->pending_tx_info[pending_idx]; + idx = netbk->mmap_pages[pending_idx]; + pending_tx_info = to_txinfo(idx); vif = pending_tx_info->vif; @@ -1496,9 +1460,9 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) xenvif_put(vif); - netbk->mmap_pages[pending_idx]->mapping = 0; - put_page(netbk->mmap_pages[pending_idx]); - netbk->mmap_pages[pending_idx] = NULL; + page_pool_put(netbk->mmap_pages[pending_idx]); + + netbk->mmap_pages[pending_idx] = INVALID_ENTRY; } static void make_tx_response(struct xenvif *vif, @@ -1681,19 +1645,21 @@ static int __init netback_init(void) wake_up_process(netbk->task); } - rc = xenvif_xenbus_init(); + rc = page_pool_init(); if (rc) goto failed_init; + rc = xenvif_xenbus_init(); + if (rc) + goto pool_failed_init; + return 0; +pool_failed_init: + page_pool_destroy(); failed_init: while (--group >= 0) { struct xen_netbk *netbk = &xen_netbk[group]; - for (i = 0; i < MAX_PENDING_REQS; i++) { - if (netbk->mmap_pages[i]) - __free_page(netbk->mmap_pages[i]); - } del_timer(&netbk->net_timer); kthread_stop(netbk->task); } diff --git a/drivers/net/xen-netback/page_pool.c b/drivers/net/xen-netback/page_pool.c new file mode 100644 index 0000000..294f48b --- /dev/null +++ b/drivers/net/xen-netback/page_pool.c @@ -0,0 +1,185 @@ +/* + * Global page pool for netback. + * + * Wei Liu <wei.liu2@citrix.com> + * Copyright (c) Citrix Systems + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "common.h" +#include "page_pool.h" +#include <asm/xen/page.h> + +static idx_t free_head; +static int free_count; +static unsigned long pool_size; +static DEFINE_SPINLOCK(pool_lock); +static struct page_pool_entry *pool; + +static int get_free_entry(void) +{ + int idx; + + spin_lock(&pool_lock); + + if (free_count == 0) { + spin_unlock(&pool_lock); + return -ENOSPC; + } + + idx = free_head; + free_count--; + free_head = pool[idx].u.fl; + pool[idx].u.fl = INVALID_ENTRY; + + spin_unlock(&pool_lock); + + return idx; +} + +static void put_free_entry(idx_t idx) +{ + spin_lock(&pool_lock); + + pool[idx].u.fl = free_head; + free_head = idx; + free_count++; + + spin_unlock(&pool_lock); +} + +static inline void set_page_ext(struct page *pg, unsigned int idx) +{ + union page_ext ext = { .idx = idx }; + + BUILD_BUG_ON(sizeof(ext) > sizeof(ext.mapping)); + pg->mapping = ext.mapping; +} + +static int get_page_ext(struct page *pg, unsigned int *pidx) +{ + union page_ext ext = { .mapping = pg->mapping }; + int idx; + + idx = ext.idx; + + if ((idx < 0) || (idx >= pool_size)) + return 0; + + if (pool[idx].page != pg) + return 0; + + *pidx = idx; + + return 1; +} + +int is_in_pool(struct page *page, int *pidx) +{ + return get_page_ext(page, pidx); +} + +struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) +{ + int idx; + struct page *page; + + idx = get_free_entry(); + if (idx < 0) + return NULL; + page = alloc_page(GFP_ATOMIC); + + if (page == NULL) { + put_free_entry(idx); + return NULL; + } + + set_page_ext(page, idx); + pool[idx].u.netbk = netbk; + pool[idx].page = page; + + *pidx = idx; + + return page; +} + +void page_pool_put(int idx) +{ + struct page *page = pool[idx].page; + + pool[idx].page = NULL; + pool[idx].u.netbk = NULL; + page->mapping = 0; + put_page(page); + put_free_entry(idx); +} + +int page_pool_init() +{ + int cpus = 0; + int i; + + cpus = num_online_cpus(); + pool_size = cpus * ENTRIES_PER_CPU; + + pool = vzalloc(sizeof(struct page_pool_entry) * pool_size); + + if (!pool) + return -ENOMEM; + + for (i = 0; i < pool_size - 1; i++) + pool[i].u.fl = i+1; + pool[pool_size-1].u.fl = INVALID_ENTRY; + free_count = pool_size; + free_head = 0; + + return 0; +} + +void page_pool_destroy() +{ + int i; + for (i = 0; i < pool_size; i++) + if (pool[i].page) + put_page(pool[i].page); + + vfree(pool); +} + +struct page *to_page(int idx) +{ + return pool[idx].page; +} + +struct xen_netbk *to_netbk(int idx) +{ + return pool[idx].u.netbk; +} + +struct pending_tx_info *to_txinfo(int idx) +{ + return &pool[idx].tx_info; +} diff --git a/drivers/net/xen-netback/page_pool.h b/drivers/net/xen-netback/page_pool.h new file mode 100644 index 0000000..572b037 --- /dev/null +++ b/drivers/net/xen-netback/page_pool.h @@ -0,0 +1,63 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef __PAGE_POOL_H__ +#define __PAGE_POOL_H__ + +#include "common.h" + +typedef uint32_t idx_t; + +#define ENTRIES_PER_CPU (1024) +#define INVALID_ENTRY 0xffffffff + +struct page_pool_entry { + struct page *page; + struct pending_tx_info tx_info; + union { + struct xen_netbk *netbk; + idx_t fl; + } u; +}; + +union page_ext { + idx_t idx; + void *mapping; +}; + +int page_pool_init(void); +void page_pool_destroy(void); + + +struct page *page_pool_get(struct xen_netbk *netbk, int *pidx); +void page_pool_put(int idx); +int is_in_pool(struct page *page, int *pidx); + +struct page *to_page(int idx); +struct xen_netbk *to_netbk(int idx); +struct pending_tx_info *to_txinfo(int idx); + +#endif /* __PAGE_POOL_H__ */ -- 1.7.2.5
Enables users to unload netback module. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 1 + drivers/net/xen-netback/netback.c | 14 ++++++++++++++ drivers/net/xen-netback/xenbus.c | 5 +++++ 3 files changed, 20 insertions(+), 0 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 288b2f3..372c7f5 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -126,6 +126,7 @@ void xenvif_get(struct xenvif *vif); void xenvif_put(struct xenvif *vif); int xenvif_xenbus_init(void); +void xenvif_xenbus_exit(void); int xenvif_schedulable(struct xenvif *vif); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index d11205f..3059684 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1670,5 +1670,19 @@ failed_init: module_init(netback_init); +static void __exit netback_exit(void) +{ + int i; + xenvif_xenbus_exit(); + for (i = 0; i < xen_netbk_group_nr; i++) { + struct xen_netbk *netbk = &xen_netbk[i]; + del_timer_sync(&netbk->net_timer); + kthread_stop(netbk->task); + } + vfree(xen_netbk); + page_pool_destroy(); +} +module_exit(netback_exit); + MODULE_LICENSE("Dual BSD/GPL"); MODULE_ALIAS("xen-backend:vif"); diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 410018c..65d14f2 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void) { return xenbus_register_backend(&netback_driver); } + +void xenvif_xenbus_exit(void) +{ + return xenbus_unregister_driver(&netback_driver); +} -- 1.7.2.5
This patch implements 1:1 model netback. We utilizes NAPI and kthread to do the weight-lifting job: - NAPI is used for guest side TX (host side RX) - kthread is used for guest side RX (host side TX) This model provides better scheduling fairness among vifs. It also lays the foundation for future work. The major defect for the current implementation is that in the NAPI poll handler we don''t actually disable interrupt. Xen stuff is different from real hardware, it requires some other tuning of ring macros. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 34 ++-- drivers/net/xen-netback/interface.c | 92 ++++++--- drivers/net/xen-netback/netback.c | 366 ++++++++++------------------------- drivers/net/xen-netback/xenbus.c | 1 - 4 files changed, 185 insertions(+), 308 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 372c7f5..31c331c 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -47,7 +47,6 @@ struct pending_tx_info { struct xen_netif_tx_request req; - struct xenvif *vif; }; typedef unsigned int pending_ring_idx_t; @@ -61,14 +60,17 @@ struct xenvif { /* Reference to netback processing backend. */ struct xen_netbk *netbk; + /* Use NAPI for guest TX */ + struct napi_struct napi; + /* Use kthread for guest RX */ + struct task_struct *task; + wait_queue_head_t wq; + u8 fe_dev_addr[6]; /* Physical parameters of the comms window. */ unsigned int irq; - /* List of frontends to notify after a batch of frames sent. */ - struct list_head notify_list; - /* The shared rings and indexes. */ struct xen_netif_tx_back_ring tx; struct xen_netif_rx_back_ring rx; @@ -99,11 +101,7 @@ struct xenvif { unsigned long rx_gso_checksum_fixup; /* Miscellaneous private stuff. */ - struct list_head schedule_list; - atomic_t refcnt; struct net_device *dev; - - wait_queue_head_t waiting_to_free; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) @@ -122,9 +120,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, unsigned long rx_ring_ref, unsigned int evtchn); void xenvif_disconnect(struct xenvif *vif); -void xenvif_get(struct xenvif *vif); -void xenvif_put(struct xenvif *vif); - int xenvif_xenbus_init(void); void xenvif_xenbus_exit(void); @@ -140,14 +135,6 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, grant_ref_t tx_ring_ref, grant_ref_t rx_ring_ref); -/* (De)Register a xenvif with the netback backend. */ -void xen_netbk_add_xenvif(struct xenvif *vif); -void xen_netbk_remove_xenvif(struct xenvif *vif); - -/* (De)Schedule backend processing for a xenvif */ -void xen_netbk_schedule_xenvif(struct xenvif *vif); -void xen_netbk_deschedule_xenvif(struct xenvif *vif); - /* Check for SKBs from frontend and schedule backend processing */ void xen_netbk_check_rx_xenvif(struct xenvif *vif); /* Receive an SKB from the frontend */ @@ -161,4 +148,13 @@ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); +/* Allocate and free xen_netbk structure */ +struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif); +void xen_netbk_free_netbk(struct xen_netbk *netbk); + +void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget); +void xen_netbk_rx_action(struct xen_netbk *netbk); + +int xen_netbk_kthread(void *data); + #endif /* __XEN_NETBACK__COMMON_H__ */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 1825629..dfc04f8 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -30,6 +30,7 @@ #include "common.h" +#include <linux/kthread.h> #include <linux/ethtool.h> #include <linux/rtnetlink.h> #include <linux/if_vlan.h> @@ -38,17 +39,7 @@ #include <asm/xen/hypercall.h> #define XENVIF_QUEUE_LENGTH 32 - -void xenvif_get(struct xenvif *vif) -{ - atomic_inc(&vif->refcnt); -} - -void xenvif_put(struct xenvif *vif) -{ - if (atomic_dec_and_test(&vif->refcnt)) - wake_up(&vif->waiting_to_free); -} +#define XENVIF_NAPI_WEIGHT 64 int xenvif_schedulable(struct xenvif *vif) { @@ -67,14 +58,37 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id) if (vif->netbk == NULL) return IRQ_NONE; - xen_netbk_schedule_xenvif(vif); - if (xenvif_rx_schedulable(vif)) netif_wake_queue(vif->dev); + if (likely(napi_schedule_prep(&vif->napi))) + __napi_schedule(&vif->napi); + return IRQ_HANDLED; } +static int xenvif_poll(struct napi_struct *napi, int budget) +{ + struct xenvif *vif = container_of(napi, struct xenvif, napi); + int work_done = 0; + + xen_netbk_tx_action(vif->netbk, &work_done, budget); + + if (work_done < budget) { + int more_to_do = 0; + unsigned long flag; + local_irq_save(flag); + + RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do); + if (!more_to_do) + __napi_complete(napi); + + local_irq_restore(flag); + } + + return work_done; +} + static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct xenvif *vif = netdev_priv(dev); @@ -90,7 +104,6 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) /* Reserve ring slots for the worst-case number of fragments. */ vif->rx_req_cons_peek += xen_netbk_count_skb_slots(vif, skb); - xenvif_get(vif); if (vif->can_queue && xen_netbk_must_stop_queue(vif)) netif_stop_queue(dev); @@ -107,7 +120,7 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) void xenvif_receive_skb(struct xenvif *vif, struct sk_buff *skb) { - netif_rx_ni(skb); + netif_receive_skb(skb); } void xenvif_notify_tx_completion(struct xenvif *vif) @@ -124,16 +137,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev) static void xenvif_up(struct xenvif *vif) { - xen_netbk_add_xenvif(vif); + napi_enable(&vif->napi); enable_irq(vif->irq); xen_netbk_check_rx_xenvif(vif); } static void xenvif_down(struct xenvif *vif) { + napi_disable(&vif->napi); disable_irq(vif->irq); - xen_netbk_deschedule_xenvif(vif); - xen_netbk_remove_xenvif(vif); } static int xenvif_open(struct net_device *dev) @@ -259,14 +271,11 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, vif = netdev_priv(dev); vif->domid = domid; vif->handle = handle; - vif->netbk = NULL; + vif->netbk = NULL; + vif->can_sg = 1; vif->csum = 1; - atomic_set(&vif->refcnt, 1); - init_waitqueue_head(&vif->waiting_to_free); vif->dev = dev; - INIT_LIST_HEAD(&vif->schedule_list); - INIT_LIST_HEAD(&vif->notify_list); vif->credit_bytes = vif->remaining_credit = ~0UL; vif->credit_usec = 0UL; @@ -290,6 +299,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, memset(dev->dev_addr, 0xFF, ETH_ALEN); dev->dev_addr[0] &= ~0x01; + netif_napi_add(dev, &vif->napi, xenvif_poll, XENVIF_NAPI_WEIGHT); + netif_carrier_off(dev); err = register_netdev(dev); @@ -324,7 +335,23 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, vif->irq = err; disable_irq(vif->irq); - xenvif_get(vif); + vif->netbk = xen_netbk_alloc_netbk(vif); + if (!vif->netbk) { + pr_warn("Could not allocate xen_netbk\n"); + err = -ENOMEM; + goto err_unbind; + } + + + init_waitqueue_head(&vif->wq); + vif->task = kthread_create(xen_netbk_kthread, + (void *)vif, + "vif%d.%d", vif->domid, vif->handle); + if (IS_ERR(vif->task)) { + pr_warn("Could not create kthread\n"); + err = PTR_ERR(vif->task); + goto err_free_netbk; + } rtnl_lock(); if (!vif->can_sg && vif->dev->mtu > ETH_DATA_LEN) @@ -335,7 +362,13 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, xenvif_up(vif); rtnl_unlock(); + wake_up_process(vif->task); + return 0; +err_free_netbk: + xen_netbk_free_netbk(vif->netbk); +err_unbind: + unbind_from_irqhandler(vif->irq, vif); err_unmap: xen_netbk_unmap_frontend_rings(vif); err: @@ -345,17 +378,22 @@ err: void xenvif_disconnect(struct xenvif *vif) { struct net_device *dev = vif->dev; + if (netif_carrier_ok(dev)) { rtnl_lock(); netif_carrier_off(dev); /* discard queued packets */ if (netif_running(dev)) xenvif_down(vif); rtnl_unlock(); - xenvif_put(vif); } - atomic_dec(&vif->refcnt); - wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0); + if (vif->task) + kthread_stop(vif->task); + + if (vif->netbk) + xen_netbk_free_netbk(vif->netbk); + + netif_napi_del(&vif->napi); del_timer_sync(&vif->credit_timeout); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 3059684..7378d63 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -61,24 +61,15 @@ struct netbk_rx_meta { #define MAX_BUFFER_OFFSET PAGE_SIZE struct xen_netbk { - wait_queue_head_t wq; - struct task_struct *task; - struct sk_buff_head rx_queue; struct sk_buff_head tx_queue; - struct timer_list net_timer; - idx_t mmap_pages[MAX_PENDING_REQS]; pending_ring_idx_t pending_prod; pending_ring_idx_t pending_cons; - struct list_head net_schedule_list; - - /* Protect the net_schedule_list in netif. */ - spinlock_t net_schedule_list_lock; - atomic_t netfront_count; + struct xenvif *vif; struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; @@ -93,42 +84,14 @@ struct xen_netbk { struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; }; -static struct xen_netbk *xen_netbk; -static int xen_netbk_group_nr; - -void xen_netbk_add_xenvif(struct xenvif *vif) -{ - int i; - int min_netfront_count; - int min_group = 0; - struct xen_netbk *netbk; - - min_netfront_count = atomic_read(&xen_netbk[0].netfront_count); - for (i = 0; i < xen_netbk_group_nr; i++) { - int netfront_count = atomic_read(&xen_netbk[i].netfront_count); - if (netfront_count < min_netfront_count) { - min_group = i; - min_netfront_count = netfront_count; - } - } - - netbk = &xen_netbk[min_group]; - - vif->netbk = netbk; - atomic_inc(&netbk->netfront_count); -} - -void xen_netbk_remove_xenvif(struct xenvif *vif) -{ - struct xen_netbk *netbk = vif->netbk; - vif->netbk = NULL; - atomic_dec(&netbk->netfront_count); -} - static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); + +static inline int tx_work_todo(struct xen_netbk *netbk); +static inline int rx_work_todo(struct xen_netbk *netbk); + static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 id, s8 st, @@ -179,11 +142,6 @@ static inline pending_ring_idx_t nr_pending_reqs(struct xen_netbk *netbk) netbk->pending_prod + netbk->pending_cons; } -static void xen_netbk_kick_thread(struct xen_netbk *netbk) -{ - wake_up(&netbk->wq); -} - static int max_required_rx_slots(struct xenvif *vif) { int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE); @@ -369,7 +327,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, if (foreign) { struct pending_tx_info *src_pend = to_txinfo(idx); - copy_gop->source.domid = src_pend->vif->domid; + copy_gop->source.domid = vif->domid; copy_gop->source.u.ref = src_pend->req.gref; copy_gop->flags |= GNTCOPY_source_gref; } else { @@ -527,11 +485,18 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_rx_action(struct xen_netbk *netbk) +static void xen_netbk_kick_thread(struct xen_netbk *netbk) { - struct xenvif *vif = NULL, *tmp; + struct xenvif *vif = netbk->vif; + + wake_up(&vif->wq); +} + +void xen_netbk_rx_action(struct xen_netbk *netbk) +{ + struct xenvif *vif = NULL; s8 status; - u16 irq, flags; + u16 flags; struct xen_netif_rx_response *resp; struct sk_buff_head rxq; struct sk_buff *skb; @@ -541,6 +506,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) int count; unsigned long offset; struct skb_cb_overlay *sco; + int need_to_notify = 0; struct netrx_pending_operations npo = { .copy = netbk->grant_copy_op, @@ -641,25 +607,19 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); - irq = vif->irq; - if (ret && list_empty(&vif->notify_list)) - list_add_tail(&vif->notify_list, ¬ify); + if (ret) + need_to_notify = 1; xenvif_notify_tx_completion(vif); - xenvif_put(vif); npo.meta_cons += sco->meta_slots_used; dev_kfree_skb(skb); } - list_for_each_entry_safe(vif, tmp, ¬ify, notify_list) { + if (need_to_notify) notify_remote_via_irq(vif->irq); - list_del_init(&vif->notify_list); - } - /* More work to do? */ - if (!skb_queue_empty(&netbk->rx_queue) && - !timer_pending(&netbk->net_timer)) + if (!skb_queue_empty(&netbk->rx_queue)) xen_netbk_kick_thread(netbk); } @@ -672,86 +632,17 @@ void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) xen_netbk_kick_thread(netbk); } -static void xen_netbk_alarm(unsigned long data) -{ - struct xen_netbk *netbk = (struct xen_netbk *)data; - xen_netbk_kick_thread(netbk); -} - -static int __on_net_schedule_list(struct xenvif *vif) -{ - return !list_empty(&vif->schedule_list); -} - -/* Must be called with net_schedule_list_lock held */ -static void remove_from_net_schedule_list(struct xenvif *vif) -{ - if (likely(__on_net_schedule_list(vif))) { - list_del_init(&vif->schedule_list); - xenvif_put(vif); - } -} - -static struct xenvif *poll_net_schedule_list(struct xen_netbk *netbk) -{ - struct xenvif *vif = NULL; - - spin_lock_irq(&netbk->net_schedule_list_lock); - if (list_empty(&netbk->net_schedule_list)) - goto out; - - vif = list_first_entry(&netbk->net_schedule_list, - struct xenvif, schedule_list); - if (!vif) - goto out; - - xenvif_get(vif); - - remove_from_net_schedule_list(vif); -out: - spin_unlock_irq(&netbk->net_schedule_list_lock); - return vif; -} - -void xen_netbk_schedule_xenvif(struct xenvif *vif) -{ - unsigned long flags; - struct xen_netbk *netbk = vif->netbk; - - if (__on_net_schedule_list(vif)) - goto kick; - - spin_lock_irqsave(&netbk->net_schedule_list_lock, flags); - if (!__on_net_schedule_list(vif) && - likely(xenvif_schedulable(vif))) { - list_add_tail(&vif->schedule_list, &netbk->net_schedule_list); - xenvif_get(vif); - } - spin_unlock_irqrestore(&netbk->net_schedule_list_lock, flags); - -kick: - smp_mb(); - if ((nr_pending_reqs(netbk) < (MAX_PENDING_REQS/2)) && - !list_empty(&netbk->net_schedule_list)) - xen_netbk_kick_thread(netbk); -} - -void xen_netbk_deschedule_xenvif(struct xenvif *vif) -{ - struct xen_netbk *netbk = vif->netbk; - spin_lock_irq(&netbk->net_schedule_list_lock); - remove_from_net_schedule_list(vif); - spin_unlock_irq(&netbk->net_schedule_list_lock); -} - void xen_netbk_check_rx_xenvif(struct xenvif *vif) { int more_to_do; RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do); + /* In this check function, we are supposed to do be''s rx, + * which means fe''s tx */ + if (more_to_do) - xen_netbk_schedule_xenvif(vif); + napi_schedule(&vif->napi); } static void tx_add_credit(struct xenvif *vif) @@ -794,7 +685,6 @@ static void netbk_tx_err(struct xenvif *vif, } while (1); vif->tx.req_cons = cons; xen_netbk_check_rx_xenvif(vif); - xenvif_put(vif); } static int netbk_count_requests(struct xenvif *vif, @@ -894,8 +784,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, gop++; memcpy(&pending_tx_info->req, txp, sizeof(*txp)); - xenvif_get(vif); - pending_tx_info->vif = vif; + frag_set_pending_idx(&frags[i], pending_idx); } @@ -910,7 +799,8 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, u16 pending_idx = *((u16 *)skb->data); struct pending_tx_info *pending_tx_info; int idx; - struct xenvif *vif = NULL; + struct xenvif *vif = netbk->vif; + struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -924,10 +814,8 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, idx = netbk->mmap_pages[index]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; - vif = pending_tx_info->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); } /* Skip first skb fragment if it is on same page as header fragment. */ @@ -951,11 +839,9 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, /* Error on this fragment: respond to client with an error. */ idx = netbk->mmap_pages[pending_idx]; txp = &to_txinfo(idx)->req; - vif = to_txinfo(idx)->vif; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); /* Not the first error? Preceding frags already invalidated. */ if (err) @@ -1171,10 +1057,9 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; struct sk_buff *skb; int ret; + struct xenvif *vif = netbk->vif; - while (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) && - !list_empty(&netbk->net_schedule_list)) { - struct xenvif *vif; + while ((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { struct xen_netif_tx_request txreq; struct xen_netif_tx_request txfrags[MAX_SKB_FRAGS]; struct page *page; @@ -1187,26 +1072,19 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) int pool_idx; struct pending_tx_info *pending_tx_info; - /* Get a netif from the list with work to do. */ - vif = poll_net_schedule_list(netbk); - if (!vif) - continue; - RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, work_to_do); if (!work_to_do) { - xenvif_put(vif); - continue; + break; } idx = vif->tx.req_cons; rmb(); /* Ensure that we see the request before we copy it. */ memcpy(&txreq, RING_GET_REQUEST(&vif->tx, idx), sizeof(txreq)); - /* Credit-based scheduling. */ + /* Credit-based traffic shaping. */ if (txreq.size > vif->remaining_credit && tx_credit_exceeded(vif, txreq.size)) { - xenvif_put(vif); - continue; + break; } vif->remaining_credit -= txreq.size; @@ -1221,14 +1099,14 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) idx = vif->tx.req_cons; if (unlikely(work_to_do < 0)) { netbk_tx_err(vif, &txreq, idx); - continue; + break; } } ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do); if (unlikely(ret < 0)) { netbk_tx_err(vif, &txreq, idx - ret); - continue; + break; } idx += ret; @@ -1236,7 +1114,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) netdev_dbg(vif->dev, "Bad packet size: %d\n", txreq.size); netbk_tx_err(vif, &txreq, idx); - continue; + break; } /* No crossing a page as the payload mustn''t fragment. */ @@ -1246,7 +1124,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) txreq.offset, txreq.size, (txreq.offset&~PAGE_MASK) + txreq.size); netbk_tx_err(vif, &txreq, idx); - continue; + break; } index = pending_index(netbk->pending_cons); @@ -1275,7 +1153,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (netbk_set_skb_gso(vif, skb, gso)) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } } @@ -1284,7 +1162,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (!page) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } gop->source.u.ref = txreq.gref; @@ -1305,7 +1183,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) memcpy(&pending_tx_info->req, &txreq, sizeof(txreq)); - pending_tx_info->vif = vif; + *((u16 *)skb->data) = pending_idx; __skb_put(skb, data_len); @@ -1329,7 +1207,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) if (request_gop == NULL) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); - continue; + break; } gop = request_gop; @@ -1343,14 +1221,16 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) return gop - netbk->tx_copy_ops; } -static void xen_netbk_tx_submit(struct xen_netbk *netbk) +static void xen_netbk_tx_submit(struct xen_netbk *netbk, + int *work_done, int budget) { struct gnttab_copy *gop = netbk->tx_copy_ops; struct sk_buff *skb; + struct xenvif *vif = netbk->vif; - while ((skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { + while ((*work_done < budget) && + (skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { struct xen_netif_tx_request *txp; - struct xenvif *vif; u16 pending_idx; unsigned data_len; int idx; @@ -1361,7 +1241,6 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) idx = netbk->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); - vif = pending_tx_info->vif; txp = &pending_tx_info->req; /* Check the remap error code. */ @@ -1415,16 +1294,21 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk) vif->dev->stats.rx_bytes += skb->len; vif->dev->stats.rx_packets++; + (*work_done)++; + xenvif_receive_skb(vif, skb); } } /* Called after netfront has transmitted */ -static void xen_netbk_tx_action(struct xen_netbk *netbk) +void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget) { unsigned nr_gops; int ret; + if (unlikely(!tx_work_todo(netbk))) + return; + nr_gops = xen_netbk_tx_build_gops(netbk); if (nr_gops == 0) @@ -1433,13 +1317,12 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk) netbk->tx_copy_ops, nr_gops); BUG_ON(ret); - xen_netbk_tx_submit(netbk); - + xen_netbk_tx_submit(netbk, work_done, budget); } static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) { - struct xenvif *vif; + struct xenvif *vif = netbk->vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; int idx; @@ -1451,15 +1334,11 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) idx = netbk->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); - vif = pending_tx_info->vif; - make_tx_response(vif, &pending_tx_info->req, XEN_NETIF_RSP_OKAY); index = pending_index(netbk->pending_prod++); netbk->pending_ring[index] = pending_idx; - xenvif_put(vif); - page_pool_put(netbk->mmap_pages[pending_idx]); netbk->mmap_pages[pending_idx] = INVALID_ENTRY; @@ -1516,37 +1395,13 @@ static inline int rx_work_todo(struct xen_netbk *netbk) static inline int tx_work_todo(struct xen_netbk *netbk) { - - if (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) && - !list_empty(&netbk->net_schedule_list)) + if (likely(RING_HAS_UNCONSUMED_REQUESTS(&netbk->vif->tx)) && + (nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) return 1; return 0; } -static int xen_netbk_kthread(void *data) -{ - struct xen_netbk *netbk = data; - while (!kthread_should_stop()) { - wait_event_interruptible(netbk->wq, - rx_work_todo(netbk) || - tx_work_todo(netbk) || - kthread_should_stop()); - cond_resched(); - - if (kthread_should_stop()) - break; - - if (rx_work_todo(netbk)) - xen_netbk_rx_action(netbk); - - if (tx_work_todo(netbk)) - xen_netbk_tx_action(netbk); - } - - return 0; -} - void xen_netbk_unmap_frontend_rings(struct xenvif *vif) { if (vif->tx.sring) @@ -1592,78 +1447,74 @@ err: return err; } -static int __init netback_init(void) +struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif) { int i; - int rc = 0; - int group; - - if (!xen_domain()) - return -ENODEV; + struct xen_netbk *netbk; - xen_netbk_group_nr = num_online_cpus(); - xen_netbk = vzalloc(sizeof(struct xen_netbk) * xen_netbk_group_nr); - if (!xen_netbk) { + netbk = vzalloc(sizeof(struct xen_netbk)); + if (!netbk) { printk(KERN_ALERT "%s: out of memory\n", __func__); - return -ENOMEM; + return NULL; } - for (group = 0; group < xen_netbk_group_nr; group++) { - struct xen_netbk *netbk = &xen_netbk[group]; - skb_queue_head_init(&netbk->rx_queue); - skb_queue_head_init(&netbk->tx_queue); - - init_timer(&netbk->net_timer); - netbk->net_timer.data = (unsigned long)netbk; - netbk->net_timer.function = xen_netbk_alarm; - - netbk->pending_cons = 0; - netbk->pending_prod = MAX_PENDING_REQS; - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->pending_ring[i] = i; - - init_waitqueue_head(&netbk->wq); - netbk->task = kthread_create(xen_netbk_kthread, - (void *)netbk, - "netback/%u", group); - - if (IS_ERR(netbk->task)) { - printk(KERN_ALERT "kthread_create() fails at netback\n"); - del_timer(&netbk->net_timer); - rc = PTR_ERR(netbk->task); - goto failed_init; - } + netbk->vif = vif; - kthread_bind(netbk->task, group); + skb_queue_head_init(&netbk->rx_queue); + skb_queue_head_init(&netbk->tx_queue); - INIT_LIST_HEAD(&netbk->net_schedule_list); + netbk->pending_cons = 0; + netbk->pending_prod = MAX_PENDING_REQS; + for (i = 0; i < MAX_PENDING_REQS; i++) + netbk->pending_ring[i] = i; - spin_lock_init(&netbk->net_schedule_list_lock); + for (i = 0; i < MAX_PENDING_REQS; i++) + netbk->mmap_pages[i] = INVALID_ENTRY; - atomic_set(&netbk->netfront_count, 0); + return netbk; +} - wake_up_process(netbk->task); +void xen_netbk_free_netbk(struct xen_netbk *netbk) +{ + vfree(netbk); +} + +int xen_netbk_kthread(void *data) +{ + struct xenvif *vif = data; + struct xen_netbk *netbk = vif->netbk; + + while (!kthread_should_stop()) { + wait_event_interruptible(vif->wq, + rx_work_todo(netbk) || + kthread_should_stop()); + cond_resched(); + + if (kthread_should_stop()) + break; + + if (rx_work_todo(netbk)) + xen_netbk_rx_action(netbk); } + return 0; +} + + +static int __init netback_init(void) +{ + int rc = 0; + + if (!xen_domain()) + return -ENODEV; + rc = page_pool_init(); if (rc) goto failed_init; - rc = xenvif_xenbus_init(); - if (rc) - goto pool_failed_init; - - return 0; + return xenvif_xenbus_init(); -pool_failed_init: - page_pool_destroy(); failed_init: - while (--group >= 0) { - struct xen_netbk *netbk = &xen_netbk[group]; - del_timer(&netbk->net_timer); - kthread_stop(netbk->task); - } - vfree(xen_netbk); return rc; } @@ -1672,14 +1523,7 @@ module_init(netback_init); static void __exit netback_exit(void) { - int i; xenvif_xenbus_exit(); - for (i = 0; i < xen_netbk_group_nr; i++) { - struct xen_netbk *netbk = &xen_netbk[i]; - del_timer_sync(&netbk->net_timer); - kthread_stop(netbk->task); - } - vfree(xen_netbk); page_pool_destroy(); } module_exit(netback_exit); diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 65d14f2..f1e89ca 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -387,7 +387,6 @@ static void connect(struct backend_info *be) netif_wake_queue(be->vif->dev); } - static int connect_rings(struct backend_info *be) { struct xenvif *vif = be->vif; -- 1.7.2.5
Wei Liu
2012-Jan-17 13:47 UTC
[RFC PATCH V2 4/8] netback: switch to per-cpu scratch space.
In the 1:1 model, given that there are maximum nr_online_cpus netbacks running, we can use per-cpu scratch space, thus shrinking size of struct xen_netbk. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 13 ++++ drivers/net/xen-netback/netback.c | 134 ++++++++++++++++++++++++------------- 2 files changed, 100 insertions(+), 47 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 31c331c..3b85563 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,6 +45,19 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +struct netbk_rx_meta { + int id; + int size; + int gso_size; +}; + +#define MAX_PENDING_REQS 256 + +/* Discriminate from any valid pending_idx value. */ +#define INVALID_PENDING_IDX 0xFFFF + +#define MAX_BUFFER_OFFSET PAGE_SIZE + struct pending_tx_info { struct xen_netif_tx_request req; }; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 7378d63..714f508 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1,3 +1,4 @@ + /* * Back-end of the driver for virtual network devices. This portion of the * driver exports a ''unified'' network-device interface that can be accessed @@ -47,18 +48,17 @@ #include <asm/xen/hypercall.h> #include <asm/xen/page.h> -struct netbk_rx_meta { - int id; - int size; - int gso_size; -}; -#define MAX_PENDING_REQS 256 +struct gnttab_copy *tx_copy_ops; -/* Discriminate from any valid pending_idx value. */ -#define INVALID_PENDING_IDX 0xFFFF +/* + * Given MAX_BUFFER_OFFSET of 4096 the worst case is that each + * head/fragment page uses 2 copy operations because it + * straddles two buffers in the frontend. + */ +struct gnttab_copy *grant_copy_op; +struct netbk_rx_meta *meta; -#define MAX_BUFFER_OFFSET PAGE_SIZE struct xen_netbk { struct sk_buff_head rx_queue; @@ -71,17 +71,7 @@ struct xen_netbk { struct xenvif *vif; - struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; - u16 pending_ring[MAX_PENDING_REQS]; - - /* - * Given MAX_BUFFER_OFFSET of 4096 the worst case is that each - * head/fragment page uses 2 copy operations because it - * straddles two buffers in the frontend. - */ - struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE]; - struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE]; }; static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); @@ -508,9 +498,12 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) struct skb_cb_overlay *sco; int need_to_notify = 0; + struct gnttab_copy *gco = get_cpu_ptr(grant_copy_op); + struct netbk_rx_meta *m = get_cpu_ptr(meta); + struct netrx_pending_operations npo = { - .copy = netbk->grant_copy_op, - .meta = netbk->meta, + .copy = gco, + .meta = m, }; skb_queue_head_init(&rxq); @@ -533,13 +526,16 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) break; } - BUG_ON(npo.meta_prod > ARRAY_SIZE(netbk->meta)); + BUG_ON(npo.meta_prod > MAX_PENDING_REQS); - if (!npo.copy_prod) + if (!npo.copy_prod) { + put_cpu_ptr(gco); + put_cpu_ptr(m); return; + } - BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op)); - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op, + BUG_ON(npo.copy_prod > (2 * XEN_NETIF_RX_RING_SIZE)); + ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, gco, npo.copy_prod); BUG_ON(ret != 0); @@ -548,14 +544,14 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) vif = netdev_priv(skb->dev); - if (netbk->meta[npo.meta_cons].gso_size && vif->gso_prefix) { + if (m[npo.meta_cons].gso_size && vif->gso_prefix) { resp = RING_GET_RESPONSE(&vif->rx, vif->rx.rsp_prod_pvt++); resp->flags = XEN_NETRXF_gso_prefix | XEN_NETRXF_more_data; - resp->offset = netbk->meta[npo.meta_cons].gso_size; - resp->id = netbk->meta[npo.meta_cons].id; + resp->offset = m[npo.meta_cons].gso_size; + resp->id = m[npo.meta_cons].id; resp->status = sco->meta_slots_used; npo.meta_cons++; @@ -580,12 +576,12 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) flags |= XEN_NETRXF_data_validated; offset = 0; - resp = make_rx_response(vif, netbk->meta[npo.meta_cons].id, + resp = make_rx_response(vif, m[npo.meta_cons].id, status, offset, - netbk->meta[npo.meta_cons].size, + m[npo.meta_cons].size, flags); - if (netbk->meta[npo.meta_cons].gso_size && !vif->gso_prefix) { + if (m[npo.meta_cons].gso_size && !vif->gso_prefix) { struct xen_netif_extra_info *gso (struct xen_netif_extra_info *) RING_GET_RESPONSE(&vif->rx, @@ -593,7 +589,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) resp->flags |= XEN_NETRXF_extra_info; - gso->u.gso.size = netbk->meta[npo.meta_cons].gso_size; + gso->u.gso.size = m[npo.meta_cons].gso_size; gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4; gso->u.gso.pad = 0; gso->u.gso.features = 0; @@ -603,7 +599,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) } netbk_add_frag_responses(vif, status, - netbk->meta + npo.meta_cons + 1, + m + npo.meta_cons + 1, sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); @@ -621,6 +617,9 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) if (!skb_queue_empty(&netbk->rx_queue)) xen_netbk_kick_thread(netbk); + + put_cpu_ptr(gco); + put_cpu_ptr(m); } void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) @@ -1052,9 +1051,10 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) +static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, + struct gnttab_copy *tco) { - struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; + struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; int ret; struct xenvif *vif = netbk->vif; @@ -1214,17 +1214,18 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) vif->tx.req_cons = idx; xen_netbk_check_rx_xenvif(vif); - if ((gop-netbk->tx_copy_ops) >= ARRAY_SIZE(netbk->tx_copy_ops)) + if ((gop - tco) >= MAX_PENDING_REQS) break; } - return gop - netbk->tx_copy_ops; + return gop - tco; } static void xen_netbk_tx_submit(struct xen_netbk *netbk, + struct gnttab_copy *tco, int *work_done, int budget) { - struct gnttab_copy *gop = netbk->tx_copy_ops; + struct gnttab_copy *gop = tco; struct sk_buff *skb; struct xenvif *vif = netbk->vif; @@ -1305,19 +1306,25 @@ void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget) { unsigned nr_gops; int ret; + struct gnttab_copy *tco; if (unlikely(!tx_work_todo(netbk))) return; - nr_gops = xen_netbk_tx_build_gops(netbk); + tco = get_cpu_ptr(tx_copy_ops); - if (nr_gops == 0) - return; - ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, - netbk->tx_copy_ops, nr_gops); + nr_gops = xen_netbk_tx_build_gops(netbk, tco); + + if (nr_gops == 0) { + put_cpu_ptr(tco); + return 0; + } + + ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, tco, nr_gops); BUG_ON(ret); - xen_netbk_tx_submit(netbk, work_done, budget); + xen_netbk_tx_submit(netbk, tco, work_done, budget); + put_cpu_ptr(tco); } static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) @@ -1503,17 +1510,47 @@ int xen_netbk_kthread(void *data) static int __init netback_init(void) { - int rc = 0; + int rc = -ENOMEM; if (!xen_domain()) return -ENODEV; + tx_copy_ops = __alloc_percpu(sizeof(struct gnttab_copy) + * MAX_PENDING_REQS, + __alignof__(struct gnttab_copy)); + if (!tx_copy_ops) + goto failed_init; + + grant_copy_op = __alloc_percpu(sizeof(struct gnttab_copy) + * 2 * XEN_NETIF_RX_RING_SIZE, + __alignof__(struct gnttab_copy)); + if (!grant_copy_op) + goto failed_init_gco; + + meta = __alloc_percpu(sizeof(struct netbk_rx_meta) + * 2 * XEN_NETIF_RX_RING_SIZE, + __alignof__(struct netbk_rx_meta)); + if (!meta) + goto failed_init_meta; + rc = page_pool_init(); if (rc) - goto failed_init; + goto failed_init_pool; + + rc = xenvif_xenbus_init(); + if (rc) + goto failed_init_xenbus; - return xenvif_xenbus_init(); + return rc; +failed_init_xenbus: + page_pool_destroy(); +failed_init_pool: + free_percpu(meta); +failed_init_meta: + free_percpu(grant_copy_op); +failed_init_gco: + free_percpu(tx_copy_ops); failed_init: return rc; @@ -1525,6 +1562,9 @@ static void __exit netback_exit(void) { xenvif_xenbus_exit(); page_pool_destroy(); + free_percpu(meta); + free_percpu(grant_copy_op); + free_percpu(tx_copy_ops); } module_exit(netback_exit); -- 1.7.2.5
Wei Liu
2012-Jan-17 13:47 UTC
[RFC PATCH V2 5/8] netback: add module get/put operations along with vif connect/disconnect.
If there is vif running and user unloads netback, it will certainly cause problems -- guest''s network interface just mysteriously stops working. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/interface.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index dfc04f8..7c86187 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -323,6 +323,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, if (vif->irq) return 0; + __module_get(THIS_MODULE); + err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); if (err < 0) goto err; @@ -405,4 +407,6 @@ void xenvif_disconnect(struct xenvif *vif) xen_netbk_unmap_frontend_rings(vif); free_netdev(vif->dev); + + module_put(THIS_MODULE); } -- 1.7.2.5
In the 1:1 model, there is no need to keep xen_netbk and xenvif separated. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 36 +++--- drivers/net/xen-netback/interface.c | 36 +++---- drivers/net/xen-netback/netback.c | 207 +++++++++++++---------------------- drivers/net/xen-netback/page_pool.c | 10 +- drivers/net/xen-netback/page_pool.h | 13 ++- 5 files changed, 120 insertions(+), 182 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 3b85563..17d4e1a 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -45,34 +45,29 @@ #include <xen/grant_table.h> #include <xen/xenbus.h> +#include "page_pool.h" + struct netbk_rx_meta { int id; int size; int gso_size; }; -#define MAX_PENDING_REQS 256 - /* Discriminate from any valid pending_idx value. */ #define INVALID_PENDING_IDX 0xFFFF #define MAX_BUFFER_OFFSET PAGE_SIZE -struct pending_tx_info { - struct xen_netif_tx_request req; -}; -typedef unsigned int pending_ring_idx_t; +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) -struct xen_netbk; +#define MAX_PENDING_REQS 256 struct xenvif { /* Unique identifier for this interface. */ domid_t domid; unsigned int handle; - /* Reference to netback processing backend. */ - struct xen_netbk *netbk; - /* Use NAPI for guest TX */ struct napi_struct napi; /* Use kthread for guest RX */ @@ -115,6 +110,16 @@ struct xenvif { /* Miscellaneous private stuff. */ struct net_device *dev; + + struct sk_buff_head rx_queue; + struct sk_buff_head tx_queue; + + idx_t mmap_pages[MAX_PENDING_REQS]; + + pending_ring_idx_t pending_prod; + pending_ring_idx_t pending_cons; + + u16 pending_ring[MAX_PENDING_REQS]; }; static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) @@ -122,9 +127,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif) return to_xenbus_device(vif->dev->dev.parent); } -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) - struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, unsigned int handle); @@ -161,12 +163,8 @@ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); -/* Allocate and free xen_netbk structure */ -struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif); -void xen_netbk_free_netbk(struct xen_netbk *netbk); - -void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget); -void xen_netbk_rx_action(struct xen_netbk *netbk); +void xen_netbk_tx_action(struct xenvif *vif, int *work_done, int budget); +void xen_netbk_rx_action(struct xenvif *vif); int xen_netbk_kthread(void *data); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 7c86187..11e638b 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -55,9 +55,6 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id) { struct xenvif *vif = dev_id; - if (vif->netbk == NULL) - return IRQ_NONE; - if (xenvif_rx_schedulable(vif)) netif_wake_queue(vif->dev); @@ -72,7 +69,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget) struct xenvif *vif = container_of(napi, struct xenvif, napi); int work_done = 0; - xen_netbk_tx_action(vif->netbk, &work_done, budget); + xen_netbk_tx_action(vif, &work_done, budget); if (work_done < budget) { int more_to_do = 0; @@ -95,7 +92,8 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) BUG_ON(skb->dev != dev); - if (vif->netbk == NULL) + /* Drop the packet if vif is not ready */ + if (vif->task == NULL) goto drop; /* Drop the packet if the target domain has no receive buffers. */ @@ -257,6 +255,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, int err; struct net_device *dev; struct xenvif *vif; + int i; char name[IFNAMSIZ] = {}; snprintf(name, IFNAMSIZ - 1, "vif%u.%u", domid, handle); @@ -271,7 +270,6 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, vif = netdev_priv(dev); vif->domid = domid; vif->handle = handle; - vif->netbk = NULL; vif->can_sg = 1; vif->csum = 1; @@ -290,6 +288,17 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, dev->tx_queue_len = XENVIF_QUEUE_LENGTH; + skb_queue_head_init(&vif->rx_queue); + skb_queue_head_init(&vif->tx_queue); + + vif->pending_cons = 0; + vif->pending_prod = MAX_PENDING_REQS; + for (i = 0; i < MAX_PENDING_REQS; i++) + vif->pending_ring[i] = i; + + for (i = 0; i < MAX_PENDING_REQS; i++) + vif->mmap_pages[i] = INVALID_ENTRY; + /* * Initialise a dummy MAC address. We choose the numerically * largest non-broadcast address to prevent the address getting @@ -337,14 +346,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, vif->irq = err; disable_irq(vif->irq); - vif->netbk = xen_netbk_alloc_netbk(vif); - if (!vif->netbk) { - pr_warn("Could not allocate xen_netbk\n"); - err = -ENOMEM; - goto err_unbind; - } - - init_waitqueue_head(&vif->wq); vif->task = kthread_create(xen_netbk_kthread, (void *)vif, @@ -352,7 +353,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, if (IS_ERR(vif->task)) { pr_warn("Could not create kthread\n"); err = PTR_ERR(vif->task); - goto err_free_netbk; + goto err_unbind; } rtnl_lock(); @@ -367,8 +368,6 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, wake_up_process(vif->task); return 0; -err_free_netbk: - xen_netbk_free_netbk(vif->netbk); err_unbind: unbind_from_irqhandler(vif->irq, vif); err_unmap: @@ -392,9 +391,6 @@ void xenvif_disconnect(struct xenvif *vif) if (vif->task) kthread_stop(vif->task); - if (vif->netbk) - xen_netbk_free_netbk(vif->netbk); - netif_napi_del(&vif->napi); del_timer_sync(&vif->credit_timeout); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 714f508..1842e4e 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -59,28 +59,13 @@ struct gnttab_copy *tx_copy_ops; struct gnttab_copy *grant_copy_op; struct netbk_rx_meta *meta; - -struct xen_netbk { - struct sk_buff_head rx_queue; - struct sk_buff_head tx_queue; - - idx_t mmap_pages[MAX_PENDING_REQS]; - - pending_ring_idx_t pending_prod; - pending_ring_idx_t pending_cons; - - struct xenvif *vif; - - u16 pending_ring[MAX_PENDING_REQS]; -}; - -static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx); +static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); -static inline int tx_work_todo(struct xen_netbk *netbk); -static inline int rx_work_todo(struct xen_netbk *netbk); +static inline int tx_work_todo(struct xenvif *vif); +static inline int rx_work_todo(struct xenvif *vif); static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 id, @@ -89,16 +74,16 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, u16 size, u16 flags); -static inline unsigned long idx_to_pfn(struct xen_netbk *netbk, +static inline unsigned long idx_to_pfn(struct xenvif *vif, u16 idx) { - return page_to_pfn(to_page(netbk->mmap_pages[idx])); + return page_to_pfn(to_page(vif->mmap_pages[idx])); } -static inline unsigned long idx_to_kaddr(struct xen_netbk *netbk, +static inline unsigned long idx_to_kaddr(struct xenvif *vif, u16 idx) { - return (unsigned long)pfn_to_kaddr(idx_to_pfn(netbk, idx)); + return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx)); } /* @@ -126,10 +111,10 @@ static inline pending_ring_idx_t pending_index(unsigned i) return i & (MAX_PENDING_REQS-1); } -static inline pending_ring_idx_t nr_pending_reqs(struct xen_netbk *netbk) +static inline pending_ring_idx_t nr_pending_reqs(struct xenvif *vif) { return MAX_PENDING_REQS - - netbk->pending_prod + netbk->pending_cons; + vif->pending_prod + vif->pending_cons; } static int max_required_rx_slots(struct xenvif *vif) @@ -475,16 +460,13 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_kick_thread(struct xen_netbk *netbk) +static void xen_netbk_kick_thread(struct xenvif *vif) { - struct xenvif *vif = netbk->vif; - wake_up(&vif->wq); } -void xen_netbk_rx_action(struct xen_netbk *netbk) +void xen_netbk_rx_action(struct xenvif *vif) { - struct xenvif *vif = NULL; s8 status; u16 flags; struct xen_netif_rx_response *resp; @@ -510,7 +492,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) count = 0; - while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) { + while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) { vif = netdev_priv(skb->dev); nr_frags = skb_shinfo(skb)->nr_frags; @@ -542,7 +524,7 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) while ((skb = __skb_dequeue(&rxq)) != NULL) { sco = (struct skb_cb_overlay *)skb->cb; - vif = netdev_priv(skb->dev); + /* vif = netdev_priv(skb->dev); */ if (m[npo.meta_cons].gso_size && vif->gso_prefix) { resp = RING_GET_RESPONSE(&vif->rx, @@ -615,8 +597,8 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) if (need_to_notify) notify_remote_via_irq(vif->irq); - if (!skb_queue_empty(&netbk->rx_queue)) - xen_netbk_kick_thread(netbk); + if (!skb_queue_empty(&vif->rx_queue)) + xen_netbk_kick_thread(vif); put_cpu_ptr(gco); put_cpu_ptr(m); @@ -624,11 +606,9 @@ void xen_netbk_rx_action(struct xen_netbk *netbk) void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) { - struct xen_netbk *netbk = vif->netbk; - - skb_queue_tail(&netbk->rx_queue, skb); + skb_queue_tail(&vif->rx_queue, skb); - xen_netbk_kick_thread(netbk); + xen_netbk_kick_thread(vif); } void xen_netbk_check_rx_xenvif(struct xenvif *vif) @@ -727,21 +707,20 @@ static int netbk_count_requests(struct xenvif *vif, return frags; } -static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk, +static struct page *xen_netbk_alloc_page(struct xenvif *vif, struct sk_buff *skb, u16 pending_idx) { struct page *page; int idx; - page = page_pool_get(netbk, &idx); + page = page_pool_get(vif, &idx); if (!page) return NULL; - netbk->mmap_pages[pending_idx] = idx; + vif->mmap_pages[pending_idx] = idx; return page; } -static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, - struct xenvif *vif, +static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, struct sk_buff *skb, struct xen_netif_tx_request *txp, struct gnttab_copy *gop) @@ -760,13 +739,13 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, int idx; struct pending_tx_info *pending_tx_info; - index = pending_index(netbk->pending_cons++); - pending_idx = netbk->pending_ring[index]; - page = xen_netbk_alloc_page(netbk, skb, pending_idx); + index = pending_index(vif->pending_cons++); + pending_idx = vif->pending_ring[index]; + page = xen_netbk_alloc_page(vif, skb, pending_idx); if (!page) return NULL; - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); gop->source.u.ref = txp->gref; @@ -790,7 +769,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk, return gop; } -static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, +static int xen_netbk_tx_check_gop(struct xenvif *vif, struct sk_buff *skb, struct gnttab_copy **gopp) { @@ -798,8 +777,6 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, u16 pending_idx = *((u16 *)skb->data); struct pending_tx_info *pending_tx_info; int idx; - struct xenvif *vif = netbk->vif; - struct xen_netif_tx_request *txp; struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -809,12 +786,12 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, err = gop->status; if (unlikely(err)) { pending_ring_idx_t index; - index = pending_index(netbk->pending_prod++); - idx = netbk->mmap_pages[index]; + index = pending_index(vif->pending_prod++); + idx = vif->mmap_pages[index]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); - netbk->pending_ring[index] = pending_idx; + vif->pending_ring[index] = pending_idx; } /* Skip first skb fragment if it is on same page as header fragment. */ @@ -831,16 +808,16 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, if (likely(!newerr)) { /* Had a previous error? Invalidate this fragment. */ if (unlikely(err)) - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); continue; } /* Error on this fragment: respond to client with an error. */ - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; txp = &to_txinfo(idx)->req; make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR); - index = pending_index(netbk->pending_prod++); - netbk->pending_ring[index] = pending_idx; + index = pending_index(vif->pending_prod++); + vif->pending_ring[index] = pending_idx; /* Not the first error? Preceding frags already invalidated. */ if (err) @@ -848,10 +825,10 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, /* First error: invalidate header and preceding fragments. */ pending_idx = *((u16 *)skb->data); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); for (j = start; j < i; j++) { pending_idx = frag_get_pending_idx(&shinfo->frags[j]); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } /* Remember the error: invalidate all subsequent fragments. */ @@ -862,7 +839,7 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk, return err; } -static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) +static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -878,11 +855,11 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) pending_idx = frag_get_pending_idx(frag); - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; - page = virt_to_page(idx_to_kaddr(netbk, pending_idx)); + page = virt_to_page(idx_to_kaddr(vif, pending_idx)); __skb_fill_page_desc(skb, i, page, txp->offset, txp->size); skb->len += txp->size; skb->data_len += txp->size; @@ -890,7 +867,7 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb) /* Take an extra reference to offset xen_netbk_idx_release */ get_page(page); - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } } @@ -1051,15 +1028,14 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, +static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, struct gnttab_copy *tco) { struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; int ret; - struct xenvif *vif = netbk->vif; - while ((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { + while ((nr_pending_reqs(vif) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) { struct xen_netif_tx_request txreq; struct xen_netif_tx_request txfrags[MAX_SKB_FRAGS]; struct page *page; @@ -1127,8 +1103,8 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, break; } - index = pending_index(netbk->pending_cons); - pending_idx = netbk->pending_ring[index]; + index = pending_index(vif->pending_cons); + pending_idx = vif->pending_ring[index]; data_len = (txreq.size > PKT_PROT_LEN && ret < MAX_SKB_FRAGS) ? @@ -1158,7 +1134,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, } /* XXX could copy straight to head */ - page = xen_netbk_alloc_page(netbk, skb, pending_idx); + page = xen_netbk_alloc_page(vif, skb, pending_idx); if (!page) { kfree_skb(skb); netbk_tx_err(vif, &txreq, idx); @@ -1178,7 +1154,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, gop++; - pool_idx = netbk->mmap_pages[pending_idx]; + pool_idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(pool_idx); memcpy(&pending_tx_info->req, @@ -1198,11 +1174,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, INVALID_PENDING_IDX); } - __skb_queue_tail(&netbk->tx_queue, skb); + __skb_queue_tail(&vif->tx_queue, skb); - netbk->pending_cons++; + vif->pending_cons++; - request_gop = xen_netbk_get_requests(netbk, vif, + request_gop = xen_netbk_get_requests(vif, skb, txfrags, gop); if (request_gop == NULL) { kfree_skb(skb); @@ -1221,16 +1197,15 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk, return gop - tco; } -static void xen_netbk_tx_submit(struct xen_netbk *netbk, +static void xen_netbk_tx_submit(struct xenvif *vif, struct gnttab_copy *tco, int *work_done, int budget) { struct gnttab_copy *gop = tco; struct sk_buff *skb; - struct xenvif *vif = netbk->vif; while ((*work_done < budget) && - (skb = __skb_dequeue(&netbk->tx_queue)) != NULL) { + (skb = __skb_dequeue(&vif->tx_queue)) != NULL) { struct xen_netif_tx_request *txp; u16 pending_idx; unsigned data_len; @@ -1239,13 +1214,13 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk, pending_idx = *((u16 *)skb->data); - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); txp = &pending_tx_info->req; /* Check the remap error code. */ - if (unlikely(xen_netbk_tx_check_gop(netbk, skb, &gop))) { + if (unlikely(xen_netbk_tx_check_gop(vif, skb, &gop))) { netdev_dbg(vif->dev, "netback grant failed.\n"); skb_shinfo(skb)->nr_frags = 0; kfree_skb(skb); @@ -1254,7 +1229,7 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk, data_len = skb->len; memcpy(skb->data, - (void *)(idx_to_kaddr(netbk, pending_idx)|txp->offset), + (void *)(idx_to_kaddr(vif, pending_idx)|txp->offset), data_len); if (data_len < txp->size) { /* Append the packet payload as a fragment. */ @@ -1262,7 +1237,7 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk, txp->size -= data_len; } else { /* Schedule a response immediately. */ - xen_netbk_idx_release(netbk, pending_idx); + xen_netbk_idx_release(vif, pending_idx); } if (txp->flags & XEN_NETTXF_csum_blank) @@ -1270,7 +1245,7 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk, else if (txp->flags & XEN_NETTXF_data_validated) skb->ip_summed = CHECKSUM_UNNECESSARY; - xen_netbk_fill_frags(netbk, skb); + xen_netbk_fill_frags(vif, skb); /* * If the initial fragment was < PKT_PROT_LEN then @@ -1302,18 +1277,18 @@ static void xen_netbk_tx_submit(struct xen_netbk *netbk, } /* Called after netfront has transmitted */ -void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget) +void xen_netbk_tx_action(struct xenvif *vif, int *work_done, int budget) { unsigned nr_gops; int ret; struct gnttab_copy *tco; - if (unlikely(!tx_work_todo(netbk))) + if (unlikely(!tx_work_todo(vif))) return; tco = get_cpu_ptr(tx_copy_ops); - nr_gops = xen_netbk_tx_build_gops(netbk, tco); + nr_gops = xen_netbk_tx_build_gops(vif, tco); if (nr_gops == 0) { put_cpu_ptr(tco); @@ -1323,32 +1298,31 @@ void xen_netbk_tx_action(struct xen_netbk *netbk, int *work_done, int budget) ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, tco, nr_gops); BUG_ON(ret); - xen_netbk_tx_submit(netbk, tco, work_done, budget); + xen_netbk_tx_submit(vif, tco, work_done, budget); put_cpu_ptr(tco); } -static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx) +static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx) { - struct xenvif *vif = netbk->vif; struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; int idx; /* Already complete? */ - if (netbk->mmap_pages[pending_idx] == INVALID_ENTRY) + if (vif->mmap_pages[pending_idx] == INVALID_ENTRY) return; - idx = netbk->mmap_pages[pending_idx]; + idx = vif->mmap_pages[pending_idx]; pending_tx_info = to_txinfo(idx); make_tx_response(vif, &pending_tx_info->req, XEN_NETIF_RSP_OKAY); - index = pending_index(netbk->pending_prod++); - netbk->pending_ring[index] = pending_idx; + index = pending_index(vif->pending_prod++); + vif->pending_ring[index] = pending_idx; - page_pool_put(netbk->mmap_pages[pending_idx]); + page_pool_put(vif->mmap_pages[pending_idx]); - netbk->mmap_pages[pending_idx] = INVALID_ENTRY; + vif->mmap_pages[pending_idx] = INVALID_ENTRY; } static void make_tx_response(struct xenvif *vif, @@ -1395,15 +1369,15 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, return resp; } -static inline int rx_work_todo(struct xen_netbk *netbk) +static inline int rx_work_todo(struct xenvif *vif) { - return !skb_queue_empty(&netbk->rx_queue); + return !skb_queue_empty(&vif->rx_queue); } -static inline int tx_work_todo(struct xen_netbk *netbk) +static inline int tx_work_todo(struct xenvif *vif) { - if (likely(RING_HAS_UNCONSUMED_REQUESTS(&netbk->vif->tx)) && - (nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) + if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) && + (nr_pending_reqs(vif) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) return 1; return 0; @@ -1454,54 +1428,21 @@ err: return err; } -struct xen_netbk *xen_netbk_alloc_netbk(struct xenvif *vif) -{ - int i; - struct xen_netbk *netbk; - - netbk = vzalloc(sizeof(struct xen_netbk)); - if (!netbk) { - printk(KERN_ALERT "%s: out of memory\n", __func__); - return NULL; - } - - netbk->vif = vif; - - skb_queue_head_init(&netbk->rx_queue); - skb_queue_head_init(&netbk->tx_queue); - - netbk->pending_cons = 0; - netbk->pending_prod = MAX_PENDING_REQS; - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->pending_ring[i] = i; - - for (i = 0; i < MAX_PENDING_REQS; i++) - netbk->mmap_pages[i] = INVALID_ENTRY; - - return netbk; -} - -void xen_netbk_free_netbk(struct xen_netbk *netbk) -{ - vfree(netbk); -} - int xen_netbk_kthread(void *data) { struct xenvif *vif = data; - struct xen_netbk *netbk = vif->netbk; while (!kthread_should_stop()) { wait_event_interruptible(vif->wq, - rx_work_todo(netbk) || + rx_work_todo(vif) || kthread_should_stop()); cond_resched(); if (kthread_should_stop()) break; - if (rx_work_todo(netbk)) - xen_netbk_rx_action(netbk); + if (rx_work_todo(vif)) + xen_netbk_rx_action(vif); } return 0; diff --git a/drivers/net/xen-netback/page_pool.c b/drivers/net/xen-netback/page_pool.c index 294f48b..ce00a93 100644 --- a/drivers/net/xen-netback/page_pool.c +++ b/drivers/net/xen-netback/page_pool.c @@ -102,7 +102,7 @@ int is_in_pool(struct page *page, int *pidx) return get_page_ext(page, pidx); } -struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) +struct page *page_pool_get(struct xenvif *vif, int *pidx) { int idx; struct page *page; @@ -118,7 +118,7 @@ struct page *page_pool_get(struct xen_netbk *netbk, int *pidx) } set_page_ext(page, idx); - pool[idx].u.netbk = netbk; + pool[idx].u.vif = vif; pool[idx].page = page; *pidx = idx; @@ -131,7 +131,7 @@ void page_pool_put(int idx) struct page *page = pool[idx].page; pool[idx].page = NULL; - pool[idx].u.netbk = NULL; + pool[idx].u.vif = NULL; page->mapping = 0; put_page(page); put_free_entry(idx); @@ -174,9 +174,9 @@ struct page *to_page(int idx) return pool[idx].page; } -struct xen_netbk *to_netbk(int idx) +struct xenvif *to_vif(int idx) { - return pool[idx].u.netbk; + return pool[idx].u.vif; } struct pending_tx_info *to_txinfo(int idx) diff --git a/drivers/net/xen-netback/page_pool.h b/drivers/net/xen-netback/page_pool.h index 572b037..efae17c 100644 --- a/drivers/net/xen-netback/page_pool.h +++ b/drivers/net/xen-netback/page_pool.h @@ -27,7 +27,10 @@ #ifndef __PAGE_POOL_H__ #define __PAGE_POOL_H__ -#include "common.h" +struct pending_tx_info { + struct xen_netif_tx_request req; +}; +typedef unsigned int pending_ring_idx_t; typedef uint32_t idx_t; @@ -38,8 +41,8 @@ struct page_pool_entry { struct page *page; struct pending_tx_info tx_info; union { - struct xen_netbk *netbk; - idx_t fl; + struct xenvif *vif; + idx_t fl; } u; }; @@ -52,12 +55,12 @@ int page_pool_init(void); void page_pool_destroy(void); -struct page *page_pool_get(struct xen_netbk *netbk, int *pidx); +struct page *page_pool_get(struct xenvif *vif, int *pidx); void page_pool_put(int idx); int is_in_pool(struct page *page, int *pidx); struct page *to_page(int idx); -struct xen_netbk *to_netbk(int idx); +struct xenvif *to_vif(int idx); struct pending_tx_info *to_txinfo(int idx); #endif /* __PAGE_POOL_H__ */ -- 1.7.2.5
Wei Liu
2012-Jan-17 13:47 UTC
[RFC PATCH V2 7/8] netback: alter internal function/structure names.
Since we''ve melted xen_netbk into xenvif, so it is better to give functions clearer names. Also alter napi poll handler function prototypes a bit. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/common.h | 26 ++-- drivers/net/xen-netback/interface.c | 20 ++-- drivers/net/xen-netback/netback.c | 229 ++++++++++++++++++----------------- 3 files changed, 141 insertions(+), 134 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 17d4e1a..53141c7 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -47,7 +47,7 @@ #include "page_pool.h" -struct netbk_rx_meta { +struct xenvif_rx_meta { int id; int size; int gso_size; @@ -140,32 +140,32 @@ void xenvif_xenbus_exit(void); int xenvif_schedulable(struct xenvif *vif); -int xen_netbk_rx_ring_full(struct xenvif *vif); +int xenvif_rx_ring_full(struct xenvif *vif); -int xen_netbk_must_stop_queue(struct xenvif *vif); +int xenvif_must_stop_queue(struct xenvif *vif); /* (Un)Map communication rings. */ -void xen_netbk_unmap_frontend_rings(struct xenvif *vif); -int xen_netbk_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref); +void xenvif_unmap_frontend_rings(struct xenvif *vif); +int xenvif_map_frontend_rings(struct xenvif *vif, + grant_ref_t tx_ring_ref, + grant_ref_t rx_ring_ref); /* Check for SKBs from frontend and schedule backend processing */ -void xen_netbk_check_rx_xenvif(struct xenvif *vif); +void xenvif_check_rx_xenvif(struct xenvif *vif); /* Receive an SKB from the frontend */ void xenvif_receive_skb(struct xenvif *vif, struct sk_buff *skb); /* Queue an SKB for transmission to the frontend */ -void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb); +void xenvif_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb); /* Notify xenvif that ring now has space to send an skb to the frontend */ void xenvif_notify_tx_completion(struct xenvif *vif); /* Returns number of ring slots required to send an skb to the frontend */ -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); +unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb); -void xen_netbk_tx_action(struct xenvif *vif, int *work_done, int budget); -void xen_netbk_rx_action(struct xenvif *vif); +int xenvif_tx_action(struct xenvif *vif, int budget); +void xenvif_rx_action(struct xenvif *vif); -int xen_netbk_kthread(void *data); +int xenvif_kthread(void *data); #endif /* __XEN_NETBACK__COMMON_H__ */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 11e638b..05caccc 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -48,7 +48,7 @@ int xenvif_schedulable(struct xenvif *vif) static int xenvif_rx_schedulable(struct xenvif *vif) { - return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif); + return xenvif_schedulable(vif) && !xenvif_rx_ring_full(vif); } static irqreturn_t xenvif_interrupt(int irq, void *dev_id) @@ -69,7 +69,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget) struct xenvif *vif = container_of(napi, struct xenvif, napi); int work_done = 0; - xen_netbk_tx_action(vif, &work_done, budget); + work_done = xenvif_tx_action(vif, budget); if (work_done < budget) { int more_to_do = 0; @@ -101,12 +101,12 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev) goto drop; /* Reserve ring slots for the worst-case number of fragments. */ - vif->rx_req_cons_peek += xen_netbk_count_skb_slots(vif, skb); + vif->rx_req_cons_peek += xenvif_count_skb_slots(vif, skb); - if (vif->can_queue && xen_netbk_must_stop_queue(vif)) + if (vif->can_queue && xenvif_must_stop_queue(vif)) netif_stop_queue(dev); - xen_netbk_queue_tx_skb(vif, skb); + xenvif_queue_tx_skb(vif, skb); return NETDEV_TX_OK; @@ -137,7 +137,7 @@ static void xenvif_up(struct xenvif *vif) { napi_enable(&vif->napi); enable_irq(vif->irq); - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); } static void xenvif_down(struct xenvif *vif) @@ -334,7 +334,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, __module_get(THIS_MODULE); - err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); + err = xenvif_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref); if (err < 0) goto err; @@ -347,7 +347,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, disable_irq(vif->irq); init_waitqueue_head(&vif->wq); - vif->task = kthread_create(xen_netbk_kthread, + vif->task = kthread_create(xenvif_kthread, (void *)vif, "vif%d.%d", vif->domid, vif->handle); if (IS_ERR(vif->task)) { @@ -371,7 +371,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref, err_unbind: unbind_from_irqhandler(vif->irq, vif); err_unmap: - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); err: return err; } @@ -400,7 +400,7 @@ void xenvif_disconnect(struct xenvif *vif) unregister_netdev(vif->dev); - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); free_netdev(vif->dev); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 1842e4e..fa864f4 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -57,9 +57,9 @@ struct gnttab_copy *tx_copy_ops; * straddles two buffers in the frontend. */ struct gnttab_copy *grant_copy_op; -struct netbk_rx_meta *meta; +struct xenvif_rx_meta *meta; -static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx); +static void xenvif_idx_release(struct xenvif *vif, u16 pending_idx); static void make_tx_response(struct xenvif *vif, struct xen_netif_tx_request *txp, s8 st); @@ -127,7 +127,7 @@ static int max_required_rx_slots(struct xenvif *vif) return max; } -int xen_netbk_rx_ring_full(struct xenvif *vif) +int xenvif_rx_ring_full(struct xenvif *vif) { RING_IDX peek = vif->rx_req_cons_peek; RING_IDX needed = max_required_rx_slots(vif); @@ -136,16 +136,16 @@ int xen_netbk_rx_ring_full(struct xenvif *vif) ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed); } -int xen_netbk_must_stop_queue(struct xenvif *vif) +int xenvif_must_stop_queue(struct xenvif *vif) { - if (!xen_netbk_rx_ring_full(vif)) + if (!xenvif_rx_ring_full(vif)) return 0; vif->rx.sring->req_event = vif->rx_req_cons_peek + max_required_rx_slots(vif); mb(); /* request notification /then/ check the queue */ - return xen_netbk_rx_ring_full(vif); + return xenvif_rx_ring_full(vif); } /* @@ -191,9 +191,9 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head) /* * Figure out how many ring slots we''re going to need to send @skb to * the guest. This function is essentially a dry run of - * netbk_gop_frag_copy. + * xenvif_gop_frag_copy. */ -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) +unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) { unsigned int count; int i, copy_off; @@ -232,15 +232,15 @@ struct netrx_pending_operations { unsigned copy_prod, copy_cons; unsigned meta_prod, meta_cons; struct gnttab_copy *copy; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; int copy_off; grant_ref_t copy_gref; }; -static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, - struct netrx_pending_operations *npo) +static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif, + struct netrx_pending_operations *npo) { - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; struct xen_netif_rx_request *req; req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++); @@ -260,13 +260,13 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif, * Set up the grant operations for this fragment. If it''s a flipping * interface, we also set up the unmap request from here. */ -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, - struct netrx_pending_operations *npo, - struct page *page, unsigned long size, - unsigned long offset, int *head) +static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, + struct netrx_pending_operations *npo, + struct page *page, unsigned long size, + unsigned long offset, int *head) { struct gnttab_copy *copy_gop; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; /* * These variables are used iff get_page_ext returns true, * in which case they are guaranteed to be initialized. @@ -344,14 +344,14 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, * zero GSO descriptors (for non-GSO packets) or one descriptor (for * frontend-side LRO). */ -static int netbk_gop_skb(struct sk_buff *skb, - struct netrx_pending_operations *npo) +static int xenvif_gop_skb(struct sk_buff *skb, + struct netrx_pending_operations *npo) { struct xenvif *vif = netdev_priv(skb->dev); int nr_frags = skb_shinfo(skb)->nr_frags; int i; struct xen_netif_rx_request *req; - struct netbk_rx_meta *meta; + struct xenvif_rx_meta *meta; unsigned char *data; int head = 1; int old_meta_prod; @@ -388,30 +388,30 @@ static int netbk_gop_skb(struct sk_buff *skb, if (data + len > skb_tail_pointer(skb)) len = skb_tail_pointer(skb) - data; - netbk_gop_frag_copy(vif, skb, npo, - virt_to_page(data), len, offset, &head); + xenvif_gop_frag_copy(vif, skb, npo, + virt_to_page(data), len, offset, &head); data += len; } for (i = 0; i < nr_frags; i++) { - netbk_gop_frag_copy(vif, skb, npo, - skb_frag_page(&skb_shinfo(skb)->frags[i]), - skb_frag_size(&skb_shinfo(skb)->frags[i]), - skb_shinfo(skb)->frags[i].page_offset, - &head); + xenvif_gop_frag_copy(vif, skb, npo, + skb_frag_page(&skb_shinfo(skb)->frags[i]), + skb_frag_size(&skb_shinfo(skb)->frags[i]), + skb_shinfo(skb)->frags[i].page_offset, + &head); } return npo->meta_prod - old_meta_prod; } /* - * This is a twin to netbk_gop_skb. Assume that netbk_gop_skb was + * This is a twin to xenvif_gop_skb. Assume that xenvif_gop_skb was * used to set up the operations on the top of * netrx_pending_operations, which have since been done. Check that * they didn''t give any errors and advance over them. */ -static int netbk_check_gop(struct xenvif *vif, int nr_meta_slots, - struct netrx_pending_operations *npo) +static int xenvif_check_gop(struct xenvif *vif, int nr_meta_slots, + struct netrx_pending_operations *npo) { struct gnttab_copy *copy_op; int status = XEN_NETIF_RSP_OKAY; @@ -430,9 +430,9 @@ static int netbk_check_gop(struct xenvif *vif, int nr_meta_slots, return status; } -static void netbk_add_frag_responses(struct xenvif *vif, int status, - struct netbk_rx_meta *meta, - int nr_meta_slots) +static void xenvif_add_frag_responses(struct xenvif *vif, int status, + struct xenvif_rx_meta *meta, + int nr_meta_slots) { int i; unsigned long offset; @@ -460,12 +460,12 @@ struct skb_cb_overlay { int meta_slots_used; }; -static void xen_netbk_kick_thread(struct xenvif *vif) +static void xenvif_kick_thread(struct xenvif *vif) { wake_up(&vif->wq); } -void xen_netbk_rx_action(struct xenvif *vif) +void xenvif_rx_action(struct xenvif *vif) { s8 status; u16 flags; @@ -481,7 +481,7 @@ void xen_netbk_rx_action(struct xenvif *vif) int need_to_notify = 0; struct gnttab_copy *gco = get_cpu_ptr(grant_copy_op); - struct netbk_rx_meta *m = get_cpu_ptr(meta); + struct xenvif_rx_meta *m = get_cpu_ptr(meta); struct netrx_pending_operations npo = { .copy = gco, @@ -497,7 +497,7 @@ void xen_netbk_rx_action(struct xenvif *vif) nr_frags = skb_shinfo(skb)->nr_frags; sco = (struct skb_cb_overlay *)skb->cb; - sco->meta_slots_used = netbk_gop_skb(skb, &npo); + sco->meta_slots_used = xenvif_gop_skb(skb, &npo); count += nr_frags + 1; @@ -544,7 +544,7 @@ void xen_netbk_rx_action(struct xenvif *vif) vif->dev->stats.tx_bytes += skb->len; vif->dev->stats.tx_packets++; - status = netbk_check_gop(vif, sco->meta_slots_used, &npo); + status = xenvif_check_gop(vif, sco->meta_slots_used, &npo); if (sco->meta_slots_used == 1) flags = 0; @@ -580,9 +580,9 @@ void xen_netbk_rx_action(struct xenvif *vif) gso->flags = 0; } - netbk_add_frag_responses(vif, status, - m + npo.meta_cons + 1, - sco->meta_slots_used); + xenvif_add_frag_responses(vif, status, + m + npo.meta_cons + 1, + sco->meta_slots_used); RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret); if (ret) @@ -598,20 +598,20 @@ void xen_netbk_rx_action(struct xenvif *vif) notify_remote_via_irq(vif->irq); if (!skb_queue_empty(&vif->rx_queue)) - xen_netbk_kick_thread(vif); + xenvif_kick_thread(vif); put_cpu_ptr(gco); put_cpu_ptr(m); } -void xen_netbk_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) +void xenvif_queue_tx_skb(struct xenvif *vif, struct sk_buff *skb) { skb_queue_tail(&vif->rx_queue, skb); - xen_netbk_kick_thread(vif); + xenvif_kick_thread(vif); } -void xen_netbk_check_rx_xenvif(struct xenvif *vif) +void xenvif_check_rx_xenvif(struct xenvif *vif) { int more_to_do; @@ -648,11 +648,11 @@ static void tx_credit_callback(unsigned long data) { struct xenvif *vif = (struct xenvif *)data; tx_add_credit(vif); - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); } -static void netbk_tx_err(struct xenvif *vif, - struct xen_netif_tx_request *txp, RING_IDX end) +static void xenvif_tx_err(struct xenvif *vif, + struct xen_netif_tx_request *txp, RING_IDX end) { RING_IDX cons = vif->tx.req_cons; @@ -663,10 +663,10 @@ static void netbk_tx_err(struct xenvif *vif, txp = RING_GET_REQUEST(&vif->tx, cons++); } while (1); vif->tx.req_cons = cons; - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); } -static int netbk_count_requests(struct xenvif *vif, +static int xenvif_count_requests(struct xenvif *vif, struct xen_netif_tx_request *first, struct xen_netif_tx_request *txp, int work_to_do) @@ -707,9 +707,9 @@ static int netbk_count_requests(struct xenvif *vif, return frags; } -static struct page *xen_netbk_alloc_page(struct xenvif *vif, - struct sk_buff *skb, - u16 pending_idx) +static struct page *xenvif_alloc_page(struct xenvif *vif, + struct sk_buff *skb, + u16 pending_idx) { struct page *page; int idx; @@ -720,10 +720,10 @@ static struct page *xen_netbk_alloc_page(struct xenvif *vif, return page; } -static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, - struct sk_buff *skb, - struct xen_netif_tx_request *txp, - struct gnttab_copy *gop) +static struct gnttab_copy *xenvif_get_requests(struct xenvif *vif, + struct sk_buff *skb, + struct xen_netif_tx_request *txp, + struct gnttab_copy *gop) { struct skb_shared_info *shinfo = skb_shinfo(skb); skb_frag_t *frags = shinfo->frags; @@ -741,7 +741,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, index = pending_index(vif->pending_cons++); pending_idx = vif->pending_ring[index]; - page = xen_netbk_alloc_page(vif, skb, pending_idx); + page = xenvif_alloc_page(vif, skb, pending_idx); if (!page) return NULL; @@ -769,9 +769,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xenvif *vif, return gop; } -static int xen_netbk_tx_check_gop(struct xenvif *vif, - struct sk_buff *skb, - struct gnttab_copy **gopp) +static int xenvif_tx_check_gop(struct xenvif *vif, + struct sk_buff *skb, + struct gnttab_copy **gopp) { struct gnttab_copy *gop = *gopp; u16 pending_idx = *((u16 *)skb->data); @@ -808,7 +808,7 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, if (likely(!newerr)) { /* Had a previous error? Invalidate this fragment. */ if (unlikely(err)) - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); continue; } @@ -825,10 +825,10 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, /* First error: invalidate header and preceding fragments. */ pending_idx = *((u16 *)skb->data); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); for (j = start; j < i; j++) { pending_idx = frag_get_pending_idx(&shinfo->frags[j]); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } /* Remember the error: invalidate all subsequent fragments. */ @@ -839,7 +839,7 @@ static int xen_netbk_tx_check_gop(struct xenvif *vif, return err; } -static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) +static void xenvif_fill_frags(struct xenvif *vif, struct sk_buff *skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); int nr_frags = shinfo->nr_frags; @@ -865,15 +865,15 @@ static void xen_netbk_fill_frags(struct xenvif *vif, struct sk_buff *skb) skb->data_len += txp->size; skb->truesize += txp->size; - /* Take an extra reference to offset xen_netbk_idx_release */ + /* Take an extra reference to offset xenvif_idx_release */ get_page(page); - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } } -static int xen_netbk_get_extras(struct xenvif *vif, - struct xen_netif_extra_info *extras, - int work_to_do) +static int xenvif_get_extras(struct xenvif *vif, + struct xen_netif_extra_info *extras, + int work_to_do) { struct xen_netif_extra_info extra; RING_IDX cons = vif->tx.req_cons; @@ -901,9 +901,9 @@ static int xen_netbk_get_extras(struct xenvif *vif, return work_to_do; } -static int netbk_set_skb_gso(struct xenvif *vif, - struct sk_buff *skb, - struct xen_netif_extra_info *gso) +static int xenvif_set_skb_gso(struct xenvif *vif, + struct sk_buff *skb, + struct xen_netif_extra_info *gso) { if (!gso->u.gso.size) { netdev_dbg(vif->dev, "GSO size must not be zero.\n"); @@ -1028,8 +1028,8 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } -static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, - struct gnttab_copy *tco) +static unsigned xenvif_tx_build_gops(struct xenvif *vif, + struct gnttab_copy *tco) { struct gnttab_copy *gop = tco, *request_gop; struct sk_buff *skb; @@ -1070,18 +1070,18 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, memset(extras, 0, sizeof(extras)); if (txreq.flags & XEN_NETTXF_extra_info) { - work_to_do = xen_netbk_get_extras(vif, extras, + work_to_do = xenvif_get_extras(vif, extras, work_to_do); idx = vif->tx.req_cons; if (unlikely(work_to_do < 0)) { - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } } - ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do); + ret = xenvif_count_requests(vif, &txreq, txfrags, work_to_do); if (unlikely(ret < 0)) { - netbk_tx_err(vif, &txreq, idx - ret); + xenvif_tx_err(vif, &txreq, idx - ret); break; } idx += ret; @@ -1089,7 +1089,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, if (unlikely(txreq.size < ETH_HLEN)) { netdev_dbg(vif->dev, "Bad packet size: %d\n", txreq.size); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1099,7 +1099,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, "txreq.offset: %x, size: %u, end: %lu\n", txreq.offset, txreq.size, (txreq.offset&~PAGE_MASK) + txreq.size); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1115,7 +1115,7 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, if (unlikely(skb == NULL)) { netdev_dbg(vif->dev, "Can''t allocate a skb in start_xmit.\n"); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1126,18 +1126,18 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, struct xen_netif_extra_info *gso; gso = &extras[XEN_NETIF_EXTRA_TYPE_GSO - 1]; - if (netbk_set_skb_gso(vif, skb, gso)) { + if (xenvif_set_skb_gso(vif, skb, gso)) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } } /* XXX could copy straight to head */ - page = xen_netbk_alloc_page(vif, skb, pending_idx); + page = xenvif_alloc_page(vif, skb, pending_idx); if (!page) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } @@ -1178,17 +1178,17 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, vif->pending_cons++; - request_gop = xen_netbk_get_requests(vif, + request_gop = xenvif_get_requests(vif, skb, txfrags, gop); if (request_gop == NULL) { kfree_skb(skb); - netbk_tx_err(vif, &txreq, idx); + xenvif_tx_err(vif, &txreq, idx); break; } gop = request_gop; vif->tx.req_cons = idx; - xen_netbk_check_rx_xenvif(vif); + xenvif_check_rx_xenvif(vif); if ((gop - tco) >= MAX_PENDING_REQS) break; @@ -1197,14 +1197,15 @@ static unsigned xen_netbk_tx_build_gops(struct xenvif *vif, return gop - tco; } -static void xen_netbk_tx_submit(struct xenvif *vif, - struct gnttab_copy *tco, - int *work_done, int budget) +static int xenvif_tx_submit(struct xenvif *vif, + struct gnttab_copy *tco, + int budget) { struct gnttab_copy *gop = tco; struct sk_buff *skb; + int work_done = 0; - while ((*work_done < budget) && + while ((work_done < budget) && (skb = __skb_dequeue(&vif->tx_queue)) != NULL) { struct xen_netif_tx_request *txp; u16 pending_idx; @@ -1220,7 +1221,7 @@ static void xen_netbk_tx_submit(struct xenvif *vif, txp = &pending_tx_info->req; /* Check the remap error code. */ - if (unlikely(xen_netbk_tx_check_gop(vif, skb, &gop))) { + if (unlikely(xenvif_tx_check_gop(vif, skb, &gop))) { netdev_dbg(vif->dev, "netback grant failed.\n"); skb_shinfo(skb)->nr_frags = 0; kfree_skb(skb); @@ -1237,7 +1238,7 @@ static void xen_netbk_tx_submit(struct xenvif *vif, txp->size -= data_len; } else { /* Schedule a response immediately. */ - xen_netbk_idx_release(vif, pending_idx); + xenvif_idx_release(vif, pending_idx); } if (txp->flags & XEN_NETTXF_csum_blank) @@ -1245,7 +1246,7 @@ static void xen_netbk_tx_submit(struct xenvif *vif, else if (txp->flags & XEN_NETTXF_data_validated) skb->ip_summed = CHECKSUM_UNNECESSARY; - xen_netbk_fill_frags(vif, skb); + xenvif_fill_frags(vif, skb); /* * If the initial fragment was < PKT_PROT_LEN then @@ -1270,25 +1271,28 @@ static void xen_netbk_tx_submit(struct xenvif *vif, vif->dev->stats.rx_bytes += skb->len; vif->dev->stats.rx_packets++; - (*work_done)++; + work_done++; xenvif_receive_skb(vif, skb); } + + return work_done; } /* Called after netfront has transmitted */ -void xen_netbk_tx_action(struct xenvif *vif, int *work_done, int budget) +int xenvif_tx_action(struct xenvif *vif, int budget) { unsigned nr_gops; int ret; struct gnttab_copy *tco; + int work_done; if (unlikely(!tx_work_todo(vif))) - return; + return 0; tco = get_cpu_ptr(tx_copy_ops); - nr_gops = xen_netbk_tx_build_gops(vif, tco); + nr_gops = xenvif_tx_build_gops(vif, tco); if (nr_gops == 0) { put_cpu_ptr(tco); @@ -1298,11 +1302,14 @@ void xen_netbk_tx_action(struct xenvif *vif, int *work_done, int budget) ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, tco, nr_gops); BUG_ON(ret); - xen_netbk_tx_submit(vif, tco, work_done, budget); + work_done = xenvif_tx_submit(vif, tco, budget); + put_cpu_ptr(tco); + + return work_done; } -static void xen_netbk_idx_release(struct xenvif *vif, u16 pending_idx) +static void xenvif_idx_release(struct xenvif *vif, u16 pending_idx) { struct pending_tx_info *pending_tx_info; pending_ring_idx_t index; @@ -1383,7 +1390,7 @@ static inline int tx_work_todo(struct xenvif *vif) return 0; } -void xen_netbk_unmap_frontend_rings(struct xenvif *vif) +void xenvif_unmap_frontend_rings(struct xenvif *vif) { if (vif->tx.sring) xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), @@ -1393,9 +1400,9 @@ void xen_netbk_unmap_frontend_rings(struct xenvif *vif) vif->rx.sring); } -int xen_netbk_map_frontend_rings(struct xenvif *vif, - grant_ref_t tx_ring_ref, - grant_ref_t rx_ring_ref) +int xenvif_map_frontend_rings(struct xenvif *vif, + grant_ref_t tx_ring_ref, + grant_ref_t rx_ring_ref) { void *addr; struct xen_netif_tx_sring *txs; @@ -1424,11 +1431,11 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, return 0; err: - xen_netbk_unmap_frontend_rings(vif); + xenvif_unmap_frontend_rings(vif); return err; } -int xen_netbk_kthread(void *data) +int xenvif_kthread(void *data) { struct xenvif *vif = data; @@ -1442,7 +1449,7 @@ int xen_netbk_kthread(void *data) break; if (rx_work_todo(vif)) - xen_netbk_rx_action(vif); + xenvif_rx_action(vif); } return 0; @@ -1468,9 +1475,9 @@ static int __init netback_init(void) if (!grant_copy_op) goto failed_init_gco; - meta = __alloc_percpu(sizeof(struct netbk_rx_meta) + meta = __alloc_percpu(sizeof(struct xenvif_rx_meta) * 2 * XEN_NETIF_RX_RING_SIZE, - __alignof__(struct netbk_rx_meta)); + __alignof__(struct xenvif_rx_meta)); if (!meta) goto failed_init_meta; -- 1.7.2.5
Wei Liu
2012-Jan-17 13:47 UTC
[RFC PATCH V2 8/8] netback: remove unwanted notification generation during NAPI processing.
In original implementation, tx_build_gops tends to update req_event pointer every time it sees tx error or finish one batch. Remove those code to only update req_event pointer when we really want to shut down NAPI. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netback/interface.c | 5 +++-- drivers/net/xen-netback/netback.c | 4 +--- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 05caccc..7cf0947 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -58,8 +58,8 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id) if (xenvif_rx_schedulable(vif)) netif_wake_queue(vif->dev); - if (likely(napi_schedule_prep(&vif->napi))) - __napi_schedule(&vif->napi); + if (RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) + napi_schedule(&vif->napi); return IRQ_HANDLED; } @@ -74,6 +74,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget) if (work_done < budget) { int more_to_do = 0; unsigned long flag; + local_irq_save(flag); RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do); diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index fa864f4..34f34f5 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -663,7 +663,6 @@ static void xenvif_tx_err(struct xenvif *vif, txp = RING_GET_REQUEST(&vif->tx, cons++); } while (1); vif->tx.req_cons = cons; - xenvif_check_rx_xenvif(vif); } static int xenvif_count_requests(struct xenvif *vif, @@ -1048,7 +1047,7 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif, int pool_idx; struct pending_tx_info *pending_tx_info; - RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, work_to_do); + work_to_do = RING_HAS_UNCONSUMED_REQUESTS(&vif->tx); if (!work_to_do) { break; } @@ -1188,7 +1187,6 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif, gop = request_gop; vif->tx.req_cons = idx; - xenvif_check_rx_xenvif(vif); if ((gop - tco) >= MAX_PENDING_REQS) break; -- 1.7.2.5
Stephen Hemminger
2012-Jan-17 17:07 UTC
Re: [RFC PATCH V2 3/8] netback: switch to NAPI + kthread model
On Tue, 17 Jan 2012 13:46:59 +0000 Wei Liu <wei.liu2@citrix.com> wrote:> This patch implements 1:1 model netback. We utilizes NAPI and kthread > to do the weight-lifting job: > > - NAPI is used for guest side TX (host side RX) > - kthread is used for guest side RX (host side TX) > > This model provides better scheduling fairness among vifs. It also > lays the foundation for future work. > > The major defect for the current implementation is that in the NAPI > poll handler we don''t actually disable interrupt. Xen stuff is > different from real hardware, it requires some other tuning of ring > macros. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com>The network receive processing is sensitive to the context it is run in. Normally it is run in softirq with interrupts enabled. With your code, the poll routine disables IRQ''s which shouldn''t be necessary. Why does xenvif_receive_skb() need to still exist? Couldn''t it just be replaced with call to netif_receive_skb() in one place it is called.
Wei Liu
2012-Jan-17 17:11 UTC
Re: [RFC PATCH V2 3/8] netback: switch to NAPI + kthread model
On Tue, 2012-01-17 at 17:07 +0000, Stephen Hemminger wrote:> On Tue, 17 Jan 2012 13:46:59 +0000 > Wei Liu <wei.liu2@citrix.com> wrote: > > > This patch implements 1:1 model netback. We utilizes NAPI and kthread > > to do the weight-lifting job: > > > > - NAPI is used for guest side TX (host side RX) > > - kthread is used for guest side RX (host side TX) > > > > This model provides better scheduling fairness among vifs. It also > > lays the foundation for future work. > > > > The major defect for the current implementation is that in the NAPI > > poll handler we don''t actually disable interrupt. Xen stuff is > > different from real hardware, it requires some other tuning of ring > > macros. > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > The network receive processing is sensitive to the context it is run in. > Normally it is run in softirq with interrupts enabled. With your code, > the poll routine disables IRQ''s which shouldn''t be necessary. >Misunderstanding here. I should rewrite my commit message. By "disabling interrupt" I mean stop the other end from generating events, not system wide disabling interrupt.> Why does xenvif_receive_skb() need to still exist? Couldn''t it > just be replaced with call to netif_receive_skb() in one place it is called.Sure. Wei.
Konrad Rzeszutek Wilk
2012-Jan-27 19:22 UTC
Re: [RFC PATCH V2] New Xen netback implementation
On Tue, Jan 17, 2012 at 01:46:56PM +0000, Wei Liu wrote:> A new netback implementation which includes three major features: > > - Global page pool support > - NAPI + kthread 1:1 model > - Netback internal name changes > > Changes in V2: > - Fix minor bugs in V1 > - Embed pending_tx_info into page pool > - Per-cpu scratch space > - Notification code path clean up > > This patch series is the foundation of furture work. So it is better > to get it right first. Patch 1 and 3 have the real meat.I''ve been playing with these patches and couple of things came to my mind: - would it make sense to also register to the shrinker API? This way if the host is running low on memory it can squeeze it out of the pool code. Perhaps a future TODO.. - I like the pool code. I was thinking that perhaps (in the future) it could be used by blkback as well, as it runs into "not enought request structure" with the default setting. And making this dynamic would be pretty sweet. - This patch set solves the CPU banding problem I''ve seen with the older netback. The older one I could see X netback threads eating 80% of CPU. With this one, the number is down to 13-14%. So you can definitly stick ''Tested-by: Konrad.." on them. And definitly Reviewed-by on the first two - hadn''t had a chance to look at the rest.> > The first benifit of 1:1 model will be scheduling fairness. > > The rational behind a global page pool is that we need to limit > overall memory consumed by all vifs. > > Utilization of NAPI enables the possibility to mitigate > interrupts/events, the code path is cleaned up in a separated patch. > > Netback internal changes cleans up the code structure after switching > to 1:1 model. It also prepares netback for further code layout > changes. > > --- > drivers/net/xen-netback/Makefile | 2 +- > drivers/net/xen-netback/common.h | 78 ++-- > drivers/net/xen-netback/interface.c | 117 ++++-- > drivers/net/xen-netback/netback.c | 836 ++++++++++++++--------------------- > drivers/net/xen-netback/page_pool.c | 185 ++++++++ > drivers/net/xen-netback/page_pool.h | 66 +++ > drivers/net/xen-netback/xenbus.c | 6 +- > 7 files changed, 704 insertions(+), 586 deletions(-) >
On Fri, 2012-01-27 at 19:22 +0000, Konrad Rzeszutek Wilk wrote:> On Tue, Jan 17, 2012 at 01:46:56PM +0000, Wei Liu wrote: > > A new netback implementation which includes three major features: > > > > - Global page pool support > > - NAPI + kthread 1:1 model > > - Netback internal name changes > > > > Changes in V2: > > - Fix minor bugs in V1 > > - Embed pending_tx_info into page pool > > - Per-cpu scratch space > > - Notification code path clean up > > > > This patch series is the foundation of furture work. So it is better > > to get it right first. Patch 1 and 3 have the real meat. > > I''ve been playing with these patches and couple of things > came to my mind: > - would it make sense to also register to the shrinker API? This way > if the host is running low on memory it can squeeze it out of the > pool code. Perhaps a future TODO.. > - I like the pool code. I was thinking that perhaps (in the future) > it could be used by blkback as well, as it runs into "not enought > request structure" with the default setting. And making this dynamic > would be pretty sweet.Interesting thoughts worth adding to TODO list. But I''m focusing on multi-page ring support and split event channel at the moment, which should help improve performance on 10G network. Hopefully I can submit RFC patch V3 in a few days. ;-)> - This patch set solves the CPU banding problem I''ve seen with the > older netback. The older one I could see X netback threads eating 80% > of CPU. With this one, the number is down to 13-14%. > > So you can definitly stick ''Tested-by: Konrad.." on them. And definitly > Reviewed-by on the first two - hadn''t had a chance to look at the rest. >Thanks for your extensive test and review. Wei.
Konrad Rzeszutek Wilk
2012-Jan-29 21:37 UTC
Re: [RFC PATCH V2] New Xen netback implementation
On Sun, Jan 29, 2012 at 01:42:41PM +0000, Wei Liu wrote:> On Fri, 2012-01-27 at 19:22 +0000, Konrad Rzeszutek Wilk wrote: > > On Tue, Jan 17, 2012 at 01:46:56PM +0000, Wei Liu wrote: > > > A new netback implementation which includes three major features: > > > > > > - Global page pool support > > > - NAPI + kthread 1:1 model > > > - Netback internal name changes > > > > > > Changes in V2: > > > - Fix minor bugs in V1 > > > - Embed pending_tx_info into page pool > > > - Per-cpu scratch space > > > - Notification code path clean up > > > > > > This patch series is the foundation of furture work. So it is better > > > to get it right first. Patch 1 and 3 have the real meat. > > > > I''ve been playing with these patches and couple of things > > came to my mind: > > - would it make sense to also register to the shrinker API? This way > > if the host is running low on memory it can squeeze it out of the > > pool code. Perhaps a future TODO.. > > - I like the pool code. I was thinking that perhaps (in the future) > > it could be used by blkback as well, as it runs into "not enought > > request structure" with the default setting. And making this dynamic > > would be pretty sweet. > > Interesting thoughts worth adding to TODO list. But I''m focusing on > multi-page ring support and split event channel at the moment, which > should help improve performance on 10G network. Hopefully I can submit > RFC patch V3 in a few days. ;-) > > > - This patch set solves the CPU banding problem I''ve seen with the > > older netback. The older one I could see X netback threads eating 80% > > of CPU. With this one, the number is down to 13-14%. > > > > So you can definitly stick ''Tested-by: Konrad.." on them. And definitly > > Reviewed-by on the first two - hadn''t had a chance to look at the rest. > > > > Thanks for your extensive test and review.Sure. I also did some testing with limiting the amount of CPUs and found that ''xl vcpu-set 0 N'' make netback not work anymore :-(> > > Wei.
On Sun, 2012-01-29 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> Sure. I also did some testing with limiting the amount of CPUs and found > that ''xl vcpu-set 0 N'' make netback not work anymore :-( > >Any stack trace? Oops message? Did you increase the number of CPUs or decrease the number? I didn''t pay much attention on the CPU hotplug path TBH. And the V3 series, which I just posted, has a completed different per-cpu scratch space implementation, which takes care of CPU hotplug events. Wei.
On Fri, 2012-01-27 at 19:22 +0000, Konrad Rzeszutek Wilk wrote:> > - This patch set solves the CPU banding problem I''ve seen with the > older netback. The older one I could see X netback threads eating > 80% > of CPU. With this one, the number is down to 13-14%."CPU banding problem"? If you had X threads using 80% before do you now see Y threads using 13-14% where Y is bigger or smaller than X? Is 80*X ~= 13*Y? Ian.
Konrad Rzeszutek Wilk
2012-Jan-30 15:21 UTC
Re: [RFC PATCH V2] New Xen netback implementation
On Mon, Jan 30, 2012 at 03:07:16PM +0000, Ian Campbell wrote:> On Fri, 2012-01-27 at 19:22 +0000, Konrad Rzeszutek Wilk wrote: > > > > - This patch set solves the CPU banding problem I''ve seen with the > > older netback. The older one I could see X netback threads eating > > 80% > > of CPU. With this one, the number is down to 13-14%. > > "CPU banding problem"? > > If you had X threads using 80% before do you now see Y threads using > 13-14% where Y is bigger or smaller than X? Is 80*X ~= 13*Y?Yes. ~=. The count looked to be the same.> > Ian.
On Mon, 2012-01-30 at 15:21 +0000, Konrad Rzeszutek Wilk wrote:> On Mon, Jan 30, 2012 at 03:07:16PM +0000, Ian Campbell wrote: > > On Fri, 2012-01-27 at 19:22 +0000, Konrad Rzeszutek Wilk wrote: > > > > > > - This patch set solves the CPU banding problem I''ve seen with the > > > older netback. The older one I could see X netback threads eating > > > 80% > > > of CPU. With this one, the number is down to 13-14%. > > > > "CPU banding problem"? > > > > If you had X threads using 80% before do you now see Y threads using > > 13-14% where Y is bigger or smaller than X? Is 80*X ~= 13*Y? > > Yes. ~=. The count looked to be the same.Great, that''s as expected, the same work is more fairly distributed -- I thought you might be suggesting the total had gone down from X*80%->X*13% which is not what I thought this series would be doing! Ian.
On Sun, 2012-01-29 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> > Sure. I also did some testing with limiting the amount of CPUs and found > that ''xl vcpu-set 0 N'' make netback not work anymore :-( > > > >I just played with vcpu-set a bit, and I can reproduced this problem. That''s a race condition. One possible fix is remove cond_resched() in the kernel thread. After removing that, it fixes the problem (at least for me). Wei. --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -994,7 +994,7 @@ int xenvif_kthread(void *data) wait_event_interruptible(vif->wq, rx_work_todo(vif) || kthread_should_stop()); - cond_resched(); + /* cond_resched(); */ if (kthread_should_stop()) break;
On Mon, 2012-01-30 at 18:27 +0000, Wei Liu (Intern) wrote:> On Sun, 2012-01-29 at 21:37 +0000, Konrad Rzeszutek Wilk wrote: > > > > Sure. I also did some testing with limiting the amount of CPUs and found > > that ''xl vcpu-set 0 N'' make netback not work anymore :-( > > > > > > > > I just played with vcpu-set a bit, and I can reproduced this problem. > That''s a race condition. > > One possible fix is remove cond_resched() in the kernel thread. After > removing that, it fixes the problem (at least for me). > > > Wei. > > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -994,7 +994,7 @@ int xenvif_kthread(void *data) > wait_event_interruptible(vif->wq, > rx_work_todo(vif) || > kthread_should_stop()); > - cond_resched(); > + /* cond_resched(); */ > > if (kthread_should_stop()) > break; > >Hmm... Here it comes again. Ignore this fix. It''s more complicated than I thought. Wei.
On Sun, 2012-01-29 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> Sure. I also did some testing with limiting the amount of CPUs and found > that ''xl vcpu-set 0 N'' make netback not work anymore :-( > >I think I find a way to work around this problem. In fact, it is my fault -- I was not guarding memory safe enough. I''m torturing my box at the moment, will see if it can survive the night. :-) Wei.