Herbert Xu
2007-May-09 11:30 UTC
[Xen-devel] [0/2] Remove netloop by lazy copying in netback
Hi Keir: Here is a repost of the patches to remove the need of netloop by copying in netback and only if it is necessary. Here''s the original description: The rationale is that most packets will be processed without delay allowing them to be freed without copying at all. So instead of copying every packet destined to dom0 we''ll only copy those that linger longer than a specified amount of time (currently 0.5s). As it is netloop doesn''t take care of all delays anyway. For instance packets delayed by qdisc or netfilter can hold up resources without any limits. Also if bridging isn''t used then traffic to dom0 does not even go through netloop. Testing shows that these patches do eliminate the copying for bulk transfers. In fact, bulk transfer throughput from domU to dom0 are increased by around 50%. Even when the copying path is taken the performance is roughly equal to that of netloop despite the unoptimised copying path. The copying is achieved through a new grant table operation. I''ve only implemented it for x86. However, there is a fallback path for other platforms so they should continue to work. It shouldn''t be too hard to implement this for ia64/ppc (for someone with access to them). In future I intend to exntend this idea to support lazy copying for dom0 to domU as well which should give us a complete zero-copy path from one domU to another. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Herbert Xu
2007-May-09 11:31 UTC
[Xen-devel] [1/2] [XEN] gnttab: Add new op unmap_and_replace
Hi: [XEN] gnttab: Add new op unmap_and_replace The operation unmap_and_replace is an extension of unmap_grant_ref. A new argument in the form of a virtual address (for PV) is given. Instead of modifying the PTE for the mapped grant table entry to null, we change it to the PTE for the new address. In turn we point the new address to null. As it stands grant table entries once mapped cannot be remapped by the guest OS (it can however perform a new mapping on the same entry but that is within our control). Therefore it''s safe to manipulate the mapped PTE entry to redirect it to a normal page where we''ve copied the contents. It''s intended to be used as follows: 1) map_grant_ref to v1 2) ... 3) alloc page at v2 4) copy the page at v1 to v2 5) unmap_and_replace v1 with v2 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 3ef0510e44d0 linux-2.6-xen-sparse/include/xen/gnttab.h --- a/linux-2.6-xen-sparse/include/xen/gnttab.h Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/include/xen/gnttab.h Wed May 09 21:16:57 2007 +1000 @@ -135,4 +135,19 @@ gnttab_set_unmap_op(struct gnttab_unmap_ unmap->dev_bus_addr = 0; } +static inline void +gnttab_set_replace_op(struct gnttab_unmap_and_replace *unmap, maddr_t addr, + maddr_t new_addr, grant_handle_t handle) +{ + if (xen_feature(XENFEAT_auto_translated_physmap)) { + unmap->host_addr = __pa(addr); + unmap->new_addr = __pa(new_addr); + } else { + unmap->host_addr = addr; + unmap->new_addr = new_addr; + } + + unmap->handle = handle; +} + #endif /* __ASM_GNTTAB_H__ */ diff -r 3ef0510e44d0 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/arch/x86/mm.c Wed May 09 21:16:57 2007 +1000 @@ -2619,8 +2619,8 @@ static int create_grant_va_mapping( return GNTST_okay; } -static int destroy_grant_va_mapping( - unsigned long addr, unsigned long frame, struct vcpu *v) +static int replace_grant_va_mapping( + unsigned long addr, unsigned long frame, l1_pgentry_t nl1e, struct vcpu *v) { l1_pgentry_t *pl1e, ol1e; unsigned long gl1mfn; @@ -2644,7 +2644,7 @@ static int destroy_grant_va_mapping( } /* Delete pagetable entry. */ - if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(), gl1mfn, v)) ) + if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v)) ) { MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e); rc = GNTST_general_error; @@ -2654,6 +2654,12 @@ static int destroy_grant_va_mapping( out: guest_unmap_l1e(v, pl1e); return rc; +} + +static int destroy_grant_va_mapping( + unsigned long addr, unsigned long frame, struct vcpu *v) +{ + return replace_grant_va_mapping(addr, frame, l1e_empty(), v); } int create_grant_host_mapping( @@ -2677,6 +2683,44 @@ int destroy_grant_host_mapping( if ( flags & GNTMAP_contains_pte ) return destroy_grant_pte_mapping(addr, frame, current->domain); return destroy_grant_va_mapping(addr, frame, current); +} + +int replace_grant_host_mapping( + uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags) +{ + l1_pgentry_t *pl1e, ol1e; + unsigned long gl1mfn; + int rc; + + if ( flags & GNTMAP_contains_pte ) + { + MEM_LOG("Unsupported grant table operation"); + return GNTST_general_error; + } + + pl1e = guest_map_l1e(current, new_addr, &gl1mfn); + if ( !pl1e ) + { + MEM_LOG("Could not find L1 PTE for address %lx", + (unsigned long)new_addr); + return GNTST_general_error; + } + ol1e = *pl1e; + + if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(), gl1mfn, current)) ) + { + MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e); + guest_unmap_l1e(current, pl1e); + return GNTST_general_error; + } + + guest_unmap_l1e(current, pl1e); + + rc = replace_grant_va_mapping(addr, frame, ol1e, current); + if ( rc && !paging_mode_refcounts(current->domain) ) + put_page_from_l1e(ol1e, current->domain); + + return rc; } int steal_page( diff -r 3ef0510e44d0 xen/common/grant_table.c --- a/xen/common/grant_table.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/common/grant_table.c Wed May 09 21:16:57 2007 +1000 @@ -542,6 +542,126 @@ fault: return -EFAULT; } +static void +__gnttab_unmap_and_replace( + struct gnttab_unmap_and_replace *op) +{ + domid_t dom; + grant_ref_t ref; + struct domain *ld, *rd; + struct active_grant_entry *act; + grant_entry_t *sha; + struct grant_mapping *map; + u16 flags; + s16 rc = 0; + unsigned long frame; + + ld = current->domain; + + if ( unlikely(op->handle >= ld->grant_table->maptrack_limit) ) + { + gdprintk(XENLOG_INFO, "Bad handle (%d).\n", op->handle); + op->status = GNTST_bad_handle; + return; + } + + map = &maptrack_entry(ld->grant_table, op->handle); + + if ( unlikely(!map->flags) ) + { + gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle); + op->status = GNTST_bad_handle; + return; + } + + dom = map->domid; + ref = map->ref; + flags = map->flags; + + if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) ) + { + /* This can happen when a grant is implicitly unmapped. */ + gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom); + domain_crash(ld); /* naughty... */ + return; + } + + TRACE_1D(TRC_MEM_PAGE_GRANT_UNMAP, dom); + + spin_lock(&rd->grant_table->lock); + + act = &active_entry(rd->grant_table, ref); + sha = &shared_entry(rd->grant_table, ref); + + frame = act->frame; + + if ( flags & GNTMAP_host_map ) + { + if ( (rc = replace_grant_host_mapping(op->host_addr, + frame, op->new_addr, flags)) < 0 ) + goto unmap_out; + + ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask)); + map->flags &= ~GNTMAP_host_map; + if ( flags & GNTMAP_readonly ) + { + act->pin -= GNTPIN_hstr_inc; + put_page(mfn_to_page(frame)); + } + else + { + act->pin -= GNTPIN_hstw_inc; + put_page_and_type(mfn_to_page(frame)); + } + } + + if ( (map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 ) + { + map->flags = 0; + put_maptrack_handle(ld->grant_table, op->handle); + } + + /* If just unmapped a writable mapping, mark as dirtied */ + if ( !(flags & GNTMAP_readonly) ) + gnttab_mark_dirty(rd, frame); + + if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) && + !(flags & GNTMAP_readonly) ) + gnttab_clear_flag(_GTF_writing, &sha->flags); + + if ( act->pin == 0 ) + gnttab_clear_flag(_GTF_reading, &sha->flags); + + unmap_out: + op->status = rc; + spin_unlock(&rd->grant_table->lock); + rcu_unlock_domain(rd); +} + +static long +gnttab_unmap_and_replace( + XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t) uop, unsigned int count) +{ + int i; + struct gnttab_unmap_and_replace op; + + for ( i = 0; i < count; i++ ) + { + if ( unlikely(__copy_from_guest_offset(&op, uop, i, 1)) ) + goto fault; + __gnttab_unmap_and_replace(&op); + if ( unlikely(__copy_to_guest_offset(uop, i, &op, 1)) ) + goto fault; + } + + flush_tlb_mask(current->domain->domain_dirty_cpumask); + return 0; + +fault: + flush_tlb_mask(current->domain->domain_dirty_cpumask); + return -EFAULT; +} + int gnttab_grow_table(struct domain *d, unsigned int req_nr_frames) { @@ -1203,6 +1323,21 @@ do_grant_table_op( if ( unlikely(!grant_operation_permitted(d)) ) goto out; rc = gnttab_unmap_grant_ref(unmap, count); + break; + } + case GNTTABOP_unmap_and_replace: + { + XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t) unmap + guest_handle_cast(uop, gnttab_unmap_and_replace_t); + if ( unlikely(!guest_handle_okay(unmap, count)) ) + goto out; + rc = -EPERM; + if ( unlikely(!grant_operation_permitted(d)) ) + goto out; + rc = -ENOSYS; + if ( unlikely(!replace_grant_supported()) ) + goto out; + rc = gnttab_unmap_and_replace(unmap, count); break; } case GNTTABOP_setup_table: diff -r 3ef0510e44d0 xen/include/asm-ia64/grant_table.h --- a/xen/include/asm-ia64/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-ia64/grant_table.h Wed May 09 21:16:57 2007 +1000 @@ -67,4 +67,14 @@ static inline void gnttab_clear_flag(uns #define gnttab_release_put_page(page) put_page((page)) #define gnttab_release_put_page_and_type(page) put_page_and_type((page)) +static inline int replace_grant_host_mapping(unsigned long gpaddr, unsigned long mfn, unsigned long new_gpaddr, unsigned int flags) +{ + return 0; +} + +static inline int replace_grant_supported(void) +{ + return 0; +} + #endif /* __ASM_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/asm-powerpc/grant_table.h --- a/xen/include/asm-powerpc/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-powerpc/grant_table.h Wed May 09 21:16:57 2007 +1000 @@ -82,4 +82,13 @@ static inline uint cpu_foreign_map_order #define gnttab_release_put_page_and_type(page) do { } while (0) #endif +static inline int replace_grant_host_mapping(unsigned long addr, unsigned long frame, unsigned long new_addr, unsigned int flags) +{ + return 0; +} + +static inline int replace_grant_supported(void) +{ + return 0; +} #endif /* __ASM_PPC_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/asm-x86/grant_table.h --- a/xen/include/asm-x86/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-x86/grant_table.h Wed May 09 21:16:57 2007 +1000 @@ -17,6 +17,8 @@ int create_grant_host_mapping( uint64_t addr, unsigned long frame, unsigned int flags); int destroy_grant_host_mapping( uint64_t addr, unsigned long frame, unsigned int flags); +int replace_grant_host_mapping( + uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags); #define gnttab_create_shared_page(d, t, i) \ do { \ @@ -48,4 +50,9 @@ static inline void gnttab_clear_flag(uns /* Done implicitly when page tables are destroyed. */ \ } while (0) +static inline int replace_grant_supported(void) +{ + return 1; +} + #endif /* __ASM_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/public/grant_table.h --- a/xen/include/public/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/public/grant_table.h Wed May 09 21:16:57 2007 +1000 @@ -327,6 +327,29 @@ struct gnttab_query_size { }; typedef struct gnttab_query_size gnttab_query_size_t; DEFINE_XEN_GUEST_HANDLE(gnttab_query_size_t); + +/* + * GNTTABOP_unmap_and_replace: Destroy one or more grant-reference mappings + * tracked by <handle> but atomically replace the page table entry with one + * pointing to the machine address under <new_addr>. <new_addr> will be + * redirected to the null entry. + * NOTES: + * 1. The call may fail in an undefined manner if either mapping is not + * tracked by <handle>. + * 2. After executing a batch of unmaps, it is guaranteed that no stale + * mappings will remain in the device or host TLBs. + */ +#define GNTTABOP_unmap_and_replace 7 +struct gnttab_unmap_and_replace { + /* IN parameters. */ + uint64_t host_addr; + uint64_t new_addr; + grant_handle_t handle; + /* OUT parameters. */ + int16_t status; /* GNTST_* */ +}; +typedef struct gnttab_unmap_and_replace gnttab_unmap_and_replace_t; +DEFINE_XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t); /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi: [NET] back: Add lazy copying This patch adds lazy copying using the new unmap_and_replace grant table operation. We keep a list of pending entries sorted by arrival order. We''ll process this list every time net_tx_action is invoked. We ensure that net_tx_action is invoked within one second of the arrival of the first packet in the list. When we process the list any entry that has been around for more than half a second is copied. This allows up to free the grant table entry and return it to domU. If the new grant table operation is not available (e.g., old HV or architectures that don''t support it yet) we simply copy each packet as we receive them using skb_linearize. We also disable SG/TSO if this is the case. By default the new code is disabled. In order to enable it, the module needs to be loaded with the argument copy_skb=1. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/common.h --- a/linux-2.6-xen-sparse/drivers/xen/netback/common.h Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/common.h Wed May 09 21:19:06 2007 +1000 @@ -114,6 +114,14 @@ typedef struct netif_st { #define netback_carrier_off(netif) ((netif)->carrier = 0) #define netback_carrier_ok(netif) ((netif)->carrier) +enum { + NETBK_DONT_COPY_SKB, + NETBK_DELAYED_COPY_SKB, + NETBK_ALWAYS_COPY_SKB, +}; + +extern int netbk_copy_skb_mode; + #define NET_TX_RING_SIZE __RING_SIZE((netif_tx_sring_t *)0, PAGE_SIZE) #define NET_RX_RING_SIZE __RING_SIZE((netif_rx_sring_t *)0, PAGE_SIZE) diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/netback.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Wed May 09 21:19:06 2007 +1000 @@ -49,6 +49,11 @@ struct netbk_rx_meta { int copy:1; }; +struct netbk_tx_pending_inuse { + struct list_head list; + unsigned long alloc_time; +}; + static void netif_idx_release(u16 pending_idx); static void netif_page_release(struct page *page); static void make_tx_response(netif_t *netif, @@ -68,6 +73,7 @@ static DECLARE_TASKLET(net_rx_tasklet, n static DECLARE_TASKLET(net_rx_tasklet, net_rx_action, 0); static struct timer_list net_timer; +static struct timer_list netbk_tx_pending_timer; #define MAX_PENDING_REQS 256 @@ -95,6 +101,10 @@ static u16 dealloc_ring[MAX_PENDING_REQS static u16 dealloc_ring[MAX_PENDING_REQS]; static PEND_RING_IDX dealloc_prod, dealloc_cons; +/* Doubly-linked list of in-use pending entries. */ +static struct netbk_tx_pending_inuse pending_inuse[MAX_PENDING_REQS]; +static LIST_HEAD(pending_inuse_head); + static struct sk_buff_head tx_queue; static grant_handle_t grant_tx_handle[MAX_PENDING_REQS]; @@ -107,6 +117,13 @@ static spinlock_t net_schedule_list_lock #define MAX_MFN_ALLOC 64 static unsigned long mfn_list[MAX_MFN_ALLOC]; static unsigned int alloc_index = 0; + +/* Setting this allows the safe use of this driver without netloop. */ +static int MODPARM_copy_skb; +module_param_named(copy_skb, MODPARM_copy_skb, bool, 0); +MODULE_PARM_DESC(copy_skb, "Copy data received from netfront without netloop"); + +int netbk_copy_skb_mode; static inline unsigned long alloc_mfn(void) { @@ -719,6 +736,11 @@ static void net_alarm(unsigned long unus tasklet_schedule(&net_rx_tasklet); } +static void netbk_tx_pending_timeout(unsigned long unused) +{ + tasklet_schedule(&net_tx_tasklet); +} + struct net_device_stats *netif_be_get_stats(struct net_device *dev) { netif_t *netif = netdev_priv(dev); @@ -812,46 +834,140 @@ static void tx_credit_callback(unsigned netif_schedule_work(netif); } +/* Perform a delayed copy. This is slow-path only. */ +static int copy_pending_req(PEND_RING_IDX pending_idx) +{ + struct gnttab_unmap_and_replace unmap; + mmu_update_t mmu; + struct page *page; + struct page *new_page; + void *new_addr; + void *addr; + unsigned long pfn; + unsigned long new_mfn; + int err; + + page = mmap_pages[pending_idx]; + if (!get_page_unless_zero(page)) + return -ENOENT; + + new_page = alloc_page(GFP_ATOMIC | __GFP_NOWARN); + if (!new_page) + return -ENOMEM; + + new_addr = page_address(new_page); + addr = page_address(page); + memcpy(new_addr, addr, PAGE_SIZE); + + pfn = page_to_pfn(page); + new_mfn = virt_to_mfn(new_addr); + + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + set_phys_to_machine(pfn, new_mfn); + set_phys_to_machine(page_to_pfn(new_page), INVALID_P2M_ENTRY); + } + + gnttab_set_replace_op(&unmap, (unsigned long)addr, + (unsigned long)new_addr, + grant_tx_handle[pending_idx]); + + err = HYPERVISOR_grant_table_op(GNTTABOP_unmap_and_replace, + &unmap, 1); + BUG_ON(err); + BUG_ON(unmap.status); + + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + mmu.ptr = ((maddr_t)new_mfn << PAGE_SHIFT) | + MMU_MACHPHYS_UPDATE; + mmu.val = pfn; + err = HYPERVISOR_mmu_update(&mmu, 1, NULL, DOMID_SELF); + BUG_ON(err); + } + + ClearPageForeign(page); + put_page(page); + + SetPageForeign(new_page, netif_page_release); + netif_page_index(new_page) = pending_idx; + mmap_pages[pending_idx] = new_page; + + return 0; +} + inline static void net_tx_action_dealloc(void) { + struct netbk_tx_pending_inuse *inuse, *n; gnttab_unmap_grant_ref_t *gop; u16 pending_idx; PEND_RING_IDX dc, dp; netif_t *netif; int ret; + LIST_HEAD(list); dc = dealloc_cons; - dp = dealloc_prod; - - /* Ensure we see all indexes enqueued by netif_idx_release(). */ - smp_rmb(); + gop = tx_unmap_ops; /* * Free up any grants we have finished using */ - gop = tx_unmap_ops; - while (dc != dp) { - pending_idx = dealloc_ring[MASK_PEND_IDX(dc++)]; - gnttab_set_unmap_op(gop, idx_to_kaddr(pending_idx), - GNTMAP_host_map, - grant_tx_handle[pending_idx]); - gop++; - } + do { + dp = dealloc_prod; + + /* Ensure we see all indices enqueued by netif_idx_release(). */ + smp_rmb(); + + while (dc != dp) { + pending_idx = dealloc_ring[MASK_PEND_IDX(dc++)]; + list_move_tail(&pending_inuse[pending_idx].list, &list); + gnttab_set_unmap_op(gop, idx_to_kaddr(pending_idx), + GNTMAP_host_map, + grant_tx_handle[pending_idx]); + gop++; + } + + if (netbk_copy_skb_mode != NETBK_DELAYED_COPY_SKB || + list_empty(&pending_inuse_head)) + break; + + /* Copy any entries that have been pending for too long. */ + list_for_each_entry_safe(inuse, n, &pending_inuse_head, list) { + if (time_after(inuse->alloc_time + HZ / 2, jiffies)) + break; + + switch (copy_pending_req(inuse - pending_inuse)) { + case 0: + list_move_tail(&inuse->list, &list); + /* fall through */ + case -ENOENT: + continue; + } + + break; + } + } while (dp != dealloc_prod); + + dealloc_cons = dc; + ret = HYPERVISOR_grant_table_op( GNTTABOP_unmap_grant_ref, tx_unmap_ops, gop - tx_unmap_ops); BUG_ON(ret); - while (dealloc_cons != dp) { - pending_idx = dealloc_ring[MASK_PEND_IDX(dealloc_cons++)]; + list_for_each_entry_safe(inuse, n, &list, list) { + pending_idx = inuse - pending_inuse; netif = pending_tx_info[pending_idx].netif; make_tx_response(netif, &pending_tx_info[pending_idx].req, NETIF_RSP_OKAY); + /* Ready for next use. */ + init_page_count(mmap_pages[pending_idx]); + pending_ring[MASK_PEND_IDX(pending_prod++)] = pending_idx; netif_put(netif); + + list_del_init(&inuse->list); } } @@ -1023,6 +1139,11 @@ static void netbk_fill_frags(struct sk_b unsigned long pending_idx; pending_idx = (unsigned long)frag->page; + + pending_inuse[pending_idx].alloc_time = jiffies; + list_add_tail(&pending_inuse[pending_idx].list, + &pending_inuse_head); + txp = &pending_tx_info[pending_idx].req; frag->page = virt_to_page(idx_to_kaddr(pending_idx)); frag->size = txp->size; @@ -1311,8 +1432,24 @@ static void net_tx_action(unsigned long netif->stats.rx_bytes += skb->len; netif->stats.rx_packets++; + if (unlikely(netbk_copy_skb_mode == NETBK_ALWAYS_COPY_SKB) && + unlikely(skb_linearize(skb))) { + DPRINTK("Can''t linearize skb in net_tx_action.\n"); + kfree_skb(skb); + continue; + } + netif_rx(skb); netif->dev->last_rx = jiffies; + } + + if (netbk_copy_skb_mode == NETBK_DELAYED_COPY_SKB && + !list_empty(&pending_inuse_head)) { + struct netbk_tx_pending_inuse *oldest; + + oldest = list_entry(pending_inuse_head.next, + struct netbk_tx_pending_inuse, list); + mod_timer(&netbk_tx_pending_timer, oldest->alloc_time + HZ); } } @@ -1333,9 +1470,6 @@ static void netif_idx_release(u16 pendin static void netif_page_release(struct page *page) { - /* Ready for next use. */ - init_page_count(page); - netif_idx_release(netif_page_index(page)); } @@ -1457,6 +1591,10 @@ static int __init netback_init(void) net_timer.data = 0; net_timer.function = net_alarm; + init_timer(&netbk_tx_pending_timer); + netbk_tx_pending_timer.data = 0; + netbk_tx_pending_timer.function = netbk_tx_pending_timeout; + mmap_pages = alloc_empty_pages_and_pagevec(MAX_PENDING_REQS); if (mmap_pages == NULL) { printk("%s: out of memory\n", __FUNCTION__); @@ -1467,6 +1605,7 @@ static int __init netback_init(void) page = mmap_pages[i]; SetPageForeign(page, netif_page_release); netif_page_index(page) = i; + INIT_LIST_HEAD(&pending_inuse[i].list); } pending_cons = 0; @@ -1476,6 +1615,15 @@ static int __init netback_init(void) spin_lock_init(&net_schedule_list_lock); INIT_LIST_HEAD(&net_schedule_list); + + netbk_copy_skb_mode = NETBK_DONT_COPY_SKB; + if (MODPARM_copy_skb) { + if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_and_replace, + NULL, 0)) + netbk_copy_skb_mode = NETBK_ALWAYS_COPY_SKB; + else + netbk_copy_skb_mode = NETBK_DELAYED_COPY_SKB; + } netif_xenbus_init(); diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Wed May 09 21:19:06 2007 +1000 @@ -62,6 +62,7 @@ static int netback_probe(struct xenbus_d const char *message; struct xenbus_transaction xbt; int err; + int sg; struct backend_info *be = kzalloc(sizeof(struct backend_info), GFP_KERNEL); if (!be) { @@ -73,6 +74,10 @@ static int netback_probe(struct xenbus_d be->dev = dev; dev->dev.driver_data = be; + sg = 1; + if (netbk_copy_skb_mode == NETBK_ALWAYS_COPY_SKB) + sg = 0; + do { err = xenbus_transaction_start(&xbt); if (err) { @@ -80,14 +85,14 @@ static int netback_probe(struct xenbus_d goto fail; } - err = xenbus_printf(xbt, dev->nodename, "feature-sg", "%d", 1); + err = xenbus_printf(xbt, dev->nodename, "feature-sg", "%d", sg); if (err) { message = "writing feature-sg"; goto abort_transaction; } err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv4", - "%d", 1); + "%d", sg); if (err) { message = "writing feature-gso-tcpv4"; goto abort_transaction; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-May-09 12:55 UTC
Re: [Xen-devel] [1/2] [XEN] gnttab: Add new op unmap_and_replace
On 9/5/07 12:31, "Herbert Xu" <herbert@gondor.apana.org.au> wrote:> [XEN] gnttab: Add new op unmap_and_replaceThere''s considerable code duplication in common/grant_table.c. Could we please somehow merge __gnttab_unmap() and __gnttab_unmap_and_replace(), because they only differ right now in the struct type they take, and in the function they call to do the actual unmap or unmap-and-replace work. Perhaps their wrappers could stuff a ''union structure'' of some sort, with enough discrimination to ensure the correct underlying arch-specific function is called? We could even only supply unmap-and-replace functionality at the arch-specific interface and have new_addr==NULL/zero-pte mean we want the the old-style unmap semantics. Then the wrapper for unmap_op can stuff that field in the ''union structure'' with zero to do the right thing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Herbert Xu
2007-May-16 23:17 UTC
[Xen-devel] [1/2] [XEN] gnttab: Add new op unmap_and_replace
On Wed, May 09, 2007 at 01:55:08PM +0100, Keir Fraser wrote:> > There''s considerable code duplication in common/grant_table.c. Could we > please somehow merge __gnttab_unmap() and __gnttab_unmap_and_replace(), > because they only differ right now in the struct type they take, and in the > function they call to do the actual unmap or unmap-and-replace work. Perhaps > their wrappers could stuff a ''union structure'' of some sort, with enough > discrimination to ensure the correct underlying arch-specific function is > called? We could even only supply unmap-and-replace functionality at the > arch-specific interface and have new_addr==NULL/zero-pte mean we want the > the old-style unmap semantics. Then the wrapper for unmap_op can stuff that > field in the ''union structure'' with zero to do the right thing.OK, that was eazy enough to do. However, in doing so I found a more serious problem with my second patch. It didn''t keep track of outstanding DMA requests so we could potentially put bogus data on the wire. Unfortunately the Linux DMA API doesn''t help us in solving this because on an DMA unmap we don''t get the struct page that was originally used so we''d have to keep a mapping of some sort from the machine address to the guest-physical address in dom0 of a grant page. IMHO it''s too much of a hassle to do right now and we can live with a delayed free in that case anyway. Right now we already may delay certain grant entries in dom0 even with netloop. In particular, any packets that don''t go through netloop can be delayed indefinitely. If you disagree then I can put in the tracking needed to handle this properly. In the long term perhaps we could modify the Linux DMA API to add the struct page to make this easier. The first patch: [XEN] gnttab: Add new op unmap_and_replace The operation unmap_and_replace is an extension of unmap_grant_ref. A new argument in the form of a virtual address (for PV) is given. Instead of modifying the PTE for the mapped grant table entry to null, we change it to the PTE for the new address. In turn we point the new address to null. As it stands grant table entries once mapped cannot be remapped by the guest OS (it can however perform a new mapping on the same entry but that is within our control). Therefore it''s safe to manipulate the mapped PTE entry to redirect it to a normal page where we''ve copied the contents. It''s intended to be used as follows: 1) map_grant_ref to v1 2) ... 3) alloc page at v2 4) copy the page at v1 to v2 5) unmap_and_replace v1 with v2 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 3ef0510e44d0 linux-2.6-xen-sparse/include/xen/gnttab.h --- a/linux-2.6-xen-sparse/include/xen/gnttab.h Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/include/xen/gnttab.h Thu May 10 11:56:29 2007 +1000 @@ -135,4 +135,19 @@ gnttab_set_unmap_op(struct gnttab_unmap_ unmap->dev_bus_addr = 0; } +static inline void +gnttab_set_replace_op(struct gnttab_unmap_and_replace *unmap, maddr_t addr, + maddr_t new_addr, grant_handle_t handle) +{ + if (xen_feature(XENFEAT_auto_translated_physmap)) { + unmap->host_addr = __pa(addr); + unmap->new_addr = __pa(new_addr); + } else { + unmap->host_addr = addr; + unmap->new_addr = new_addr; + } + + unmap->handle = handle; +} + #endif /* __ASM_GNTTAB_H__ */ diff -r 3ef0510e44d0 xen/arch/ia64/xen/mm.c --- a/xen/arch/ia64/xen/mm.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/arch/ia64/xen/mm.c Thu May 10 11:56:29 2007 +1000 @@ -63,7 +63,7 @@ * assign_domain_page_replace() * - cmpxchg p2m entry * assign_domain_page_cmpxchg_rel() - * destroy_grant_host_mapping() + * replace_grant_host_mapping() * steal_page() * zap_domain_page_one() * - read p2m entry @@ -133,7 +133,7 @@ * - races between p2m entry update and tlb insert * This is a race between reading/writing the p2m entry. * reader: vcpu_itc_i(), vcpu_itc_d(), ia64_do_page_fault(), vcpu_fc() - * writer: assign_domain_page_cmpxchg_rel(), destroy_grant_host_mapping(), + * writer: assign_domain_page_cmpxchg_rel(), replace_grant_host_mapping(), * steal_page(), zap_domain_page_one() * * For example, vcpu_itc_i() is about to insert tlb by calling @@ -151,7 +151,7 @@ * This is a race between reading/writing the p2m entry. * reader: vcpu_get_domain_bundle(), vmx_get_domain_bundle(), * efi_emulate_get_time() - * writer: assign_domain_page_cmpxchg_rel(), destroy_grant_host_mapping(), + * writer: assign_domain_page_cmpxchg_rel(), replace_grant_host_mapping(), * steal_page(), zap_domain_page_one() * * A page which assigned to a domain can be de-assigned by another vcpu. @@ -1509,8 +1509,8 @@ create_grant_host_mapping(unsigned long // grant table host unmapping int -destroy_grant_host_mapping(unsigned long gpaddr, - unsigned long mfn, unsigned int flags) +replace_grant_host_mapping(unsigned long gpaddr, + unsigned long mfn, unsigned long new_gpaddr, unsigned int flags) { struct domain* d = current->domain; unsigned long gpfn = gpaddr >> PAGE_SHIFT; @@ -1521,6 +1521,11 @@ destroy_grant_host_mapping(unsigned long pte_t old_pte; struct page_info* page = mfn_to_page(mfn); + if (new_gpaddr) { + gdprintk(XENLOG_INFO, "%s: new_gpaddr 0x%lx\n", __func__, new_gpaddr); + return GNTST_general_error; + } + if (flags & (GNTMAP_application_map | GNTMAP_contains_pte)) { gdprintk(XENLOG_INFO, "%s: flags 0x%x\n", __func__, flags); return GNTST_general_error; @@ -1568,7 +1573,7 @@ destroy_grant_host_mapping(unsigned long BUG_ON(pte_pgc_allocated(old_pte)); domain_page_flush_and_put(d, gpaddr, pte, old_pte, page); - perfc_incr(destroy_grant_host_mapping); + perfc_incr(replace_grant_host_mapping); return GNTST_okay; } diff -r 3ef0510e44d0 xen/arch/powerpc/mm.c --- a/xen/arch/powerpc/mm.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/arch/powerpc/mm.c Thu May 10 11:56:29 2007 +1000 @@ -183,9 +183,16 @@ int create_grant_host_mapping( return create_grant_va_mapping(addr, frame, current); } -int destroy_grant_host_mapping( - unsigned long addr, unsigned long frame, unsigned int flags) -{ +int replace_grant_host_mapping( + unsigned long addr, unsigned long frame, unsigned long new_addr, + unsigned int flags) +{ + if (new_addr) + printk("%s: new_addr not supported\n", __func__); + BUG(); + return GNTST_general_error; + } + if (flags & GNTMAP_contains_pte) { printk("%s: GNTMAP_contains_pte not supported\n", __func__); BUG(); diff -r 3ef0510e44d0 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/arch/x86/mm.c Thu May 10 11:56:29 2007 +1000 @@ -2619,8 +2619,8 @@ static int create_grant_va_mapping( return GNTST_okay; } -static int destroy_grant_va_mapping( - unsigned long addr, unsigned long frame, struct vcpu *v) +static int replace_grant_va_mapping( + unsigned long addr, unsigned long frame, l1_pgentry_t nl1e, struct vcpu *v) { l1_pgentry_t *pl1e, ol1e; unsigned long gl1mfn; @@ -2644,7 +2644,7 @@ static int destroy_grant_va_mapping( } /* Delete pagetable entry. */ - if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(), gl1mfn, v)) ) + if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v)) ) { MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e); rc = GNTST_general_error; @@ -2654,6 +2654,12 @@ static int destroy_grant_va_mapping( out: guest_unmap_l1e(v, pl1e); return rc; +} + +static int destroy_grant_va_mapping( + unsigned long addr, unsigned long frame, struct vcpu *v) +{ + return replace_grant_va_mapping(addr, frame, l1e_empty(), v); } int create_grant_host_mapping( @@ -2671,12 +2677,48 @@ int create_grant_host_mapping( return create_grant_va_mapping(addr, pte, current); } -int destroy_grant_host_mapping( - uint64_t addr, unsigned long frame, unsigned int flags) -{ +int replace_grant_host_mapping( + uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags) +{ + l1_pgentry_t *pl1e, ol1e; + unsigned long gl1mfn; + int rc; + if ( flags & GNTMAP_contains_pte ) - return destroy_grant_pte_mapping(addr, frame, current->domain); - return destroy_grant_va_mapping(addr, frame, current); + { + if (!new_addr) + return destroy_grant_pte_mapping(addr, frame, current->domain); + + MEM_LOG("Unsupported grant table operation"); + return GNTST_general_error; + } + + if (!new_addr) + return destroy_grant_va_mapping(addr, frame, current); + + pl1e = guest_map_l1e(current, new_addr, &gl1mfn); + if ( !pl1e ) + { + MEM_LOG("Could not find L1 PTE for address %lx", + (unsigned long)new_addr); + return GNTST_general_error; + } + ol1e = *pl1e; + + if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(), gl1mfn, current)) ) + { + MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e); + guest_unmap_l1e(current, pl1e); + return GNTST_general_error; + } + + guest_unmap_l1e(current, pl1e); + + rc = replace_grant_va_mapping(addr, frame, ol1e, current); + if ( rc && !paging_mode_refcounts(current->domain) ) + put_page_from_l1e(ol1e, current->domain); + + return rc; } int steal_page( diff -r 3ef0510e44d0 xen/common/grant_table.c --- a/xen/common/grant_table.c Tue May 08 10:21:23 2007 +0100 +++ b/xen/common/grant_table.c Thu May 10 11:56:29 2007 +1000 @@ -58,6 +58,16 @@ union grant_combo { } shorts; }; +/* Used to share code between unmap_grant_ref and unmap_and_replace. */ +struct gnttab_unmap_common { + uint64_t host_addr; + uint64_t dev_bus_addr; + uint64_t new_addr; + grant_handle_t handle; + + int16_t status; +}; + #define PIN_FAIL(_lbl, _rc, _f, _a...) \ do { \ gdprintk(XENLOG_WARNING, _f, ## _a ); \ @@ -397,8 +407,8 @@ gnttab_map_grant_ref( } static void -__gnttab_unmap_grant_ref( - struct gnttab_unmap_grant_ref *op) +__gnttab_unmap_common( + struct gnttab_unmap_common *op) { domid_t dom; grant_ref_t ref; @@ -477,8 +487,8 @@ __gnttab_unmap_grant_ref( if ( (op->host_addr != 0) && (flags & GNTMAP_host_map) ) { - if ( (rc = destroy_grant_host_mapping(op->host_addr, - frame, flags)) < 0 ) + if ( (rc = replace_grant_host_mapping(op->host_addr, + frame, op->new_addr, flags)) < 0 ) goto unmap_out; ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask)); @@ -518,6 +528,20 @@ __gnttab_unmap_grant_ref( rcu_unlock_domain(rd); } +static void +__gnttab_unmap_grant_ref( + struct gnttab_unmap_grant_ref *op) +{ + struct gnttab_unmap_common common = { + .host_addr = op->host_addr, + .dev_bus_addr = op->dev_bus_addr, + .handle = op->handle, + }; + + __gnttab_unmap_common(&common); + op->status = common.status; +} + static long gnttab_unmap_grant_ref( XEN_GUEST_HANDLE(gnttab_unmap_grant_ref_t) uop, unsigned int count) @@ -530,6 +554,44 @@ gnttab_unmap_grant_ref( if ( unlikely(__copy_from_guest_offset(&op, uop, i, 1)) ) goto fault; __gnttab_unmap_grant_ref(&op); + if ( unlikely(__copy_to_guest_offset(uop, i, &op, 1)) ) + goto fault; + } + + flush_tlb_mask(current->domain->domain_dirty_cpumask); + return 0; + +fault: + flush_tlb_mask(current->domain->domain_dirty_cpumask); + return -EFAULT; +} + +static void +__gnttab_unmap_and_replace( + struct gnttab_unmap_and_replace *op) +{ + struct gnttab_unmap_common common = { + .host_addr = op->host_addr, + .new_addr = op->new_addr, + .handle = op->handle, + }; + + __gnttab_unmap_common(&common); + op->status = common.status; +} + +static long +gnttab_unmap_and_replace( + XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t) uop, unsigned int count) +{ + int i; + struct gnttab_unmap_and_replace op; + + for ( i = 0; i < count; i++ ) + { + if ( unlikely(__copy_from_guest_offset(&op, uop, i, 1)) ) + goto fault; + __gnttab_unmap_and_replace(&op); if ( unlikely(__copy_to_guest_offset(uop, i, &op, 1)) ) goto fault; } @@ -1203,6 +1265,21 @@ do_grant_table_op( if ( unlikely(!grant_operation_permitted(d)) ) goto out; rc = gnttab_unmap_grant_ref(unmap, count); + break; + } + case GNTTABOP_unmap_and_replace: + { + XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t) unmap + guest_handle_cast(uop, gnttab_unmap_and_replace_t); + if ( unlikely(!guest_handle_okay(unmap, count)) ) + goto out; + rc = -EPERM; + if ( unlikely(!grant_operation_permitted(d)) ) + goto out; + rc = -ENOSYS; + if ( unlikely(!replace_grant_supported()) ) + goto out; + rc = gnttab_unmap_and_replace(unmap, count); break; } case GNTTABOP_setup_table: diff -r 3ef0510e44d0 xen/include/asm-ia64/grant_table.h --- a/xen/include/asm-ia64/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-ia64/grant_table.h Thu May 10 11:56:29 2007 +1000 @@ -9,7 +9,7 @@ // for grant map/unmap int create_grant_host_mapping(unsigned long gpaddr, unsigned long mfn, unsigned int flags); -int destroy_grant_host_mapping(unsigned long gpaddr, unsigned long mfn, unsigned int flags); +int replace_grant_host_mapping(unsigned long gpaddr, unsigned long mfn, unsigned long new_gpaddr, unsigned int flags); // for grant transfer void guest_physmap_add_page(struct domain *d, unsigned long gpfn, unsigned long mfn); @@ -67,4 +67,9 @@ static inline void gnttab_clear_flag(uns #define gnttab_release_put_page(page) put_page((page)) #define gnttab_release_put_page_and_type(page) put_page_and_type((page)) +static inline int replace_grant_supported(void) +{ + return 0; +} + #endif /* __ASM_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/asm-powerpc/grant_table.h --- a/xen/include/asm-powerpc/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-powerpc/grant_table.h Thu May 10 11:56:29 2007 +1000 @@ -35,8 +35,9 @@ extern long pte_remove(ulong flags, ulon int create_grant_host_mapping( unsigned long addr, unsigned long frame, unsigned int flags); -int destroy_grant_host_mapping( - unsigned long addr, unsigned long frame, unsigned int flags); +int replace_grant_host_mapping( + unsigned long addr, unsigned long frame, unsigned long new_addr, + unsigned int flags); #define gnttab_create_shared_page(d, t, i) \ do { \ @@ -82,4 +83,8 @@ static inline uint cpu_foreign_map_order #define gnttab_release_put_page_and_type(page) do { } while (0) #endif +static inline int replace_grant_supported(void) +{ + return 0; +} #endif /* __ASM_PPC_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/asm-x86/grant_table.h --- a/xen/include/asm-x86/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/asm-x86/grant_table.h Thu May 10 11:56:29 2007 +1000 @@ -15,8 +15,8 @@ */ int create_grant_host_mapping( uint64_t addr, unsigned long frame, unsigned int flags); -int destroy_grant_host_mapping( - uint64_t addr, unsigned long frame, unsigned int flags); +int replace_grant_host_mapping( + uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags); #define gnttab_create_shared_page(d, t, i) \ do { \ @@ -48,4 +48,9 @@ static inline void gnttab_clear_flag(uns /* Done implicitly when page tables are destroyed. */ \ } while (0) +static inline int replace_grant_supported(void) +{ + return 1; +} + #endif /* __ASM_GRANT_TABLE_H__ */ diff -r 3ef0510e44d0 xen/include/public/grant_table.h --- a/xen/include/public/grant_table.h Tue May 08 10:21:23 2007 +0100 +++ b/xen/include/public/grant_table.h Thu May 10 11:56:29 2007 +1000 @@ -327,6 +327,29 @@ struct gnttab_query_size { }; typedef struct gnttab_query_size gnttab_query_size_t; DEFINE_XEN_GUEST_HANDLE(gnttab_query_size_t); + +/* + * GNTTABOP_unmap_and_replace: Destroy one or more grant-reference mappings + * tracked by <handle> but atomically replace the page table entry with one + * pointing to the machine address under <new_addr>. <new_addr> will be + * redirected to the null entry. + * NOTES: + * 1. The call may fail in an undefined manner if either mapping is not + * tracked by <handle>. + * 2. After executing a batch of unmaps, it is guaranteed that no stale + * mappings will remain in the device or host TLBs. + */ +#define GNTTABOP_unmap_and_replace 7 +struct gnttab_unmap_and_replace { + /* IN parameters. */ + uint64_t host_addr; + uint64_t new_addr; + grant_handle_t handle; + /* OUT parameters. */ + int16_t status; /* GNTST_* */ +}; +typedef struct gnttab_unmap_and_replace gnttab_unmap_and_replace_t; +DEFINE_XEN_GUEST_HANDLE(gnttab_unmap_and_replace_t); /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi: [LINUX] gnttab: Add basic DMA tracking This patch adds basic tracking of outstanding DMA requests on grant table entries marked as PageForeign. When a PageForeign struct page is about to be mapped for DMA, we set its map count to 1 (or zero in actual value). This is then checked for when we need to free a grant table entry early to ensure that we don''t free an entry that''s currently used for DMA. So any entry that has been marked for DMA will not be freed early. If the unmapping API had a struct page (which exists for the sg case) then we could do this properly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 3ef0510e44d0 linux-2.6-xen-sparse/arch/i386/kernel/pci-dma-xen.c --- a/linux-2.6-xen-sparse/arch/i386/kernel/pci-dma-xen.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/arch/i386/kernel/pci-dma-xen.c Wed May 16 22:31:20 2007 +1000 @@ -15,6 +15,7 @@ #include <linux/version.h> #include <asm/io.h> #include <xen/balloon.h> +#include <xen/gnttab.h> #include <asm/swiotlb.h> #include <asm/tlbflush.h> #include <asm-i386/mach-xen/asm/swiotlb.h> @@ -90,7 +91,7 @@ dma_map_sg(struct device *hwdev, struct } else { for (i = 0; i < nents; i++ ) { sg[i].dma_address - page_to_bus(sg[i].page) + sg[i].offset; + gnttab_dma_map_page(sg[i].page) + sg[i].offset; sg[i].dma_length = sg[i].length; BUG_ON(!sg[i].page); IOMMU_BUG_ON(address_needs_mapping( @@ -108,9 +109,15 @@ dma_unmap_sg(struct device *hwdev, struc dma_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents, enum dma_data_direction direction) { + int i; + BUG_ON(direction == DMA_NONE); if (swiotlb) swiotlb_unmap_sg(hwdev, sg, nents, direction); + else { + for (i = 0; i < nents; i++ ) + gnttab_dma_unmap_page(sg[i].dma_address); + } } EXPORT_SYMBOL(dma_unmap_sg); @@ -127,7 +134,7 @@ dma_map_page(struct device *dev, struct dma_addr = swiotlb_map_page( dev, page, offset, size, direction); } else { - dma_addr = page_to_bus(page) + offset; + dma_addr = gnttab_dma_map_page(page) + offset; IOMMU_BUG_ON(address_needs_mapping(dev, dma_addr)); } @@ -142,6 +149,8 @@ dma_unmap_page(struct device *dev, dma_a BUG_ON(direction == DMA_NONE); if (swiotlb) swiotlb_unmap_page(dev, dma_address, size, direction); + else + gnttab_dma_unmap_page(dma_address); } EXPORT_SYMBOL(dma_unmap_page); #endif /* CONFIG_HIGHMEM */ @@ -326,7 +335,8 @@ dma_map_single(struct device *dev, void if (swiotlb) { dma = swiotlb_map_single(dev, ptr, size, direction); } else { - dma = virt_to_bus(ptr); + dma = gnttab_dma_map_page(virt_to_page(ptr)) + + offset_in_page(ptr); IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size)); IOMMU_BUG_ON(address_needs_mapping(dev, dma)); } @@ -344,6 +354,8 @@ dma_unmap_single(struct device *dev, dma BUG(); if (swiotlb) swiotlb_unmap_single(dev, dma_addr, size, direction); + else + gnttab_dma_unmap_page(dma_addr); } EXPORT_SYMBOL(dma_unmap_single); diff -r 3ef0510e44d0 linux-2.6-xen-sparse/arch/i386/kernel/swiotlb.c --- a/linux-2.6-xen-sparse/arch/i386/kernel/swiotlb.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/arch/i386/kernel/swiotlb.c Wed May 16 22:31:20 2007 +1000 @@ -25,14 +25,13 @@ #include <asm/pci.h> #include <asm/dma.h> #include <asm/uaccess.h> +#include <xen/gnttab.h> #include <xen/interface/memory.h> int swiotlb; EXPORT_SYMBOL(swiotlb); #define OFFSET(val,align) ((unsigned long)((val) & ( (align) - 1))) - -#define SG_ENT_PHYS_ADDRESS(sg) (page_to_bus((sg)->page) + (sg)->offset) /* * Maximum allowable number of contiguous slabs to map, @@ -468,7 +467,8 @@ dma_addr_t dma_addr_t swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir) { - dma_addr_t dev_addr = virt_to_bus(ptr); + dma_addr_t dev_addr = gnttab_dma_map_page(virt_to_page(ptr)) + + offset_in_page(ptr); void *map; struct phys_addr buffer; @@ -486,6 +486,7 @@ swiotlb_map_single(struct device *hwdev, /* * Oh well, have to allocate and map a bounce buffer. */ + gnttab_dma_unmap_page(dev_addr); buffer.page = virt_to_page(ptr); buffer.offset = (unsigned long)ptr & ~PAGE_MASK; map = map_single(hwdev, buffer, size, dir); @@ -513,6 +514,8 @@ swiotlb_unmap_single(struct device *hwde BUG_ON(dir == DMA_NONE); if (in_swiotlb_aperture(dev_addr)) unmap_single(hwdev, bus_to_virt(dev_addr), size, dir); + else + gnttab_dma_unmap_page(dev_addr); } /* @@ -571,8 +574,10 @@ swiotlb_map_sg(struct device *hwdev, str BUG_ON(dir == DMA_NONE); for (i = 0; i < nelems; i++, sg++) { - dev_addr = SG_ENT_PHYS_ADDRESS(sg); + dev_addr = gnttab_dma_map_page(sg->page) + sg->offset; + if (address_needs_mapping(hwdev, dev_addr)) { + gnttab_dma_unmap_page(dev_addr); buffer.page = sg->page; buffer.offset = sg->offset; map = map_single(hwdev, buffer, sg->length, dir); @@ -605,10 +610,12 @@ swiotlb_unmap_sg(struct device *hwdev, s BUG_ON(dir == DMA_NONE); for (i = 0; i < nelems; i++, sg++) - if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg)) + if (in_swiotlb_aperture(sg->dma_address)) unmap_single(hwdev, (void *)bus_to_virt(sg->dma_address), sg->dma_length, dir); + else + gnttab_dma_unmap_page(sg->dma_address); } /* @@ -627,7 +634,7 @@ swiotlb_sync_sg_for_cpu(struct device *h BUG_ON(dir == DMA_NONE); for (i = 0; i < nelems; i++, sg++) - if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg)) + if (in_swiotlb_aperture(sg->dma_address)) sync_single(hwdev, (void *)bus_to_virt(sg->dma_address), sg->dma_length, dir); @@ -642,7 +649,7 @@ swiotlb_sync_sg_for_device(struct device BUG_ON(dir == DMA_NONE); for (i = 0; i < nelems; i++, sg++) - if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg)) + if (in_swiotlb_aperture(sg->dma_address)) sync_single(hwdev, (void *)bus_to_virt(sg->dma_address), sg->dma_length, dir); @@ -659,8 +666,9 @@ swiotlb_map_page(struct device *hwdev, s dma_addr_t dev_addr; char *map; - dev_addr = page_to_bus(page) + offset; + dev_addr = gnttab_dma_map_page(page) + offset; if (address_needs_mapping(hwdev, dev_addr)) { + gnttab_dma_unmap_page(dev_addr); buffer.page = page; buffer.offset = offset; map = map_single(hwdev, buffer, size, direction); @@ -681,6 +689,8 @@ swiotlb_unmap_page(struct device *hwdev, BUG_ON(direction == DMA_NONE); if (in_swiotlb_aperture(dma_address)) unmap_single(hwdev, bus_to_virt(dma_address), size, direction); + else + gnttab_dma_unmap_page(dma_address); } #endif diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/core/gnttab.c --- a/linux-2.6-xen-sparse/drivers/xen/core/gnttab.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/core/gnttab.c Wed May 16 22:31:20 2007 +1000 @@ -490,6 +490,128 @@ static int gnttab_map(unsigned int start return 0; } +static void gnttab_page_free(struct page *page) +{ + if (page->mapping) { + put_page((struct page *)page->mapping); + page->mapping = NULL; + } + + ClearPageForeign(page); + gnttab_reset_grant_page(page); + put_page(page); +} + +/* + * Must not be called with IRQs off. This should only be used on the + * slow path. + * + * Copy a foreign granted page to local memory. + */ +int gnttab_copy_grant_page(grant_ref_t ref, struct page **pagep) +{ + struct gnttab_unmap_and_replace unmap; + mmu_update_t mmu; + struct page *page; + struct page *new_page; + void *new_addr; + void *addr; + paddr_t pfn; + maddr_t mfn; + maddr_t new_mfn; + int err; + + page = *pagep; + if (!get_page_unless_zero(page)) + return -ENOENT; + + err = -ENOMEM; + new_page = alloc_page(GFP_ATOMIC | __GFP_NOWARN); + if (!new_page) + goto out; + + new_addr = page_address(new_page); + addr = page_address(page); + memcpy(new_addr, addr, PAGE_SIZE); + + pfn = page_to_pfn(page); + mfn = pfn_to_mfn(pfn); + new_mfn = virt_to_mfn(new_addr); + + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + set_phys_to_machine(pfn, new_mfn); + set_phys_to_machine(page_to_pfn(new_page), INVALID_P2M_ENTRY); + + mmu.ptr = (new_mfn << PAGE_SHIFT) | MMU_MACHPHYS_UPDATE; + mmu.val = pfn; + err = HYPERVISOR_mmu_update(&mmu, 1, NULL, DOMID_SELF); + BUG_ON(err); + } + + gnttab_set_replace_op(&unmap, (unsigned long)addr, + (unsigned long)new_addr, ref); + + err = HYPERVISOR_grant_table_op(GNTTABOP_unmap_and_replace, + &unmap, 1); + BUG_ON(err); + BUG_ON(unmap.status); + + new_page->mapping = page->mapping; + new_page->index = page->index; + set_bit(PG_foreign, &new_page->flags); + *pagep = new_page; + + SetPageForeign(page, gnttab_page_free); + page->mapping = NULL; + + /* + * Ensure that there is a barrier between setting the p2m entry + * and checking the map count. See gnttab_dma_map_page. + */ + smp_mb(); + + /* Has the page been DMA-mapped? */ + if (unlikely(page_mapped(page))) { + err = -EBUSY; + page->mapping = (void *)new_page; + } + +out: + put_page(page); + return err; + +} +EXPORT_SYMBOL(gnttab_copy_grant_page); + +/* + * Keep track of foreign pages marked as PageForeign so that we don''t + * return them to the remote domain prematurely. + * + * PageForeign pages are pinned down by increasing their mapcount. + * + * All other pages are simply returned as is. + */ +maddr_t gnttab_dma_map_page(struct page *page) +{ + maddr_t mfn = pfn_to_mfn(page_to_pfn(page)), mfn2; + + if (!PageForeign(page)) + return mfn << PAGE_SHIFT; + + if (mfn_to_local_pfn(mfn) < max_mapnr) + return mfn << PAGE_SHIFT; + + atomic_set(&page->_mapcount, 0); + + /* This barrier corresponds to the one in gnttab_copy_grant_page. */ + smp_mb(); + + /* Has this page been copied in the mean time? */ + mfn2 = pfn_to_mfn(page_to_pfn(page)); + + return mfn2 << PAGE_SHIFT; +} + int gnttab_resume(void) { if (max_nr_grant_frames() < nr_grant_frames) diff -r 3ef0510e44d0 linux-2.6-xen-sparse/include/xen/gnttab.h --- a/linux-2.6-xen-sparse/include/xen/gnttab.h Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/include/xen/gnttab.h Wed May 16 22:31:20 2007 +1000 @@ -39,6 +39,7 @@ #include <asm/hypervisor.h> #include <asm/maddr.h> /* maddr_t */ +#include <linux/mm.h> #include <xen/interface/grant_table.h> #include <xen/features.h> @@ -101,6 +102,19 @@ void gnttab_grant_foreign_transfer_ref(g void gnttab_grant_foreign_transfer_ref(grant_ref_t, domid_t domid, unsigned long pfn); +int gnttab_copy_grant_page(grant_ref_t ref, struct page **pagep); +maddr_t gnttab_dma_map_page(struct page *page); + +static inline void gnttab_dma_unmap_page(maddr_t mfn) +{ +} + +static inline void gnttab_reset_grant_page(struct page *page) +{ + init_page_count(page); + reset_page_mapcount(page); +} + int gnttab_suspend(void); int gnttab_resume(void); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi: [NET] back: Add lazy copying This patch adds lazy copying using the new unmap_and_replace grant table operation. We keep a list of pending entries sorted by arrival order. We''ll process this list every time net_tx_action is invoked. We ensure that net_tx_action is invoked within one second of the arrival of the first packet in the list. When we process the list any entry that has been around for more than half a second is copied. This allows up to free the grant table entry and return it to domU. If the new grant table operation is not available (e.g., old HV or architectures that don''t support it yet) we simply copy each packet as we receive them using skb_linearize. We also disable SG/TSO if this is the case. By default the new code is disabled. In order to enable it, the module needs to be loaded with the argument copy_skb=1. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/common.h --- a/linux-2.6-xen-sparse/drivers/xen/netback/common.h Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/common.h Wed May 16 22:31:20 2007 +1000 @@ -114,6 +114,14 @@ typedef struct netif_st { #define netback_carrier_off(netif) ((netif)->carrier = 0) #define netback_carrier_ok(netif) ((netif)->carrier) +enum { + NETBK_DONT_COPY_SKB, + NETBK_DELAYED_COPY_SKB, + NETBK_ALWAYS_COPY_SKB, +}; + +extern int netbk_copy_skb_mode; + #define NET_TX_RING_SIZE __RING_SIZE((netif_tx_sring_t *)0, PAGE_SIZE) #define NET_RX_RING_SIZE __RING_SIZE((netif_rx_sring_t *)0, PAGE_SIZE) diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/netback.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Wed May 16 22:31:20 2007 +1000 @@ -49,6 +49,11 @@ struct netbk_rx_meta { int copy:1; }; +struct netbk_tx_pending_inuse { + struct list_head list; + unsigned long alloc_time; +}; + static void netif_idx_release(u16 pending_idx); static void netif_page_release(struct page *page); static void make_tx_response(netif_t *netif, @@ -68,15 +73,21 @@ static DECLARE_TASKLET(net_rx_tasklet, n static DECLARE_TASKLET(net_rx_tasklet, net_rx_action, 0); static struct timer_list net_timer; +static struct timer_list netbk_tx_pending_timer; #define MAX_PENDING_REQS 256 static struct sk_buff_head rx_queue; static struct page **mmap_pages; +static inline unsigned long idx_to_pfn(unsigned int idx) +{ + return page_to_pfn(mmap_pages[idx]); +} + static inline unsigned long idx_to_kaddr(unsigned int idx) { - return (unsigned long)pfn_to_kaddr(page_to_pfn(mmap_pages[idx])); + return (unsigned long)pfn_to_kaddr(idx_to_pfn(idx)); } #define PKT_PROT_LEN 64 @@ -95,6 +106,10 @@ static u16 dealloc_ring[MAX_PENDING_REQS static u16 dealloc_ring[MAX_PENDING_REQS]; static PEND_RING_IDX dealloc_prod, dealloc_cons; +/* Doubly-linked list of in-use pending entries. */ +static struct netbk_tx_pending_inuse pending_inuse[MAX_PENDING_REQS]; +static LIST_HEAD(pending_inuse_head); + static struct sk_buff_head tx_queue; static grant_handle_t grant_tx_handle[MAX_PENDING_REQS]; @@ -107,6 +122,13 @@ static spinlock_t net_schedule_list_lock #define MAX_MFN_ALLOC 64 static unsigned long mfn_list[MAX_MFN_ALLOC]; static unsigned int alloc_index = 0; + +/* Setting this allows the safe use of this driver without netloop. */ +static int MODPARM_copy_skb; +module_param_named(copy_skb, MODPARM_copy_skb, bool, 0); +MODULE_PARM_DESC(copy_skb, "Copy data received from netfront without netloop"); + +int netbk_copy_skb_mode; static inline unsigned long alloc_mfn(void) { @@ -719,6 +741,11 @@ static void net_alarm(unsigned long unus tasklet_schedule(&net_rx_tasklet); } +static void netbk_tx_pending_timeout(unsigned long unused) +{ + tasklet_schedule(&net_tx_tasklet); +} + struct net_device_stats *netif_be_get_stats(struct net_device *dev) { netif_t *netif = netdev_priv(dev); @@ -812,46 +839,97 @@ static void tx_credit_callback(unsigned netif_schedule_work(netif); } +static inline int copy_pending_req(PEND_RING_IDX pending_idx) +{ + return gnttab_copy_grant_page(grant_tx_handle[pending_idx], + &mmap_pages[pending_idx]); +} + inline static void net_tx_action_dealloc(void) { + struct netbk_tx_pending_inuse *inuse, *n; gnttab_unmap_grant_ref_t *gop; u16 pending_idx; PEND_RING_IDX dc, dp; netif_t *netif; int ret; + LIST_HEAD(list); dc = dealloc_cons; - dp = dealloc_prod; - - /* Ensure we see all indexes enqueued by netif_idx_release(). */ - smp_rmb(); + gop = tx_unmap_ops; /* * Free up any grants we have finished using */ - gop = tx_unmap_ops; - while (dc != dp) { - pending_idx = dealloc_ring[MASK_PEND_IDX(dc++)]; - gnttab_set_unmap_op(gop, idx_to_kaddr(pending_idx), - GNTMAP_host_map, - grant_tx_handle[pending_idx]); - gop++; - } + do { + dp = dealloc_prod; + + /* Ensure we see all indices enqueued by netif_idx_release(). */ + smp_rmb(); + + while (dc != dp) { + unsigned long pfn; + + pending_idx = dealloc_ring[MASK_PEND_IDX(dc++)]; + list_move_tail(&pending_inuse[pending_idx].list, &list); + + pfn = idx_to_pfn(pending_idx); + /* Already unmapped? */ + if (!phys_to_machine_mapping_valid(pfn)) + continue; + + gnttab_set_unmap_op(gop, idx_to_kaddr(pending_idx), + GNTMAP_host_map, + grant_tx_handle[pending_idx]); + gop++; + } + + if (netbk_copy_skb_mode != NETBK_DELAYED_COPY_SKB || + list_empty(&pending_inuse_head)) + break; + + /* Copy any entries that have been pending for too long. */ + list_for_each_entry_safe(inuse, n, &pending_inuse_head, list) { + if (time_after(inuse->alloc_time + HZ / 2, jiffies)) + break; + + switch (copy_pending_req(inuse - pending_inuse)) { + case 0: + list_move_tail(&inuse->list, &list); + continue; + case -EBUSY: + list_del_init(&inuse->list); + continue; + case -ENOENT: + continue; + } + + break; + } + } while (dp != dealloc_prod); + + dealloc_cons = dc; + ret = HYPERVISOR_grant_table_op( GNTTABOP_unmap_grant_ref, tx_unmap_ops, gop - tx_unmap_ops); BUG_ON(ret); - while (dealloc_cons != dp) { - pending_idx = dealloc_ring[MASK_PEND_IDX(dealloc_cons++)]; + list_for_each_entry_safe(inuse, n, &list, list) { + pending_idx = inuse - pending_inuse; netif = pending_tx_info[pending_idx].netif; make_tx_response(netif, &pending_tx_info[pending_idx].req, NETIF_RSP_OKAY); + /* Ready for next use. */ + gnttab_reset_grant_page(mmap_pages[pending_idx]); + pending_ring[MASK_PEND_IDX(pending_prod++)] = pending_idx; netif_put(netif); + + list_del_init(&inuse->list); } } @@ -1023,6 +1101,11 @@ static void netbk_fill_frags(struct sk_b unsigned long pending_idx; pending_idx = (unsigned long)frag->page; + + pending_inuse[pending_idx].alloc_time = jiffies; + list_add_tail(&pending_inuse[pending_idx].list, + &pending_inuse_head); + txp = &pending_tx_info[pending_idx].req; frag->page = virt_to_page(idx_to_kaddr(pending_idx)); frag->size = txp->size; @@ -1311,8 +1394,24 @@ static void net_tx_action(unsigned long netif->stats.rx_bytes += skb->len; netif->stats.rx_packets++; + if (unlikely(netbk_copy_skb_mode == NETBK_ALWAYS_COPY_SKB) && + unlikely(skb_linearize(skb))) { + DPRINTK("Can''t linearize skb in net_tx_action.\n"); + kfree_skb(skb); + continue; + } + netif_rx(skb); netif->dev->last_rx = jiffies; + } + + if (netbk_copy_skb_mode == NETBK_DELAYED_COPY_SKB && + !list_empty(&pending_inuse_head)) { + struct netbk_tx_pending_inuse *oldest; + + oldest = list_entry(pending_inuse_head.next, + struct netbk_tx_pending_inuse, list); + mod_timer(&netbk_tx_pending_timer, oldest->alloc_time + HZ); } } @@ -1333,9 +1432,6 @@ static void netif_idx_release(u16 pendin static void netif_page_release(struct page *page) { - /* Ready for next use. */ - init_page_count(page); - netif_idx_release(netif_page_index(page)); } @@ -1457,6 +1553,10 @@ static int __init netback_init(void) net_timer.data = 0; net_timer.function = net_alarm; + init_timer(&netbk_tx_pending_timer); + netbk_tx_pending_timer.data = 0; + netbk_tx_pending_timer.function = netbk_tx_pending_timeout; + mmap_pages = alloc_empty_pages_and_pagevec(MAX_PENDING_REQS); if (mmap_pages == NULL) { printk("%s: out of memory\n", __FUNCTION__); @@ -1467,6 +1567,7 @@ static int __init netback_init(void) page = mmap_pages[i]; SetPageForeign(page, netif_page_release); netif_page_index(page) = i; + INIT_LIST_HEAD(&pending_inuse[i].list); } pending_cons = 0; @@ -1476,6 +1577,15 @@ static int __init netback_init(void) spin_lock_init(&net_schedule_list_lock); INIT_LIST_HEAD(&net_schedule_list); + + netbk_copy_skb_mode = NETBK_DONT_COPY_SKB; + if (MODPARM_copy_skb) { + if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_and_replace, + NULL, 0)) + netbk_copy_skb_mode = NETBK_ALWAYS_COPY_SKB; + else + netbk_copy_skb_mode = NETBK_DELAYED_COPY_SKB; + } netif_xenbus_init(); diff -r 3ef0510e44d0 linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Tue May 08 10:21:23 2007 +0100 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Wed May 16 22:31:20 2007 +1000 @@ -62,6 +62,7 @@ static int netback_probe(struct xenbus_d const char *message; struct xenbus_transaction xbt; int err; + int sg; struct backend_info *be = kzalloc(sizeof(struct backend_info), GFP_KERNEL); if (!be) { @@ -73,6 +74,10 @@ static int netback_probe(struct xenbus_d be->dev = dev; dev->dev.driver_data = be; + sg = 1; + if (netbk_copy_skb_mode == NETBK_ALWAYS_COPY_SKB) + sg = 0; + do { err = xenbus_transaction_start(&xbt); if (err) { @@ -80,14 +85,14 @@ static int netback_probe(struct xenbus_d goto fail; } - err = xenbus_printf(xbt, dev->nodename, "feature-sg", "%d", 1); + err = xenbus_printf(xbt, dev->nodename, "feature-sg", "%d", sg); if (err) { message = "writing feature-sg"; goto abort_transaction; } err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv4", - "%d", 1); + "%d", sg); if (err) { message = "writing feature-gso-tcpv4"; goto abort_transaction; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel