David Vrabel
2013-Sep-11 13:52 UTC
[PATCHv2 net] xen-netback: count number required slots for an skb more carefully
From: David Vrabel <david.vrabel@citrix.com> When a VM is providing an iSCSI target and the LUN is used by the backend domain, the generated skbs for direct I/O writes to the disk have large, multi-page skb->data but no frags. With some lengths and starting offsets, xen_netbk_count_skb_slots() would be one short because the simple calculation of DIV_ROUND_UP(skb_headlen(), PAGE_SIZE) was not accounting for the decisions made by start_new_rx_buffer() which does not guarantee responses are fully packed. For example, a skb with length < 2 pages but which spans 3 pages would be counted as requiring 2 slots but would actually use 3 slots. skb->data: | 1111|222222222222|3333 | Fully packed, this would need 2 slots: |111122222222|22223333 | But because the 2nd page wholy fits into a slot it is not split across slots and goes into a slot of its own: |1111 |222222222222|3333 | Miscounting the number of slots means netback may push more responses than the number of available requests. This will cause the frontend to get very confused and report "Too many frags/slots". The frontend never recovers and will eventually BUG. Fix this by counting the number of required slots more carefully. In xen_netbk_count_skb_slots(), more closely follow the algorithm used by xen_netbk_gop_skb() by introducing xen_netbk_count_frag_slots() which is the dry-run equivalent of netbk_gop_frag_copy(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> -- v2: xen_netbk_* -> xenvif_* to match new style --- drivers/net/xen-netback/netback.c | 94 +++++++++++++++++++++++++------------ 1 files changed, 64 insertions(+), 30 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 956130c..f3e591c 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -212,6 +212,49 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head) return false; } +struct xenvif_count_slot_state { + unsigned long copy_off; + bool head; +}; + +unsigned int xenvif_count_frag_slots(struct xenvif *vif, + unsigned long offset, unsigned long size, + struct xenvif_count_slot_state *state) +{ + unsigned count = 0; + + offset &= ~PAGE_MASK; + + while (size > 0) { + unsigned long bytes; + + bytes = PAGE_SIZE - offset; + + if (bytes > size) + bytes = size; + + if (start_new_rx_buffer(state->copy_off, bytes, state->head)) { + count++; + state->copy_off = 0; + } + + if (state->copy_off + bytes > MAX_BUFFER_OFFSET) + bytes = MAX_BUFFER_OFFSET - state->copy_off; + + state->copy_off += bytes; + + offset += bytes; + size -= bytes; + + if (offset == PAGE_SIZE) + offset = 0; + + state->head = false; + } + + return count; +} + /* * Figure out how many ring slots we''re going to need to send @skb to * the guest. This function is essentially a dry run of @@ -219,48 +262,39 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head) */ unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) { + struct xenvif_count_slot_state state; unsigned int count; - int i, copy_off; + unsigned char *data; + unsigned i; - count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE); + state.head = true; + state.copy_off = 0; - copy_off = skb_headlen(skb) % PAGE_SIZE; + /* Slot for the first (partial) page of data. */ + count = 1; + /* Need a slot for the GSO prefix for GSO extra data? */ if (skb_shinfo(skb)->gso_size) count++; - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { - unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]); - unsigned long offset = skb_shinfo(skb)->frags[i].page_offset; - unsigned long bytes; - - offset &= ~PAGE_MASK; - - while (size > 0) { - BUG_ON(offset >= PAGE_SIZE); - BUG_ON(copy_off > MAX_BUFFER_OFFSET); - - bytes = PAGE_SIZE - offset; - - if (bytes > size) - bytes = size; + data = skb->data; + while (data < skb_tail_pointer(skb)) { + unsigned long offset = offset_in_page(data); + unsigned long size = PAGE_SIZE - offset; - if (start_new_rx_buffer(copy_off, bytes, 0)) { - count++; - copy_off = 0; - } + if (data + size > skb_tail_pointer(skb)) + size = skb_tail_pointer(skb) - data; - if (copy_off + bytes > MAX_BUFFER_OFFSET) - bytes = MAX_BUFFER_OFFSET - copy_off; + count += xenvif_count_frag_slots(vif, offset, size, &state); - copy_off += bytes; + data += size; + } - offset += bytes; - size -= bytes; + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]); + unsigned long offset = skb_shinfo(skb)->frags[i].page_offset; - if (offset == PAGE_SIZE) - offset = 0; - } + count += xenvif_count_frag_slots(vif, offset, size, &state); } return count; } -- 1.7.2.5
David Miller
2013-Sep-13 03:23 UTC
Re: [PATCHv2 net] xen-netback: count number required slots for an skb more carefully
From: David Vrabel <david.vrabel@citrix.com> Date: Wed, 11 Sep 2013 14:52:48 +0100> From: David Vrabel <david.vrabel@citrix.com> > > When a VM is providing an iSCSI target and the LUN is used by the > backend domain, the generated skbs for direct I/O writes to the disk > have large, multi-page skb->data but no frags. > > With some lengths and starting offsets, xen_netbk_count_skb_slots() > would be one short because the simple calculation of > DIV_ROUND_UP(skb_headlen(), PAGE_SIZE) was not accounting for the > decisions made by start_new_rx_buffer() which does not guarantee > responses are fully packed....> Fix this by counting the number of required slots more carefully. In > xen_netbk_count_skb_slots(), more closely follow the algorithm used by > xen_netbk_gop_skb() by introducing xen_netbk_count_frag_slots() which > is the dry-run equivalent of netbk_gop_frag_copy(). > > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > Acked-by: Ian Campbell <ian.campbell@citrix.com> > -- > v2: xen_netbk_* -> xenvif_* to match new styleThanks for resolving this into a final patch after so much back and forth :-) I assume you want this queued up for -stable, and can you check if there is any non-trivial backporting for earlier kernels?
Ian Campbell
2013-Sep-17 14:05 UTC
Re: [PATCHv2 net] xen-netback: count number required slots for an skb more carefully
On Thu, 2013-09-12 at 23:23 -0400, David Miller wrote:> I assume you want this queued up for -stable,I think so, David V -- do you know how far back this goes?> and can you check if > there is any non-trivial backporting for earlier kernels?Another one I hope David can answer... Ian.
David Vrabel
2013-Sep-17 16:41 UTC
Re: [PATCHv2 net] xen-netback: count number required slots for an skb more carefully
On 17/09/13 15:05, Ian Campbell wrote:> On Thu, 2013-09-12 at 23:23 -0400, David Miller wrote: >> I assume you want this queued up for -stable, > > I think so, David V -- do you know how far back this goes?Stable please, but only back to 3.10. I suspect (but never had time to confirm) that this fixes a regression introduced by "xen/netback: Calculate the number of SKB slots required correctly" e26b203e which was added in 3.6.>> and can you check if >> there is any non-trivial backporting for earlier kernels? > > Another one I hope David can answer...Backport to 3.10 "just works". David