Hi Rusty,
Sorry 'bout the lag ...
On Fri, 2008-05-02 at 20:55 +1000, Rusty Russell wrote:> On Thursday 01 May 2008 00:31:46 Mark McLoughlin wrote:
> > virtio_net currently only frees old transmit skbs just
> > before queueing new ones. If the queue is full, it then
> > enables interrupts and waits for notification that more
> > work has been performed.
>
> Hi Mark,
>
> This patch is fine, but it's better to do it from skb_xmit_done().
Unless I'm missing something, we only get this callback when we've
stopped the queue and we're waiting for buffers to be freed up.
In the normal case, where the callback is disabled, we don't get any
notification that the host has finished with the buffer ... hence the
need for a timer.
2.6.25-rc2 rebase below.
> Of
> course, this is usually called from an interrupt handler, so it's not
> entirely trivial: we can't free the skbs there.
>
> A softirq is probably the answer here, but AFAICT that's old
fashioned.
> Not sure what the right way of doing this is now...
Thanks,
Mark.
Subject: [PATCH] virtio_net: free transmit skbs in a timer
virtio_net currently only frees old transmit skbs just
before queueing new ones. If the queue is full, it then
enables interrupts and waits for notification that more
work has been performed.
However, a side-effect of this scheme is that there are
always xmit skbs left dangling when no new packets are
sent, against the Documentation/networking/driver.txt
guideline:
"... it is not allowed for your TX mitigation scheme
to let TX packets "hang out" in the TX ring unreclaimed
forever if no new TX packets are sent."
Add a timer to ensure that any time we queue new TX
skbs, we will shortly free them again.
This fixes an easily reproduced hang at shutdown where
iptables attempts to unload nf_conntrack and nf_conntrack
waits for an skb it is tracking to be freed, but virtio_net
never frees it.
Signed-off-by: Mark McLoughlin <markmc at redhat.com>
---
drivers/net/virtio_net.c | 30 ++++++++++++++++++++++++++++--
1 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f926b5a..69b308a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -44,6 +44,8 @@ struct virtnet_info
/* The skb we couldn't send because buffers were full. */
struct sk_buff *last_xmit_skb;
+ struct timer_list xmit_free_timer;
+
/* Number of input buffers, and max we've ever had. */
unsigned int num, max;
@@ -230,9 +232,23 @@ static void free_old_xmit_skbs(struct virtnet_info *vi)
}
}
+static void xmit_free(unsigned long data)
+{
+ struct virtnet_info *vi = (void *)data;
+
+ netif_tx_lock(vi->dev);
+
+ free_old_xmit_skbs(vi);
+
+ if (!skb_queue_empty(&vi->send))
+ mod_timer(&vi->xmit_free_timer, jiffies + (HZ/10));
+
+ netif_tx_unlock(vi->dev);
+}
+
static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
{
- int num;
+ int num, err;
struct scatterlist sg[2+MAX_SKB_FRAGS];
struct virtio_net_hdr *hdr;
const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
@@ -275,7 +291,11 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff
*skb)
vnet_hdr_to_sg(sg, skb);
num = skb_to_sgvec(skb, sg+1, 0, skb->len) + 1;
- return vi->svq->vq_ops->add_buf(vi->svq, sg, num, 0, skb);
+ err = vi->svq->vq_ops->add_buf(vi->svq, sg, num, 0, skb);
+ if (!err)
+ mod_timer(&vi->xmit_free_timer, jiffies + (HZ/10));
+
+ return err;
}
static int start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -428,6 +448,10 @@ static int virtnet_probe(struct virtio_device *vdev)
skb_queue_head_init(&vi->recv);
skb_queue_head_init(&vi->send);
+ init_timer(&vi->xmit_free_timer);
+ vi->xmit_free_timer.data = (unsigned long)vi;
+ vi->xmit_free_timer.function = xmit_free;
+
err = register_netdev(dev);
if (err) {
pr_debug("virtio_net: registering device failed\n");
@@ -465,6 +489,8 @@ static void virtnet_remove(struct virtio_device *vdev)
/* Stop all the virtqueues. */
vdev->config->reset(vdev);
+ del_timer_sync(&vi->xmit_free_timer);
+
/* Free our skbs in send and recv queues, if any. */
while ((skb = __skb_dequeue(&vi->recv)) != NULL) {
kfree_skb(skb);
--