Willem de Bruijn
2017-Apr-02 20:10 UTC
[PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
From: Willem de Bruijn <willemb at google.com> Amortize the cost of virtual interrupts by doing both rx and tx work on reception of a receive interrupt if tx napi is enabled. With VIRTIO_F_EVENT_IDX, this suppresses most explicit tx completion interrupts for bidirectional workloads. Signed-off-by: Willem de Bruijn <willemb at google.com> --- drivers/net/virtio_net.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 95d938e82080..af830eb212bf 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1030,12 +1030,34 @@ static int virtnet_receive(struct receive_queue *rq, int budget) return received; } +static void free_old_xmit_skbs(struct send_queue *sq); + +static void virtnet_poll_cleantx(struct receive_queue *rq) +{ + struct virtnet_info *vi = rq->vq->vdev->priv; + unsigned int index = vq2rxq(rq->vq); + struct send_queue *sq = &vi->sq[index]; + struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, index); + + if (!sq->napi.weight) + return; + + __netif_tx_lock(txq, smp_processor_id()); + free_old_xmit_skbs(sq); + __netif_tx_unlock(txq); + + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + netif_wake_subqueue(vi->dev, vq2txq(sq->vq)); +} + static int virtnet_poll(struct napi_struct *napi, int budget) { struct receive_queue *rq container_of(napi, struct receive_queue, napi); unsigned int received; + virtnet_poll_cleantx(rq); + received = virtnet_receive(rq, budget); /* Out of packets? */ -- 2.12.2.564.g063fe858b8-goog
Willem de Bruijn
2017-Apr-03 05:02 UTC
[PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
On Sun, Apr 2, 2017 at 10:47 PM, Michael S. Tsirkin <mst at redhat.com> wrote:> On Sun, Apr 02, 2017 at 04:10:12PM -0400, Willem de Bruijn wrote: >> From: Willem de Bruijn <willemb at google.com> >> >> Amortize the cost of virtual interrupts by doing both rx and tx work >> on reception of a receive interrupt if tx napi is enabled. With >> VIRTIO_F_EVENT_IDX, this suppresses most explicit tx completion >> interrupts for bidirectional workloads. >> >> Signed-off-by: Willem de Bruijn <willemb at google.com> >> --- >> drivers/net/virtio_net.c | 22 ++++++++++++++++++++++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> index 95d938e82080..af830eb212bf 100644 >> --- a/drivers/net/virtio_net.c >> +++ b/drivers/net/virtio_net.c >> @@ -1030,12 +1030,34 @@ static int virtnet_receive(struct receive_queue *rq, int budget) >> return received; >> } >> >> +static void free_old_xmit_skbs(struct send_queue *sq); >> + > > Could you pls re-arrange code to avoid forward declarations?Okay. I'll do the move in a separate patch to simplify review.>> +static void virtnet_poll_cleantx(struct receive_queue *rq) >> +{ >> + struct virtnet_info *vi = rq->vq->vdev->priv; >> + unsigned int index = vq2rxq(rq->vq); >> + struct send_queue *sq = &vi->sq[index]; >> + struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, index); >> + >> + if (!sq->napi.weight) >> + return; >> + >> + __netif_tx_lock(txq, smp_processor_id()); >> + free_old_xmit_skbs(sq); >> + __netif_tx_unlock(txq); >> + >> + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) >> + netif_wake_subqueue(vi->dev, vq2txq(sq->vq)); >> +} >> + > > Looks very similar to virtnet_poll_tx. > > I think this might be waking the tx queue too early, so > it will tend to stay almost full for long periods of time. > Why not defer wakeup until queue is at least half empty?I'll test that. Delaying wake-up longer than necessary can cause queue build up at the qdisc and higher tail latency, I imagine. But it may reduce the number of __netif_schedule calls.> I wonder whether it's worth it to handle very short queues > correctly - they previously made very slow progress, > not they are never woken up. > > I'm a bit concerned about the cost of these wakeups > and locking. I note that this wake is called basically > every time queue is not full. > > Would it make sense to limit the amount of tx polling? > Maybe use trylock to reduce the conflict with xmit?Yes, that sounds good. I did test that previously and saw no difference then. But when multiple cpus contend for a single txq it should help.
Possibly Parallel Threads
- [PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
- [PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
- [PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
- [PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi
- [PATCH net-next 3/3] virtio-net: clean tx descriptors from rx napi