Michael S. Tsirkin
2022-Dec-13 06:38 UTC
[PATCH net] virtio-net: correctly enable callback during start_xmit
On Tue, Dec 13, 2022 at 11:43:36AM +0800, Jason Wang wrote:> On Tue, Dec 13, 2022 at 11:38 AM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > On Mon, 12 Dec 2022 04:25:22 -0500, "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > On Mon, Dec 12, 2022 at 05:10:29PM +0800, Jason Wang wrote: > > > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables > > > > virtqueue callback via the following statement: > > > > > > > > do { > > > > ...... > > > > } while (use_napi && kick && > > > > unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > > > > > This will cause a missing call to virtqueue_enable_cb_delayed() when > > > > kick is false. Fixing this by removing the checking of the kick from > > > > the condition to make sure callback is enabled correctly. > > > > > > > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively") > > > > Signed-off-by: Jason Wang <jasowang at redhat.com> > > > > --- > > > > The patch is needed for -stable. > > > > > > stable rules don't allow for theoretical fixes. Was a problem observed? > > Yes, running a pktgen sample script can lead to a tx timeout.Since April 2021 and we only noticed now? Are you sure it's the right Fixes tag?> > > > > > > --- > > > > drivers/net/virtio_net.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > index 86e52454b5b5..44d7daf0267b 100644 > > > > --- a/drivers/net/virtio_net.c > > > > +++ b/drivers/net/virtio_net.c > > > > @@ -1834,8 +1834,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > > > > > > > free_old_xmit_skbs(sq, false); > > > > > > > > - } while (use_napi && kick && > > > > - unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > + } while (use_napi && > > > > + unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > > > > > > > A bit more explanation pls. kick simply means !netdev_xmit_more - > > > if it's false we know there will be another packet, then transmissing > > > that packet will invoke virtqueue_enable_cb_delayed. No? > > > > It's just that there may be a next packet, but in fact there may not be. > > For example, the vq is full, and the driver stops the queue. > > Exactly, when the queue is about to be full we disable tx and wait for > the next tx interrupt to re-enable tx. > > ThanksOK, it's a good idea to document that. And we should enable callbacks at that point, not here on data path.> > > > Thanks. > > > > > > > > > > > > > > > > > > > > > /* timestamp packet in software */ > > > > skb_tx_timestamp(skb); > > > > -- > > > > 2.25.1 > > > > > > _______________________________________________ > > > Virtualization mailing list > > > Virtualization at lists.linux-foundation.org > > > https://lists.linuxfoundation.org/mailman/listinfo/virtualization > >
Jason Wang
2022-Dec-13 06:57 UTC
[PATCH net] virtio-net: correctly enable callback during start_xmit
On Tue, Dec 13, 2022 at 2:38 PM Michael S. Tsirkin <mst at redhat.com> wrote:> > On Tue, Dec 13, 2022 at 11:43:36AM +0800, Jason Wang wrote: > > On Tue, Dec 13, 2022 at 11:38 AM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > > > On Mon, 12 Dec 2022 04:25:22 -0500, "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > > On Mon, Dec 12, 2022 at 05:10:29PM +0800, Jason Wang wrote: > > > > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables > > > > > virtqueue callback via the following statement: > > > > > > > > > > do { > > > > > ...... > > > > > } while (use_napi && kick && > > > > > unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > > > > > > > This will cause a missing call to virtqueue_enable_cb_delayed() when > > > > > kick is false. Fixing this by removing the checking of the kick from > > > > > the condition to make sure callback is enabled correctly. > > > > > > > > > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively") > > > > > Signed-off-by: Jason Wang <jasowang at redhat.com> > > > > > --- > > > > > The patch is needed for -stable. > > > > > > > > stable rules don't allow for theoretical fixes. Was a problem observed? > > > > Yes, running a pktgen sample script can lead to a tx timeout. > > Since April 2021 and we only noticed now? Are you sure it's the > right Fixes tag?Well, reverting a7766ef18b33 makes pktgen work again. The reason we doesn't notice is probably because: 1) We don't support BQL, so no bulk dequeuing (skb list) in normal traffic 2) When burst is enabled for pktgen, it can do bulk xmit via skb list by its own> > > > > > > > > > --- > > > > > drivers/net/virtio_net.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > > index 86e52454b5b5..44d7daf0267b 100644 > > > > > --- a/drivers/net/virtio_net.c > > > > > +++ b/drivers/net/virtio_net.c > > > > > @@ -1834,8 +1834,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > > > > > > > > > free_old_xmit_skbs(sq, false); > > > > > > > > > > - } while (use_napi && kick && > > > > > - unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > > + } while (use_napi && > > > > > + unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > > > > > > > > > > A bit more explanation pls. kick simply means !netdev_xmit_more - > > > > if it's false we know there will be another packet, then transmissing > > > > that packet will invoke virtqueue_enable_cb_delayed. No? > > > > > > It's just that there may be a next packet, but in fact there may not be. > > > For example, the vq is full, and the driver stops the queue. > > > > Exactly, when the queue is about to be full we disable tx and wait for > > the next tx interrupt to re-enable tx. > > > > Thanks > > OK, it's a good idea to document that.Will do.> And we should enable callbacks at that point, not here on data path.I'm not sure I understand here. Are you suggesting removing the !user_napi check here? if (!use_napi && unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ free_old_xmit_skbs(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); virtqueue_disable_cb(sq->vq); } } Btw, it doesn't differ too much as kick is always true without pktgen and that may even need more comments or make the code even harder to read. We need a patch for -stable at least so I prefer to let this patch go first and do optimization on top. Thanks> > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > /* timestamp packet in software */ > > > > > skb_tx_timestamp(skb); > > > > > -- > > > > > 2.25.1 > > > > > > > > _______________________________________________ > > > > Virtualization mailing list > > > > Virtualization at lists.linux-foundation.org > > > > https://lists.linuxfoundation.org/mailman/listinfo/virtualization > > > >