commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 net: Remove ndo_xmit_flush netdev operation, use signalling instead. added code that looks at skb->xmit_more after the skb has been put in TX VQ. Since some paths process the ring and free the skb immediately, this can cause use after free. Fix by storing xmit_more in a local variable. Cc: David S. Miller <davem at davemloft.net> Signed-off-by: Michael S. Tsirkin <mst at redhat.com> --- David, am I using the API correctly? Seems to work for me. You used __netif_subqueue_stopped but that seems to use a slightly more expensive test_bit internally. The reason I added a variable for the txq here is because it's handy for BQL patch later on. drivers/net/virtio_net.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3d0ce44..13d0a8b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) int qnum = skb_get_queue_mapping(skb); struct send_queue *sq = &vi->sq[qnum]; int err; + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); + bool kick = !skb->xmit_more; /* Free up any pending old buffers before queueing new ones. */ free_old_xmit_skbs(sq); @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) } } - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) + if (kick || netif_xmit_stopped(txq)) virtqueue_kick(sq->vq); return NETDEV_TX_OK; -- MST
From: Michael S. Tsirkin> commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 > net: Remove ndo_xmit_flush netdev operation, use signalling instead. > > added code that looks at skb->xmit_more after the skb has > been put in TX VQ. Since some paths process the ring and free the skb > immediately, this can cause use after free. > > Fix by storing xmit_more in a local variable. > > Cc: David S. Miller <davem at davemloft.net> > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > --- > > David, am I using the API correctly? > Seems to work for me. > You used __netif_subqueue_stopped but that seems to use > a slightly more expensive test_bit internally. > The reason I added a variable for the txq here is because it's handy for > BQL patch later on. > > > drivers/net/virtio_net.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 3d0ce44..13d0a8b 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > int qnum = skb_get_queue_mapping(skb); > struct send_queue *sq = &vi->sq[qnum]; > int err; > + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);Do you need to cache 'txq' on stack for the entire call? Looks like it is only needed when 'kick' is true. I've not looked to see if saves both 'dev' and 'qnum' being kept. In any case it isn't mentioned in the commit message. David> + bool kick = !skb->xmit_more; > > /* Free up any pending old buffers before queueing new ones. */ > free_old_xmit_skbs(sq); > @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > } > } > > - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) > + if (kick || netif_xmit_stopped(txq)) > virtqueue_kick(sq->vq); > > return NETDEV_TX_OK; > -- > MST > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 15, 2014 at 01:24:57PM +0000, David Laight wrote:> From: Michael S. Tsirkin > > commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 > > net: Remove ndo_xmit_flush netdev operation, use signalling instead. > > > > added code that looks at skb->xmit_more after the skb has > > been put in TX VQ. Since some paths process the ring and free the skb > > immediately, this can cause use after free. > > > > Fix by storing xmit_more in a local variable. > > > > Cc: David S. Miller <davem at davemloft.net> > > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > > --- > > > > David, am I using the API correctly? > > Seems to work for me. > > You used __netif_subqueue_stopped but that seems to use > > a slightly more expensive test_bit internally. > > The reason I added a variable for the txq here is because it's handy for > > BQL patch later on. > > > > > > drivers/net/virtio_net.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 3d0ce44..13d0a8b 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > int qnum = skb_get_queue_mapping(skb); > > struct send_queue *sq = &vi->sq[qnum]; > > int err; > > + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); > > Do you need to cache 'txq' on stack for the entire call? > Looks like it is only needed when 'kick' is true. > I've not looked to see if saves both 'dev' and 'qnum' being kept. > > In any case it isn't mentioned in the commit message. > > DavidCode seems slightly neater this way, I haven't bothered to micro-optimize it to this level yet. Want to benchmark and send a patch on top?> > + bool kick = !skb->xmit_more; > > > > /* Free up any pending old buffers before queueing new ones. */ > > free_old_xmit_skbs(sq); > > @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > } > > } > > > > - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) > > + if (kick || netif_xmit_stopped(txq)) > > virtqueue_kick(sq->vq); > > > > return NETDEV_TX_OK; > > -- > > MST > > -- > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > the body of a message to majordomo at vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html >
From: "Michael S. Tsirkin" <mst at redhat.com> Date: Wed, 15 Oct 2014 16:23:28 +0300> You used __netif_subqueue_stopped but that seems to use > a slightly more expensive test_bit internally.More expensive in what sense? It should be roughly the same as "x & y" sans the volatile. Anyways I'm ambivalent and I want to see this bug fixes, so I'll apply your patch. Thanks!
On Wed, Oct 15, 2014 at 04:47:27PM -0400, David Miller wrote:> From: "Michael S. Tsirkin" <mst at redhat.com> > Date: Wed, 15 Oct 2014 16:23:28 +0300 > > > You used __netif_subqueue_stopped but that seems to use > > a slightly more expensive test_bit internally. > > More expensive in what sense? It should be roughly the same > as "x & y" sans the volatile.I really just meant volatile - this might prevent some compiler optimizations. I have't actually checked the produced binary so I don't know for sure.> Anyways I'm ambivalent and I want to see this bug fixes, so I'll > apply your patch. > > Thanks!
On Wed, 2014-10-15 at 16:23 +0300, Michael S. Tsirkin wrote:> commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 > net: Remove ndo_xmit_flush netdev operation, use signalling instead. > > added code that looks at skb->xmit_more after the skb has > been put in TX VQ. Since some paths process the ring and free the skb > immediately, this can cause use after free. > > Fix by storing xmit_more in a local variable. > > Cc: David S. Miller <davem at davemloft.net> > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > --- > > David, am I using the API correctly? > Seems to work for me. > You used __netif_subqueue_stopped but that seems to use > a slightly more expensive test_bit internally. > The reason I added a variable for the txq here is because it's handy for > BQL patch later on. > > > drivers/net/virtio_net.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 3d0ce44..13d0a8b 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > int qnum = skb_get_queue_mapping(skb); > struct send_queue *sq = &vi->sq[qnum]; > int err; > + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); > + bool kick = !skb->xmit_more; > > /* Free up any pending old buffers before queueing new ones. */ > free_old_xmit_skbs(sq); > @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > } > } > > - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) > + if (kick || netif_xmit_stopped(txq)) > virtqueue_kick(sq->vq); > > return NETDEV_TX_OK;I must say I am kind of confused by this patch. Why the skb_orphan(skb) & nf_reset(skb) do not have the same issue ? It looks like following patch is needed ? diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ec2a8b41ed41..17cc42c6a559 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -927,6 +927,10 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) /* Free up any pending old buffers before queueing new ones. */ free_old_xmit_skbs(sq); + /* Don't wait up for transmitted skbs to be freed. */ + skb_orphan(skb); + nf_reset(skb); + /* Try to transmit */ err = xmit_skb(sq, skb); @@ -941,10 +945,6 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } - /* Don't wait up for transmitted skbs to be freed. */ - skb_orphan(skb); - nf_reset(skb); - /* Apparently nice girls don't return TX_BUSY; stop the queue * before it gets out of hand. Naturally, this wastes entries. */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
On 10/31/2014 11:36 AM, Eric Dumazet wrote:> On Wed, 2014-10-15 at 16:23 +0300, Michael S. Tsirkin wrote: >> commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 >> net: Remove ndo_xmit_flush netdev operation, use signalling instead. >> >> added code that looks at skb->xmit_more after the skb has >> been put in TX VQ. Since some paths process the ring and free the skb >> immediately, this can cause use after free. >> >> Fix by storing xmit_more in a local variable. >> >> Cc: David S. Miller <davem at davemloft.net> >> Signed-off-by: Michael S. Tsirkin <mst at redhat.com> >> --- >> >> David, am I using the API correctly? >> Seems to work for me. >> You used __netif_subqueue_stopped but that seems to use >> a slightly more expensive test_bit internally. >> The reason I added a variable for the txq here is because it's handy for >> BQL patch later on. >> >> >> drivers/net/virtio_net.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> index 3d0ce44..13d0a8b 100644 >> --- a/drivers/net/virtio_net.c >> +++ b/drivers/net/virtio_net.c >> @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) >> int qnum = skb_get_queue_mapping(skb); >> struct send_queue *sq = &vi->sq[qnum]; >> int err; >> + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); >> + bool kick = !skb->xmit_more; >> >> /* Free up any pending old buffers before queueing new ones. */ >> free_old_xmit_skbs(sq); >> @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) >> } >> } >> >> - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) >> + if (kick || netif_xmit_stopped(txq)) >> virtqueue_kick(sq->vq); >> >> return NETDEV_TX_OK; > I must say I am kind of confused by this patch. > > Why the skb_orphan(skb) & nf_reset(skb) do not have the same issue ? >Since they are called before the possible free_old_xmit_skbs(), skb won't get freed at this time.