Michael S. Tsirkin
2013-Jul-08 09:04 UTC
[PATCH 2/2] virtio_net: fix race in RX VQ processing
virtio net called virtqueue_enable_cq on RX path after napi_complete, so with NAPI_STATE_SCHED clear - outside the implicit napi lock. This violates the requirement to synchronize virtqueue_enable_cq wrt virtqueue_add_buf. In particular, used event can move backwards, causing us to lose interrupts. In a debug build, this can trigger panic within START_USE. Jason Wang reports that he can trigger the races artificially, by adding udelay() in virtqueue_enable_cb() after virtio_mb(). However, we must call napi_complete to clear NAPI_STATE_SCHED before polling the virtqueue for used buffers, otherwise napi_schedule_prep in a callback will fail, causing us to lose RX events. To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED set (under napi lock), later call virtqueue_poll with NAPI_STATE_SCHED clear (outside the lock). Reported-by: Jason Wang <jasowang at redhat.com> Signed-off-by: Michael S. Tsirkin <mst at redhat.com> --- drivers/net/virtio_net.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 5305bd1..fbdd79a 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -622,8 +622,9 @@ again: /* Out of packets? */ if (received < budget) { + unsigned r = virtqueue_enable_cb_prepare(rq->vq); napi_complete(napi); - if (unlikely(!virtqueue_enable_cb(rq->vq)) && + if (unlikely(virtqueue_poll(rq->vq, r)) && napi_schedule_prep(napi)) { virtqueue_disable_cb(rq->vq); __napi_schedule(napi); -- MST
Sergei Shtylyov
2013-Jul-08 12:52 UTC
[PATCH 2/2] virtio_net: fix race in RX VQ processing
Hello. On 08-07-2013 13:04, Michael S. Tsirkin wrote:> virtio net called virtqueue_enable_cq on RX path after napi_complete, so > with NAPI_STATE_SCHED clear - outside the implicit napi lock. > This violates the requirement to synchronize virtqueue_enable_cq wrt > virtqueue_add_buf. In particular, used event can move backwards, > causing us to lose interrupts. > In a debug build, this can trigger panic within START_USE.> Jason Wang reports that he can trigger the races artificially, > by adding udelay() in virtqueue_enable_cb() after virtio_mb().> However, we must call napi_complete to clear NAPI_STATE_SCHED before > polling the virtqueue for used buffers, otherwise napi_schedule_prep in > a callback will fail, causing us to lose RX events.> To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED > set (under napi lock), later call virtqueue_poll with > NAPI_STATE_SCHED clear (outside the lock).> Reported-by: Jason Wang <jasowang at redhat.com> > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > --- > drivers/net/virtio_net.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-)> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 5305bd1..fbdd79a 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -622,8 +622,9 @@ again: > > /* Out of packets? */ > if (received < budget) { > + unsigned r = virtqueue_enable_cb_prepare(rq->vq);Empty line wouldn't hurt here, after declaration. WBR, Sergei
Michael S. Tsirkin
2013-Jul-08 13:08 UTC
[PATCH 2/2] virtio_net: fix race in RX VQ processing
On Mon, Jul 08, 2013 at 04:52:26PM +0400, Sergei Shtylyov wrote:> Hello. > > On 08-07-2013 13:04, Michael S. Tsirkin wrote: > > >virtio net called virtqueue_enable_cq on RX path after napi_complete, so > >with NAPI_STATE_SCHED clear - outside the implicit napi lock. > >This violates the requirement to synchronize virtqueue_enable_cq wrt > >virtqueue_add_buf. In particular, used event can move backwards, > >causing us to lose interrupts. > >In a debug build, this can trigger panic within START_USE. > > >Jason Wang reports that he can trigger the races artificially, > >by adding udelay() in virtqueue_enable_cb() after virtio_mb(). > > >However, we must call napi_complete to clear NAPI_STATE_SCHED before > >polling the virtqueue for used buffers, otherwise napi_schedule_prep in > >a callback will fail, causing us to lose RX events. > > >To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED > >set (under napi lock), later call virtqueue_poll with > >NAPI_STATE_SCHED clear (outside the lock). > > >Reported-by: Jason Wang <jasowang at redhat.com> > >Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > >--- > > drivers/net/virtio_net.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > >diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >index 5305bd1..fbdd79a 100644 > >--- a/drivers/net/virtio_net.c > >+++ b/drivers/net/virtio_net.c > >@@ -622,8 +622,9 @@ again: > > > > /* Out of packets? */ > > if (received < budget) { > >+ unsigned r = virtqueue_enable_cb_prepare(rq->vq); > > Empty line wouldn't hurt here, after declaration. > > WBR, SergeiI don't like an empty line here - it breaks _prepare away from _poll which is in the same logical code block. Is there some rule that says we must have empty lines after declarations? If yes I'd rather split initialization away from declaration, though that's more verbose than it needs to be. diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index fbdd79a..edcffc6 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -622,7 +622,9 @@ again: /* Out of packets? */ if (received < budget) { - unsigned r = virtqueue_enable_cb_prepare(rq->vq); + unsigned r; + + r = virtqueue_enable_cb_prepare(rq->vq); napi_complete(napi); if (unlikely(virtqueue_poll(rq->vq, r)) && napi_schedule_prep(napi)) {
On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote:> virtio net called virtqueue_enable_cq on RX path after napi_complete, so > with NAPI_STATE_SCHED clear - outside the implicit napi lock. > This violates the requirement to synchronize virtqueue_enable_cq wrt > virtqueue_add_buf. In particular, used event can move backwards, > causing us to lose interrupts. > In a debug build, this can trigger panic within START_USE. > > Jason Wang reports that he can trigger the races artificially, > by adding udelay() in virtqueue_enable_cb() after virtio_mb(). > > However, we must call napi_complete to clear NAPI_STATE_SCHED before > polling the virtqueue for used buffers, otherwise napi_schedule_prep in > a callback will fail, causing us to lose RX events. > > To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED > set (under napi lock), later call virtqueue_poll with > NAPI_STATE_SCHED clear (outside the lock). > > Reported-by: Jason Wang <jasowang at redhat.com> > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > ---Tested-by: Jason Wang <jasowang at redhat.com> Acked-by: Jason Wang <jasowang at redhat.com>> drivers/net/virtio_net.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 5305bd1..fbdd79a 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -622,8 +622,9 @@ again: > > /* Out of packets? */ > if (received < budget) { > + unsigned r = virtqueue_enable_cb_prepare(rq->vq); > napi_complete(napi); > - if (unlikely(!virtqueue_enable_cb(rq->vq)) && > + if (unlikely(virtqueue_poll(rq->vq, r)) && > napi_schedule_prep(napi)) { > virtqueue_disable_cb(rq->vq); > __napi_schedule(napi);
On Tue, Jul 09, 2013 at 11:28:34AM +0800, Jason Wang wrote:> On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote: > > virtio net called virtqueue_enable_cq on RX path after napi_complete, so > > with NAPI_STATE_SCHED clear - outside the implicit napi lock. > > This violates the requirement to synchronize virtqueue_enable_cq wrt > > virtqueue_add_buf. In particular, used event can move backwards, > > causing us to lose interrupts. > > In a debug build, this can trigger panic within START_USE. > > > > Jason Wang reports that he can trigger the races artificially, > > by adding udelay() in virtqueue_enable_cb() after virtio_mb(). > > > > However, we must call napi_complete to clear NAPI_STATE_SCHED before > > polling the virtqueue for used buffers, otherwise napi_schedule_prep in > > a callback will fail, causing us to lose RX events. > > > > To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED > > set (under napi lock), later call virtqueue_poll with > > NAPI_STATE_SCHED clear (outside the lock). > > > > Reported-by: Jason Wang <jasowang at redhat.com> > > Signed-off-by: Michael S. Tsirkin <mst at redhat.com>Acked-by: Asias He <asias at redhat.com>> > --- > > Tested-by: Jason Wang <jasowang at redhat.com> > Acked-by: Jason Wang <jasowang at redhat.com> > > drivers/net/virtio_net.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 5305bd1..fbdd79a 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -622,8 +622,9 @@ again: > > > > /* Out of packets? */ > > if (received < budget) { > > + unsigned r = virtqueue_enable_cb_prepare(rq->vq); > > napi_complete(napi); > > - if (unlikely(!virtqueue_enable_cb(rq->vq)) && > > + if (unlikely(virtqueue_poll(rq->vq, r)) && > > napi_schedule_prep(napi)) { > > virtqueue_disable_cb(rq->vq); > > __napi_schedule(napi); > > _______________________________________________ > Virtualization mailing list > Virtualization at lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization-- Asias
Possibly Parallel Threads
- [PATCH 2/2] virtio_net: fix race in RX VQ processing
- [PATCH 2/2] virtio_net: fix race in RX VQ processing
- [PATCH 2/2] virtio_net: fix race in RX VQ processing
- [PATCH v2 2/2] virtio_net: fix race in RX VQ processing
- [PATCH v2 2/2] virtio_net: fix race in RX VQ processing