Jason Wang
2022-Jun-30 06:07 UTC
[PATCH V2] virtio-net: fix the race between refill work and close
On Thu, Jun 30, 2022 at 10:51 AM Jakub Kicinski <kuba at kernel.org> wrote:> > On Thu, 30 Jun 2022 10:08:04 +0800 Jason Wang wrote: > > +static void enable_refill_work(struct virtnet_info *vi) > > +{ > > + spin_lock(&vi->refill_lock); > > + vi->refill_work_enabled = true; > > + spin_unlock(&vi->refill_lock); > > +} > > + > > +static void disable_refill_work(struct virtnet_info *vi) > > +{ > > + spin_lock(&vi->refill_lock); > > + vi->refill_work_enabled = false; > > + spin_unlock(&vi->refill_lock); > > +} > > + > > static void virtqueue_napi_schedule(struct napi_struct *napi, > > struct virtqueue *vq) > > { > > @@ -1527,8 +1547,12 @@ static int virtnet_receive(struct receive_queue *rq, int budget, > > } > > > > if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) { > > - if (!try_fill_recv(vi, rq, GFP_ATOMIC)) > > - schedule_delayed_work(&vi->refill, 0); > > + if (!try_fill_recv(vi, rq, GFP_ATOMIC)) { > > + spin_lock(&vi->refill_lock); > > + if (vi->refill_work_enabled) > > + schedule_delayed_work(&vi->refill, 0); > > + spin_unlock(&vi->refill_lock); > > Are you sure you can use the basic spin_lock() flavor in all cases? > Isn't the disable/enable called from a different context than this > thing here?This function will only be called in bh so it's safe.> > The entire delayed work construct seems a little risky because the work > may go to sleep after disabling napi, causing large latency spikes.Yes, but it only happens on OOM.> I guess you must have a good reason no to simply reschedule the NAPI > and keep retrying with GFP_ATOMIC...Less pressure on the memory allocator on OOM probably, but it looks like an independent issue that might be optimized in the future.> > Please add the target tree name to the subject.Ok Thanks>
Jason Wang
2022-Jun-30 06:15 UTC
[PATCH V2] virtio-net: fix the race between refill work and close
On Thu, Jun 30, 2022 at 2:07 PM Jason Wang <jasowang at redhat.com> wrote:> > On Thu, Jun 30, 2022 at 10:51 AM Jakub Kicinski <kuba at kernel.org> wrote: > > > > On Thu, 30 Jun 2022 10:08:04 +0800 Jason Wang wrote: > > > +static void enable_refill_work(struct virtnet_info *vi) > > > +{ > > > + spin_lock(&vi->refill_lock); > > > + vi->refill_work_enabled = true; > > > + spin_unlock(&vi->refill_lock); > > > +} > > > + > > > +static void disable_refill_work(struct virtnet_info *vi) > > > +{ > > > + spin_lock(&vi->refill_lock); > > > + vi->refill_work_enabled = false; > > > + spin_unlock(&vi->refill_lock); > > > +} > > > + > > > static void virtqueue_napi_schedule(struct napi_struct *napi, > > > struct virtqueue *vq) > > > { > > > @@ -1527,8 +1547,12 @@ static int virtnet_receive(struct receive_queue *rq, int budget, > > > } > > > > > > if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) { > > > - if (!try_fill_recv(vi, rq, GFP_ATOMIC)) > > > - schedule_delayed_work(&vi->refill, 0); > > > + if (!try_fill_recv(vi, rq, GFP_ATOMIC)) { > > > + spin_lock(&vi->refill_lock); > > > + if (vi->refill_work_enabled) > > > + schedule_delayed_work(&vi->refill, 0); > > > + spin_unlock(&vi->refill_lock); > > > > Are you sure you can use the basic spin_lock() flavor in all cases? > > Isn't the disable/enable called from a different context than this > > thing here? > > This function will only be called in bh so it's safe.Ok, so it looks like we should use the bh variant in close. Otherwise we may have a deadlock. Will fix it. Thanks> > > > > The entire delayed work construct seems a little risky because the work > > may go to sleep after disabling napi, causing large latency spikes. > > Yes, but it only happens on OOM. > > > I guess you must have a good reason no to simply reschedule the NAPI > > and keep retrying with GFP_ATOMIC... > > Less pressure on the memory allocator on OOM probably, but it looks > like an independent issue that might be optimized in the future. > > > > > Please add the target tree name to the subject. > > Ok > > Thanks > > >