Jason Wang
2022-Nov-23 04:00 UTC
[PATCH 6/7] vdpa/mlx5: Avoid using reslock in event_handler
On Mon, Nov 14, 2022 at 4:58 PM Eli Cohen <elic at nvidia.com> wrote:> > > From: Jason Wang <jasowang at redhat.com> > > Sent: Monday, 14 November 2022 9:53 > > To: Eli Cohen <elic at nvidia.com> > > Cc: mst at redhat.com; linux-kernel at vger.kernel.org; virtualization at lists.linux- > > foundation.org; si-wei.liu at oracle.com; eperezma at redhat.com; > > lulu at redhat.com > > Subject: Re: [PATCH 6/7] vdpa/mlx5: Avoid using reslock in event_handler > > > > On Sun, Nov 13, 2022 at 9:45 PM Eli Cohen <elic at nvidia.com> wrote: > > > > > > event_handler runs under atomic context and may not acquire reslock. We > > > can still guarantee that the handler won't be called after suspend by > > > clearing nb_registered, unregistering the handler and flushing the > > > workqueue. > > > > > > Signed-off-by: Eli Cohen <elic at nvidia.com> > > > --- > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 14 +++----------- > > > 1 file changed, 3 insertions(+), 11 deletions(-) > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > index 6e6490c85be2..bebfba530247 100644 > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > @@ -2872,8 +2872,8 @@ static int mlx5_vdpa_suspend(struct vdpa_device > > *vdev) > > > int i; > > > > > > down_write(&ndev->reslock); > > > - mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > ndev->nb_registered = false; > > > + mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > > I wonder why this can help anything. > I think you were concerned that async events will come when the device was suspended. Since we can't take reslock, I think this guarantees that we won't get any events after suspension. >Ok, I see.> > And if it does, we have simliar > > logic in mlx5_vdpa_dev_del() do we need to fix that as well? > > > We have the same construct there only that I set nb_registered = false after unregistering the notifier. So I probably need to move it before mlx5_notifier_unregister().Right. Thanks> > > Thanks > > > > > flush_workqueue(ndev->mvdev.wq); > > > for (i = 0; i < ndev->cur_num_vqs; i++) { > > > mvq = &ndev->vqs[i]; > > > @@ -3051,7 +3051,7 @@ static void update_carrier(struct work_struct > > *work) > > > else > > > ndev->config.status &= cpu_to_mlx5vdpa16(mvdev, > > ~VIRTIO_NET_S_LINK_UP); > > > > > > - if (ndev->config_cb.callback) > > > + if (ndev->nb_registered && ndev->config_cb.callback) > > > ndev->config_cb.callback(ndev->config_cb.private); > > > > > > kfree(wqent); > > > @@ -3068,21 +3068,13 @@ static int event_handler(struct notifier_block > > *nb, unsigned long event, void *p > > > switch (eqe->sub_type) { > > > case MLX5_PORT_CHANGE_SUBTYPE_DOWN: > > > case MLX5_PORT_CHANGE_SUBTYPE_ACTIVE: > > > - down_read(&ndev->reslock); > > > - if (!ndev->nb_registered) { > > > - up_read(&ndev->reslock); > > > - return NOTIFY_DONE; > > > - } > > > wqent = kzalloc(sizeof(*wqent), GFP_ATOMIC); > > > - if (!wqent) { > > > - up_read(&ndev->reslock); > > > + if (!wqent) > > > return NOTIFY_DONE; > > > - } > > > > > > wqent->mvdev = &ndev->mvdev; > > > INIT_WORK(&wqent->work, update_carrier); > > > queue_work(ndev->mvdev.wq, &wqent->work); > > > - up_read(&ndev->reslock); > > > ret = NOTIFY_OK; > > > break; > > > default: > > > -- > > > 2.38.1 > > > >
Jason Wang
2022-Nov-24 04:04 UTC
[PATCH 6/7] vdpa/mlx5: Avoid using reslock in event_handler
On Wed, Nov 23, 2022 at 12:00 PM Jason Wang <jasowang at redhat.com> wrote:> > On Mon, Nov 14, 2022 at 4:58 PM Eli Cohen <elic at nvidia.com> wrote: > > > > > From: Jason Wang <jasowang at redhat.com> > > > Sent: Monday, 14 November 2022 9:53 > > > To: Eli Cohen <elic at nvidia.com> > > > Cc: mst at redhat.com; linux-kernel at vger.kernel.org; virtualization at lists.linux- > > > foundation.org; si-wei.liu at oracle.com; eperezma at redhat.com; > > > lulu at redhat.com > > > Subject: Re: [PATCH 6/7] vdpa/mlx5: Avoid using reslock in event_handler > > > > > > On Sun, Nov 13, 2022 at 9:45 PM Eli Cohen <elic at nvidia.com> wrote: > > > > > > > > event_handler runs under atomic context and may not acquire reslock. We > > > > can still guarantee that the handler won't be called after suspend by > > > > clearing nb_registered, unregistering the handler and flushing the > > > > workqueue. > > > > > > > > Signed-off-by: Eli Cohen <elic at nvidia.com> > > > > --- > > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 14 +++----------- > > > > 1 file changed, 3 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > index 6e6490c85be2..bebfba530247 100644 > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > @@ -2872,8 +2872,8 @@ static int mlx5_vdpa_suspend(struct vdpa_device > > > *vdev) > > > > int i; > > > > > > > > down_write(&ndev->reslock); > > > > - mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > > ndev->nb_registered = false; > > > > + mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > > > > I wonder why this can help anything. > > I think you were concerned that async events will come when the device was suspended. Since we can't take reslock, I think this guarantees that we won't get any events after suspension. > > > > Ok, I see. > > > > And if it does, we have simliar > > > logic in mlx5_vdpa_dev_del() do we need to fix that as well? > > > > > We have the same construct there only that I set nb_registered = false after unregistering the notifier. So I probably need to move it before mlx5_notifier_unregister(). > > Right. > > ThanksSo I'm fine with this patch. Acked-by: Jason Wang <jasowang at redhat.com> Thanks> > > > > > Thanks > > > > > > > flush_workqueue(ndev->mvdev.wq); > > > > for (i = 0; i < ndev->cur_num_vqs; i++) { > > > > mvq = &ndev->vqs[i]; > > > > @@ -3051,7 +3051,7 @@ static void update_carrier(struct work_struct > > > *work) > > > > else > > > > ndev->config.status &= cpu_to_mlx5vdpa16(mvdev, > > > ~VIRTIO_NET_S_LINK_UP); > > > > > > > > - if (ndev->config_cb.callback) > > > > + if (ndev->nb_registered && ndev->config_cb.callback) > > > > ndev->config_cb.callback(ndev->config_cb.private); > > > > > > > > kfree(wqent); > > > > @@ -3068,21 +3068,13 @@ static int event_handler(struct notifier_block > > > *nb, unsigned long event, void *p > > > > switch (eqe->sub_type) { > > > > case MLX5_PORT_CHANGE_SUBTYPE_DOWN: > > > > case MLX5_PORT_CHANGE_SUBTYPE_ACTIVE: > > > > - down_read(&ndev->reslock); > > > > - if (!ndev->nb_registered) { > > > > - up_read(&ndev->reslock); > > > > - return NOTIFY_DONE; > > > > - } > > > > wqent = kzalloc(sizeof(*wqent), GFP_ATOMIC); > > > > - if (!wqent) { > > > > - up_read(&ndev->reslock); > > > > + if (!wqent) > > > > return NOTIFY_DONE; > > > > - } > > > > > > > > wqent->mvdev = &ndev->mvdev; > > > > INIT_WORK(&wqent->work, update_carrier); > > > > queue_work(ndev->mvdev.wq, &wqent->work); > > > > - up_read(&ndev->reslock); > > > > ret = NOTIFY_OK; > > > > break; > > > > default: > > > > -- > > > > 2.38.1 > > > > > >