Leon Romanovsky
2022-Mar-03 19:14 UTC
[PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress
On Thu, Mar 03, 2022 at 03:19:29PM +0000, Lee Jones wrote:> All workers/users should be halted before any clean-up should take place. > > Suggested-by: Michael S. Tsirkin <mst at redhat.com> > Signed-off-by: Lee Jones <lee.jones at linaro.org> > --- > drivers/vhost/vhost.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index bbaff6a5e21b8..d935d2506963f 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev) > int i; > > for (i = 0; i < dev->nvqs; ++i) { > + /* Ideally all workers should be stopped prior to clean-up */ > + WARN_ON(mutex_is_locked(&dev->vqs[i]->mutex)); > + > mutex_lock(&dev->vqs[i]->mutex);I know nothing about vhost, but this construction and patch looks strange to me. If all workers were stopped, you won't need mutex_lock(). The mutex_lock here suggests to me that workers can still run here. Thanks> if (dev->vqs[i]->error_ctx) > eventfd_ctx_put(dev->vqs[i]->error_ctx); > -- > 2.35.1.574.g5d30c73bfb-goog >
Lee Jones
2022-Mar-03 19:38 UTC
[PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress
On Thu, 03 Mar 2022, Leon Romanovsky wrote:> On Thu, Mar 03, 2022 at 03:19:29PM +0000, Lee Jones wrote: > > All workers/users should be halted before any clean-up should take place. > > > > Suggested-by: Michael S. Tsirkin <mst at redhat.com> > > Signed-off-by: Lee Jones <lee.jones at linaro.org> > > --- > > drivers/vhost/vhost.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > index bbaff6a5e21b8..d935d2506963f 100644 > > --- a/drivers/vhost/vhost.c > > +++ b/drivers/vhost/vhost.c > > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev) > > int i; > > > > for (i = 0; i < dev->nvqs; ++i) { > > + /* Ideally all workers should be stopped prior to clean-up */ > > + WARN_ON(mutex_is_locked(&dev->vqs[i]->mutex)); > > + > > mutex_lock(&dev->vqs[i]->mutex); > > I know nothing about vhost, but this construction and patch looks > strange to me. > > If all workers were stopped, you won't need mutex_lock(). The mutex_lock > here suggests to me that workers can still run here.The suggestion for this patch came from the maintainer. Please see the conversation here: https://lore.kernel.org/all/20220302082021-mutt-send-email-mst at kernel.org/ -- Lee Jones [???] Principal Technical Lead - Developer Services Linaro.org ? Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog
Michael S. Tsirkin
2022-Mar-03 21:01 UTC
[PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress
On Thu, Mar 03, 2022 at 09:14:36PM +0200, Leon Romanovsky wrote:> On Thu, Mar 03, 2022 at 03:19:29PM +0000, Lee Jones wrote: > > All workers/users should be halted before any clean-up should take place. > > > > Suggested-by: Michael S. Tsirkin <mst at redhat.com> > > Signed-off-by: Lee Jones <lee.jones at linaro.org> > > --- > > drivers/vhost/vhost.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > index bbaff6a5e21b8..d935d2506963f 100644 > > --- a/drivers/vhost/vhost.c > > +++ b/drivers/vhost/vhost.c > > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev) > > int i; > > > > for (i = 0; i < dev->nvqs; ++i) { > > + /* Ideally all workers should be stopped prior to clean-up */ > > + WARN_ON(mutex_is_locked(&dev->vqs[i]->mutex)); > > + > > mutex_lock(&dev->vqs[i]->mutex); > > I know nothing about vhost, but this construction and patch looks > strange to me. > > If all workers were stopped, you won't need mutex_lock(). The mutex_lock > here suggests to me that workers can still run here. > > Thanks"Ideally" here is misleading, we need a bigger detailed comment along the lines of: /* * By design, no workers can run here. But if there's a bug and the * driver did not flush all work properly then they might, and we * encountered such bugs in the past. With no proper flush guest won't * work correctly but avoiding host memory corruption in this case * sounds like a good idea. */> > if (dev->vqs[i]->error_ctx) > > eventfd_ctx_put(dev->vqs[i]->error_ctx); > > -- > > 2.35.1.574.g5d30c73bfb-goog > >