Jason Wang
2023-Mar-22 02:18 UTC
[PATCH] vdpa/mlx5: Remove debugfs file after device unregister
On Tue, Mar 21, 2023 at 4:29?PM Eli Cohen <elic at nvidia.com> wrote:> > > -----Original Message----- > > From: Jason Wang <jasowang at redhat.com> > > Sent: Tuesday, 21 March 2023 5:23 > > To: Eli Cohen <elic at nvidia.com> > > Cc: mst at redhat.com; si-wei.liu at oracle.com; eperezma at redhat.com; > > virtualization at lists.linux-foundation.org; Parav Pandit > > <parav at mellanox.com> > > Subject: Re: [PATCH] vdpa/mlx5: Remove debugfs file after device unregister > > > > On Mon, Mar 20, 2023 at 5:35?PM Eli Cohen <elic at nvidia.com> wrote: > > > > > > > > > On 20/03/2023 11:28, Jason Wang wrote: > > > > On Mon, Mar 20, 2023 at 5:07?PM Eli Cohen <elic at nvidia.com> wrote: > > > >> On 17/03/2023 5:32, Jason Wang wrote: > > > >>> On Sun, Mar 12, 2023 at 4:41?PM Eli Cohen <elic at nvidia.com> wrote: > > > >>>> When deleting the vdpa device, the debugfs files need to be removed > > so > > > >>>> need to remove debugfs after the device has been unregistered. > > > >>>> > > > >>>> This fixes null pointer dereference when someone deletes the device > > > >>>> after debugfs has been populated. > > > >>>> > > > >>>> Fixes: 294221004322 ("vdpa/mlx5: Add debugfs subtree") > > > >>>> Signed-off-by: Eli Cohen <elic at nvidia.com> > > > >>>> --- > > > >>>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 4 ++-- > > > >>>> 1 file changed, 2 insertions(+), 2 deletions(-) > > > >>>> > > > >>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > >>>> index 3858ba1e8975..3f6149f2ffd4 100644 > > > >>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > >>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > >>>> @@ -3322,8 +3322,6 @@ static void mlx5_vdpa_dev_del(struct > > vdpa_mgmt_dev *v_mdev, struct vdpa_device * > > > >>>> struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > > > >>>> struct workqueue_struct *wq; > > > >>>> > > > >>>> - mlx5_vdpa_remove_debugfs(ndev->debugfs); > > > >>>> - ndev->debugfs = NULL; > > > >>>> if (ndev->nb_registered) { > > > >>>> ndev->nb_registered = false; > > > >>>> mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > >>>> @@ -3332,6 +3330,8 @@ static void mlx5_vdpa_dev_del(struct > > vdpa_mgmt_dev *v_mdev, struct vdpa_device * > > > >>>> mvdev->wq = NULL; > > > >>>> destroy_workqueue(wq); > > > >>>> _vdpa_unregister_device(dev); > > > >>> What if the user tries to access debugfs after > > _vdpa_unregister_device()? > > > >> I don't see an issue with that. During the process of deleting a device, > > > >> the resources are destroyed and their corresponding debugfs files are > > > >> also destroyed just prior to destroying the resource. > > > > For example, rx_flow_table_show() requires the structure mlx5_vdpa_net > > > > alive, but it seems the structure has been freed after > > > > _vdpa_unregister_device(). > > > But in this case, rx_flow_table_show() gets called first, so the file > > > will be removed from the debugfs before > > > > > > mlx5_vdpa_net gets released. > > > > Just to make sure we are at the same page. I meant: > > > > [CPU 0] _vdpa_unregister_device(dev); > > If the vdpa is still in use, e.g a VM is using it, this call will hang.Ture since it will wait for the VM to release the fd. But this is not the case when the device is bound to the virtio-vdpa driver. In this case, do we need to synchronize here?> > For this call to conclude, the device needs to be reset. During the reset process, all the resources are destroyed, including (for example) the TIR. Just before destroying the TIR, the debugfs entry is removed by calling mlx5_vdpa_remove_rx_flow_table(). So at the time _vdpa_unregister_device() is completed, the debugfs file is not there anymore.Ok, so in this case we probably don't need a second call to mlx5_vdpa_remove_debugfs()? Thanks> > > [CPU 1] rx_flow_table_show() > > [CPU 0] mlx5_vdpa_remove_debugfs() > > > > So when CPU 1 tries to access the flow table, the ndev has been > > released by CPU0 in _vdpa_unregister_device()? > > > > Thanks > > > > > > > > > > > > > If a user tries to access that file after _vdpa_unregister_device() > > > > but before mlx5_vdpa_remove_debugfs(), will we end up with > > > > use-after-free? > > > > > > > > Thanks > > > > > > > > > > > >>> Thanks > > > >>> > > > >>>> + mlx5_vdpa_remove_debugfs(ndev->debugfs); > > > >>>> + ndev->debugfs = NULL; > > > >>>> mgtdev->ndev = NULL; > > > >>>> } > > > >>>> > > > >>>> -- > > > >>>> 2.38.1 > > > >>>> > > > >