Si-Wei Liu
2023-Oct-23 23:03 UTC
[PATCH v4 5/7] vhost-vdpa: clean iotlb map during reset for older userspace
(+ linux-next) Hi Michael, Dragos reported below oops for which I have a fix at hand (having it fully tested), ready to be posted to linux-next. Please let me know if you want me to respin the original patch series, or you would think it'd be fine to fix it on top. On 10/23/2023 11:59 AM, Dragos Tatulea wrote:> On Sat, 2023-10-21 at 02:25 -0700, Si-Wei Liu wrote: >> Using .compat_reset op from the previous patch, the buggy .reset >> behaviour can be kept as-is on older userspace apps, which don't ack the >> IOTLB_PERSIST backend feature. As this compatibility quirk is limited to >> those drivers that used to be buggy in the past, it won't affect change >> the behaviour or affect ABI on the setups with API compliant driver. >> >> The separation of .compat_reset from the regular .reset allows >> vhost-vdpa able to know which driver had broken behaviour before, so it >> can apply the corresponding compatibility quirk to the individual driver >> whenever needed.? Compared to overloading the existing .reset with >> flags, .compat_reset won't cause any extra burden to the implementation >> of every compliant driver. >> >> Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> >> --- >> ?drivers/vhost/vdpa.c???????? | 17 +++++++++++++---- >> ?drivers/virtio/virtio_vdpa.c |? 2 +- >> ?include/linux/vdpa.h???????? |? 7 +++++-- >> ?3 files changed, 19 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >> index acc7c74ba7d6..9ce40003793b 100644 >> --- a/drivers/vhost/vdpa.c >> +++ b/drivers/vhost/vdpa.c >> @@ -227,13 +227,22 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa >> *v, u16 qid) >> ????????irq_bypass_unregister_producer(&vq->call_ctx.producer); >> ?} >> >> -static int vhost_vdpa_reset(struct vhost_vdpa *v) >> +static int _compat_vdpa_reset(struct vhost_vdpa *v) >> ?{ >> ????????struct vdpa_device *vdpa = v->vdpa; >> +???????u32 flags = 0; >> >> -???????v->in_batch = 0; >> +???????flags |= !vhost_backend_has_feature(v->vdev.vqs[0], >> +?????????????????????????????????????????? VHOST_BACKEND_F_IOTLB_PERSIST) ? >> +??????????????? VDPA_RESET_F_CLEAN_MAP : 0; > Hi Si-Wei, > > I am getting a Oops due to the vqs not being initialized here. Here's how it it > looks like: > > [ 37.817075] BUG: kernel NULL pointer dereference, address: 0000000000000000 > [ 37.817674] #PF: supervisor read access in kernel mode > [ 37.818150] #PF: error_code(0x0000) - not-present page > [ 37.818615] PGD 0 P4D 0 > [ 37.818893] Oops: 0000 [#1] SMP > [ 37.819223] CPU: 3 PID: 1727 Comm: qemu-system-x86 Not tainted 6.6.0-rc6+ #2 > [ 37.819829] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel- > 1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > [ 37.820791] RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] > [ 37.821316] Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4 > e0 48 8b 43 28 48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49 > 8b 95 d8 00 00 00 48 8b 80 88 45 00 00 48 c1 e8 08 48 > [ 37.822811] RSP: 0018:ffff8881063c3c38 EFLAGS: 00010246 > [ 37.823285] RAX: 0000000000000000 RBX: ffff8881074eb800 RCX: 0000000000000000 > [ 37.823893] RDX: 0000000000000000 RSI: ffff888103ab4000 RDI: ffffffffa0561319 > [ 37.824506] RBP: ffff888103ab4000 R08: 00000000ffffdfff R09: 0000000000000001 > [ 37.825116] R10: 0000000000000003 R11: ffff88887fecbac0 R12: ffff888103ab42b8 > [ 37.825721] R13: ffff888106dbe850 R14: 0000000000000003 R15: ffff8881074ebc18 > [ 37.826326] FS: 00007f02fba6ef00(0000) GS:ffff88885f8c0000(0000) > knlGS:0000000000000000 > [ 37.827035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 37.827552] CR2: 0000000000000000 CR3: 00000001325e5003 CR4: 0000000000372ea0 > [ 37.828162] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 37.828772] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 37.829381] Call Trace: > [ 37.829660] <TASK> > [ 37.829911] ? __die+0x1f/0x60 > [ 37.830234] ? page_fault_oops+0x14c/0x3b0 > [ 37.830623] ? exc_page_fault+0x74/0x140 > [ 37.830999] ? asm_exc_page_fault+0x22/0x30 > [ 37.831402] ? _compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] > [ 37.831888] ? _compat_vdpa_reset+0x32/0xc0 [vhost_vdpa] > [ 37.832366] vhost_vdpa_open+0x55/0x270 [vhost_vdpa] > [ 37.832821] ? sb_init_dio_done_wq+0x50/0x50 > [ 37.833225] chrdev_open+0xc0/0x210 > [ 37.833582] ? __unregister_chrdev+0x50/0x50 > [ 37.833990] do_dentry_open+0x1fc/0x4f0 > [ 37.834363] path_openat+0xc2d/0xf20 > [ 37.834721] do_filp_open+0xb4/0x160 > [ 37.835082] ? kmem_cache_alloc+0x3c/0x490 > [ 37.835474] do_sys_openat2+0x8d/0xc0 > [ 37.835834] __x64_sys_openat+0x6a/0xa0 > [ 37.836208] do_syscall_64+0x3c/0x80 > [ 37.836564] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > [ 37.837021] RIP: 0033:0x7f02fcc2c085 > [ 37.837378] Code: 8b 55 d0 48 89 45 b0 75 a0 44 89 55 9c e8 63 7d f8 ff 44 8b > 55 9c 89 da 4c 89 e6 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 > ff ff 77 33 44 89 c7 89 45 9c e8 b8 7d f8 ff 8b 45 9c > [ 37.838891] RSP: 002b:00007ffdea3c8cc0 EFLAGS: 00000293 ORIG_RAX: > 0000000000000101 > [ 37.839571] RAX: ffffffffffffffda RBX: 0000000000080002 RCX: 00007f02fcc2c085 > [ 37.840179] RDX: 0000000000080002 RSI: 000055e439b5fa40 RDI: 00000000ffffff9c > [ 37.840785] RBP: 00007ffdea3c8d30 R08: 0000000000000000 R09: 00007ffdea3c8df8 > [ 37.841396] R10: 0000000000000000 R11: 0000000000000293 R12: 000055e439b5fa40 > [ 37.842014] R13: 0000000000000000 R14: 000055e43792fd00 R15: 0000000000000000 > [ 37.842626] </TASK> > [ 37.842884] Modules linked in: vhost_vdpa vhost mlx5_vdpa vringh vhost_iotlb > vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink > iptable_nat nf_nat xt_addrtype br_netfilter rpcrdma rdma_ucm ib_iser libiscsi > scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core > overlay zram zsmalloc fuse [last unloaded: mlx5_core] > [ 37.845437] CR2: 0000000000000000 > [ 37.845778] ---[ end trace 0000000000000000 ]--- > [ 37.846205] RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] > [ 37.846730] Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4 > e0 48 8b 43 28 48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49 > 8b 95 d8 00 00 00 48 8b 80 88 45 00 00 48 c1 e8 08 48 > [ 37.848240] RSP: 0018:ffff8881063c3c38 EFLAGS: 00010246 > [ 37.848711] RAX: 0000000000000000 RBX: ffff8881074eb800 RCX: 0000000000000000 > [ 37.849319] RDX: 0000000000000000 RSI: ffff888103ab4000 RDI: ffffffffa0561319 > [ 37.849924] RBP: ffff888103ab4000 R08: 00000000ffffdfff R09: 0000000000000001 > [ 37.850531] R10: 0000000000000003 R11: ffff88887fecbac0 R12: ffff888103ab42b8 > [ 37.851136] R13: ffff888106dbe850 R14: 0000000000000003 R15: ffff8881074ebc18 > [ 37.851741] FS: 00007f02fba6ef00(0000) GS:ffff88885f8c0000(0000) > knlGS:0000000000000000 > [ 37.852464] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 37.852975] CR2: 0000000000000000 CR3: 00000001325e5003 CR4: 0000000000372ea0 > [ 37.853585] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 37.854192] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 37.854797] note: qemu-system-x86[1727] exited with irqs disabled > > Looks like the patches are already in linux-next so I guess we'll need a fix for > this. >To Dragos: thanks for your report, I will add your Reported-by from here. Thanks, -Siwei> Thanks, > Dragos >