Si-Wei Liu
2023-Oct-25 23:31 UTC
[PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset
Hi Yang Lei, Thanks for testing my patches and reporting! As for the issue, could you please try what I posted in: https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu at oracle.com/ and let me know how it goes? Thank you very much! Thanks, -Siwei On 10/25/2023 2:41 AM, Lei Yang wrote:> On Wed, Oct 25, 2023 at 1:27?AM Si-Wei Liu <si-wei.liu at oracle.com> wrote: > Hello Si-Wei >> Thanks a lot for testing! Please be aware that there's a follow-up fix >> for a potential oops in this v4 series: >> > The first, when I did not apply this patch [1], I will also hit this > patch mentioned problem. After I applied this patch, this problem will > no longer to hit again. But I hit another issues, about the error > messages please review the attached file. > [1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu at oracle.com/ > > My test steps: > git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > cd linux/ > b4 am 1697880319-4937-1-git-send-email-si-wei.liu at oracle.com > b4 am 20231018171456.1624030-2-dtatulea at nvidia.com > b4 am 1698102863-21122-1-git-send-email-si-wei.liu at oracle.com > git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx > git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx > git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx > cp /boot/config-5.14.0-377.el9.x86_64 .config > make -j 32 > make modules_install > make install > > Thanks > > Lei >> https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu at oracle.com/ >> >> Would be nice to have it applied for any tests. >> >> Thanks, >> -Siwei >> >> On 10/23/2023 11:51 PM, Lei Yang wrote: >>> QE tested this series v4 with regression testing on real nic, there is >>> no new regression bug. >>> >>> Tested-by: Lei Yang <leiyang at redhat.com> >>> >>> On Tue, Oct 24, 2023 at 6:02?AM Si-Wei Liu <si-wei.liu at oracle.com> wrote: >>>> >>>> On 10/22/2023 8:51 PM, Jason Wang wrote: >>>>> Hi Si-Wei: >>>>> >>>>> On Sat, Oct 21, 2023 at 5:28?PM Si-Wei Liu <si-wei.liu at oracle.com> wrote: >>>>>> In order to reduce needlessly high setup and teardown cost >>>>>> of iotlb mapping during live migration, it's crucial to >>>>>> decouple the vhost-vdpa iotlb abstraction from the virtio >>>>>> device life cycle, i.e. iotlb mappings should be left >>>>>> intact across virtio device reset [1]. For it to work, the >>>>>> on-chip IOMMU parent device could implement a separate >>>>>> .reset_map() operation callback to restore 1:1 DMA mapping >>>>>> without having to resort to the .reset() callback, the >>>>>> latter of which is mainly used to reset virtio device state. >>>>>> This new .reset_map() callback will be invoked only before >>>>>> the vhost-vdpa driver is to be removed and detached from >>>>>> the vdpa bus, such that other vdpa bus drivers, e.g. >>>>>> virtio-vdpa, can start with 1:1 DMA mapping when they >>>>>> are attached. For the context, those on-chip IOMMU parent >>>>>> devices, create the 1:1 DMA mapping at vdpa device creation, >>>>>> and they would implicitly destroy the 1:1 mapping when >>>>>> the first .set_map or .dma_map callback is invoked. >>>>>> >>>>>> This patchset is rebased on top of the latest vhost tree. >>>>>> >>>>>> [1] Reducing vdpa migration downtime because of memory pin / maps >>>>>> https://www.mail-archive.com/qemu-devel at nongnu.org/msg953755.html >>>>>> >>>>>> --- >>>>>> v4: >>>>>> - Rework compatibility using new .compat_reset driver op >>>>> I still think having a set_backend_feature() >>>> This will overload backend features with the role of carrying over >>>> compatibility quirks, which I tried to avoid from. While I think the >>>> .compat_reset from the v4 code just works with the backend features >>>> acknowledgement (and maybe others as well) to determine, but not >>>> directly tie it to backend features itself. These two have different >>>> implications in terms of requirement, scope and maintaining/deprecation, >>>> better to cope with compat quirks in explicit and driver visible way. >>>> >>>>> or reset_map(clean=true) might be better. >>>> An explicit op might be marginally better in driver writer's point of >>>> view. Compliant driver doesn't have to bother asserting clean_map never >>>> be true so their code would never bother dealing with this case, as >>>> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map >>>> during reset for older userspace": >>>> >>>> " >>>> The separation of .compat_reset from the regular .reset allows >>>> vhost-vdpa able to know which driver had broken behavior before, so it >>>> can apply the corresponding compatibility quirk to the individual >>>> driver >>>> whenever needed. Compared to overloading the existing .reset with >>>> flags, .compat_reset won't cause any extra burden to the implementation >>>> of every compliant driver. >>>> " >>>> >>>>> As it tries hard to not introduce new stuff on the bus. >>>> Honestly I don't see substantial difference between these other than the >>>> color. There's no single best solution that stands out among the 3. And >>>> I assume you already noticed it from all the above 3 approaches will >>>> have to go with backend features negotiation, that the 1st vdpa reset >>>> before backend feature negotiation will use the compliant version of >>>> .reset that doesn't clean up the map. While I don't think this nuance >>>> matters much to existing older userspace apps, as the maps should >>>> already get cleaned by previous process in vhost_vdpa_cleanup(), but if >>>> bug-for-bug behavioral compatibility is what you want, module parameter >>>> will be the single best answer. >>>> >>>> Regards, >>>> -Siwei >>>> >>>>> But we can listen to others for sure. >>>>> >>>>> Thanks >>>>>