Am 18.08.22 um 01:13 schrieb Dmitry Osipenko:> On 8/18/22 01:57, Dmitry Osipenko wrote:
>> On 8/15/22 18:54, Dmitry Osipenko wrote:
>>> On 8/15/22 17:57, Dmitry Osipenko wrote:
>>>> On 8/15/22 16:53, Christian K?nig wrote:
>>>>> Am 15.08.22 um 15:45 schrieb Dmitry Osipenko:
>>>>>> [SNIP]
>>>>>>> Well that comment sounds like KVM is doing the
right thing, so I'm
>>>>>>> wondering what exactly is going on here.
>>>>>> KVM actually doesn't hold the page reference, it
takes the temporal
>>>>>> reference during page fault and then drops the
reference once page is
>>>>>> mapped, IIUC. Is it still illegal for TTM? Or there is
a possibility for
>>>>>> a race condition here?
>>>>>>
>>>>> Well the question is why does KVM grab the page reference
in the first
>>>>> place?
>>>>>
>>>>> If that is to prevent the mapping from changing then yes
that's illegal
>>>>> and won't work. It can always happen that you grab the
address, solve
>>>>> the fault and then immediately fault again because the
address you just
>>>>> grabbed is invalidated.
>>>>>
>>>>> If it's for some other reason than we should probably
investigate if we
>>>>> shouldn't stop doing this.
>>>> CC: +Paolo Bonzini who introduced this code
>>>>
>>>> commit add6a0cd1c5ba51b201e1361b05a5df817083618
>>>> Author: Paolo Bonzini <pbonzini at redhat.com>
>>>> Date: Tue Jun 7 17:51:18 2016 +0200
>>>>
>>>> KVM: MMU: try to fix up page faults before giving up
>>>>
>>>> The vGPU folks would like to trap the first access to a
BAR by setting
>>>> vm_ops on the VMAs produced by mmap-ing a VFIO device.
The fault
>>>> handler
>>>> then can use remap_pfn_range to place some non-reserved
pages in the
>>>> VMA.
>>>>
>>>> This kind of VM_PFNMAP mapping is not handled by KVM, but
follow_pfn
>>>> and fixup_user_fault together help supporting it. The
patch also
>>>> supports
>>>> VM_MIXEDMAP vmas where the pfns are not reserved and thus
subject to
>>>> reference counting.
>>>>
>>>> @Paolo,
>>>>
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F73e5ed8d-0d25-7d44-8fa2-e1d61b1f5a04%40amd.com%2FT%2F%23m7647ce5f8c4749599d2c6bc15a2b45f8d8cf8154&data=05%7C01%7Cchristian.koenig%40amd.com%7Cecb0f0eb6c2d43afa15b08da80a625ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637963748360952327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=52Dvisa7p%2BckmBxMvDJFScGSNy9JRPDdnPK05C%2F6n7A%3D&reserved=0
>>>>
>>> If we need to bump the refcount only for VM_MIXEDMAP and not for
>>> VM_PFNMAP, then perhaps we could add a flag for that to the
kvm_main
>>> code that will denote to kvm_release_page_clean whether it needs to
put
>>> the page?
>> The other variant that kind of works is to mark TTM pages reserved
using
>> SetPageReserved/ClearPageReserved, telling KVM not to mess with the
page
>> struct. But the potential consequences of doing this are unclear to me.
>>
>> Christian, do you think we can do it?
> Although, no. It also doesn't work with KVM without additional changes
> to KVM.
Well my fundamental problem is that I can't fit together why KVM is
grabing a page reference in the first place.
See the idea of the page reference is that you have one reference is
that you count the reference so that the memory is not reused while you
access it, e.g. for I/O or mapping it into different address spaces etc...
But none of those use cases seem to apply to KVM. If I'm not totally
mistaken in KVM you want to make sure that the address space mapping,
e.g. the translation between virtual and physical address, don't change
while you handle it, but grabbing a page reference is the completely
wrong approach for that.
Regards,
Christian.