thr3ads.net - Virtualization - DANGER WILL ROBINSON, DANGER [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Paolo Bonzini

2019-Oct-02 13:46 UTC

DANGER WILL ROBINSON, DANGER

On 02/10/19 21:27, Jerome Glisse wrote:> On Tue, Sep 10, 2019 at 07:49:51AM +0000, Mircea CIRJALIU - MELIU wrote:
>>> On 05/09/19 20:09, Jerome Glisse wrote:
>>>> Not sure i understand, you are saying that the solution i
outline
>>>> above does not work ? If so then i think you are wrong, in the
above
>>>> solution the importing process mmap a device file and the
resulting
>>>> vma is then populated using insert_pfn() and constantly keep
>>>> synchronize with the target process through mirroring which
means that
>>>> you never have to look at the struct page ... you can mirror
any kind
>>>> of memory from the remote process.
>>>
>>> If insert_pfn in turn calls MMU notifiers for the target VMA (which
would be
>>> the KVM MMU notifier), then that would work.  Though I guess it
would be
>>> possible to call MMU notifier update callbacks around the call to
insert_pfn.
>>
>> Can't do that.
>> First, insert_pfn() uses set_pte_at() which won't trigger the MMU
notifier on
>> the target VMA. It's also static, so I'll have to access it
thru vmf_insert_pfn()
>> or vmf_insert_mixed().
> 
> Why would you need to target mmu notifier on target vma ?
If the mapping of the source VMA changes, mirroring can update the
target VMA via insert_pfn.  But what ensures that KVM's MMU notifier
dismantles its own existing page tables (so that they can be recreated
with the new mapping from the source VMA)?

Thanks,

Paolo
> You do not need
> that. The workflow is:
> 
>     userspace:
>         ptr = mmap(/dev/kvm-mirroring-device, virtual_addresse_of_target)
> 
> Then when the mirroring process access ptr it triggers page fault that
> endup in the vm_operation_struct->fault() which is just doing:
> 
>     kernel-kvm-mirroring-function:
>         kvm_mirror_page_fault(struct vm_fault *vmf) {
>             struct kvm_mirror_struct *kvmms;
> 
>             kvmms = kvm_mirror_struct_from_file(vmf->vma->vm_file);
>             ...
>         again:
>             hmm_range_register(&range);
>             hmm_range_snapshot(&range);
>             take_lock(kvmms->update);
>             if (!hmm_range_valid(&range)) {
>                 vm_insert_pfn();
>                 drop_lock(kvmms->update);
>                 hmm_range_unregister(&range);
>                 return VM_FAULT_NOPAGE;
>             }
>             drop_lock(kvmms->update);
>             goto again;
>         }
> 
> The notifier callback:
>         kvmms_notifier_start() {
>             take_lock(kvmms->update);
>             clear_pte(start, end);
>             drop_lock(kvmms->update);
>         }
> 
>>
>> Our model (the importing process is encapsulated in another VM) forces
us
>> to mirror certain pages from the anon VMA backing one VM's system
RAM to
>> the other VM's anon VMA. 
> 
> The mirror does not have to be an anon vma it can very well be a
> device vma ie mmap of a device file. I do not see any reasons why
> the mirror need to be an anon vma. Please explain why.
> 
>>
>> Using the functions above means setting VM_PFNMAP|VM_MIXEDMAP on 
>> the target anon VMA, but I guess this breaks the VMA. Is this
recommended?
> 
> The mirror vma should not be an anon vma.
> 
>>
>> Then, mapping anon pages from one VMA to another without fixing the 
>> refcount and the mapcount breaks the daemons that think they're
working
>> on a pure anon VMA (kcompactd, khugepaged).
> 
> Note here the target vma ie the mirroring one is a mmap of device file
> and thus is skip by all of the above (kcompactd, khugepaged, ...) it is
> fully ignore by core mm.
> 
> Thus you do not need to fix the refcount in any way. If any of the core
> mm try to reclaim memory from the original vma then you will get mmu
> notifier callbacks and all you have to do is clear the page table of your
> device vma.
> 
> I did exactly that as a tools in the past and it works just fine with
> no change to core mm whatsoever.
> 
> Cheers,
> J?r?me
>

Jerome Glisse

2019-Oct-02 14:15 UTC

head link

DANGER WILL ROBINSON, DANGER

On Wed, Oct 02, 2019 at 03:46:30PM +0200, Paolo Bonzini
wrote:> On 02/10/19 21:27, Jerome Glisse wrote:
> > On Tue, Sep 10, 2019 at 07:49:51AM +0000, Mircea CIRJALIU - MELIU
wrote:
> >>> On 05/09/19 20:09, Jerome Glisse wrote:
> >>>> Not sure i understand, you are saying that the solution i
outline
> >>>> above does not work ? If so then i think you are wrong, in
the above
> >>>> solution the importing process mmap a device file and the
resulting
> >>>> vma is then populated using insert_pfn() and constantly
keep
> >>>> synchronize with the target process through mirroring
which means that
> >>>> you never have to look at the struct page ... you can
mirror any kind
> >>>> of memory from the remote process.
> >>>
> >>> If insert_pfn in turn calls MMU notifiers for the target VMA
(which would be
> >>> the KVM MMU notifier), then that would work.  Though I guess
it would be
> >>> possible to call MMU notifier update callbacks around the call
to insert_pfn.
> >>
> >> Can't do that.
> >> First, insert_pfn() uses set_pte_at() which won't trigger the
MMU notifier on
> >> the target VMA. It's also static, so I'll have to access
it thru vmf_insert_pfn()
> >> or vmf_insert_mixed().
> > 
> > Why would you need to target mmu notifier on target vma ?
> 
> If the mapping of the source VMA changes, mirroring can update the
> target VMA via insert_pfn.  But what ensures that KVM's MMU notifier
> dismantles its own existing page tables (so that they can be recreated
> with the new mapping from the source VMA)?
> 
So just to make sure i follow we have:
      - qemu process on host with anonymous vma
            -> host cpu page table
      - kvm which maps host anonymous vma to guest
            -> kvm guest page table
      - kvm inspector process which mirror vma from qemu process
            -> inspector process page table

AFAIK the KVM notifier's will clear the kvm guest page table whenever
necessary (through kvm_mmu_notifier_invalidate_range_start). This is
what ensure that KVM's dismatles its own mapping, it abides to mmu-
notifier callbacks. If you did not you would have bugs (at least i
expect so). Am i wrong here ?

The mirroring kernel driver would also register the notifier against
the quemu process and would also abide to notifier callbacks.

What you want to maintain at all times is that none of the actors
above ever look at different page for the same virtual address (ie
one looking at older page while another look at new page).

This is where you have helper like HMM that make sure that you can
not populate the mirroring vma while a notifier is on going. Which
means that everything is serialize on the notifier.

Cheers,
J?r?me

Paolo Bonzini

2019-Oct-02 16:18 UTC

head link

DANGER WILL ROBINSON, DANGER

On 02/10/19 16:15, Jerome Glisse wrote:>>> Why would you need to target mmu notifier on target vma ?
>> If the mapping of the source VMA changes, mirroring can update the
>> target VMA via insert_pfn.  But what ensures that KVM's MMU
notifier
>> dismantles its own existing page tables (so that they can be recreated
>> with the new mapping from the source VMA)?
>>
> So just to make sure i follow we have:
>       - qemu process on host with anonymous vma
>             -> host cpu page table
>       - kvm which maps host anonymous vma to guest
>             -> kvm guest page table
>       - kvm inspector process which mirror vma from qemu process
>             -> inspector process page table
> 
> AFAIK the KVM notifier's will clear the kvm guest page table whenever
> necessary (through kvm_mmu_notifier_invalidate_range_start). This is
> what ensure that KVM's dismatles its own mapping, it abides to mmu-
> notifier callbacks. If you did not you would have bugs (at least i
> expect so). Am i wrong here ?
The KVM inspector process is also (or can be) a QEMU that will have to
create its own KVM guest page table.

So if a page in the source VMA is unmapped we want:

- the source KVM to invalidate its guest page table (done by the KVM MMU
notifier)

- the target VMA to be invalidated (easy using mirroring)

- the target KVM to invalidate its guest page table, as a result of
invalidation of the target VMA

Paolo

Seemingly Similar Threads

Search for more reasonably related threads

Virtualization - Oct 2019 - DANGER WILL ROBINSON, DANGER

DANGER WILL ROBINSON, DANGER

DANGER WILL ROBINSON, DANGER

DANGER WILL ROBINSON, DANGER

Seemingly Similar Threads