thr3ads.net - Linux Virtualization - [PATCH net V2 4/4] vhost: log dirty page correctly [Dec 2018]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2018-Dec-24 17:41 UTC

[PATCH net V2 4/4] vhost: log dirty page correctly

On Mon, Dec 24, 2018 at 11:43:31AM +0800, Jason Wang
wrote:> 
> On 2018/12/14 ??9:20, Michael S. Tsirkin wrote:
> > On Fri, Dec 14, 2018 at 10:43:03AM +0800, Jason Wang wrote:
> > > On 2018/12/13 ??10:31, Michael S. Tsirkin wrote:
> > > > > Just to make sure I understand this. It looks to me we
should:
> > > > > 
> > > > > - allow passing GIOVA->GPA through UAPI
> > > > > 
> > > > > - cache GIOVA->GPA somewhere but still use
GIOVA->HVA in device IOTLB for
> > > > > performance
> > > > > 
> > > > > Is this what you suggest?
> > > > > 
> > > > > Thanks
> > > > Not really. We already have GPA->HVA, so I suggested a
flag to pass
> > > > GIOVA->GPA in the IOTLB.
> > > > 
> > > > This has advantages for security since a single table needs
> > > > then to be validated to ensure guest does not corrupt
> > > > QEMU memory.
> > > > 
> > > I wonder how much we can gain through this. Currently, qemu IOMMU
gives
> > > GIOVA->GPA mapping, and qemu vhost code will translate GPA to
HVA then pass
> > > GIOVA->HVA to vhost. It looks no difference to me.
> > > 
> > > Thanks
> > The difference is in security not in performance.  Getting a bad HVA
> > corrupts QEMU memory and it might be guest controlled. Very risky.
> 
> 
> How can this be controlled by guest? HVA was generated from qemu ram blocks
> which is totally under the control of qemu memory core instead of guest.
> 
> 
> Thanks
It is ultimately under guest influence as guest supplies IOVA->GPA
translations.  qemu translates GPA->HVA and gives the translated result
to the kernel.  If it's not buggy and kernel isn't buggy it's all
fine.

But that's the approach that was proven not to work in the 20th century.
In the 21st century we are trying defence in depth approach.

My point is that a single code path that is responsible for
the HVA translations is better than two.
> 
> >   If
> > translations to HVA are done in a single place through a single table
> > it's safer as there's a single risky place.
> >

Jason Wang

2018-Dec-25 09:43 UTC

head link

[PATCH net V2 4/4] vhost: log dirty page correctly

On 2018/12/25 ??1:41, Michael S. Tsirkin wrote:> On Mon, Dec 24, 2018 at 11:43:31AM +0800, Jason Wang wrote:
>> On 2018/12/14 ??9:20, Michael S. Tsirkin wrote:
>>> On Fri, Dec 14, 2018 at 10:43:03AM +0800, Jason Wang wrote:
>>>> On 2018/12/13 ??10:31, Michael S. Tsirkin wrote:
>>>>>> Just to make sure I understand this. It looks to me we
should:
>>>>>>
>>>>>> - allow passing GIOVA->GPA through UAPI
>>>>>>
>>>>>> - cache GIOVA->GPA somewhere but still use
GIOVA->HVA in device IOTLB for
>>>>>> performance
>>>>>>
>>>>>> Is this what you suggest?
>>>>>>
>>>>>> Thanks
>>>>> Not really. We already have GPA->HVA, so I suggested a
flag to pass
>>>>> GIOVA->GPA in the IOTLB.
>>>>>
>>>>> This has advantages for security since a single table needs
>>>>> then to be validated to ensure guest does not corrupt
>>>>> QEMU memory.
>>>>>
>>>> I wonder how much we can gain through this. Currently, qemu
IOMMU gives
>>>> GIOVA->GPA mapping, and qemu vhost code will translate GPA
to HVA then pass
>>>> GIOVA->HVA to vhost. It looks no difference to me.
>>>>
>>>> Thanks
>>> The difference is in security not in performance.  Getting a bad
HVA
>>> corrupts QEMU memory and it might be guest controlled. Very risky.
>> How can this be controlled by guest? HVA was generated from qemu ram
blocks
>> which is totally under the control of qemu memory core instead of
guest.
>>
>>
>> Thanks
> It is ultimately under guest influence as guest supplies IOVA->GPA
> translations.  qemu translates GPA->HVA and gives the translated result
> to the kernel.  If it's not buggy and kernel isn't buggy it's
all
> fine.

If qemu provides buggy GPA->HVA, we can't workaround this. And I
don't
get the point why we even want to try this. Buggy qemu code can crash 
itself in many ways.

>
> But that's the approach that was proven not to work in the 20th
century.
> In the 21st century we are trying defence in depth approach.
>
> My point is that a single code path that is responsible for
> the HVA translations is better than two.
>
So the difference whether or not use memory table information:

Current:

1) SET_MEM_TABLE: GPA->HVA

2) Qemu GIOVA->GPA

3) Qemu GPA->HVA

4) IOTLB_UPDATE: GIOVA->HVA

If I understand correctly you want to drop step 3 consider it might be 
buggy which is just 19 lines of code in qemu 
(vhost_memory_region_lookup()). This will ends up:

1) Do GPA->HVA translation in IOTLB_UPDATE path (I believe we won't want 
to do it during device IOTLB lookup).

2) Extra bits to enable this capability.

So this looks need more codes in kernel than what qemu did in 
userspace.? Is this really worthwhile?

Thanks

Michael S. Tsirkin

2018-Dec-25 16:25 UTC

head link

[PATCH net V2 4/4] vhost: log dirty page correctly

On Tue, Dec 25, 2018 at 05:43:25PM +0800, Jason Wang
wrote:> 
> On 2018/12/25 ??1:41, Michael S. Tsirkin wrote:
> > On Mon, Dec 24, 2018 at 11:43:31AM +0800, Jason Wang wrote:
> > > On 2018/12/14 ??9:20, Michael S. Tsirkin wrote:
> > > > On Fri, Dec 14, 2018 at 10:43:03AM +0800, Jason Wang wrote:
> > > > > On 2018/12/13 ??10:31, Michael S. Tsirkin wrote:
> > > > > > > Just to make sure I understand this. It looks
to me we should:
> > > > > > > 
> > > > > > > - allow passing GIOVA->GPA through UAPI
> > > > > > > 
> > > > > > > - cache GIOVA->GPA somewhere but still use
GIOVA->HVA in device IOTLB for
> > > > > > > performance
> > > > > > > 
> > > > > > > Is this what you suggest?
> > > > > > > 
> > > > > > > Thanks
> > > > > > Not really. We already have GPA->HVA, so I
suggested a flag to pass
> > > > > > GIOVA->GPA in the IOTLB.
> > > > > > 
> > > > > > This has advantages for security since a single
table needs
> > > > > > then to be validated to ensure guest does not
corrupt
> > > > > > QEMU memory.
> > > > > > 
> > > > > I wonder how much we can gain through this. Currently,
qemu IOMMU gives
> > > > > GIOVA->GPA mapping, and qemu vhost code will
translate GPA to HVA then pass
> > > > > GIOVA->HVA to vhost. It looks no difference to me.
> > > > > 
> > > > > Thanks
> > > > The difference is in security not in performance.  Getting a
bad HVA
> > > > corrupts QEMU memory and it might be guest controlled. Very
risky.
> > > How can this be controlled by guest? HVA was generated from qemu
ram blocks
> > > which is totally under the control of qemu memory core instead of
guest.
> > > 
> > > 
> > > Thanks
> > It is ultimately under guest influence as guest supplies IOVA->GPA
> > translations.  qemu translates GPA->HVA and gives the translated
result
> > to the kernel.  If it's not buggy and kernel isn't buggy
it's all
> > fine.
> 
> 
> If qemu provides buggy GPA->HVA, we can't workaround this. And I
don't get
> the point why we even want to try this. Buggy qemu code can crash itself in
> many ways.
> 
> 
> > 
> > But that's the approach that was proven not to work in the 20th
century.
> > In the 21st century we are trying defence in depth approach.
> > 
> > My point is that a single code path that is responsible for
> > the HVA translations is better than two.
> > 
> 
> So the difference whether or not use memory table information:
> 
> Current:
> 
> 1) SET_MEM_TABLE: GPA->HVA
> 
> 2) Qemu GIOVA->GPA
> 
> 3) Qemu GPA->HVA
> 
> 4) IOTLB_UPDATE: GIOVA->HVA
> 
> If I understand correctly you want to drop step 3 consider it might be
buggy
> which is just 19 lines of code in qemu (vhost_memory_region_lookup()). This
> will ends up:
> 
> 1) Do GPA->HVA translation in IOTLB_UPDATE path (I believe we won't
want to
> do it during device IOTLB lookup).
> 
> 2) Extra bits to enable this capability.
> 
> So this looks need more codes in kernel than what qemu did in userspace.?
Is
> this really worthwhile?
> 
> Thanks
So there are several points I would like to make

1. At the moment without an iommu it is possible to
   change GPA-HVA mappings and everything keeps working
   because a change in memory tables flushes the rings.
   However I don't see the iotlb cache being invalidated
   on that path - did I miss it? If it is not there it's
   a related minor bug.

2. qemu already has a GPA. Discarding it and re-calculating
   when logging is on just seems wrong.
   However if you would like to *also* keep the HVA in the iotlb
   to avoid doing extra translations, that sounds like a
   reasonable optimization.

3. it also means that the hva->gpa translation only runs
   when logging is enabled. That is a rarely excercised
   path so any bugs there will not be caught.

So I really would like us long term to move away from
hva->gpa translations, keep them for legacy userspace only
but I don't really mind how we do it.

How about
- a new flag to pass an iotlb with *both* a gpa and hva
- for legacy userspace, calculate the gpa on iotlb update
  so the device then uses a shared code path

what do you think?


-- 
MST

Possibly Parallel Threads

Search for more apparently analagous threads

Linux Virtualization - Dec 2018 - [PATCH net V2 4/4] vhost: log dirty page correctly

[PATCH net V2 4/4] vhost: log dirty page correctly

[PATCH net V2 4/4] vhost: log dirty page correctly

[PATCH net V2 4/4] vhost: log dirty page correctly

Possibly Parallel Threads