thr3ads.net - Linux Virtualization - [RFC 0/4] Virtio uses DMA API for all devices [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Benjamin Herrenschmidt

2018-Aug-06 22:13 UTC

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, 2018-08-07 at 00:46 +0300, Michael S. Tsirkin
wrote:> On Tue, Aug 07, 2018 at 07:26:35AM +1000, Benjamin Herrenschmidt wrote:
> > On Mon, 2018-08-06 at 23:35 +0300, Michael S. Tsirkin wrote:
> > > > As I said replying to Christoph, we are "leaking"
into the interface
> > > > something here that is really what's the VM is doing to
itself, which
> > > > is to stash its memory away in an inaccessible place.
> > > > 
> > > > Cheers,
> > > > Ben.
> > > 
> > > I think Christoph merely objects to the specific implementation. 
If
> > > instead you do something like tweak dev->bus_dma_mask for the
virtio
> > > device I think he won't object.
> > 
> > Well, we don't have "bus_dma_mask" yet ..or you mean
dma_mask ?
> > 
> > So, something like that would be a possibility, but the problem is
that
> > the current virtio (guest side) implementation doesn't honor this
when
> > not using dma ops and will not use dma ops if not using iommu, so back
> > to square one.
> 
> Well we have the RFC for that - the switch to using DMA ops unconditionally
isn't
> problematic itself IMHO, for now that RFC is blocked
> by its perfromance overhead for now but Christoph says
> he's trying to remove that for direct mappings,
> so we should hopefully be able to get there in X weeks.
That would be good yes.

 ../..
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -155,7 +155,7 @@ static bool vring_use_dma_api(struct virtio_device
> > *vdev)
> >          * the DMA API if we're a Xen guest, which at least allows
> >          * all of the sensible Xen configurations to work correctly.
> >          */
> > -       if (xen_domain())
> > +       if (xen_domain() ||
arch_virtio_direct_dma_ops(&vdev->dev))
> >                 return true;
> >  
> >         return false;
> 
> Right but can't we fix the retpoline overhead such that
> vring_use_dma_api will not be called on data path any longer, making
> this a setup time check?
Yes it needs to be a setup time check regardless actually !

The above is broken, sorry I was a bit quick here (too early in the
morning... ugh). We don't want the arch to go override the dma ops
every time that is callled.

But yes, if we can fix the overhead, it becomes just a matter of
setting up the "right" ops automatically.
> > (Passing the dev allows the arch to know this is a virtio device in
> > "direct" mode or whatever we want to call the !iommu case,
and
> > construct appropriate DMA ops for it, which aren't the same as the
DMA
> > ops of any other PCI device who *do* use the iommu).
> 
> I think that's where Christoph might have specific ideas about it.
OK well, assuming Christoph can solve the direct case in a way that
also work for the virtio !iommu case, we still want some bit of logic
somewhere that will "switch" to swiotlb based ops if the DMA mask is
limited.

You mentioned an RFC for that ? Do you happen to have a link ?

It would be indeed ideal if all we had to do was setup some kind of
bus_dma_mask on all PCI devices and have virtio automagically insert
swiotlb when necessary.

Cheers,
Ben.

Benjamin Herrenschmidt

2018-Aug-06 23:16 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, 2018-08-07 at 08:13 +1000, Benjamin Herrenschmidt
wrote:> 
> OK well, assuming Christoph can solve the direct case in a way that
> also work for the virtio !iommu case, we still want some bit of logic
> somewhere that will "switch" to swiotlb based ops if the DMA mask
is
> limited.
> 
> You mentioned an RFC for that ? Do you happen to have a link ?
> 
> It would be indeed ideal if all we had to do was setup some kind of
> bus_dma_mask on all PCI devices and have virtio automagically insert
> swiotlb when necessary.
Actually... I can think of a simpler option (Anshuman, didn't you
prototype this earlier ?):

Since that limitaiton of requiring bounce buffering via swiotlb is true
of any device in a secure VM, whether it goes through the iommu or not,
the iommu remapping is essentially pointless.

Thus, we could ensure that the iommu maps 1:1 the swiotlb bounce buffer
(either that or we configure it as "disabled" which is equivalent in
this case).

That way, we can now use the basic swiotlb ops everywhere, the same
dma_ops (swiotlb) will work whether a device uses the iommu or not.

Which boils down now to only making virtio use dma ops, there is no
need to override the dma_ops.

Which means all we have to do is either make xen_domain() return true
(yuck) or replace that one test with arch_virtio_force_dma_api() which
resolves to xen_domain() on x86 and can do something else for us.

As to using a virtio feature flag for that, which is what Christoph
proposes, I'm not too fan of it because this means effectively exposing
this to the peer, ie the interface. I don't think it belong there. The
interface, from the hypervisor perspective, whether it's qemu, vmware,
hyperz etc... have no business knowing how the guest manages its dma
operations, and may not even be aware of the access limitations (in our
case they are somewhat guest self-imposed).

Now, if this flag really is what we have to do, then we'd probably need
a qemu hack which will go set that flag on all virtio devices when it
detects that the VM is going secure.

But I don't think that's where that information "need to use the
dma
API even for direct mode" belongs.

Cheers,
Ben.

Michael S. Tsirkin

2018-Aug-06 23:45 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, Aug 07, 2018 at 08:13:56AM +1000, Benjamin Herrenschmidt
wrote:> On Tue, 2018-08-07 at 00:46 +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 07, 2018 at 07:26:35AM +1000, Benjamin Herrenschmidt
wrote:
> > > On Mon, 2018-08-06 at 23:35 +0300, Michael S. Tsirkin wrote:
> > > > > As I said replying to Christoph, we are
"leaking" into the interface
> > > > > something here that is really what's the VM is
doing to itself, which
> > > > > is to stash its memory away in an inaccessible place.
> > > > > 
> > > > > Cheers,
> > > > > Ben.
> > > > 
> > > > I think Christoph merely objects to the specific
implementation.  If
> > > > instead you do something like tweak dev->bus_dma_mask for
the virtio
> > > > device I think he won't object.
> > > 
> > > Well, we don't have "bus_dma_mask" yet ..or you
mean dma_mask ?
> > > 
> > > So, something like that would be a possibility, but the problem
is that
> > > the current virtio (guest side) implementation doesn't honor
this when
> > > not using dma ops and will not use dma ops if not using iommu, so
back
> > > to square one.
> > 
> > Well we have the RFC for that - the switch to using DMA ops
unconditionally isn't
> > problematic itself IMHO, for now that RFC is blocked
> > by its perfromance overhead for now but Christoph says
> > he's trying to remove that for direct mappings,
> > so we should hopefully be able to get there in X weeks.
> 
> That would be good yes.
> 
>  ../..
> 
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -155,7 +155,7 @@ static bool vring_use_dma_api(struct
virtio_device
> > > *vdev)
> > >          * the DMA API if we're a Xen guest, which at least
allows
> > >          * all of the sensible Xen configurations to work
correctly.
> > >          */
> > > -       if (xen_domain())
> > > +       if (xen_domain() ||
arch_virtio_direct_dma_ops(&vdev->dev))
> > >                 return true;
> > >  
> > >         return false;
> > 
> > Right but can't we fix the retpoline overhead such that
> > vring_use_dma_api will not be called on data path any longer, making
> > this a setup time check?
> 
> Yes it needs to be a setup time check regardless actually !
> 
> The above is broken, sorry I was a bit quick here (too early in the
> morning... ugh). We don't want the arch to go override the dma ops
> every time that is callled.
> 
> But yes, if we can fix the overhead, it becomes just a matter of
> setting up the "right" ops automatically.
> 
> > > (Passing the dev allows the arch to know this is a virtio device
in
> > > "direct" mode or whatever we want to call the !iommu
case, and
> > > construct appropriate DMA ops for it, which aren't the same
as the DMA
> > > ops of any other PCI device who *do* use the iommu).
> > 
> > I think that's where Christoph might have specific ideas about it.
> 
> OK well, assuming Christoph can solve the direct case in a way that
> also work for the virtio !iommu case, we still want some bit of logic
> somewhere that will "switch" to swiotlb based ops if the DMA mask
is
> limited.
> 
> You mentioned an RFC for that ? Do you happen to have a link ?
No but Christoph did I think.
> It would be indeed ideal if all we had to do was setup some kind of
> bus_dma_mask on all PCI devices and have virtio automagically insert
> swiotlb when necessary.
> 
> Cheers,
> Ben.
>

Benjamin Herrenschmidt

2018-Aug-07 00:18 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, 2018-08-07 at 02:45 +0300, Michael S. Tsirkin
wrote:> > OK well, assuming Christoph can solve the direct case in a way that
> > also work for the virtio !iommu case, we still want some bit of logic
> > somewhere that will "switch" to swiotlb based ops if the DMA
mask is
> > limited.
> > 
> > You mentioned an RFC for that ? Do you happen to have a link ?
> 
> No but Christoph did I think.
Ok I missed that, sorry, I'll dig it out. Thanks.

Cheers,
Ben.

Christoph Hellwig

2018-Aug-07 06:27 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, Aug 07, 2018 at 08:13:56AM +1000, Benjamin Herrenschmidt
wrote:> It would be indeed ideal if all we had to do was setup some kind of
> bus_dma_mask on all PCI devices and have virtio automagically insert
> swiotlb when necessary.
For 4.20 I plan to remove the swiotlb ops and instead do the bounce
buffering in the common code, including a direct call to the direct
ops to avoid retpoline overhead.  For that you still need a flag
in virtio that instead of blindly working physical addresses it needs
to be treated like a real device in terms of DMA.

And for powerpc to make use of that I need to get the dma series I
posted last week reviewed and included, otherwise powerpc will have
to be excepted (like arm, where rmk didn't like the way the code
was factored, everything else has already been taken care of).

https://lists.linuxfoundation.org/pipermail/iommu/2018-July/028989.html

Christoph Hellwig

2018-Aug-07 06:32 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Tue, Aug 07, 2018 at 02:45:25AM +0300, Michael S. Tsirkin
wrote:> > > I think that's where Christoph might have specific ideas
about it.
> > 
> > OK well, assuming Christoph can solve the direct case in a way that
> > also work for the virtio !iommu case, we still want some bit of logic
> > somewhere that will "switch" to swiotlb based ops if the DMA
mask is
> > limited.
> > 
> > You mentioned an RFC for that ? Do you happen to have a link ?
> 
> No but Christoph did I think.
Do you mean the direct map retpoline mitigation?  It is here:

https://www.spinics.net/lists/netdev/msg495413.html

https://www.spinics.net/lists/netdev/msg495785.html

Benjamin Herrenschmidt

2018-Aug-07 06:44 UTC

head link

[RFC 0/4] Virtio uses DMA API for all devices

On Mon, 2018-08-06 at 23:27 -0700, Christoph Hellwig
wrote:> On Tue, Aug 07, 2018 at 08:13:56AM +1000, Benjamin Herrenschmidt wrote:
> > It would be indeed ideal if all we had to do was setup some kind of
> > bus_dma_mask on all PCI devices and have virtio automagically insert
> > swiotlb when necessary.
> 
> For 4.20 I plan to remove the swiotlb ops and instead do the bounce
> buffering in the common code, including a direct call to the direct
> ops to avoid retpoline overhead.  For that you still need a flag
> in virtio that instead of blindly working physical addresses it needs
> to be treated like a real device in terms of DMA.
But you will still call the swiotlb infrastructure, right ? IE, I sitll
need to control where/how the swiotlb "pool" is
allocated.> 
> And for powerpc to make use of that I need to get the dma series I
> posted last week reviewed and included, otherwise powerpc will have
> to be excepted (like arm, where rmk didn't like the way the code
> was factored, everything else has already been taken care of).
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2018-July/028989.html
Yes, I saw your series. I'm just back from a week of travel, I plan to
start reviewing it this week if Michael doesn't beat me to it.

Cheers,
Ben.

Seemingly Similar Threads

Search for more apparently analagous threads

Linux Virtualization - Aug 2018 - [RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

[RFC 0/4] Virtio uses DMA API for all devices

Seemingly Similar Threads