thr3ads.net - Virtualization - [RFC V2] virtio: Add platform specific DMA API translation for virito devices [Jun 2018]

If this information is useful, please help other people find it:
Share via:

Benjamin Herrenschmidt

2018-May-23 22:27 UTC

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> I re-read that discussion and I'm still unclear on the
> original question, since I got several apparently
> conflicting answers.
> 
> I asked:
> 
> 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> 	hypervisor side sufficient?
I thought I had replied to this...

There are a couple of reasons:

- First qemu doesn't know that the guest will switch to "secure
mode"
in advance. There is no difference between a normal and a secure
partition until the partition does the magic UV call to "enter secure
mode" and qemu doesn't see any of it. So who can set the flag here ?

- Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
vhost) go through the emulated MMIO for every access to the guest,
which adds additional overhead.

Cheers,
Ben.
> 
> 
> >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> >  3 files changed, 27 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/dma-mapping.h
b/arch/powerpc/include/asm/dma-mapping.h
> > index 8fa3945..056e578 100644
> > --- a/arch/powerpc/include/asm/dma-mapping.h
> > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device
*dev);
> >  #define ARCH_HAS_DMA_MMAP_COHERENT
> >  
> >  #endif /* __KERNEL__ */
> > +
> > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > +
> > +struct virtio_device;
> > +
> > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> >  #endif	/* _ASM_DMA_MAPPING_H */
> > diff --git a/arch/powerpc/platforms/pseries/iommu.c
b/arch/powerpc/platforms/pseries/iommu.c
> > index 06f0296..a2ec15a 100644
> > --- a/arch/powerpc/platforms/pseries/iommu.c
> > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > @@ -38,6 +38,7 @@
> >  #include <linux/of.h>
> >  #include <linux/iommu.h>
> >  #include <linux/rculist.h>
> > +#include <linux/virtio.h>
> >  #include <asm/io.h>
> >  #include <asm/prom.h>
> >  #include <asm/rtas.h>
> > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> >  __setup("multitce=", disable_multitce);
> >  
> >  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
> > +
> > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > +{
> > +	/*
> > +	 * On protected guest platforms, force virtio core to use DMA
> > +	 * MAP API for all virtio devices. But there can also be some
> > +	 * exceptions for individual devices like virtio balloon.
> > +	 */
> > +	return (of_find_compatible_node(NULL, NULL,
"ibm,ultravisor") != NULL);
> > +}
> 
> Isn't this kind of slow?  vring_use_dma_api is on
> data path and supposed to be very fast.
> 
> > diff --git a/drivers/virtio/virtio_ring.c
b/drivers/virtio/virtio_ring.c
> > index 21d464a..47ea6c3 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> >   * unconditionally on data path.
> >   */
> >  
> > +#ifndef platform_forces_virtio_dma
> > +static inline bool platform_forces_virtio_dma(struct virtio_device
*vdev)
> > +{
> > +	return false;
> > +}
> > +#endif
> > +
> >  static bool vring_use_dma_api(struct virtio_device *vdev)
> >  {
> > +	if (platform_forces_virtio_dma(vdev))
> > +		return true;
> > +
> >  	if (!virtio_has_iommu_quirk(vdev))
> >  		return true;
> >  
> > -- 
> > 2.9.3

Christoph Hellwig

2018-May-24 07:17 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt
wrote:> - First qemu doesn't know that the guest will switch to "secure
mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here
?
> 
> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
Also this whole scheme is simply the wrong way around.  No driver should
opt out of the DMA API in general.  For legacy reasons we might have to
opt out of the dma API for some virtio cases due to qemu bugs, but
this should never have been the default, but a quirk for the affected
versions.

We need to fix this now and make the dma ops bypass the quirk instead of
the default, which will also solve the power issue.

Michael S. Tsirkin

2018-May-25 17:45 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt
wrote:> On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> 
> > I re-read that discussion and I'm still unclear on the
> > original question, since I got several apparently
> > conflicting answers.
> > 
> > I asked:
> > 
> > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > 	hypervisor side sufficient?
> 
> I thought I had replied to this...
> 
> There are a couple of reasons:
> 
> - First qemu doesn't know that the guest will switch to "secure
mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here
?
Not sure I understand. Just set the flag e.g. on qemu command line.
I might be wrong, but these secure mode things usually
a. require hypervisor side tricks anyway
> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
> 
> Cheers,
> Ben.
Well it's not supposed to be much slower for the static case.

vhost has a cache so should be fine.

A while ago Paolo implemented a translation cache which should be
perfect for this case - most of the code got merged but
never enabled because of stability issues.

If all else fails, we could teach QEMU to handle the no-iommu case
as if VIRTIO_F_IOMMU_PLATFORM was off.


> > 
> > 
> > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > >  3 files changed, 27 insertions(+)
> > > 
> > > diff --git a/arch/powerpc/include/asm/dma-mapping.h
b/arch/powerpc/include/asm/dma-mapping.h
> > > index 8fa3945..056e578 100644
> > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct
device *dev);
> > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > >  
> > >  #endif /* __KERNEL__ */
> > > +
> > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > +
> > > +struct virtio_device;
> > > +
> > > +extern bool platform_forces_virtio_dma(struct virtio_device
*vdev);
> > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > diff --git a/arch/powerpc/platforms/pseries/iommu.c
b/arch/powerpc/platforms/pseries/iommu.c
> > > index 06f0296..a2ec15a 100644
> > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > @@ -38,6 +38,7 @@
> > >  #include <linux/of.h>
> > >  #include <linux/iommu.h>
> > >  #include <linux/rculist.h>
> > > +#include <linux/virtio.h>
> > >  #include <asm/io.h>
> > >  #include <asm/prom.h>
> > >  #include <asm/rtas.h>
> > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char
*str)
> > >  __setup("multitce=", disable_multitce);
> > >  
> > >  machine_subsys_initcall_sync(pseries,
tce_iommu_bus_notifier_init);
> > > +
> > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > +{
> > > +	/*
> > > +	 * On protected guest platforms, force virtio core to use DMA
> > > +	 * MAP API for all virtio devices. But there can also be some
> > > +	 * exceptions for individual devices like virtio balloon.
> > > +	 */
> > > +	return (of_find_compatible_node(NULL, NULL,
"ibm,ultravisor") != NULL);
> > > +}
> > 
> > Isn't this kind of slow?  vring_use_dma_api is on
> > data path and supposed to be very fast.
> > 
> > > diff --git a/drivers/virtio/virtio_ring.c
b/drivers/virtio/virtio_ring.c
> > > index 21d464a..47ea6c3 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > >   * unconditionally on data path.
> > >   */
> > >  
> > > +#ifndef platform_forces_virtio_dma
> > > +static inline bool platform_forces_virtio_dma(struct
virtio_device *vdev)
> > > +{
> > > +	return false;
> > > +}
> > > +#endif
> > > +
> > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  {
> > > +	if (platform_forces_virtio_dma(vdev))
> > > +		return true;
> > > +
> > >  	if (!virtio_has_iommu_quirk(vdev))
> > >  		return true;
> > >  
> > > -- 
> > > 2.9.3

Benjamin Herrenschmidt

2018-May-28 23:48 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Fri, 2018-05-25 at 20:45 +0300, Michael S. Tsirkin
wrote:> On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > 
> > > I re-read that discussion and I'm still unclear on the
> > > original question, since I got several apparently
> > > conflicting answers.
> > > 
> > > I asked:
> > > 
> > > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > > 	hypervisor side sufficient?
> > 
> > I thought I had replied to this...
> > 
> > There are a couple of reasons:
> > 
> > - First qemu doesn't know that the guest will switch to
"secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter
secure
> > mode" and qemu doesn't see any of it. So who can set the flag
here ?
> 
> Not sure I understand. Just set the flag e.g. on qemu command line.
> I might be wrong, but these secure mode things usually
> a. require hypervisor side tricks anyway
The way our secure mode architecture is designed, there doesn't need at
this point to be any knowledge at qemu level whatsoever. Well at least
until we do migration but that's a different kettle of fish. In any
case, the guest starts normally (which means as a non-secure guest, and
thus expects normal virtio, our FW today doesn't handle
VIRTIO_F_IOMMU_PLATFORM, though granted, we can fix this), and later
that guest issues some special Ultravisor call that turns it into a
secure guest.

There is some involvement of the hypervisor, but not qemu at this
stage. We would very much like to avoid that, as it would be a hassle
for users to have to use different libvirt options etc... bcs the guest
might turn itself into a secure VM.
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> > 
> > Cheers,
> > Ben.
> 
> Well it's not supposed to be much slower for the static case.
> 
> vhost has a cache so should be fine.
> 
> A while ago Paolo implemented a translation cache which should be
> perfect for this case - most of the code got merged but
> never enabled because of stability issues.
> 
> If all else fails, we could teach QEMU to handle the no-iommu case
> as if VIRTIO_F_IOMMU_PLATFORM was off.
Any serious reason why not just getting that 2 line patch allowing our
arch code to force virtio to use the DMA API ?

It's not particularly invasive and solves our problem rather nicely
without adding overhead or additional knowledge to qemu/libvirt/mgmnt
tools etc... that it doesn't need etc....

The guest knows it's going secure so the guest arch code can do the
right thing rather trivially.

Long term we should probably make virtio always use the DMA API anyway,
and interpose "1:1" dma_ops for the traditional virtio case, that
would
reduce code clutter significantly. In that case, it would become just a
matter of having a platform hook to override the dma_ops used.

Cheers,
Ben.
> 
> 
> > > 
> > > 
> > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > > >  3 files changed, 27 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h
b/arch/powerpc/include/asm/dma-mapping.h
> > > > index 8fa3945..056e578 100644
> > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > @@ -115,4 +115,10 @@ extern u64
__dma_get_required_mask(struct device *dev);
> > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > +
> > > > +#define platform_forces_virtio_dma
platform_forces_virtio_dma
> > > > +
> > > > +struct virtio_device;
> > > > +
> > > > +extern bool platform_forces_virtio_dma(struct virtio_device
*vdev);
> > > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c
b/arch/powerpc/platforms/pseries/iommu.c
> > > > index 06f0296..a2ec15a 100644
> > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > @@ -38,6 +38,7 @@
> > > >  #include <linux/of.h>
> > > >  #include <linux/iommu.h>
> > > >  #include <linux/rculist.h>
> > > > +#include <linux/virtio.h>
> > > >  #include <asm/io.h>
> > > >  #include <asm/prom.h>
> > > >  #include <asm/rtas.h>
> > > > @@ -1396,3 +1397,13 @@ static int __init
disable_multitce(char *str)
> > > >  __setup("multitce=", disable_multitce);
> > > >  
> > > >  machine_subsys_initcall_sync(pseries,
tce_iommu_bus_notifier_init);
> > > > +
> > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	/*
> > > > +	 * On protected guest platforms, force virtio core to use
DMA
> > > > +	 * MAP API for all virtio devices. But there can also be
some
> > > > +	 * exceptions for individual devices like virtio balloon.
> > > > +	 */
> > > > +	return (of_find_compatible_node(NULL, NULL,
"ibm,ultravisor") != NULL);
> > > > +}
> > > 
> > > Isn't this kind of slow?  vring_use_dma_api is on
> > > data path and supposed to be very fast.
> > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c
b/drivers/virtio/virtio_ring.c
> > > > index 21d464a..47ea6c3 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > > >   * unconditionally on data path.
> > > >   */
> > > >  
> > > > +#ifndef platform_forces_virtio_dma
> > > > +static inline bool platform_forces_virtio_dma(struct
virtio_device *vdev)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +#endif
> > > > +
> > > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > > >  {
> > > > +	if (platform_forces_virtio_dma(vdev))
> > > > +		return true;
> > > > +
> > > >  	if (!virtio_has_iommu_quirk(vdev))
> > > >  		return true;
> > > >  
> > > > -- 
> > > > 2.9.3

David Gibson

2018-Jun-04 08:57 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt
wrote:> On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> 
> > I re-read that discussion and I'm still unclear on the
> > original question, since I got several apparently
> > conflicting answers.
> > 
> > I asked:
> > 
> > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > 	hypervisor side sufficient?
> 
> I thought I had replied to this...
> 
> There are a couple of reasons:
> 
> - First qemu doesn't know that the guest will switch to "secure
mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here
?
This seems weird to me.  As a rule HV calls should go through qemu -
or be allowed to go directly to KVM *by* qemu.  We generally reserve
the latter for hot path things.  Since this isn't a hot path, having
the call handled directly by the kernel seems wrong.

Unless a "UV call" is something different I don't know about.
> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
> 
-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20180604/6c856489/attachment.sig>

Benjamin Herrenschmidt

2018-Jun-04 09:48 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Mon, 2018-06-04 at 18:57 +1000, David Gibson wrote:> 
> > - First qemu doesn't know that the guest will switch to
"secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter
secure
> > mode" and qemu doesn't see any of it. So who can set the flag
here ?
> 
> This seems weird to me.  As a rule HV calls should go through qemu -
> or be allowed to go directly to KVM *by* qemu.
It's not an HV call, it's a UV call, qemu won't see it, qemu
isn't
trusted. Now the UV *will* reflect that to the HV via some synthetized
HV calls, and we *could* have those do a pass by qemu, however, so far,
our entire design doesn't rely on *any* qemu knowledge whatsoever and
it would be sad to add it just for that purpose.

Additionally, this is rather orthogonal, see my other email, the
problem we are trying to solve is *not* a qemu problem and it doesn't
make sense to leak that into qemu.
>   We generally reserve
> the latter for hot path things.  Since this isn't a hot path, having
> the call handled directly by the kernel seems wrong.
>
> Unless a "UV call" is something different I don't know about.
Yes, a UV call goes to the Ultravisor, not the Hypervisor. The
Hypervisor isn't trusted.
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> Ben.

Michael S. Tsirkin

2018-Jun-04 12:43 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt
wrote:> On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> 
> > I re-read that discussion and I'm still unclear on the
> > original question, since I got several apparently
> > conflicting answers.
> > 
> > I asked:
> > 
> > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > 	hypervisor side sufficient?
> 
> I thought I had replied to this...
> 
> There are a couple of reasons:
> 
> - First qemu doesn't know that the guest will switch to "secure
mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here
?
The user should set it. You just tell user "to be able to use with
feature X, enable IOMMU".
> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
> 
> Cheers,
> Ben.
There are several answers to this.  One is that we are working hard to
make overhead small when the mappings are static (which they would be if
there's no actual IOMMU). So maybe especially given you are using
a bounce buffer on top it's not so bad - did you try to
benchmark?

Another is that given the basic functionality is in there, optimizations
can possibly wait until per-device quirks in DMA API are supported.

> > 
> > 
> > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > >  3 files changed, 27 insertions(+)
> > > 
> > > diff --git a/arch/powerpc/include/asm/dma-mapping.h
b/arch/powerpc/include/asm/dma-mapping.h
> > > index 8fa3945..056e578 100644
> > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct
device *dev);
> > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > >  
> > >  #endif /* __KERNEL__ */
> > > +
> > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > +
> > > +struct virtio_device;
> > > +
> > > +extern bool platform_forces_virtio_dma(struct virtio_device
*vdev);
> > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > diff --git a/arch/powerpc/platforms/pseries/iommu.c
b/arch/powerpc/platforms/pseries/iommu.c
> > > index 06f0296..a2ec15a 100644
> > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > @@ -38,6 +38,7 @@
> > >  #include <linux/of.h>
> > >  #include <linux/iommu.h>
> > >  #include <linux/rculist.h>
> > > +#include <linux/virtio.h>
> > >  #include <asm/io.h>
> > >  #include <asm/prom.h>
> > >  #include <asm/rtas.h>
> > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char
*str)
> > >  __setup("multitce=", disable_multitce);
> > >  
> > >  machine_subsys_initcall_sync(pseries,
tce_iommu_bus_notifier_init);
> > > +
> > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > +{
> > > +	/*
> > > +	 * On protected guest platforms, force virtio core to use DMA
> > > +	 * MAP API for all virtio devices. But there can also be some
> > > +	 * exceptions for individual devices like virtio balloon.
> > > +	 */
> > > +	return (of_find_compatible_node(NULL, NULL,
"ibm,ultravisor") != NULL);
> > > +}
> > 
> > Isn't this kind of slow?  vring_use_dma_api is on
> > data path and supposed to be very fast.
> > 
> > > diff --git a/drivers/virtio/virtio_ring.c
b/drivers/virtio/virtio_ring.c
> > > index 21d464a..47ea6c3 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > >   * unconditionally on data path.
> > >   */
> > >  
> > > +#ifndef platform_forces_virtio_dma
> > > +static inline bool platform_forces_virtio_dma(struct
virtio_device *vdev)
> > > +{
> > > +	return false;
> > > +}
> > > +#endif
> > > +
> > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  {
> > > +	if (platform_forces_virtio_dma(vdev))
> > > +		return true;
> > > +
> > >  	if (!virtio_has_iommu_quirk(vdev))
> > >  		return true;
> > >  
> > > -- 
> > > 2.9.3

Christoph Hellwig

2018-Jun-04 12:55 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Mon, Jun 04, 2018 at 03:43:09PM +0300, Michael S. Tsirkin
wrote:> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.
We have had per-device dma_ops for quite a while.

Benjamin Herrenschmidt

2018-Jun-04 13:11 UTC

head link

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin
wrote:> On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > 
> > > I re-read that discussion and I'm still unclear on the
> > > original question, since I got several apparently
> > > conflicting answers.
> > > 
> > > I asked:
> > > 
> > > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > > 	hypervisor side sufficient?
> > 
> > I thought I had replied to this...
> > 
> > There are a couple of reasons:
> > 
> > - First qemu doesn't know that the guest will switch to
"secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter
secure
> > mode" and qemu doesn't see any of it. So who can set the flag
here ?
> 
> The user should set it. You just tell user "to be able to use with
> feature X, enable IOMMU".
That's completely backwards. The user has no idea what that stuff is.
And it would have to percolate all the way up the management stack,
libvirt, kimchi, whatever else ... that's just nonsense.

Especially since, as I explained in my other email, this is *not* a
qemu problem and thus the solution shouldn't be messing around with
qemu.
> 
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> > 
> > Cheers,
> > Ben.
> 
> There are several answers to this.  One is that we are working hard to
> make overhead small when the mappings are static (which they would be if
> there's no actual IOMMU). So maybe especially given you are using
> a bounce buffer on top it's not so bad - did you try to
> benchmark?
> 
> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.
The point is that requiring specific qemu command line arguments isn't
going to fly. We have additional problems due to the fact that our
firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
though those can be fixed.

Overall, however, this seems to be the most convoluted way of achieving
things, require user interventions where none should be needed etc...

Again, what's wrong with a 2 lines hook instead that solves it all and
completely avoids involving qemu ?

Ben.
> 
> > > 
> > > 
> > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > > >  3 files changed, 27 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h
b/arch/powerpc/include/asm/dma-mapping.h
> > > > index 8fa3945..056e578 100644
> > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > @@ -115,4 +115,10 @@ extern u64
__dma_get_required_mask(struct device *dev);
> > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > +
> > > > +#define platform_forces_virtio_dma
platform_forces_virtio_dma
> > > > +
> > > > +struct virtio_device;
> > > > +
> > > > +extern bool platform_forces_virtio_dma(struct virtio_device
*vdev);
> > > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c
b/arch/powerpc/platforms/pseries/iommu.c
> > > > index 06f0296..a2ec15a 100644
> > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > @@ -38,6 +38,7 @@
> > > >  #include <linux/of.h>
> > > >  #include <linux/iommu.h>
> > > >  #include <linux/rculist.h>
> > > > +#include <linux/virtio.h>
> > > >  #include <asm/io.h>
> > > >  #include <asm/prom.h>
> > > >  #include <asm/rtas.h>
> > > > @@ -1396,3 +1397,13 @@ static int __init
disable_multitce(char *str)
> > > >  __setup("multitce=", disable_multitce);
> > > >  
> > > >  machine_subsys_initcall_sync(pseries,
tce_iommu_bus_notifier_init);
> > > > +
> > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	/*
> > > > +	 * On protected guest platforms, force virtio core to use
DMA
> > > > +	 * MAP API for all virtio devices. But there can also be
some
> > > > +	 * exceptions for individual devices like virtio balloon.
> > > > +	 */
> > > > +	return (of_find_compatible_node(NULL, NULL,
"ibm,ultravisor") != NULL);
> > > > +}
> > > 
> > > Isn't this kind of slow?  vring_use_dma_api is on
> > > data path and supposed to be very fast.
> > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c
b/drivers/virtio/virtio_ring.c
> > > > index 21d464a..47ea6c3 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > > >   * unconditionally on data path.
> > > >   */
> > > >  
> > > > +#ifndef platform_forces_virtio_dma
> > > > +static inline bool platform_forces_virtio_dma(struct
virtio_device *vdev)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +#endif
> > > > +
> > > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > > >  {
> > > > +	if (platform_forces_virtio_dma(vdev))
> > > > +		return true;
> > > > +
> > > >  	if (!virtio_has_iommu_quirk(vdev))
> > > >  		return true;
> > > >  
> > > > -- 
> > > > 2.9.3

Virtualization - Jun 2018 - [RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices

[RFC V2] virtio: Add platform specific DMA API translation for virito devices