thr3ads.net - Linux Virtualization - [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API [Sep 2014]

If this information is useful, please help other people find it:
Share via:

Benjamin Herrenschmidt

2014-Sep-03 00:25 UTC

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Tue, 2014-09-02 at 16:42 -0700, Andy Lutomirski wrote:
> But there aren't any ACPI systems with both virtio-pci and IOMMUs,
> right?  So we could say that, henceforth, ACPI systems must declare
> whether virtio-pci devices live behind IOMMUs without breaking
> backward compatibility.
I don't know for sure whether that's the case and whether we can rely on
that not happening, we'll need x86 folks opinion here.
> >> On ARM, I hope the QEMU will never implement a PCI IOMMU.  As far
as I
> >> could tell when I looked last week, none of the newer
QEMU-emulated
> >> ARM machines even support PCI.  Even if QEMU were to implement a
PCI
> >> IOMMU on some future ARM machine, it could continue using
virtio-mmio
> >> for virtio devices.
> >
> > Possibly...
> >
> >> So ppc might actually be the only system that has or will have
> >> physically-addressed virtio PCI devices that are behind an IOMMU. 
Can
> >> this be handled in a ppc64-specific way?
> >
> > I wouldn't be so certain, as I said, the way virtio is implemented
in
> > qemu bypass the DMA layer which is where IOMMUs sit. The fact that
> > currently x86 doesn't put an IOMMU there is not even garanteed, is
it ?
> > What happens if you try to mix and match virtio and other emulated
> > devices that require the iommu on the same bus ?
> 
> AFAIK QEMU doesn't support IOMMUs at all on x86, so current versions
> of QEMU really do guarantee that virtio-pci on x86 has no IOMMU, even
> if that guarantee is purely accidental.
Right.
> > If we could discriminate virtio devices to a specific host bridge and
> > guarantee no mix & match, we could probably add a concept of
> > "IOMMU-less" bus but that would require guest changes which
limits the
> > usefulness.
> >
> >>   Is there any way that the
> >> kernel can distinguish a QEMU-provided virtio PCI device from a
> >> physical PCIe thing?
> >
> > Not with existing guests which cannot be changed. Existing distros are
> > out with those drivers. If we add a backward compatibility mechanism,
> > then we could add something yes, provided we can segregate virtio onto
a
> > dedicated host bridge (which can be a problem with the libvirt
> > trainwreck...)
> 
> Ugh.
> 
> So here's an ugly proposal:
> 
> Step 1: Make virtio-pci use the DMA API only on x86.  This will at
> least fix Xen and people experimenting with virtio hardware on x86,
> and it won't break anything, since there are no emulated IOMMUs on
> x86.
I think we should make all virtio drivers use the DMA API and just have
different set of dma_ops. We can make a simple ifdef powerpc if needed
in virtio-pci that force the dma-ops of the device to some direct
"bypass" ops at init time.

That way no need to select whether to use the DMA API or not, just
always use it, and add a tweak to replace the DMA ops with the direct
ones on the archs/platforms that need that. That was my original
proposal and I still think it's the best approach.
> Step 2: Update the virtio spec.  Virtio 1.0 PCI devices should set a
> new bit if they are physically addressed.  If that bit is clear, then
> the device is assumed to be addressed in accordance with the
> platform's standard addressing model for PCI.  Presumably this would
> be something like VIRTIO_F_BUS_ADDRESSING = 33, and the spec would say
> something like "Physical devices compatible with this specification
> MUST offer VIRTIO_F_BUS_ADDRESSING.  Drivers MUST implement this
> feature."  Alternatively, this could live in a PCI configuration
> capability.
I'll let you sort that out with Rusty but it makes sense.
> Step 3: Update virtio-pci to use the DMA API for all devices on x86
> and for devices that advertise bus addressing on other architectures.
> 
> I think this proposal will work, but I also think it sucks and I'd
> really like to see a better counter-proposal.
As I said, make it always use the DMA API, but add a quirk to replace
the dma_ops with some NULL ops on platforms that need it.

The only issue with that is the location of the dma ops is arch
specific, so that one function will contain some ifdefs, but the rest of
the code can just use the DMA API.
 
Cheers,
Ben.

Andy Lutomirski

2014-Sep-03 00:32 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Tue, Sep 2, 2014 at 5:25 PM, Benjamin Herrenschmidt <benh at
au1.ibm.com> wrote:> On Tue, 2014-09-02 at 16:42 -0700, Andy Lutomirski wrote:
>> So here's an ugly proposal:
>>
>> Step 1: Make virtio-pci use the DMA API only on x86.  This will at
>> least fix Xen and people experimenting with virtio hardware on x86,
>> and it won't break anything, since there are no emulated IOMMUs on
>> x86.
>
> I think we should make all virtio drivers use the DMA API and just have
> different set of dma_ops. We can make a simple ifdef powerpc if needed
> in virtio-pci that force the dma-ops of the device to some direct
> "bypass" ops at init time.
>
> That way no need to select whether to use the DMA API or not, just
> always use it, and add a tweak to replace the DMA ops with the direct
> ones on the archs/platforms that need that. That was my original
> proposal and I still think it's the best approach.
I agree *except* that implementing it will be a real PITA and (I
think) can't be done without changing code in arch/.  My patches plus
an ifdef powerpc will be functionally equivalent, just uglier.
>
> As I said, make it always use the DMA API, but add a quirk to replace
> the dma_ops with some NULL ops on platforms that need it.
>
> The only issue with that is the location of the dma ops is arch
> specific, so that one function will contain some ifdefs, but the rest of
> the code can just use the DMA API.
Bigger quirk: on a standard s390 virtio guest configuration,
dma_map_single etc will fail to link.  I tried this in v1 of these
patches.  So we can poke at the archdata all day, but we can't build a
kernel like that :(

So until the dma_ops pointer move into struct device and
CONFIG_HAS_DMA becomes mandatory (or mandatory enough that virtio can
depend on it), I don't think we can do it this way.

I'll send a v5 that is the same as v4 except with physical addressing
hardcoded in for powerpc.

--Andy

Benjamin Herrenschmidt

2014-Sep-03 00:43 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Tue, 2014-09-02 at 17:32 -0700, Andy Lutomirski
wrote:> 
> I agree *except* that implementing it will be a real PITA and (I
> think) can't be done without changing code in arch/.  My patches plus
> an ifdef powerpc will be functionally equivalent, just uglier.
So for powerpc, it's a 2 liner inside virtio-pci, but yes, it might be
more of a problem for s390, I'm not too sure what they do in that area.
> Bigger quirk: on a standard s390 virtio guest configuration,
> dma_map_single etc will fail to link. 
Yuck
>  I tried this in v1 of these
> patches.  So we can poke at the archdata all day, but we can't build a
> kernel like that :(
I would like the s390 people to chime in here, it still looks like the
best way to go if they can fix things on their side :-)
> So until the dma_ops pointer move into struct device and
> CONFIG_HAS_DMA becomes mandatory (or mandatory enough that virtio can
> depend on it), I don't think we can do it this way.
I see, it's a bummer because it would be a lot cleaner.
> I'll send a v5 that is the same as v4 except with physical addressing
> hardcoded in for powerpc.
Thanks. That will do for now, but ideally we want to make it a function
of some flag from the implementation, so let's see what Rusty has to
say.

Cheers,
Ben.

Paolo Bonzini

2014-Sep-03 07:47 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

Il 03/09/2014 02:25, Benjamin Herrenschmidt ha scritto:> > But there aren't any ACPI systems with both virtio-pci and IOMMUs,
> > right?  So we could say that, henceforth, ACPI systems must declare
> > whether virtio-pci devices live behind IOMMUs without breaking
> > backward compatibility.
> 
> I don't know for sure whether that's the case and whether we can
rely on
> that not happening, we'll need x86 folks opinion here.
IOMMU support for x86 is going to go in this week.

However, it is and likely will remain niche enough that I don't really
care about performance loss from IOMMU support.  If you enable it, you
want it.

So from the QEMU point of view we can simply add the direct-ram-access
property, and have the pseries machine turn it on by default (while
other machines can leave it off by default---they have no IOMMU and thus
no performance cost).

Paolo

Andy Lutomirski

2014-Sep-03 07:52 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Wed, Sep 3, 2014 at 12:47 AM, Paolo Bonzini <pbonzini at redhat.com>
wrote:> Il 03/09/2014 02:25, Benjamin Herrenschmidt ha scritto:
>> > But there aren't any ACPI systems with both virtio-pci and
IOMMUs,
>> > right?  So we could say that, henceforth, ACPI systems must
declare
>> > whether virtio-pci devices live behind IOMMUs without breaking
>> > backward compatibility.
>>
>> I don't know for sure whether that's the case and whether we
can rely on
>> that not happening, we'll need x86 folks opinion here.
>
> IOMMU support for x86 is going to go in this week.
>
Can you try to make sure that qemu-system-x86_64 -device iommu -device
virtio-balloon-pci (or whatever the syntax is) doesn't put the
virtio-pci device behind the IOMMU?  Because, if it does, then the
kernel will have to support that, and it'll be messy.

--Andy

Benjamin Herrenschmidt

2014-Sep-03 08:05 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Wed, 2014-09-03 at 09:47 +0200, Paolo Bonzini wrote:> 
> IOMMU support for x86 is going to go in this week.
But won't that break virtio on x86 ? Or will virtio continue bypassing
it ? IE, the guest side virtio doesn't expect an IOMMU and doesn't call
the dma mappings ops.
> However, it is and likely will remain niche enough that I don't really
> care about performance loss from IOMMU support.  If you enable it, you
> want it.
> 
> So from the QEMU point of view we can simply add the direct-ram-access
> property, and have the pseries machine turn it on by default (while
> other machines can leave it off by default---they have no IOMMU and
> thus no performance cost).
Well, it's only for virtio and should be on by default on x86 as well if
an iommu is installed no ?

Cheers,
Ben.

Maybe Matching Threads

Search for more maybe matching threads

Linux Virtualization - Sep 2014 - [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

Maybe Matching Threads