thr3ads.net - Virtualization - [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible [Sep 2014]

If this information is useful, please help other people find it:
Share via:

Andy Lutomirski

2014-Sep-29 20:55 UTC

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

On Mon, Sep 29, 2014 at 1:49 PM, Benjamin Herrenschmidt
<benh at kernel.crashing.org> wrote:> On Mon, 2014-09-29 at 11:55 -0700, Andy Lutomirski wrote:
>
>> Rusty and Michael, what's the status of this?
>
> The status is that I still think we need *a* way to actually inform the
> guest whether the virtio implementation will or will not bypass the
> IOMMU. I don't know Xen enough to figure out how to do that and we
could
> maybe just make it something qemu puts in the device-tree on powerpc
> only.
>
> However I dislike making it global or per-bus, we could have a
> combination of qemu and HW virtio on the same guest, so I really think
> this needs to be a capability of the virtio device.
Or a capability of the PCI slot, somehow.  I don't understand PCI
topology very well.
>
> I don't completely understand what games Xen is playing here, but from
> what I can tell, it's pretty clear that today's qemu implementation
> always bypasses any iommu and so should always be exported as such on
> all platforms, at least all kvm and pure qemu ones.
Except that I think that PPC is the only platform on which QEMU's code
actually bypasses any IOMMU.  Unless we've all missed something, there
is no QEMU release that will put a virtio device behind an IOMMU on
any platform other than PPC.

If the eventual solution is to say that virtio 1.0 PCI devices always
respect an IOMMU unless they set a magic flag saying "I'm not real
hardware and I bypass the IOMMU", then I don't really object to that,
except that it'll be a mess if the guest is running Xen.  But even Xen
would (I think) be okay if it actually worked by having a new DMA API
operation that says "this device is magically identity mapped" and
then just teaching Xen to implement that.

But I'm not an OASIS member, so I can't really do this.  I agree that
this issue needs to be addressed somehow, but I don't think it needs
to block these patches.

--Andy
>
>> I think that (aside from the trivial DMI/DMA typo) the only real issue
>> here is that the situation on PPC is ugly.  We're failing to enable
>> physical virtio hardware on PPC with these patches, but that never
>> worked anyway.  I don't think that there are any regressions other
>> than ugliness.
>>
>> My preference would be to apply the patches as is (or with
"DMA"
>> spelled correctly), and then to:
>>
>>  - Make sure that all virtio-mmio systems have working DMA ops so that
>> virtio-mmio can the DMA API
>>
>>  - Fix the DMA API on s390 (probably easy) and on PPC (not necessarily
so easy)
>>
>>  - Remove the non-DMA-API code, which would be a very small change on
>> top of these patches.
>>
>> --Andy
>
>


-- 
Andy Lutomirski
AMA Capital Management, LLC

Benjamin Herrenschmidt

2014-Sep-29 21:06 UTC

head link

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

On Mon, 2014-09-29 at 13:55 -0700, Andy Lutomirski
wrote:> If the eventual solution is to say that virtio 1.0 PCI devices always
> respect an IOMMU unless they set a magic flag saying "I'm not real
> hardware and I bypass the IOMMU", then I don't really object to
that,
> except that it'll be a mess if the guest is running Xen.  But even Xen
> would (I think) be okay if it actually worked by having a new DMA API
> operation that says "this device is magically identity mapped"
and
> then just teaching Xen to implement that.
> 
> But I'm not an OASIS member, so I can't really do this.  I agree
that
> this issue needs to be addressed somehow, but I don't think it needs
> to block these patches.
I'll let Rusty be the final judge of that (well, when he's back from his
vacation that is).

Cheers,
Ben.

Michael S. Tsirkin

2014-Sep-30 15:38 UTC

head link

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

On Mon, Sep 29, 2014 at 01:55:11PM -0700, Andy Lutomirski
wrote:> On Mon, Sep 29, 2014 at 1:49 PM, Benjamin Herrenschmidt
> <benh at kernel.crashing.org> wrote:
> > On Mon, 2014-09-29 at 11:55 -0700, Andy Lutomirski wrote:
> >
> >> Rusty and Michael, what's the status of this?
> >
> > The status is that I still think we need *a* way to actually inform
the
> > guest whether the virtio implementation will or will not bypass the
> > IOMMU. I don't know Xen enough to figure out how to do that and we
could
> > maybe just make it something qemu puts in the device-tree on powerpc
> > only.
> >
> > However I dislike making it global or per-bus, we could have a
> > combination of qemu and HW virtio on the same guest, so I really think
> > this needs to be a capability of the virtio device.
> 
> Or a capability of the PCI slot, somehow.  I don't understand PCI
> topology very well.
> 
> >
> > I don't completely understand what games Xen is playing here, but
from
> > what I can tell, it's pretty clear that today's qemu
implementation
> > always bypasses any iommu and so should always be exported as such on
> > all platforms, at least all kvm and pure qemu ones.
> 
> Except that I think that PPC is the only platform on which QEMU's code
> actually bypasses any IOMMU.  Unless we've all missed something, there
> is no QEMU release that will put a virtio device behind an IOMMU on
> any platform other than PPC.
I think that is true but it seems that this will be true for x86 for
QEMU 2.2 unless we make some changes there.
Which we might not have the time for since 2.2 is feature frozen
from tomorrow.
Maybe we should disable the IOMMU in 2.2, this is worth considering.

> If the eventual solution is to say that virtio 1.0 PCI devices always
> respect an IOMMU unless they set a magic flag saying "I'm not real
> hardware and I bypass the IOMMU", then I don't really object to
that,
> except that it'll be a mess if the guest is running Xen.  But even Xen
> would (I think) be okay if it actually worked by having a new DMA API
> operation that says "this device is magically identity mapped"
and
> then just teaching Xen to implement that.
> 
> But I'm not an OASIS member, so I can't really do this.  I agree
that
> this issue needs to be addressed somehow, but I don't think it needs
> to block these patches.
> 
> --Andy
I thought hard about this, I think we are better off waiting till the
next release: there's a chance QEMU will have IOMMU support for KVM x86
then, and this will make it easier to judge which way does the wind
blow.

It seems that we lose nothing substantial keeping the status quo a bit longer,
but if we make an incompatible change in guests now we might
create nasty compatibility headaches going forward.
> >
> >> I think that (aside from the trivial DMI/DMA typo) the only real
issue
> >> here is that the situation on PPC is ugly.  We're failing to
enable
> >> physical virtio hardware on PPC with these patches, but that never
> >> worked anyway.  I don't think that there are any regressions
other
> >> than ugliness.
> >>
> >> My preference would be to apply the patches as is (or with
"DMA"
> >> spelled correctly), and then to:
> >>
> >>  - Make sure that all virtio-mmio systems have working DMA ops so
that
> >> virtio-mmio can the DMA API
> >>
> >>  - Fix the DMA API on s390 (probably easy) and on PPC (not
necessarily so easy)
> >>
> >>  - Remove the non-DMA-API code, which would be a very small change
on
> >> top of these patches.
> >>
> >> --Andy
> >
> >
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC

Andy Lutomirski

2014-Sep-30 15:48 UTC

head link

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

On Tue, Sep 30, 2014 at 8:38 AM, Michael S. Tsirkin <mst at redhat.com>
wrote:> I thought hard about this, I think we are better off waiting till the
> next release: there's a chance QEMU will have IOMMU support for KVM x86
> then, and this will make it easier to judge which way does the wind
> blow.
>
> It seems that we lose nothing substantial keeping the status quo a bit
longer,
> but if we make an incompatible change in guests now we might
> create nasty compatibility headaches going forward.
>
I would argue for the opposite approach.  Having a QEMU release that
supports an IOMMU on x86 and exposes a commonly used PCI device that
bypasses that IOMMU without any explicit notification to the guest
(and specification!) that this is happening is IMO insane.  Once that
happens, we'll have to address the nasty case on both x86 and PPC.
This will suck.

If we accept the guest change and make sure that there is never a QEMU
release that has a visible IOMMU cheat on any arch other than PPC,
then at least the damage will be contained.

x86 will be worse than PPC, too: the special case needed to support
QEMU 2.2 with IOMMU and virtio enabled with a Xen guest will be fairly
large and disgusting and will only exist to support something that IMO
should never have existed in the first place.

PPC at least avoids *that* problem by virtue of not having Xen
paravirt.  (And please don't add Xen paravirt to PPC -- x86 is trying
to kill it off, but this is a 5-10 year project.)

[..., reordered]
>>
>> Except that I think that PPC is the only platform on which QEMU's
code
>> actually bypasses any IOMMU.  Unless we've all missed something,
there
>> is no QEMU release that will put a virtio device behind an IOMMU on
>> any platform other than PPC.
>
> I think that is true but it seems that this will be true for x86 for
> QEMU 2.2 unless we make some changes there.
> Which we might not have the time for since 2.2 is feature frozen
> from tomorrow.
> Maybe we should disable the IOMMU in 2.2, this is worth considering.
>
Please do.

Also, try booting this 2.2 QEMU candidate with nested virtualization
on.  Then bind vfio to a virtio-pci device and watch the guest get
corrupted.  QEMU will blame Linux for incorrectly programming the
hardware, and Linux will blame QEMU for its blatant violation of the
ACPI spec.  Given that this is presumably most of the point of adding
IOMMU support, it seems like a terrible idea to let code like that
into the wild.

If this happens, Linux may also end up needing a quirk to prevent vfio
from binding to QEMU 2.2's virtio-pci devices.

--Andy

Paolo Bonzini

2014-Sep-30 15:53 UTC

head link

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

Il 30/09/2014 17:38, Michael S. Tsirkin ha scritto:> I think that is true but it seems that this will be true for x86 for
> QEMU 2.2 unless we make some changes there.
> Which we might not have the time for since 2.2 is feature frozen
> from tomorrow.
> Maybe we should disable the IOMMU in 2.2, this is worth considering.
It is disabled by default, no?  And only supported by Q35 which does not
yet have migration compatibility (though we are versioning it, not sure
why that is the case).

Paolo

Andy Lutomirski

2014-Sep-30 20:05 UTC

head link

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

On Tue, Sep 30, 2014 at 8:38 AM, Michael S. Tsirkin <mst at redhat.com>
wrote:> On Mon, Sep 29, 2014 at 01:55:11PM -0700, Andy Lutomirski wrote:
>> On Mon, Sep 29, 2014 at 1:49 PM, Benjamin Herrenschmidt
>> <benh at kernel.crashing.org> wrote:
>> > On Mon, 2014-09-29 at 11:55 -0700, Andy Lutomirski wrote:
>> >
>> >> Rusty and Michael, what's the status of this?
>> >
>> > The status is that I still think we need *a* way to actually
inform the
>> > guest whether the virtio implementation will or will not bypass
the
>> > IOMMU. I don't know Xen enough to figure out how to do that
and we could
>> > maybe just make it something qemu puts in the device-tree on
powerpc
>> > only.
>> >
>> > However I dislike making it global or per-bus, we could have a
>> > combination of qemu and HW virtio on the same guest, so I really
think
>> > this needs to be a capability of the virtio device.
>>
>> Or a capability of the PCI slot, somehow.  I don't understand PCI
>> topology very well.
>>
>> >
>> > I don't completely understand what games Xen is playing here,
but from
>> > what I can tell, it's pretty clear that today's qemu
implementation
>> > always bypasses any iommu and so should always be exported as such
on
>> > all platforms, at least all kvm and pure qemu ones.
>>
>> Except that I think that PPC is the only platform on which QEMU's
code
>> actually bypasses any IOMMU.  Unless we've all missed something,
there
>> is no QEMU release that will put a virtio device behind an IOMMU on
>> any platform other than PPC.
>
> I think that is true but it seems that this will be true for x86 for
> QEMU 2.2 unless we make some changes there.
> Which we might not have the time for since 2.2 is feature frozen
> from tomorrow.
> Maybe we should disable the IOMMU in 2.2, this is worth considering.
>
>
>> If the eventual solution is to say that virtio 1.0 PCI devices always
>> respect an IOMMU unless they set a magic flag saying "I'm not
real
>> hardware and I bypass the IOMMU", then I don't really object
to that,
>> except that it'll be a mess if the guest is running Xen.  But even
Xen
>> would (I think) be okay if it actually worked by having a new DMA API
>> operation that says "this device is magically identity
mapped" and
>> then just teaching Xen to implement that.
>>
>> But I'm not an OASIS member, so I can't really do this.  I
agree that
>> this issue needs to be addressed somehow, but I don't think it
needs
>> to block these patches.
>>
>> --Andy
>
> I thought hard about this, I think we are better off waiting till the
> next release: there's a chance QEMU will have IOMMU support for KVM x86
> then, and this will make it easier to judge which way does the wind
> blow.
If QEMU wants to fix this, it looks like it wouldn't be so bad.  The
virtio code could keep track of the address space in use.  It looks
like a lot of the changes would be simplifications, but vring_map
would have to go away.  That would cause some performance hit due to
the loss of the permanent ring mapping, bit it would be no worse than
the temporary mappings that already exist for indirect descriptors and
actual data.

--Andy

Reasonably Related Threads

Search for more seemingly similar threads

Virtualization - Sep 2014 - [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible

Reasonably Related Threads