thr3ads.net - Linux Virtualization - [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Andy Lutomirski

2015-Jul-28 19:24 UTC

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Tue, Jul 28, 2015 at 12:06 PM, Jan Kiszka <jan.kiszka at siemens.com>
wrote:> On 2015-07-28 20:22, Andy Lutomirski wrote:
>> On Tue, Jul 28, 2015 at 10:17 AM, Jan Kiszka <jan.kiszka at
siemens.com> wrote:
>>> On 2015-07-28 19:10, Andy Lutomirski wrote:
>>>> The trouble is that this is really a property of the bus and
not of
>>>> the device.  If you build a virtio device that physically plugs
into a
>>>> PCIe slot, the device has no concept of an IOMMU in the first
place.
>>>
>>> If one would build a real virtio device today, it would be broken
>>> because every IOMMU would start to translate its requests. Already
from
>>> that POV, we really need to introduce a feature flag "I will
be
>>> IOMMU-translated" so that a potential physical implementation
can carry
>>> it unconditionally.
>>>
>>
>> Except that, with my patches, it would work correctly.  ISTM the thing
>
> I haven't looked at your patches yet - they make the virtio PCI driver
> in Linux IOMMU-compatible? Perfect - except for a compatibility check,
> right?
Yes.  (virtio_pci_legacy, anyway.  Presumably virtio_pci_modern is
easy to adapt, too.)
>
>> that's broken right now is QEMU and the virtio_pci driver.  My
patches
>> fix the driver.  Last year that would have been the end of the story
>> except for PPC.  Now we have to deal with QEMU.
>>
>>>> Similarly, if you take an L0-provided IOMMU-supporting device
and pass
>>>> it through to L2 using current QEMU on L1 (with Q35 emulation
and
>>>> iommu enabled), then, from L2's perspective, the device is
1:1 no
>>>> matter what the device thinks.
>>>>
>>>> IOW, I think the original design was wrong and now we have to
deal
>>>> with it.  I think the best solution would be to teach QEMU to
fix its
>>>> ACPI tables so that 1:1 virtio devices are actually exposed as
1:1.
>>>
>>> Only the current drivers are broken. And we can easily tell them
apart
>>> from newer ones via feature flags. Sorry, don't get the
problem.
>>
>> I still don't see how feature flags solve the problem.  Suppose we
>> added a feature flag meaning "respects IOMMU".
>>
>> Bad case 1:  Build a malicious device that advertises
>> non-IOMMU-respecting virtio.  Plug it in behind an IOMMU.  Host starts
>> leaking physical addresses to the device (and the device doesn't
work,
>> of course).  Maybe that's only barely a security problem, but
still...
>
> I don't see right now how critical such a hypothetical case could be.
> But the OS / its drivers could still decide to refuse talking to such a
> device.
>
How does OS know it's such a device as opposed to a QEMU-supplied thing?
>>
>> Bad case 2: Some hypothetical well-behaved new QEMU provides a virtio
>> device that *does* respect the IOMMU and sets the feature flag.  They
>> emulate Q35 with an IOMMU.  They boot Linux 4.1.  Data corruption in
>> the guest.
>
> No. In that case, the feature negotiation of
"virtio-with-iommu-support"
> would have failed for older drivers, and the device would have never
> been used by the guest.
So are you suggesting that newer virtio devices always provide this
feature flag and, if supplied by QEMU with iommu=on, simply refuse to
operate of the driver doesn't support that flag?

That could work as long as QEMU with the current (broken?) iommu=on
never exposes such a device.
>
>>
>> We could make the rule that *all* virtio-pci devices (except on PPC)
>> respect the bus rules.  We'd have to fix QEMU so that virtio
devices
>> on Q35 iommu=on systems set up a PCI topology where the devices
>> *aren't* behind the IOMMU or are protected by RMRRs or whatever. 
Then
>> old kernels would work correctly on new hosts, new kernels would work
>> correctly except on old iommu-providing hosts, and Xen would work.
>
> I don't see a point in doing anything about old QEMU with IOMMU enabled
> and virtio devices plugged except declaring such setups broken. No one
> should have configured this for production purposes, only for test
> setups (like we, with the knowledge about the limitations).
>
I'm fine with that.  In fact, I proposed these patches before QEMU had
this feature in the first place.
>>
>> In fact, on Xen, it's impossible without colossal hacks to support
>> non-IOMMU-respecting virtio devices because Xen acts as an
>> intermediate IOMMU between the Linux dom0 guest and the actual host.
>> The QEMU host doesn't even know that Xen is involved.  This is why
Xen
>> and virtio don't currently work together (without my patches): the
>> device thinks it doesn't respect the IOMMU, the driver thinks the
>> device doesn't respect the IOMMU, and they're both wrong.
>>
>> TL;DR: I think there are only two cases.  Either a device respects the
>> IOMMU or a device doesn't know whether it respects the IOMMU.  The
>> latter case is problematic.
>
> See above, the latter is only problematic on setups that actually use an
> IOMMU. If that includes Xen, then no one should use it until virtio can
> declare itself IOMMU compatible, and drivers exist that process this.
Xen works right now with my patches on standard QEMU (as long as
iommu=off).  Certainly no one except me uses it now with virtio
because it doesn't work with mainline kernels.

If we apply something similar enough to my patches, then even old
hypervisors (e.g. Amazon's hardware virt systems) will support Xen
with virtio devices passed in just fine.

--Andy

Jan Kiszka

2015-Jul-28 19:33 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On 2015-07-28 21:24, Andy Lutomirski wrote:> On Tue, Jul 28, 2015 at 12:06 PM, Jan Kiszka <jan.kiszka at
siemens.com> wrote:
>> On 2015-07-28 20:22, Andy Lutomirski wrote:
>>> On Tue, Jul 28, 2015 at 10:17 AM, Jan Kiszka <jan.kiszka at
siemens.com> wrote:
>>>> On 2015-07-28 19:10, Andy Lutomirski wrote:
>>>>> The trouble is that this is really a property of the bus
and not of
>>>>> the device.  If you build a virtio device that physically
plugs into a
>>>>> PCIe slot, the device has no concept of an IOMMU in the
first place.
>>>>
>>>> If one would build a real virtio device today, it would be
broken
>>>> because every IOMMU would start to translate its requests.
Already from
>>>> that POV, we really need to introduce a feature flag "I
will be
>>>> IOMMU-translated" so that a potential physical
implementation can carry
>>>> it unconditionally.
>>>>
>>>
>>> Except that, with my patches, it would work correctly.  ISTM the
thing
>>
>> I haven't looked at your patches yet - they make the virtio PCI
driver
>> in Linux IOMMU-compatible? Perfect - except for a compatibility check,
>> right?
> 
> Yes.  (virtio_pci_legacy, anyway.  Presumably virtio_pci_modern is
> easy to adapt, too.)
> 
>>
>>> that's broken right now is QEMU and the virtio_pci driver.  My
patches
>>> fix the driver.  Last year that would have been the end of the
story
>>> except for PPC.  Now we have to deal with QEMU.
>>>
>>>>> Similarly, if you take an L0-provided IOMMU-supporting
device and pass
>>>>> it through to L2 using current QEMU on L1 (with Q35
emulation and
>>>>> iommu enabled), then, from L2's perspective, the device
is 1:1 no
>>>>> matter what the device thinks.
>>>>>
>>>>> IOW, I think the original design was wrong and now we have
to deal
>>>>> with it.  I think the best solution would be to teach QEMU
to fix its
>>>>> ACPI tables so that 1:1 virtio devices are actually exposed
as 1:1.
>>>>
>>>> Only the current drivers are broken. And we can easily tell
them apart
>>>> from newer ones via feature flags. Sorry, don't get the
problem.
>>>
>>> I still don't see how feature flags solve the problem.  Suppose
we
>>> added a feature flag meaning "respects IOMMU".
>>>
>>> Bad case 1:  Build a malicious device that advertises
>>> non-IOMMU-respecting virtio.  Plug it in behind an IOMMU.  Host
starts
>>> leaking physical addresses to the device (and the device
doesn't work,
>>> of course).  Maybe that's only barely a security problem, but
still...
>>
>> I don't see right now how critical such a hypothetical case could
be.
>> But the OS / its drivers could still decide to refuse talking to such a
>> device.
>>
> 
> How does OS know it's such a device as opposed to a QEMU-supplied
thing?
It can restrict itself to virtio devices exposing the feature if it
feels uncomfortable that it might be talking to some evil piece of
silicon (instead of the hypervisor, which has to be trusted anyway).
> 
>>>
>>> Bad case 2: Some hypothetical well-behaved new QEMU provides a
virtio
>>> device that *does* respect the IOMMU and sets the feature flag. 
They
>>> emulate Q35 with an IOMMU.  They boot Linux 4.1.  Data corruption
in
>>> the guest.
>>
>> No. In that case, the feature negotiation of
"virtio-with-iommu-support"
>> would have failed for older drivers, and the device would have never
>> been used by the guest.
> 
> So are you suggesting that newer virtio devices always provide this
> feature flag and, if supplied by QEMU with iommu=on, simply refuse to
> operate of the driver doesn't support that flag?
Exactly.
> 
> That could work as long as QEMU with the current (broken?) iommu=on
> never exposes such a device.
QEMU would have to be adjusted first so that all its virtio-pci device
models take IOMMUs into account - if they exist or not. Only then it
could expose the feature and expect the guest to acknowledge it.

For compat reasons, QEMU should still be able to expose virtio devices
without the flag set - but then without any IOMMU emulation enabled as
well. That would prevent the current setup we are using today, but it's
trivial to update the guest kernel to a newer virtio driver which would
restore our scenario again.
> 
>>
>>>
>>> We could make the rule that *all* virtio-pci devices (except on
PPC)
>>> respect the bus rules.  We'd have to fix QEMU so that virtio
devices
>>> on Q35 iommu=on systems set up a PCI topology where the devices
>>> *aren't* behind the IOMMU or are protected by RMRRs or
whatever.  Then
>>> old kernels would work correctly on new hosts, new kernels would
work
>>> correctly except on old iommu-providing hosts, and Xen would work.
>>
>> I don't see a point in doing anything about old QEMU with IOMMU
enabled
>> and virtio devices plugged except declaring such setups broken. No one
>> should have configured this for production purposes, only for test
>> setups (like we, with the knowledge about the limitations).
>>
> 
> I'm fine with that.  In fact, I proposed these patches before QEMU had
> this feature in the first place.
> 
>>>
>>> In fact, on Xen, it's impossible without colossal hacks to
support
>>> non-IOMMU-respecting virtio devices because Xen acts as an
>>> intermediate IOMMU between the Linux dom0 guest and the actual
host.
>>> The QEMU host doesn't even know that Xen is involved.  This is
why Xen
>>> and virtio don't currently work together (without my patches):
the
>>> device thinks it doesn't respect the IOMMU, the driver thinks
the
>>> device doesn't respect the IOMMU, and they're both wrong.
>>>
>>> TL;DR: I think there are only two cases.  Either a device respects
the
>>> IOMMU or a device doesn't know whether it respects the IOMMU. 
The
>>> latter case is problematic.
>>
>> See above, the latter is only problematic on setups that actually use
an
>> IOMMU. If that includes Xen, then no one should use it until virtio can
>> declare itself IOMMU compatible, and drivers exist that process this.
> 
> Xen works right now with my patches on standard QEMU (as long as
> iommu=off).  Certainly no one except me uses it now with virtio
> because it doesn't work with mainline kernels.
> 
> If we apply something similar enough to my patches, then even old
> hypervisors (e.g. Amazon's hardware virt systems) will support Xen
> with virtio devices passed in just fine.
Then it seems we can make everyone happy - perfect. :)

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

Andy Lutomirski

2015-Jul-28 21:16 UTC

head link

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

On Tue, Jul 28, 2015 at 12:33 PM, Jan Kiszka <jan.kiszka at siemens.com>
wrote:> On 2015-07-28 21:24, Andy Lutomirski wrote:
>> On Tue, Jul 28, 2015 at 12:06 PM, Jan Kiszka <jan.kiszka at
siemens.com> wrote:
>>> On 2015-07-28 20:22, Andy Lutomirski wrote:
>>>> On Tue, Jul 28, 2015 at 10:17 AM, Jan Kiszka <jan.kiszka at
siemens.com> wrote:
>>>>> On 2015-07-28 19:10, Andy Lutomirski wrote:
>>>>>> The trouble is that this is really a property of the
bus and not of
>>>>>> the device.  If you build a virtio device that
physically plugs into a
>>>>>> PCIe slot, the device has no concept of an IOMMU in the
first place.
>>>>>
>>>>> If one would build a real virtio device today, it would be
broken
>>>>> because every IOMMU would start to translate its requests.
Already from
>>>>> that POV, we really need to introduce a feature flag
"I will be
>>>>> IOMMU-translated" so that a potential physical
implementation can carry
>>>>> it unconditionally.
>>>>>
>>>>
>>>> Except that, with my patches, it would work correctly.  ISTM
the thing
>>>
>>> I haven't looked at your patches yet - they make the virtio PCI
driver
>>> in Linux IOMMU-compatible? Perfect - except for a compatibility
check,
>>> right?
>>
>> Yes.  (virtio_pci_legacy, anyway.  Presumably virtio_pci_modern is
>> easy to adapt, too.)
>>
>>>
>>>> that's broken right now is QEMU and the virtio_pci driver. 
My patches
>>>> fix the driver.  Last year that would have been the end of the
story
>>>> except for PPC.  Now we have to deal with QEMU.
>>>>
>>>>>> Similarly, if you take an L0-provided IOMMU-supporting
device and pass
>>>>>> it through to L2 using current QEMU on L1 (with Q35
emulation and
>>>>>> iommu enabled), then, from L2's perspective, the
device is 1:1 no
>>>>>> matter what the device thinks.
>>>>>>
>>>>>> IOW, I think the original design was wrong and now we
have to deal
>>>>>> with it.  I think the best solution would be to teach
QEMU to fix its
>>>>>> ACPI tables so that 1:1 virtio devices are actually
exposed as 1:1.
>>>>>
>>>>> Only the current drivers are broken. And we can easily tell
them apart
>>>>> from newer ones via feature flags. Sorry, don't get the
problem.
>>>>
>>>> I still don't see how feature flags solve the problem. 
Suppose we
>>>> added a feature flag meaning "respects IOMMU".
>>>>
>>>> Bad case 1:  Build a malicious device that advertises
>>>> non-IOMMU-respecting virtio.  Plug it in behind an IOMMU.  Host
starts
>>>> leaking physical addresses to the device (and the device
doesn't work,
>>>> of course).  Maybe that's only barely a security problem,
but still...
>>>
>>> I don't see right now how critical such a hypothetical case
could be.
>>> But the OS / its drivers could still decide to refuse talking to
such a
>>> device.
>>>
>>
>> How does OS know it's such a device as opposed to a QEMU-supplied
thing?
>
> It can restrict itself to virtio devices exposing the feature if it
> feels uncomfortable that it might be talking to some evil piece of
> silicon (instead of the hypervisor, which has to be trusted anyway).
>
>>
>>>>
>>>> Bad case 2: Some hypothetical well-behaved new QEMU provides a
virtio
>>>> device that *does* respect the IOMMU and sets the feature flag.
They
>>>> emulate Q35 with an IOMMU.  They boot Linux 4.1.  Data
corruption in
>>>> the guest.
>>>
>>> No. In that case, the feature negotiation of
"virtio-with-iommu-support"
>>> would have failed for older drivers, and the device would have
never
>>> been used by the guest.
>>
>> So are you suggesting that newer virtio devices always provide this
>> feature flag and, if supplied by QEMU with iommu=on, simply refuse to
>> operate of the driver doesn't support that flag?
>
> Exactly.
>
>>
>> That could work as long as QEMU with the current (broken?) iommu=on
>> never exposes such a device.
>
> QEMU would have to be adjusted first so that all its virtio-pci device
> models take IOMMUs into account - if they exist or not. Only then it
> could expose the feature and expect the guest to acknowledge it.
>
> For compat reasons, QEMU should still be able to expose virtio devices
> without the flag set - but then without any IOMMU emulation enabled as
> well. That would prevent the current setup we are using today, but it's
> trivial to update the guest kernel to a newer virtio driver which would
> restore our scenario again.
Seems reasonable.
>>
>> If we apply something similar enough to my patches, then even old
>> hypervisors (e.g. Amazon's hardware virt systems) will support Xen
>> with virtio devices passed in just fine.
>
> Then it seems we can make everyone happy - perfect. :)
Yay.

FWIW, I have no intention to touch the QEMU code for this.  I'm
willing to do the vring bit and the virtio-pci bit as long as it's
well specified.

--Andy

Reasonably Related Threads

Search for more reasonably related threads

Linux Virtualization - Jul 2015 - [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

[PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API

Reasonably Related Threads