Christoph Hellwig
2019-Aug-11 05:56 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
sev_active() is gone now in linux-next, at least as a global API. And once again this is entirely going in the wrong direction. The only way using the DMA API is going to work at all is if the device is ready for it. So we need a flag on the virtio device, exposed by the hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, don't take a shortcut. And that means on power and s390 qemu will always have to set thos if you want to be ready for the ultravisor and co games. It's not like we haven't been through this a few times before, have we?
Ram Pai
2019-Aug-11 06:46 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote:> sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. > > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we?We have been through this so many times, but I dont think, we ever understood each other. I have a fundamental question, the answer to which was never clear. Here it is... If the hypervisor (hardware for hw virtio devices) does not mandate a DMA API, why is it illegal for the driver to request, special handling of its i/o buffers? Why are we associating this special handling to always mean, some DMA address translation? Can't there be any other kind of special handling needs, that has nothing to do with DMA address translation? -- Ram Pai
Michael S. Tsirkin
2019-Aug-11 08:42 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote:> And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it.So the point made is that if DMA addresses are also physical addresses (not necessarily the same physical addresses that driver supplied), then DMA API actually works even though device itself uses CPU page tables. To put it in other terms: it would be possible to make all or part of memory unenecrypted and then have virtio access all of it. SEV guests at the moment make a decision to instead use a bounce buffer, forcing an extra copy but gaining security. -- MST
Michael S. Tsirkin
2019-Aug-11 08:44 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote:> On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > sev_active() is gone now in linux-next, at least as a global API. > > > > And once again this is entirely going in the wrong direction. The only > > way using the DMA API is going to work at all is if the device is ready > > for it. So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > > > And that means on power and s390 qemu will always have to set thos if > > you want to be ready for the ultravisor and co games. It's not like we > > haven't been through this a few times before, have we? > > > We have been through this so many times, but I dont think, we ever > understood each other. I have a fundamental question, the answer to > which was never clear. Here it is... > > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation?I think the answer to that is, extend the DMA API to cover that special need then. And that's exactly what dma_addr_is_phys_addr is trying to do.> > -- > Ram Pai
Michael S. Tsirkin
2019-Aug-11 08:55 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote:> So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut.The point here is that it's actually still not real. So we would still use a physical address. However Linux decides that it wants extra security by moving all data through the bounce buffer. The distinction made is that one can actually give device a physical address of the bounce buffer. -- MST
David Gibson
2019-Aug-12 09:51 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote:> sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut.There still seems to be a failure to understand each other here. The limitation here simply *is not* a property of the device. In fact, it's effectively a property of the memory the virtio device would be trying to access (because it's in secure mode it can't be directly accessed via the hypervisor). There absolutely are cases where this is a device property (a physical virtio device being the obvious one), but this isn't one of them. Unfortunately, we're kind of stymied by the feature negotiation model of virtio. AIUI the hypervisor / device presents a bunch of feature bits of which the guest / driver selects a subset. AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, because to handle for cases where it *is* a device limitation, we assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then the guest *must* select it. What we actually need here is for the hypervisor to present VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need a way for the platform core code to communicate to the virtio driver that *it* requires the IOMMU to be used, so that the driver can select or not the feature bit on that basis.> And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we?-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20190812/a419edf6/attachment.sig>
Christoph Hellwig
2019-Aug-12 12:13 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote:> If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation?I don't think it is illegal per se. It is however completely broken if we do that decision on a system weide scale rather than properly requesting it through a per-device flag in the normal virtio framework.
Christoph Hellwig
2019-Aug-12 12:15 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote:> On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > The point here is that it's actually still not real. So we would still > use a physical address. However Linux decides that it wants extra > security by moving all data through the bounce buffer. The distinction > made is that one can actually give device a physical address of the > bounce buffer.Sure. The problem is just that you keep piling hacks on top of hacks. We need the per-device flag anyway to properly support hardware virtio device in all circumstances. Instead of coming up with another ad-hoc hack to force DMA uses implement that one proper bit and reuse it here.
Christoph Hellwig
2019-Aug-13 13:26 UTC
[RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote:> AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > because to handle for cases where it *is* a device limitation, we > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > the guest *must* select it. > > What we actually need here is for the hypervisor to present > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > a way for the platform core code to communicate to the virtio driver > that *it* requires the IOMMU to be used, so that the driver can select > or not the feature bit on that basis.I agree with the above, but that just brings us back to the original issue - the whole bypass of the DMA OPS should be an option that the device can offer, not the other way around. And we really need to fix that root cause instead of doctoring around it.