Konrad Rzeszutek Wilk
2021-Feb-02 16:37 UTC
[PATCH] swiotlb: Validate bounce size in the sync/unmap path
On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote: > > On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote: > > > On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote: > > > > On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote: > > > > > The size of the buffer being bounced is not checked if it happens > > > > > to be larger than the size of the mapped buffer. Because the size > > > > > can be controlled by a device, as it's the case with virtio devices, > > > > > this can lead to memory corruption. > > > > > > > > > > > > > I'm really worried about all these hodge podge hacks for not trusted > > > > hypervisors in the I/O stack. Instead of trying to harden protocols > > > > that are fundamentally not designed for this, how about instead coming > > > > up with a new paravirtualized I/O interface that is specifically > > > > designed for use with an untrusted hypervisor from the start? > > > > > > Your comment makes sense but then that would require the cooperation > > > of these vendors and the cloud providers to agree on something meaningful. > > > I am also not sure whether the end result would be better than hardening > > > this interface to catch corruption. There is already some validation in > > > unmap path anyway. > > > > > > Another possibility is to move this hardening to the common virtio code, > > > but I think the code may become more complicated there since it would > > > require tracking both the dma_addr and length for each descriptor. > > > > Christoph, > > > > I've been wrestling with the same thing - this is specific to busted > > drivers. And in reality you could do the same thing with a hardware > > virtio device (see example in http://thunderclap.io/) - where the > > mitigation is 'enable the IOMMU to do its job.'. > > > > AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP).. > > and while that is great in the future, SEV without IOMMU is now here. > > > > Doing a full circle here, this issue can be exploited with virtio > > but you could say do that with real hardware too if you hacked the > > firmware, so if you say used Intel SR-IOV NIC that was compromised > > on an AMD SEV machine, and plumbed in the guest - the IOMMU inside > > of the guest would be SWIOTLB code. Last line of defense against > > bad firmware to say. > > > > As such I am leaning towards taking this code, but I am worried > > about the performance hit .. but perhaps I shouldn't as if you > > are using SWIOTLB=force already you are kind of taking a > > performance hit? > > > > I have not measured the performance degradation. This will hit all AMD SEV, > Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to > be large since there are only few added operations per hundreads of copied > bytes. I could try to measure the performance hit by running some benchmark > with virtio-net/virtio-blk/virtio-rng. > > Earlier I said: > > > Another possibility is to move this hardening to the common virtio code, > > > but I think the code may become more complicated there since it would > > > require tracking both the dma_addr and length for each descriptor. > > Unfortunately, this doesn't make sense. Even if there's validation for > the size in the common virtio layer, there will be some other device > which controls a dma_addr and length passed to dma_unmap* in the > corresponding driver. The device can target a specific dma-mapped private > buffer by changing the dma_addr and set a good length to overwrite buffers > following it. > > So, instead of doing the check in every driver and hitting a performance > cost even when swiotlb is not used, it's probably better to fix it in > swiotlb. > > @Tom Lendacky, do you think that it makes sense to harden swiotlb or > some other approach may be better for the SEV features?I am not Tom, but this change seems the right way forward regardless if is TDX, AMD SEV, or any other architecture that encrypt memory and use SWIOTLB. Let me queue it up in development branch and do some regression testing.>
Tom Lendacky
2021-Feb-02 22:34 UTC
[PATCH] swiotlb: Validate bounce size in the sync/unmap path
On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote:> On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote: >> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote: >>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote: >>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote: >>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote: >>>>>> The size of the buffer being bounced is not checked if it happens >>>>>> to be larger than the size of the mapped buffer. Because the size >>>>>> can be controlled by a device, as it's the case with virtio devices, >>>>>> this can lead to memory corruption. >>>>>> >>>>> >>>>> I'm really worried about all these hodge podge hacks for not trusted >>>>> hypervisors in the I/O stack. Instead of trying to harden protocols >>>>> that are fundamentally not designed for this, how about instead coming >>>>> up with a new paravirtualized I/O interface that is specifically >>>>> designed for use with an untrusted hypervisor from the start? >>>> >>>> Your comment makes sense but then that would require the cooperation >>>> of these vendors and the cloud providers to agree on something meaningful. >>>> I am also not sure whether the end result would be better than hardening >>>> this interface to catch corruption. There is already some validation in >>>> unmap path anyway. >>>> >>>> Another possibility is to move this hardening to the common virtio code, >>>> but I think the code may become more complicated there since it would >>>> require tracking both the dma_addr and length for each descriptor. >>> >>> Christoph, >>> >>> I've been wrestling with the same thing - this is specific to busted >>> drivers. And in reality you could do the same thing with a hardware >>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&reserved=0) - where the >>> mitigation is 'enable the IOMMU to do its job.'. >>> >>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP).. >>> and while that is great in the future, SEV without IOMMU is now here. >>> >>> Doing a full circle here, this issue can be exploited with virtio >>> but you could say do that with real hardware too if you hacked the >>> firmware, so if you say used Intel SR-IOV NIC that was compromised >>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside >>> of the guest would be SWIOTLB code. Last line of defense against >>> bad firmware to say. >>> >>> As such I am leaning towards taking this code, but I am worried >>> about the performance hit .. but perhaps I shouldn't as if you >>> are using SWIOTLB=force already you are kind of taking a >>> performance hit? >>> >> >> I have not measured the performance degradation. This will hit all AMD SEV, >> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to >> be large since there are only few added operations per hundreads of copied >> bytes. I could try to measure the performance hit by running some benchmark >> with virtio-net/virtio-blk/virtio-rng. >> >> Earlier I said: >>>> Another possibility is to move this hardening to the common virtio code, >>>> but I think the code may become more complicated there since it would >>>> require tracking both the dma_addr and length for each descriptor. >> >> Unfortunately, this doesn't make sense. Even if there's validation for >> the size in the common virtio layer, there will be some other device >> which controls a dma_addr and length passed to dma_unmap* in the >> corresponding driver. The device can target a specific dma-mapped private >> buffer by changing the dma_addr and set a good length to overwrite buffers >> following it. >> >> So, instead of doing the check in every driver and hitting a performance >> cost even when swiotlb is not used, it's probably better to fix it in >> swiotlb. >> >> @Tom Lendacky, do you think that it makes sense to harden swiotlb or >> some other approach may be better for the SEV features? > > I am not Tom, but this change seems the right way forward regardless if > is TDX, AMD SEV, or any other architecture that encrypt memory and use > SWIOTLB.Sorry, I missed the @Tom before. I'm with Konrad and believe it makes sense to add these checks. I'm not sure if there would be a better approach for all confidential computing technologies. SWIOTLB works nicely, but is limited because of the 32-bit compatible memory location. Being able to have buffers above the 32-bit limit would alleviate that, but that is support that would have to be developed. Thanks, Tom> > Let me queue it up in development branch and do some regression testing. >>