? 2021/4/23 ??4:09, Jason Wang ??:> Hi: > > Sometimes, the driver doesn't trust the device. This is usually > happens for the encrtpyed VM or VDUSE[1]. In both cases, technology > like swiotlb is used to prevent the poking/mangling of memory from the > device. But this is not sufficient since current virtio driver may > trust what is stored in the descriptor table (coherent mapping) for > performing the DMA operations like unmap and bounce so the device may > choose to utilize the behaviour of swiotlb to perform attacks[2]. > > To protect from a malicous device, this series store and use the > descriptor metadata in an auxiliay structure which can not be accessed > via swiotlb instead of the ones in the descriptor table. This means > the descriptor table is write-only from the view of the driver. > > Actually, we've almost achieved that through packed virtqueue and we > just need to fix a corner case of handling mapping errors. For split > virtqueue we just follow what's done in the packed. > > Note that we don't duplicate descriptor medata for indirect > descriptors since it uses stream mapping which is read only so it's > safe if the metadata of non-indirect descriptors are correct. > > For split virtqueue, the change increase the footprint due the the > auxiliary metadata but it's almost neglectlable in the simple test > like pktgen or netpef. > > Slightly tested with packed on/off, iommu on/of, swiotlb force/off in > the guest. > > Please review. > > Changes from V1: > - Always use auxiliary metadata for split virtqueue > - Don't read from descripto when detaching indirect descriptorHi Michael: Our QE see no regression on the perf test for 10G but some regressions (5%-10%) on 40G card. I think this is expected since we increase the footprint, are you OK with this and we can try to optimize on top or you have other ideas? Thanks> > [1] > https://lore.kernel.org/netdev/fab615ce-5e13-a3b3-3715-a4203b4ab010 at redhat.com/T/ > [2] > https://yhbt.net/lore/all/c3629a27-3590-1d9f-211b-c0b7be152b32 at redhat.com/T/#mc6b6e2343cbeffca68ca7a97e0f473aaa871c95b > > Jason Wang (7): > virtio-ring: maintain next in extra state for packed virtqueue > virtio_ring: rename vring_desc_extra_packed > virtio-ring: factor out desc_extra allocation > virtio_ring: secure handling of mapping errors > virtio_ring: introduce virtqueue_desc_add_split() > virtio: use err label in __vring_new_virtqueue() > virtio-ring: store DMA metadata in desc_extra for split virtqueue > > drivers/virtio/virtio_ring.c | 201 +++++++++++++++++++++++++---------- > 1 file changed, 144 insertions(+), 57 deletions(-) >
Michael S. Tsirkin
2021-May-06 08:12 UTC
[RFC PATCH V2 0/7] Do not read from descripto ring
On Thu, May 06, 2021 at 11:20:30AM +0800, Jason Wang wrote:> > ? 2021/4/23 ??4:09, Jason Wang ??: > > Hi: > > > > Sometimes, the driver doesn't trust the device. This is usually > > happens for the encrtpyed VM or VDUSE[1]. In both cases, technology > > like swiotlb is used to prevent the poking/mangling of memory from the > > device. But this is not sufficient since current virtio driver may > > trust what is stored in the descriptor table (coherent mapping) for > > performing the DMA operations like unmap and bounce so the device may > > choose to utilize the behaviour of swiotlb to perform attacks[2]. > > > > To protect from a malicous device, this series store and use the > > descriptor metadata in an auxiliay structure which can not be accessed > > via swiotlb instead of the ones in the descriptor table. This means > > the descriptor table is write-only from the view of the driver. > > > > Actually, we've almost achieved that through packed virtqueue and we > > just need to fix a corner case of handling mapping errors. For split > > virtqueue we just follow what's done in the packed. > > > > Note that we don't duplicate descriptor medata for indirect > > descriptors since it uses stream mapping which is read only so it's > > safe if the metadata of non-indirect descriptors are correct. > > > > For split virtqueue, the change increase the footprint due the the > > auxiliary metadata but it's almost neglectlable in the simple test > > like pktgen or netpef. > > > > Slightly tested with packed on/off, iommu on/of, swiotlb force/off in > > the guest. > > > > Please review. > > > > Changes from V1: > > - Always use auxiliary metadata for split virtqueue > > - Don't read from descripto when detaching indirect descriptor > > > Hi Michael: > > Our QE see no regression on the perf test for 10G but some regressions > (5%-10%) on 40G card. > > I think this is expected since we increase the footprint, are you OK with > this and we can try to optimize on top or you have other ideas? > > ThanksLet's try for just a bit, won't make this window anyway: I have an old idea. Add a way to find out that unmap is a nop (or more exactly does not use the address/length). Then in that case even with DMA API we do not need the extra data. Hmm?> > > > > [1] > > https://lore.kernel.org/netdev/fab615ce-5e13-a3b3-3715-a4203b4ab010 at redhat.com/T/ > > [2] > > https://yhbt.net/lore/all/c3629a27-3590-1d9f-211b-c0b7be152b32 at redhat.com/T/#mc6b6e2343cbeffca68ca7a97e0f473aaa871c95b > > > > Jason Wang (7): > > virtio-ring: maintain next in extra state for packed virtqueue > > virtio_ring: rename vring_desc_extra_packed > > virtio-ring: factor out desc_extra allocation > > virtio_ring: secure handling of mapping errors > > virtio_ring: introduce virtqueue_desc_add_split() > > virtio: use err label in __vring_new_virtqueue() > > virtio-ring: store DMA metadata in desc_extra for split virtqueue > > > > drivers/virtio/virtio_ring.c | 201 +++++++++++++++++++++++++---------- > > 1 file changed, 144 insertions(+), 57 deletions(-) > >