Once DMA API usage is enabled, it becomes apparent that virtio-mmio is inadvertently relying on the default 32-bit DMA mask, which leads to problems like rapidly exhausting SWIOTLB bounce buffers. Ensure that we set the appropriate 64-bit DMA mask whenever possible, with the coherent mask suitably limited for the legacy vring as per a0be1db4304f ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio devices"). Reported-by: Jean-Philippe Brucker <jean-philippe.brucker at arm.com> Fixes: b42111382f0e ("virtio_mmio: Use the DMA API if enabled") Signed-off-by: Robin Murphy <robin.murphy at arm.com> --- drivers/virtio/virtio_mmio.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c index 48bfea91dbca..b5c5d49ca598 100644 --- a/drivers/virtio/virtio_mmio.c +++ b/drivers/virtio/virtio_mmio.c @@ -59,6 +59,7 @@ #define pr_fmt(fmt) "virtio-mmio: " fmt #include <linux/acpi.h> +#include <linux/dma-mapping.h> #include <linux/highmem.h> #include <linux/interrupt.h> #include <linux/io.h> @@ -497,6 +498,7 @@ static int virtio_mmio_probe(struct platform_device *pdev) struct virtio_mmio_device *vm_dev; struct resource *mem; unsigned long magic; + int rc; mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); if (!mem) @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) if (vm_dev->version == 1) writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (rc) + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); + else if (vm_dev->version == 1) + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT)); + if (rc) + dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n"); + platform_set_drvdata(pdev, vm_dev); return register_virtio_device(&vm_dev->vdev); -- 2.10.2.dirty
On Tuesday, January 10, 2017 12:26:01 PM CET Robin Murphy wrote:> @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) > if (vm_dev->version == 1) > writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); > > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); > + if (rc) > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));You don't seem to do anything different when 64-bit DMA is unsupported. How do you prevent the use of kernel buffers that are above the first 4G here?> + else if (vm_dev->version == 1) > + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT));Why is this limitation only for the coherent mask? Arnd
Russell King - ARM Linux
2017-Jan-10 13:25 UTC
[PATCH] virtio_mmio: Set DMA masks appropriately
On Tue, Jan 10, 2017 at 02:15:57PM +0100, Arnd Bergmann wrote:> On Tuesday, January 10, 2017 12:26:01 PM CET Robin Murphy wrote: > > @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) > > if (vm_dev->version == 1) > > writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); > > > > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); > > + if (rc) > > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); > > You don't seem to do anything different when 64-bit DMA is unsupported. > How do you prevent the use of kernel buffers that are above the first 4G > here? > > > + else if (vm_dev->version == 1) > > + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT)); > > Why is this limitation only for the coherent mask?It looks wrong for two reasons: 1. It is calling dma_set_coherent_mask(), so only the coherent mask is being updated. What about streaming DMA? Maybe include the comment from the commit you refer to (a0be1db4304f) which explains this, which would help reviewers understand why you're only changing the coherent mask. 2. It fails to check whether the coherent mask was accepted... which I guess is okay, as the coherent allocation mask won't be updated so you should get coherent memory below 4GB. Nevertheless, drivers are expected to try setting a 32-bit coherent mask if setting a larger mask fails. See examples in Documentation/DMA-API-HOWTO.txt. Of course, if setting a 32-bit coherent mask fails, then the driver should probably fail to initialise. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
On 10/01/17 13:15, Arnd Bergmann wrote:> On Tuesday, January 10, 2017 12:26:01 PM CET Robin Murphy wrote: >> @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) >> if (vm_dev->version == 1) >> writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); >> >> + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); >> + if (rc) >> + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); > > You don't seem to do anything different when 64-bit DMA is unsupported. > How do you prevent the use of kernel buffers that are above the first 4G > here?That's the token "give up and rely on SWIOTLB/IOMMU" point, which as we already know won't necessarily work very well (because it's already the situation without this patch), but is still arguably better than nothing. As I've just replied elsewhere, I personally hate this idiom, but it's the done thing given the current DMA mask API.>> + else if (vm_dev->version == 1) >> + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT)); > > Why is this limitation only for the coherent mask?AIUI, the "32-bit pointers to pages" limitation of legacy virtio only applies to the location of the vring itself, which is allocated via dma_alloc_coherent - the descriptors themselves hold full 64-bit addresses pointing at the actual data, which is mapped using streaming DMA. It relies on the API guarantee that if we've managed to set a 64-bit streaming mask, then setting a smaller coherent mask cannot fail (DMA-API-HOWTO.txt:257) This is merely an amalgamation of the logic already in place for virtio-pci, I just skimped on duplicating all the rationale (I know there's a mail thread somewhere I could probably dig up). Robin.
On Tue, Jan 10, 2017 at 12:26:01PM +0000, Robin Murphy wrote:> Once DMA API usage is enabled, it becomes apparent that virtio-mmio is > inadvertently relying on the default 32-bit DMA mask, which leads to > problems like rapidly exhausting SWIOTLB bounce buffers. > > Ensure that we set the appropriate 64-bit DMA mask whenever possible, > with the coherent mask suitably limited for the legacy vring as per > a0be1db4304f ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio > devices"). > > Reported-by: Jean-Philippe Brucker <jean-philippe.brucker at arm.com> > Fixes: b42111382f0e ("virtio_mmio: Use the DMA API if enabled") > Signed-off-by: Robin Murphy <robin.murphy at arm.com> > --- > drivers/virtio/virtio_mmio.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c > index 48bfea91dbca..b5c5d49ca598 100644 > --- a/drivers/virtio/virtio_mmio.c > +++ b/drivers/virtio/virtio_mmio.c > @@ -59,6 +59,7 @@ > #define pr_fmt(fmt) "virtio-mmio: " fmt > > #include <linux/acpi.h> > +#include <linux/dma-mapping.h> > #include <linux/highmem.h> > #include <linux/interrupt.h> > #include <linux/io.h> > @@ -497,6 +498,7 @@ static int virtio_mmio_probe(struct platform_device *pdev) > struct virtio_mmio_device *vm_dev; > struct resource *mem; > unsigned long magic; > + int rc; > > mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); > if (!mem) > @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) > if (vm_dev->version == 1) > writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); > > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); > + if (rc) > + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); > + else if (vm_dev->version == 1) > + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT));That's a very convoluted way to do this, for version 1 you set coherent mask to 64 then override it. why not if (vm_dev->version == 1) { dma_set_mask dma_set_coherent_mask } else { dma_set_mask_and_coherent } if (rc) dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));> + if (rc) > + dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n"); > +is there a chance it actually still might work?> platform_set_drvdata(pdev, vm_dev); > > return register_virtio_device(&vm_dev->vdev); > -- > 2.10.2.dirty
On 10/01/17 14:54, Michael S. Tsirkin wrote:> On Tue, Jan 10, 2017 at 12:26:01PM +0000, Robin Murphy wrote: >> Once DMA API usage is enabled, it becomes apparent that virtio-mmio is >> inadvertently relying on the default 32-bit DMA mask, which leads to >> problems like rapidly exhausting SWIOTLB bounce buffers. >> >> Ensure that we set the appropriate 64-bit DMA mask whenever possible, >> with the coherent mask suitably limited for the legacy vring as per >> a0be1db4304f ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio >> devices"). >> >> Reported-by: Jean-Philippe Brucker <jean-philippe.brucker at arm.com> >> Fixes: b42111382f0e ("virtio_mmio: Use the DMA API if enabled") >> Signed-off-by: Robin Murphy <robin.murphy at arm.com> >> --- >> drivers/virtio/virtio_mmio.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c >> index 48bfea91dbca..b5c5d49ca598 100644 >> --- a/drivers/virtio/virtio_mmio.c >> +++ b/drivers/virtio/virtio_mmio.c >> @@ -59,6 +59,7 @@ >> #define pr_fmt(fmt) "virtio-mmio: " fmt >> >> #include <linux/acpi.h> >> +#include <linux/dma-mapping.h> >> #include <linux/highmem.h> >> #include <linux/interrupt.h> >> #include <linux/io.h> >> @@ -497,6 +498,7 @@ static int virtio_mmio_probe(struct platform_device *pdev) >> struct virtio_mmio_device *vm_dev; >> struct resource *mem; >> unsigned long magic; >> + int rc; >> >> mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); >> if (!mem) >> @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device *pdev) >> if (vm_dev->version == 1) >> writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE); >> >> + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); >> + if (rc) >> + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); >> + else if (vm_dev->version == 1) >> + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + PAGE_SHIFT)); > > That's a very convoluted way to do this, for version 1 you > set coherent mask to 64 then override it. > why not > > if (vm_dev->version == 1) { > dma_set_mask > dma_set_coherent_mask > } else { > dma_set_mask_and_coherent > } > > if (rc) > dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));Purely because it's fewer lines of code - if you'd prefer separate legacy vs. modern flows for clarity that's fine by me.>> + if (rc) >> + dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n"); >> + > > is there a chance it actually still might work?If we're not actually using the DMA API, we may still get away with it, otherwise it's a fairly sure bet that the subsequent dma_map/dma_alloc calls will fail and we'll get nowhere. If I change this to be a probe failure condition (and correspondingly in the PCI drivers too), would you rather that be predicated on vring_use_dma_api(), or always (given the TODO in virtio_ring.c)? Robin.>> platform_set_drvdata(pdev, vm_dev); >> >> return register_virtio_device(&vm_dev->vdev); >> -- >> 2.10.2.dirty