Tomasz Figa
2020-Sep-25 18:46 UTC
[Nouveau] [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
Hi Christoph, On Tue, Sep 15, 2020 at 05:51:21PM +0200, Christoph Hellwig wrote:> Implement the alloc_noncoherent method to provide memory that is neither > coherent not contiguous. > > Signed-off-by: Christoph Hellwig <hch at lst.de> > --- > drivers/iommu/dma-iommu.c | 41 +++++++++++++++++++++++++++++++++++---- > 1 file changed, 37 insertions(+), 4 deletions(-) >Sorry for being late to the party and thanks a lot for the patch. Please see my comments inline. [snip]> @@ -1052,6 +1055,34 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, > return cpu_addr; > } > > +#ifdef CONFIG_DMA_REMAP > +static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size, > + dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp) > +{ > + if (!gfpflags_allow_blocking(gfp)) { > + struct page *page; > + > + page = dma_common_alloc_pages(dev, size, handle, dir, gfp); > + if (!page) > + return NULL; > + return page_address(page); > + } > + > + return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO, > + PAGE_KERNEL, 0);iommu_dma_alloc_remap() makes use of the DMA_ATTR_ALLOC_SINGLE_PAGES attribute to optimize the allocations for devices which don't care about how contiguous the backing memory is. Do you think we could add an attrs argument to this function and pass it there? As ARM is being moved to the common iommu-dma layer as well, we'll probably make use of the argument to support the DMA_ATTR_NO_KERNEL_MAPPING attribute to conserve the vmalloc area. Best regards, Tomasz
Christoph Hellwig
2020-Sep-26 14:14 UTC
[Nouveau] [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
On Fri, Sep 25, 2020 at 06:46:22PM +0000, Tomasz Figa wrote:> > +static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size, > > + dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp) > > +{ > > + if (!gfpflags_allow_blocking(gfp)) { > > + struct page *page; > > + > > + page = dma_common_alloc_pages(dev, size, handle, dir, gfp); > > + if (!page) > > + return NULL; > > + return page_address(page); > > + } > > + > > + return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO, > > + PAGE_KERNEL, 0); > > iommu_dma_alloc_remap() makes use of the DMA_ATTR_ALLOC_SINGLE_PAGES attribute > to optimize the allocations for devices which don't care about how contiguous > the backing memory is. Do you think we could add an attrs argument to this > function and pass it there? > > As ARM is being moved to the common iommu-dma layer as well, we'll probably > make use of the argument to support the DMA_ATTR_NO_KERNEL_MAPPING attribute to > conserve the vmalloc area.We could probably at it. However I wonder why this is something the drivers should care about. Isn't this really something that should be a kernel-wide policy for a given system?
Tomasz Figa
2020-Sep-26 15:25 UTC
[Nouveau] [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
On Sat, Sep 26, 2020 at 4:14 PM Christoph Hellwig <hch at lst.de> wrote:> > On Fri, Sep 25, 2020 at 06:46:22PM +0000, Tomasz Figa wrote: > > > +static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size, > > > + dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp) > > > +{ > > > + if (!gfpflags_allow_blocking(gfp)) { > > > + struct page *page; > > > + > > > + page = dma_common_alloc_pages(dev, size, handle, dir, gfp); > > > + if (!page) > > > + return NULL; > > > + return page_address(page); > > > + } > > > + > > > + return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO, > > > + PAGE_KERNEL, 0); > > > > iommu_dma_alloc_remap() makes use of the DMA_ATTR_ALLOC_SINGLE_PAGES attribute > > to optimize the allocations for devices which don't care about how contiguous > > the backing memory is. Do you think we could add an attrs argument to this > > function and pass it there? > > > > As ARM is being moved to the common iommu-dma layer as well, we'll probably > > make use of the argument to support the DMA_ATTR_NO_KERNEL_MAPPING attribute to > > conserve the vmalloc area. > > We could probably at it. However I wonder why this is something the > drivers should care about. Isn't this really something that should > be a kernel-wide policy for a given system?There are IOMMUs out there which support huge pages and those can benefit *some* hardware depending on what kind of accesses they perform, possibly on a per-buffer basis. At the same time, order > 0 allocations can be expensive, significantly affecting allocation latency, so for devices which don't care about huge pages anyone would prefer simple single-page allocations. Currently the drivers know the best on whether the hardware they drive would care. There are some decision factors listed in the documentation [1]. I can imagine cases where drivers could not be the best to decide about this - for example, the workload could vary depending on the userspace or a product decision regarding the performance vs allocation latency, but we haven't seen such cases in practice yet. [1] https://www.kernel.org/doc/html/latest/core-api/dma-attributes.html?highlight=dma_attr_alloc_single_pages#dma-attr-alloc-single-pages Best regards, Tomasz
Maybe Matching Threads
- [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
- [PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
- [RFC v3 44/45] dma-mapping: Remove dma_get_attr
- [PATCH 15/18] dma-mapping: add a new dma_alloc_pages API
- [PATCH 19/28] dma-mapping: replace DMA_ATTR_NON_CONSISTENT with dma_{alloc, free}_pages