Gerd Hoffmann
2019-Sep-02 05:28 UTC
[PATCH] drm/virtio: Use vmalloc for command buffer allocations.
On Fri, Aug 30, 2019 at 10:49:25AM -0700, David Riley wrote:> Hi Gerd, > > On Fri, Aug 30, 2019 at 4:16 AM Gerd Hoffmann <kraxel at redhat.com> wrote: > > > > Hi, > > > > > > > - kfree(vbuf->data_buf); > > > > > + kvfree(vbuf->data_buf); > > > > > > > > if (is_vmalloc_addr(vbuf->data_buf)) ... > > > > > > > > needed here I gues? > > > > > > > > > > kvfree() handles vmalloc/kmalloc/kvmalloc internally by doing that check. > > > > Ok. > > > > > - videobuf_vmalloc_to_sg in drivers/media/v4l2-core/videobuf-dma-sg.c, > > > assumes contiguous array of scatterlist and that the buffer being converted > > > is page aligned > > > > Well, vmalloc memory _is_ page aligned. > > True, but this function gets called for all potential enqueuings (eg > resource_create_3d, resource_attach_backing) and I was concerned that > some other usage in the future might not have that guarantee.The vmalloc_to_sg call is wrapped into "if (is_vmalloc())", so this should not be a problem.> > sg_alloc_table_from_pages() does alot of what you need, you just need a > > small loop around vmalloc_to_page() create a struct page array > > beforehand. > > That feels like an extra allocation when under memory pressure and > more work, to not gain much -- there still needs to be a function that > iterates through all the pages. But I don't feel super strongly about > it and can change it if you think that it will be less maintenance > overhead.Lets see how vmalloc_to_sg looks like when it assumes page-aligned memory. It's probably noticeable shorter then. cheers, Gerd
David Riley
2019-Sep-03 20:27 UTC
[PATCH] drm/virtio: Use vmalloc for command buffer allocations.
On Sun, Sep 1, 2019 at 10:28 PM Gerd Hoffmann <kraxel at redhat.com> wrote:> > On Fri, Aug 30, 2019 at 10:49:25AM -0700, David Riley wrote: > > Hi Gerd, > > > > On Fri, Aug 30, 2019 at 4:16 AM Gerd Hoffmann <kraxel at redhat.com> wrote: > > > > > > Hi, > > > > > > > > > - kfree(vbuf->data_buf); > > > > > > + kvfree(vbuf->data_buf); > > > > > > > > > > if (is_vmalloc_addr(vbuf->data_buf)) ... > > > > > > > > > > needed here I gues? > > > > > > > > > > > > > kvfree() handles vmalloc/kmalloc/kvmalloc internally by doing that check. > > > > > > Ok. > > > > > > > - videobuf_vmalloc_to_sg in drivers/media/v4l2-core/videobuf-dma-sg.c, > > > > assumes contiguous array of scatterlist and that the buffer being converted > > > > is page aligned > > > > > > Well, vmalloc memory _is_ page aligned. > > > > True, but this function gets called for all potential enqueuings (eg > > resource_create_3d, resource_attach_backing) and I was concerned that > > some other usage in the future might not have that guarantee. > > The vmalloc_to_sg call is wrapped into "if (is_vmalloc())", so this > should not be a problem. > > > > sg_alloc_table_from_pages() does alot of what you need, you just need a > > > small loop around vmalloc_to_page() create a struct page array > > > beforehand. > > > > That feels like an extra allocation when under memory pressure and > > more work, to not gain much -- there still needs to be a function that > > iterates through all the pages. But I don't feel super strongly about > > it and can change it if you think that it will be less maintenance > > overhead. > > Lets see how vmalloc_to_sg looks like when it assumes page-aligned > memory. It's probably noticeable shorter then.It's not really. The allocation of the table is one unit less, and doesn't need to take into account that data might be an offset within the page. It still needs error handling, partial final page handling, and marking of the end of the scatterlist. Things could be slightly simplified to assume that you can always get a contiguous allocation of the table instead of using sg_alloc_table/for_each_sg, but given that we're only going down this path when memory is fragmented and in a fallback, doesn't seem worthwhile to make that trade-off. I've written a different version of vmalloc_to_sgt which uses sg_alloc_table_from_pages under the covers and it comes in slightly shorter (39 lines vs 55 lines), but incurs another allocation as previously so I'm personally in favour of things as written. fpga_mgr_buf_load is another function which roughly does the same sort of operation and it's a bit longer. I'll post a v2 shortly, but if you think it's worth making the extra allocation of the pages array to use, I can post that instead.> cheers, > Gerd >