Alistair Popple
2022-Sep-27 02:06 UTC
[Nouveau] [PATCH 2/7] mm: Free device private pages have zero refcount
Jason Gunthorpe <jgg at nvidia.com> writes:> On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote: >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page >> refcount") device private pages have no longer had an extra reference >> count when the page is in use. However before handing them back to the >> owning device driver we add an extra reference count such that free >> pages have a reference count of one. >> >> This makes it difficult to tell if a page is free or not because both >> free and in use pages will have a non-zero refcount. Instead we should >> return pages to the drivers page allocator with a zero reference count. >> Kernel code can then safely use kernel functions such as >> get_page_unless_zero(). >> >> Signed-off-by: Alistair Popple <apopple at nvidia.com> >> --- >> arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + >> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 + >> drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 + >> lib/test_hmm.c | 1 + >> mm/memremap.c | 5 ----- >> mm/page_alloc.c | 6 ++++++ >> 6 files changed, 10 insertions(+), 5 deletions(-) > > I think this is a great idea, but I'm surprised no dax stuff is > touched here?free_zone_device_page() shouldn't be called for pgmap->type =MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX there. Except that the folio code looks like it might have introduced a bug. AFAICT put_page() always calls put_devmap_managed_page(&folio->page) but folio_put() does not (although folios_put() does!). So it seems folio_put() won't end up calling __put_devmap_managed_page_refs() as I think it should. I think you're right about the change to __init_zone_device_page() - I should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to look at Dan's patch series more closely as I suspect it might be better to rebase this patch on top of that.> Jason
Dan Williams
2022-Sep-29 20:18 UTC
[Nouveau] [PATCH 2/7] mm: Free device private pages have zero refcount
Alistair Popple wrote:> > Jason Gunthorpe <jgg at nvidia.com> writes: > > > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote: > >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page > >> refcount") device private pages have no longer had an extra reference > >> count when the page is in use. However before handing them back to the > >> owning device driver we add an extra reference count such that free > >> pages have a reference count of one. > >> > >> This makes it difficult to tell if a page is free or not because both > >> free and in use pages will have a non-zero refcount. Instead we should > >> return pages to the drivers page allocator with a zero reference count. > >> Kernel code can then safely use kernel functions such as > >> get_page_unless_zero(). > >> > >> Signed-off-by: Alistair Popple <apopple at nvidia.com> > >> --- > >> arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + > >> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 + > >> drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 + > >> lib/test_hmm.c | 1 + > >> mm/memremap.c | 5 ----- > >> mm/page_alloc.c | 6 ++++++ > >> 6 files changed, 10 insertions(+), 5 deletions(-) > > > > I think this is a great idea, but I'm surprised no dax stuff is > > touched here? > > free_zone_device_page() shouldn't be called for pgmap->type => MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX > there. Except that the folio code looks like it might have introduced a > bug. AFAICT put_page() always calls > put_devmap_managed_page(&folio->page) but folio_put() does not (although > folios_put() does!). So it seems folio_put() won't end up calling > __put_devmap_managed_page_refs() as I think it should. > > I think you're right about the change to __init_zone_device_page() - I > should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to > look at Dan's patch series more closely as I suspect it might be better > to rebase this patch on top of that.Apologies for the delay I was travelling the past few days. Yes, I think this patch slots in nicely to avoid the introduction of an init_mode [1]: https://lore.kernel.org/nvdimm/166329940343.2786261.6047770378829215962.stgit at dwillia2-xfh.jf.intel.com/ Mind if I steal it into my series?