thr3ads.net - Nouveau - [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages [Feb 2025]

If this information is useful, please help other people find it:
Share via:

Jason Gunthorpe

2025-Feb-04 13:26 UTC

[RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellstr?m
wrote:> > I would not be happy to see this. Please improve pagemap directly if
> > you think you need more things.
> 
> These are mainly helpers to migrate and populate a range of cpu memory
> space (struct mm_struct) with GPU device_private memory, migrate to
> system on gpu memory shortage and implement the migrate_to_vram pagemap
> op, tied to gpu device memory allocations, so I don't think there is
> anything we should be exposing at the dev_pagemap level at this point?
Maybe that belongs in mm/hmm then?
> > Neither really match the expected design here. The owner should be
> > entirely based on reachability. Devices that cannot reach each other
> > directly should have different owners.
> 
> Actually what I'm putting together is a small helper to allocate and
> assign an "owner" based on devices that are previously registered
to a
> "registry". The caller has to indicate using a callback function
for
> each struct device pair whether there is a fast interconnect available,
> and this is expected to be done at pagemap creation time, so I think
> this aligns with the above. Initially a "registry" (which is a
list of
> device-owner pairs) will be driver-local, but could easily have a wider
> scope.
Yeah, that seems like a workable idea
> This means we handle access control, unplug checks and similar at
> migration time, typically before hmm_range_fault(), and the role of
> hmm_range_fault() will be to over pfns whose backing memory is directly
> accessible to the device, else migrate to system.
Yes, that sound right
> 1) Existing users would never use the callback. They can still rely on
> the owner check, only if that fails we check for callback existence.
> 2) By simply caching the result from the last checked dev_pagemap, most
> callback calls could typically be eliminated.
But then you are not in the locked region so your cache is racy and
invalid.
> 3) As mentioned before, a callback call would typically always be
> followed by either migration to ram or a page-table update. Compared to
> these, the callback overhead would IMO be unnoticeable.
Why? Surely the normal case should be a callback saying the memory can
be accessed?
> 4) pcie_p2p is already planning a dev_pagemap callback?
Yes, but it is not a racy validation callback, and it already is
creating a complicated lifecycle problem inside the exporting the
driver.

Jason

Thomas Hellström

2025-Feb-04 14:29 UTC

head link

[RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe
wrote:> On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellstr?m wrote:
> > 
> 
> > 1) Existing users would never use the callback. They can still rely
> > on
> > the owner check, only if that fails we check for callback
> > existence.
> > 2) By simply caching the result from the last checked dev_pagemap,
> > most
> > callback calls could typically be eliminated.
> 
> But then you are not in the locked region so your cache is racy and
> invalid.
I'm not sure I follow? If a device private pfn handed back to the
caller is dependent on dev_pagemap A having a fast interconnect to the
client, then subsequent pfns in the same hmm_range_fault() call must be
able to make the same assumption (pagemap A having a fast
interconnect), else the whole result is invalid?
> 
> > 3) As mentioned before, a callback call would typically always be
> > followed by either migration to ram or a page-table update.
> > Compared to
> > these, the callback overhead would IMO be unnoticeable.
> 
> Why? Surely the normal case should be a callback saying the memory
> can
> be accessed?
Sure, but at least on the xe driver, that means page-table repopulation
since the hmm_range_fault() typically originated from a page-fault.

> 
> > 4) pcie_p2p is already planning a dev_pagemap callback?
> 
> Yes, but it is not a racy validation callback, and it already is
> creating a complicated lifecycle problem inside the exporting the
> driver.
Yeah, I bet there are various reasons against a callback. I just don't
see the performance argument being a main concern. 
> 
> Jason
/Thomas

Nouveau - Feb 2025 - [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

[RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

[RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages