thr3ads.net - Nouveau - [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm

If this information is useful, please help other people find it:
Share via:

Christian König

2023-Nov-10 08:50 UTC

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

Am 09.11.23 um 19:34 schrieb Danilo Krummrich:> On 11/9/23 17:03, Christian K?nig wrote:
>> Am 09.11.23 um 16:50 schrieb Thomas Hellstr?m:
>>> [SNIP]
>>>>>
>>> Did we get any resolution on this?
>>>
>>> FWIW, my take on this is that it would be possible to get GPUVM to 
>>> work both with and without internal refcounting; If with, the
driver
>>> needs a vm close to resolve cyclic references, if without
that's not
>>> necessary. If GPUVM is allowed to refcount in mappings and vm_bos, 
>>> that comes with a slight performance drop but as Danilo pointed
out,
>>> the VM lifetime problem iterating over a vm_bo's mapping
becomes
>>> much easier and the code thus becomes easier to maintain moving 
>>> forward. That convinced me it's a good thing.
>>
>> I strongly believe you guys stumbled over one of the core problems 
>> with the VM here and I think that reference counting is the right 
>> answer to solving this.
>>
>> The big question is that what is reference counted and in which 
>> direction does the dependency points, e.g. we have here VM, BO, BO_VM 
>> and Mapping objects.
>>
>> Those patches here suggest a counted Mapping -> VM reference and
I'm
>> pretty sure that this isn't a good idea. What we should rather
really
>> have is a BO -> VM or BO_VM ->VM reference. In other words that
each
>> BO which is part of the VM keeps a reference to the VM.
>
> We have both. Please see the subsequent patch introducing VM_BO 
> structures for that.
>
> As I explained, mappings (struct drm_gpuva) keep a pointer to their VM 
> they're mapped
> in and besides that it doesn't make sense to free a VM that still 
> contains mappings,
> the reference count ensures that. This simply ensures memory safety.
>
>>
>> BTW: At least in amdgpu we can have BOs which (temporary) doesn't 
>> have any mappings, but are still considered part of the VM.
>
> That should be possible.
>
>>
>>>
>>> Another issue Christian brought up is that something intended to be
>>> embeddable (a base class) shouldn't really have its own
refcount. I
>>> think that's a valid point. If you at some point need to derive
from
>>> multiple such structs each having its own refcount, things will 
>>> start to get weird. One way to resolve that would be to have the 
>>> driver's subclass provide get() and put() ops, and export a 
>>> destructor for the base-class, rather than to have the base-class 
>>> provide the refcount and a destructor? ops.
>
> GPUVM simply follows the same pattern we have with drm_gem_objects. 
> And I think it makes
> sense. Why would we want to embed two struct drm_gpuvm in a single 
> driver structure?
Because you need one drm_gpuvm structure for each application using the 
driver? Or am I missing something?

As far as I can see a driver would want to embed that into your fpriv 
structure which is allocated during drm_driver.open callback.
>
>>
>> Well, I have never seen stuff like that in the kernel. Might be that 
>> this works, but I would rather not try if avoidable.
>>
>>>
>>> That would also make it possible for the driver to decide the 
>>> context for the put() call: If the driver needs to be able to call 
>>> put() from irq / atomic context but the base-class'es
destructor
>>> doesn't allow atomic context, the driver can push freeing out
to a
>>> work item if needed.
>>>
>>> Finally, the refcount overflow Christian pointed out. Limiting the 
>>> number of mapping sounds like a reasonable remedy to me.
>>
>> Well that depends, I would rather avoid having a dependency for 
>> mappings.
>>
>> Taking the CPU VM handling as example as far as I know 
>> vm_area_structs doesn't grab a reference to their mm_struct either.
>> Instead they get automatically destroyed when the mm_struct is 
>> destroyed.
>
> Certainly, that would be possible. However, thinking about it, this 
> might call for
> huge trouble.
>
> First of all, we'd still need to reference count a GPUVM and take a 
> reference for each
> VM_BO, as we do already. Now instead of simply increasing the 
> reference count for each
> mapping as well, we'd need a *mandatory* driver callback that is 
> called when the GPUVM
> reference count drops to zero. Maybe something like vm_destroy().
>
> The reason is that GPUVM can't just remove all mappings from the tree 
> nor can it free them
> by itself, since drivers might use them for tracking their allocated 
> page tables and/or
> other stuff.
>
> Now, let's think about the scope this callback might be called from. 
> When a VM_BO is destroyed
> the driver might hold a couple of locks (for Xe it would be the VM's 
> shared dma-resv lock and
> potentially the corresponding object's dma-resv lock if they're not
> the same already). If
> destroying this VM_BO leads to the VM being destroyed, the drivers 
> vm_destroy() callback would
> be called with those locks being held as well.
>
> I feel like doing this finally opens the doors of the locking hell 
> entirely. I think we should
> really avoid that.
That's a really good point, but I fear exactly that's the use case.

I would expect that VM_BO structures are added in the 
drm_gem_object_funcs.open callback and freed in drm_gem_object_funcs.close.

Since it is perfectly legal for userspace to close a BO while there are 
still mappings (can trivial be that the app is killed) I would expect 
that the drm_gem_object_funcs.close handling is something like asking 
drm_gpuvm destroying the VM_BO and getting the mappings which should be 
cleared in the page table in return.

In amdgpu we even go a step further and the VM structure keeps track of 
all the mappings of deleted VM_BOs so that higher level can query those 
and clear them later on.

Background is that the drm_gem_object_funcs.close can't fail, but it can 
perfectly be that the app is killed because of an OOM situation and we 
can't do page tables updates in that moment because of this.
>
>>
>> Which makes sense in that case because when the mm_struct is gone the 
>> vm_area_struct doesn't make sense any more either.
>>
>> What we clearly need is a reference to prevent the VM or at least the 
>> shared resv to go away to early.
>
> Yeah, that was a good hint and we've covered that.
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> But I think all of this is fixable as follow-ups if needed, unless 
>>> I'm missing something crucial.
>
> Fully agree, I think at this point we should go ahead and land this 
> series.
Yeah, agree this is not UAPI so not nailed in stone. Feel free to add my 
acked-by as well if you want.

Only keep in mind that when you give drivers some functionality in a 
common component they usually expect to keep that functionality.

For example changing the dma_resv object to make sure that drivers can't 
cause use after free errors any more was an extremely annoying 
experience since every user of those interface had to change at once.

Regards,
Christian.
>
>>>
>>> Just my 2 cents.
>>>
>>> /Thomas
>>>
>>>
>>
>

Thomas Hellström

2023-Nov-10 09:39 UTC

head link

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

On 11/10/23 09:50, Christian K?nig wrote:> Am 09.11.23 um 19:34 schrieb Danilo Krummrich:
>> On 11/9/23 17:03, Christian K?nig wrote:
>>> Am 09.11.23 um 16:50 schrieb Thomas Hellstr?m:
>>>> [SNIP]
>>>>>>
>>>> Did we get any resolution on this?
>>>>
>>>> FWIW, my take on this is that it would be possible to get GPUVM
to
>>>> work both with and without internal refcounting; If with, the 
>>>> driver needs a vm close to resolve cyclic references, if
without
>>>> that's not necessary. If GPUVM is allowed to refcount in
mappings
>>>> and vm_bos, that comes with a slight performance drop but as
Danilo
>>>> pointed out, the VM lifetime problem iterating over a
vm_bo's
>>>> mapping becomes much easier and the code thus becomes easier to
>>>> maintain moving forward. That convinced me it's a good
thing.
>>>
>>> I strongly believe you guys stumbled over one of the core problems 
>>> with the VM here and I think that reference counting is the right 
>>> answer to solving this.
>>>
>>> The big question is that what is reference counted and in which 
>>> direction does the dependency points, e.g. we have here VM, BO, 
>>> BO_VM and Mapping objects.
>>>
>>> Those patches here suggest a counted Mapping -> VM reference and
I'm
>>> pretty sure that this isn't a good idea. What we should rather 
>>> really have is a BO -> VM or BO_VM ->VM reference. In other
words
>>> that each BO which is part of the VM keeps a reference to the VM.
>>
>> We have both. Please see the subsequent patch introducing VM_BO 
>> structures for that.
>>
>> As I explained, mappings (struct drm_gpuva) keep a pointer to their 
>> VM they're mapped
>> in and besides that it doesn't make sense to free a VM that still 
>> contains mappings,
>> the reference count ensures that. This simply ensures memory safety.
>>
>>>
>>> BTW: At least in amdgpu we can have BOs which (temporary)
doesn't
>>> have any mappings, but are still considered part of the VM.
>>
>> That should be possible.
>>
>>>
>>>>
>>>> Another issue Christian brought up is that something intended
to be
>>>> embeddable (a base class) shouldn't really have its own
refcount. I
>>>> think that's a valid point. If you at some point need to
derive
>>>> from multiple such structs each having its own refcount, things
>>>> will start to get weird. One way to resolve that would be to
have
>>>> the driver's subclass provide get() and put() ops, and
export a
>>>> destructor for the base-class, rather than to have the
base-class
>>>> provide the refcount and a destructor? ops.
>>
>> GPUVM simply follows the same pattern we have with drm_gem_objects. 
>> And I think it makes
>> sense. Why would we want to embed two struct drm_gpuvm in a single 
>> driver structure?
>
> Because you need one drm_gpuvm structure for each application using 
> the driver? Or am I missing something?
>
> As far as I can see a driver would want to embed that into your fpriv 
> structure which is allocated during drm_driver.open callback.
I was thinking more of the general design of a base-class that needs to 
be refcounted. Say a driver vm that inherits from gpu-vm, gem_object and 
yet another base-class that supplies its own refcount. What's the 
best-practice way to do refcounting? All base-classes supplying a 
refcount of its own, or the subclass supplying a refcount and the 
base-classes supply destroy helpers.

But to be clear this is nothing I see needing urgent attention.
>
>>
>>>
>>> Well, I have never seen stuff like that in the kernel. Might be
that
>>> this works, but I would rather not try if avoidable.
>>>
>>>>
>>>> That would also make it possible for the driver to decide the 
>>>> context for the put() call: If the driver needs to be able to
call
>>>> put() from irq / atomic context but the base-class'es
destructor
>>>> doesn't allow atomic context, the driver can push freeing
out to a
>>>> work item if needed.
>>>>
>>>> Finally, the refcount overflow Christian pointed out. Limiting
the
>>>> number of mapping sounds like a reasonable remedy to me.
>>>
>>> Well that depends, I would rather avoid having a dependency for 
>>> mappings.
>>>
>>> Taking the CPU VM handling as example as far as I know 
>>> vm_area_structs doesn't grab a reference to their mm_struct
either.
>>> Instead they get automatically destroyed when the mm_struct is 
>>> destroyed.
>>
>> Certainly, that would be possible. However, thinking about it, this 
>> might call for
>> huge trouble.
>>
>> First of all, we'd still need to reference count a GPUVM and take a
>> reference for each
>> VM_BO, as we do already. Now instead of simply increasing the 
>> reference count for each
>> mapping as well, we'd need a *mandatory* driver callback that is 
>> called when the GPUVM
>> reference count drops to zero. Maybe something like vm_destroy().
>>
>> The reason is that GPUVM can't just remove all mappings from the
tree
>> nor can it free them
>> by itself, since drivers might use them for tracking their allocated 
>> page tables and/or
>> other stuff.
>>
>> Now, let's think about the scope this callback might be called
from.
>> When a VM_BO is destroyed
>> the driver might hold a couple of locks (for Xe it would be the
VM's
>> shared dma-resv lock and
>> potentially the corresponding object's dma-resv lock if they're
not
>> the same already). If
>> destroying this VM_BO leads to the VM being destroyed, the drivers 
>> vm_destroy() callback would
>> be called with those locks being held as well.
>>
>> I feel like doing this finally opens the doors of the locking hell 
>> entirely. I think we should
>> really avoid that.
I don't think we need to worry much about this particular locking hell 
because if we hold, for example a vm and bo resv when putting the vm_bo, 
we need to keep additional strong references for the bo / vm pointer we 
use for unlocking. Hence putting the vm_bo under those locks can never 
lead to the vm getting destroyed.

Also, don't we already sort of have a mandatory vm_destroy callback?

+	if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
+		return;


>
> That's a really good point, but I fear exactly that's the use case.
>
> I would expect that VM_BO structures are added in the 
> drm_gem_object_funcs.open callback and freed in 
> drm_gem_object_funcs.close.
>
> Since it is perfectly legal for userspace to close a BO while there 
> are still mappings (can trivial be that the app is killed) I would 
> expect that the drm_gem_object_funcs.close handling is something like 
> asking drm_gpuvm destroying the VM_BO and getting the mappings which 
> should be cleared in the page table in return.
>
> In amdgpu we even go a step further and the VM structure keeps track 
> of all the mappings of deleted VM_BOs so that higher level can query 
> those and clear them later on.
>
> Background is that the drm_gem_object_funcs.close can't fail, but it 
> can perfectly be that the app is killed because of an OOM situation 
> and we can't do page tables updates in that moment because of this.
>
>>
>>>
>>> Which makes sense in that case because when the mm_struct is gone 
>>> the vm_area_struct doesn't make sense any more either.
>>>
>>> What we clearly need is a reference to prevent the VM or at least 
>>> the shared resv to go away to early.
>>
>> Yeah, that was a good hint and we've covered that.
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> But I think all of this is fixable as follow-ups if needed,
unless
>>>> I'm missing something crucial.
>>
>> Fully agree, I think at this point we should go ahead and land this 
>> series.
+1.

/Thomas

>>
>
> Yeah, agree this is not UAPI so not nailed in stone. Feel free to add 
> my acked-by as well if you want.
>
> Only keep in mind that when you give drivers some functionality in a 
> common component they usually expect to keep that functionality.
>
> For example changing the dma_resv object to make sure that drivers 
> can't cause use after free errors any more was an extremely annoying 
> experience since every user of those interface had to change at once.
>
> Regards,
> Christian.
>
>>
>>>>
>>>> Just my 2 cents.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>
>>
>

Danilo Krummrich

2023-Nov-10 16:57 UTC

head link

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

On 11/10/23 09:50, Christian K?nig wrote:> Am 09.11.23 um 19:34 schrieb Danilo Krummrich:
>> On 11/9/23 17:03, Christian K?nig wrote:
>>> Am 09.11.23 um 16:50 schrieb Thomas Hellstr?m:
>>>> [SNIP]
>>>>>>
>>>> Did we get any resolution on this?
>>>>
>>>> FWIW, my take on this is that it would be possible to get GPUVM
to work both with and without internal refcounting; If with, the driver needs a
vm close to resolve cyclic references, if without that's not necessary. If
GPUVM is allowed to refcount in mappings and vm_bos, that comes with a slight
performance drop but as Danilo pointed out, the VM lifetime problem iterating
over a vm_bo's mapping becomes much easier and the code thus becomes easier
to maintain moving forward. That convinced me it's a good thing.
>>>
>>> I strongly believe you guys stumbled over one of the core problems
with the VM here and I think that reference counting is the right answer to
solving this.
>>>
>>> The big question is that what is reference counted and in which
direction does the dependency points, e.g. we have here VM, BO, BO_VM and
Mapping objects.
>>>
>>> Those patches here suggest a counted Mapping -> VM reference and
I'm pretty sure that this isn't a good idea. What we should rather
really have is a BO -> VM or BO_VM ->VM reference. In other words that
each BO which is part of the VM keeps a reference to the VM.
>>
>> We have both. Please see the subsequent patch introducing VM_BO
structures for that.
>>
>> As I explained, mappings (struct drm_gpuva) keep a pointer to their VM
they're mapped
>> in and besides that it doesn't make sense to free a VM that still
contains mappings,
>> the reference count ensures that. This simply ensures memory safety.
>>
>>>
>>> BTW: At least in amdgpu we can have BOs which (temporary)
doesn't have any mappings, but are still considered part of the VM.
>>
>> That should be possible.
>>
>>>
>>>>
>>>> Another issue Christian brought up is that something intended
to be embeddable (a base class) shouldn't really have its own refcount. I
think that's a valid point. If you at some point need to derive from
multiple such structs each having its own refcount, things will start to get
weird. One way to resolve that would be to have the driver's subclass
provide get() and put() ops, and export a destructor for the base-class, rather
than to have the base-class provide the refcount and a destructor? ops.
>>
>> GPUVM simply follows the same pattern we have with drm_gem_objects. And
I think it makes
>> sense. Why would we want to embed two struct drm_gpuvm in a single
driver structure?
> 
> Because you need one drm_gpuvm structure for each application using the
driver? Or am I missing something?
Right, *one*, but not more than one. Wasn't that the concern? Maybe I
misunderstood something. :)
> 
> As far as I can see a driver would want to embed that into your fpriv
structure which is allocated during drm_driver.open callback.
> 
>>
>>>
>>> Well, I have never seen stuff like that in the kernel. Might be
that this works, but I would rather not try if avoidable.
>>>
>>>>
>>>> That would also make it possible for the driver to decide the
context for the put() call: If the driver needs to be able to call put() from
irq / atomic context but the base-class'es destructor doesn't allow
atomic context, the driver can push freeing out to a work item if needed.
>>>>
>>>> Finally, the refcount overflow Christian pointed out. Limiting
the number of mapping sounds like a reasonable remedy to me.
>>>
>>> Well that depends, I would rather avoid having a dependency for
mappings.
>>>
>>> Taking the CPU VM handling as example as far as I know
vm_area_structs doesn't grab a reference to their mm_struct either. Instead
they get automatically destroyed when the mm_struct is destroyed.
>>
>> Certainly, that would be possible. However, thinking about it, this
might call for
>> huge trouble.
>>
>> First of all, we'd still need to reference count a GPUVM and take a
reference for each
>> VM_BO, as we do already. Now instead of simply increasing the reference
count for each
>> mapping as well, we'd need a *mandatory* driver callback that is
called when the GPUVM
>> reference count drops to zero. Maybe something like vm_destroy().
>>
>> The reason is that GPUVM can't just remove all mappings from the
tree nor can it free them
>> by itself, since drivers might use them for tracking their allocated
page tables and/or
>> other stuff.
>>
>> Now, let's think about the scope this callback might be called
from. When a VM_BO is destroyed
>> the driver might hold a couple of locks (for Xe it would be the
VM's shared dma-resv lock and
>> potentially the corresponding object's dma-resv lock if they're
not the same already). If
>> destroying this VM_BO leads to the VM being destroyed, the drivers
vm_destroy() callback would
>> be called with those locks being held as well.
>>
>> I feel like doing this finally opens the doors of the locking hell
entirely. I think we should
>> really avoid that.
> 
> That's a really good point, but I fear exactly that's the use case.
> 
> I would expect that VM_BO structures are added in the
drm_gem_object_funcs.open callback and freed in drm_gem_object_funcs.close.
> 
> Since it is perfectly legal for userspace to close a BO while there are
still mappings (can trivial be that the app is killed) I would expect that the
drm_gem_object_funcs.close handling is something like asking drm_gpuvm
destroying the VM_BO and getting the mappings which should be cleared in the
page table in return.
Yes, you can do that. You can ask GPUVM to create a set of operations for you to
shut down all
mappings of a given VM_BO. But until this is done the VM_BO structure is kept
alive. Semantically,
it's exactly the same though.
> 
> In amdgpu we even go a step further and the VM structure keeps track of all
the mappings of deleted VM_BOs so that higher level can query those and clear
them later on.
> 
> Background is that the drm_gem_object_funcs.close can't fail, but it
can perfectly be that the app is killed because of an OOM situation and we
can't do page tables updates in that moment because of this.
> 
>>
>>>
>>> Which makes sense in that case because when the mm_struct is gone
the vm_area_struct doesn't make sense any more either.
>>>
>>> What we clearly need is a reference to prevent the VM or at least
the shared resv to go away to early.
>>
>> Yeah, that was a good hint and we've covered that.
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> But I think all of this is fixable as follow-ups if needed,
unless I'm missing something crucial.
>>
>> Fully agree, I think at this point we should go ahead and land this
series.
> 
> Yeah, agree this is not UAPI so not nailed in stone. Feel free to add my
acked-by as well if you want.
> 
> Only keep in mind that when you give drivers some functionality in a common
component they usually expect to keep that functionality.
> 
> For example changing the dma_resv object to make sure that drivers
can't cause use after free errors any more was an extremely annoying
experience since every user of those interface had to change at once.
> 
> Regards,
> Christian.
> 
>>
>>>>
>>>> Just my 2 cents.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>
>>
>

Nouveau - Nov 2023 - [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures