thr3ads.net - Nouveau - [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm

If this information is useful, please help other people find it:
Share via:

Christian König

2023-Nov-10 10:42 UTC

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

Am 10.11.23 um 10:39 schrieb Thomas Hellstr?m:>
> [SNIP]
> I was thinking more of the general design of a base-class that needs 
> to be refcounted. Say a driver vm that inherits from gpu-vm, 
> gem_object and yet another base-class that supplies its own refcount. 
> What's the best-practice way to do refcounting? All base-classes 
> supplying a refcount of its own, or the subclass supplying a refcount 
> and the base-classes supply destroy helpers.
 From my experience the most common design pattern in the Linux kernel 
is that you either have reference counted objects which contain a 
private pointer (like struct file, struct inode etc..) or the lifetime 
is defined by the user of the object instead of reference counting and 
in this case you can embed it into your own object.
>
> But to be clear this is nothing I see needing urgent attention.
>
>>
>>>
>>>>
>>>> Well, I have never seen stuff like that in the kernel. Might be
>>>> that this works, but I would rather not try if avoidable.
>>>>
>>>>>
>>>>> That would also make it possible for the driver to decide
the
>>>>> context for the put() call: If the driver needs to be able
to call
>>>>> put() from irq / atomic context but the base-class'es
destructor
>>>>> doesn't allow atomic context, the driver can push
freeing out to a
>>>>> work item if needed.
>>>>>
>>>>> Finally, the refcount overflow Christian pointed out.
Limiting the
>>>>> number of mapping sounds like a reasonable remedy to me.
>>>>
>>>> Well that depends, I would rather avoid having a dependency for
>>>> mappings.
>>>>
>>>> Taking the CPU VM handling as example as far as I know 
>>>> vm_area_structs doesn't grab a reference to their mm_struct
either.
>>>> Instead they get automatically destroyed when the mm_struct is 
>>>> destroyed.
>>>
>>> Certainly, that would be possible. However, thinking about it, this
>>> might call for
>>> huge trouble.
>>>
>>> First of all, we'd still need to reference count a GPUVM and
take a
>>> reference for each
>>> VM_BO, as we do already. Now instead of simply increasing the 
>>> reference count for each
>>> mapping as well, we'd need a *mandatory* driver callback that
is
>>> called when the GPUVM
>>> reference count drops to zero. Maybe something like vm_destroy().
>>>
>>> The reason is that GPUVM can't just remove all mappings from
the
>>> tree nor can it free them
>>> by itself, since drivers might use them for tracking their
allocated
>>> page tables and/or
>>> other stuff.
>>>
>>> Now, let's think about the scope this callback might be called
from.
>>> When a VM_BO is destroyed
>>> the driver might hold a couple of locks (for Xe it would be the
VM's
>>> shared dma-resv lock and
>>> potentially the corresponding object's dma-resv lock if
they're not
>>> the same already). If
>>> destroying this VM_BO leads to the VM being destroyed, the drivers 
>>> vm_destroy() callback would
>>> be called with those locks being held as well.
>>>
>>> I feel like doing this finally opens the doors of the locking hell 
>>> entirely. I think we should
>>> really avoid that.
>
> I don't think we need to worry much about this particular locking hell 
> because if we hold
I have to agree with Danilo here. Especially you have cases where you 
usually lock BO->VM (for example eviction) as well as cases where you 
need to lock VM->BO (command submission).

Because of this in amdgpu we used (or abused?) the dma_resv of the root 
BO as lock for the VM. Since this is a ww_mutex locking it in both VM, 
BO as well as BO, VM order works.

Regards,
Christian.
> , for example a vm and bo resv when putting the vm_bo, we need to keep 
> additional strong references for the bo / vm pointer we use for 
> unlocking. Hence putting the vm_bo under those locks can never lead to 
> the vm getting destroyed.
>
> Also, don't we already sort of have a mandatory vm_destroy callback?
>
> +??? if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
> +??????? return;
>
>
>
>>
>> That's a really good point, but I fear exactly that's the use
case.
>>
>> I would expect that VM_BO structures are added in the 
>> drm_gem_object_funcs.open callback and freed in 
>> drm_gem_object_funcs.close.
>>
>> Since it is perfectly legal for userspace to close a BO while there 
>> are still mappings (can trivial be that the app is killed) I would 
>> expect that the drm_gem_object_funcs.close handling is something like 
>> asking drm_gpuvm destroying the VM_BO and getting the mappings which 
>> should be cleared in the page table in return.
>>
>> In amdgpu we even go a step further and the VM structure keeps track 
>> of all the mappings of deleted VM_BOs so that higher level can query 
>> those and clear them later on.
>>
>> Background is that the drm_gem_object_funcs.close can't fail, but
it
>> can perfectly be that the app is killed because of an OOM situation 
>> and we can't do page tables updates in that moment because of this.
>>
>>>
>>>>
>>>> Which makes sense in that case because when the mm_struct is
gone
>>>> the vm_area_struct doesn't make sense any more either.
>>>>
>>>> What we clearly need is a reference to prevent the VM or at
least
>>>> the shared resv to go away to early.
>>>
>>> Yeah, that was a good hint and we've covered that.
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> But I think all of this is fixable as follow-ups if needed,
unless
>>>>> I'm missing something crucial.
>>>
>>> Fully agree, I think at this point we should go ahead and land this
>>> series.
>
> +1.
>
> /Thomas
>
>
>>>
>>
>> Yeah, agree this is not UAPI so not nailed in stone. Feel free to add 
>> my acked-by as well if you want.
>>
>> Only keep in mind that when you give drivers some functionality in a 
>> common component they usually expect to keep that functionality.
>>
>> For example changing the dma_resv object to make sure that drivers 
>> can't cause use after free errors any more was an extremely
annoying
>> experience since every user of those interface had to change at once.
>>
>> Regards,
>> Christian.
>>
>>>
>>>>>
>>>>> Just my 2 cents.
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>
>>>
>>

Thomas Hellström

2023-Nov-10 10:52 UTC

head link

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

On 11/10/23 11:42, Christian K?nig wrote:> Am 10.11.23 um 10:39 schrieb Thomas Hellstr?m:
>>
>> [SNIP]
>
>> I was thinking more of the general design of a base-class that needs 
>> to be refcounted. Say a driver vm that inherits from gpu-vm, 
>> gem_object and yet another base-class that supplies its own refcount. 
>> What's the best-practice way to do refcounting? All base-classes 
>> supplying a refcount of its own, or the subclass supplying a refcount 
>> and the base-classes supply destroy helpers.
>
> From my experience the most common design pattern in the Linux kernel 
> is that you either have reference counted objects which contain a 
> private pointer (like struct file, struct inode etc..) or the lifetime 
> is defined by the user of the object instead of reference counting and 
> in this case you can embed it into your own object.
>
>>
>> But to be clear this is nothing I see needing urgent attention.
>>
>>>
>>>>
>>>>>
>>>>> Well, I have never seen stuff like that in the kernel.
Might be
>>>>> that this works, but I would rather not try if avoidable.
>>>>>
>>>>>>
>>>>>> That would also make it possible for the driver to
decide the
>>>>>> context for the put() call: If the driver needs to be
able to
>>>>>> call put() from irq / atomic context but the
base-class'es
>>>>>> destructor doesn't allow atomic context, the driver
can push
>>>>>> freeing out to a work item if needed.
>>>>>>
>>>>>> Finally, the refcount overflow Christian pointed out.
Limiting
>>>>>> the number of mapping sounds like a reasonable remedy
to me.
>>>>>
>>>>> Well that depends, I would rather avoid having a dependency
for
>>>>> mappings.
>>>>>
>>>>> Taking the CPU VM handling as example as far as I know 
>>>>> vm_area_structs doesn't grab a reference to their
mm_struct
>>>>> either. Instead they get automatically destroyed when the 
>>>>> mm_struct is destroyed.
>>>>
>>>> Certainly, that would be possible. However, thinking about it,
this
>>>> might call for
>>>> huge trouble.
>>>>
>>>> First of all, we'd still need to reference count a GPUVM
and take a
>>>> reference for each
>>>> VM_BO, as we do already. Now instead of simply increasing the 
>>>> reference count for each
>>>> mapping as well, we'd need a *mandatory* driver callback
that is
>>>> called when the GPUVM
>>>> reference count drops to zero. Maybe something like
vm_destroy().
>>>>
>>>> The reason is that GPUVM can't just remove all mappings
from the
>>>> tree nor can it free them
>>>> by itself, since drivers might use them for tracking their 
>>>> allocated page tables and/or
>>>> other stuff.
>>>>
>>>> Now, let's think about the scope this callback might be
called
>>>> from. When a VM_BO is destroyed
>>>> the driver might hold a couple of locks (for Xe it would be the
>>>> VM's shared dma-resv lock and
>>>> potentially the corresponding object's dma-resv lock if
they're not
>>>> the same already). If
>>>> destroying this VM_BO leads to the VM being destroyed, the
drivers
>>>> vm_destroy() callback would
>>>> be called with those locks being held as well.
>>>>
>>>> I feel like doing this finally opens the doors of the locking
hell
>>>> entirely. I think we should
>>>> really avoid that.
>>
>> I don't think we need to worry much about this particular locking 
>> hell because if we hold
>
> I have to agree with Danilo here. Especially you have cases where you 
> usually lock BO->VM (for example eviction) as well as cases where you 
> need to lock VM->BO (command submission).
>
> Because of this in amdgpu we used (or abused?) the dma_resv of the 
> root BO as lock for the VM. Since this is a ww_mutex locking it in 
> both VM, BO as well as BO, VM order works.
Yes, gpuvm is doing the same. (although not necessarily using the 
page-table root bo, but any bo of the driver's choice). But I read it as 
Danilo feared the case where the VM destructor was called with a VM resv 
(or possibly bo resv) held. I meant the driver can easily ensure that's 
not happening, and in some cases it can't happen.

Thanks,

Thomas


>
> Regards,
> Christian.
>
>> , for example a vm and bo resv when putting the vm_bo, we need to 
>> keep additional strong references for the bo / vm pointer we use for 
>> unlocking. Hence putting the vm_bo under those locks can never lead 
>> to the vm getting destroyed.
>>
>> Also, don't we already sort of have a mandatory vm_destroy
callback?
>>
>> +??? if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
>> +??????? return;
>>
>>
>>
>>>
>>> That's a really good point, but I fear exactly that's the
use case.
>>>
>>> I would expect that VM_BO structures are added in the 
>>> drm_gem_object_funcs.open callback and freed in 
>>> drm_gem_object_funcs.close.
>>>
>>> Since it is perfectly legal for userspace to close a BO while there
>>> are still mappings (can trivial be that the app is killed) I would 
>>> expect that the drm_gem_object_funcs.close handling is something 
>>> like asking drm_gpuvm destroying the VM_BO and getting the mappings
>>> which should be cleared in the page table in return.
>>>
>>> In amdgpu we even go a step further and the VM structure keeps
track
>>> of all the mappings of deleted VM_BOs so that higher level can
query
>>> those and clear them later on.
>>>
>>> Background is that the drm_gem_object_funcs.close can't fail,
but it
>>> can perfectly be that the app is killed because of an OOM situation
>>> and we can't do page tables updates in that moment because of
this.
>>>
>>>>
>>>>>
>>>>> Which makes sense in that case because when the mm_struct
is gone
>>>>> the vm_area_struct doesn't make sense any more either.
>>>>>
>>>>> What we clearly need is a reference to prevent the VM or at
least
>>>>> the shared resv to go away to early.
>>>>
>>>> Yeah, that was a good hint and we've covered that.
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> But I think all of this is fixable as follow-ups if
needed,
>>>>>> unless I'm missing something crucial.
>>>>
>>>> Fully agree, I think at this point we should go ahead and land
this
>>>> series.
>>
>> +1.
>>
>> /Thomas
>>
>>
>>>>
>>>
>>> Yeah, agree this is not UAPI so not nailed in stone. Feel free to 
>>> add my acked-by as well if you want.
>>>
>>> Only keep in mind that when you give drivers some functionality in
a
>>> common component they usually expect to keep that functionality.
>>>
>>> For example changing the dma_resv object to make sure that drivers 
>>> can't cause use after free errors any more was an extremely
annoying
>>> experience since every user of those interface had to change at
once.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>>>>
>>>>>> Just my 2 cents.
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Apparently Analagous Threads

Search for more apparently analagous threads

Nouveau - Nov 2023 - [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

[Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

Apparently Analagous Threads