Thomas Hellström
2023-Oct-04 15:29 UTC
[Nouveau] [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects
On Wed, 2023-10-04 at 14:57 +0200, Danilo Krummrich wrote:> On 10/3/23 11:11, Thomas Hellstr?m wrote: > > <snip> > > > > > + > > > > +/** > > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gpuvm_bo to / > > > > from the &drm_gpuvms > > > > + * evicted list > > > > + * @vm_bo: the &drm_gpuvm_bo to add or remove > > > > + * @evict: indicates whether the object is evicted > > > > + * > > > > + * Adds a &drm_gpuvm_bo to or removes it from the &drm_gpuvms > > > > evicted list. > > > > + */ > > > > +void > > > > +drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict) > > > > +{ > > > > +??? struct drm_gem_object *obj = vm_bo->obj; > > > > + > > > > +??? dma_resv_assert_held(obj->resv); > > > > + > > > > +??? /* Always lock list transactions, even if > > > > DRM_GPUVM_RESV_PROTECTED is > > > > +???? * set. This is required to protect multiple concurrent > > > > calls to > > > > +???? * drm_gpuvm_bo_evict() with BOs with different dma_resv. > > > > +???? */ > > > > > > This doesn't work. The RESV_PROTECTED case requires the evicted > > > flag we discussed before. The list is either protected by the > > > spinlock or the resv. Otherwise a list add could race with a list > > > removal elsewhere. > > I think it does unless I miss something, but it might be a bit subtle > though. > > Concurrent drm_gpuvm_bo_evict() are protected by the spinlock. > Additionally, when > drm_gpuvm_bo_evict() is called we hold the dma-resv of the > corresponding GEM object. > > In drm_gpuvm_validate() I assert that we hold *all* dma-resv, which > implies that no > one can call drm_gpuvm_bo_evict() on any of the VM's objects and no > one can add a new > one and directly call drm_gpuvm_bo_evict() on it either.But translated into how the data (the list in this case) is protected it becomes "Either the spinlock and the bo resv of a single list item OR the bo resvs of all bos that can potentially be on the list", while this is certainly possible to assert, any new / future code that manipulates the evict list will probably get this wrong and as a result the code becomes pretty fragile. I think drm_gpuvm_bo_destroy() already gets it wrong in that it, while holding a single resv, doesn't take the spinlock. So I think that needs fixing, and if keeping that protection I think it needs to be documented with the list member and ideally an assert. But also note that lockdep_assert_held will typically give false true for dma_resv locks; as long as the first dma_resv lock locked in a drm_exec sequence remains locked, lockdep thinks *all* dma_resv locks are held. (or something along those lines), so the resv lockdep asserts are currently pretty useless. /Thomas> > > > > > > Thanks, > > > > > > Thomas > > > > > > > > >
Danilo Krummrich
2023-Oct-04 17:17 UTC
[Nouveau] [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects
On 10/4/23 17:29, Thomas Hellstr?m wrote:> > On Wed, 2023-10-04 at 14:57 +0200, Danilo Krummrich wrote: >> On 10/3/23 11:11, Thomas Hellstr?m wrote: >> >> <snip> >> >>>>> + >>>>> +/** >>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gpuvm_bo to / >>>>> from the &drm_gpuvms >>>>> + * evicted list >>>>> + * @vm_bo: the &drm_gpuvm_bo to add or remove >>>>> + * @evict: indicates whether the object is evicted >>>>> + * >>>>> + * Adds a &drm_gpuvm_bo to or removes it from the &drm_gpuvms >>>>> evicted list. >>>>> + */ >>>>> +void >>>>> +drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict) >>>>> +{ >>>>> +??? struct drm_gem_object *obj = vm_bo->obj; >>>>> + >>>>> +??? dma_resv_assert_held(obj->resv); >>>>> + >>>>> +??? /* Always lock list transactions, even if >>>>> DRM_GPUVM_RESV_PROTECTED is >>>>> +???? * set. This is required to protect multiple concurrent >>>>> calls to >>>>> +???? * drm_gpuvm_bo_evict() with BOs with different dma_resv. >>>>> +???? */ >>>> >>>> This doesn't work. The RESV_PROTECTED case requires the evicted >>>> flag we discussed before. The list is either protected by the >>>> spinlock or the resv. Otherwise a list add could race with a list >>>> removal elsewhere. >> >> I think it does unless I miss something, but it might be a bit subtle >> though. >> >> Concurrent drm_gpuvm_bo_evict() are protected by the spinlock. >> Additionally, when >> drm_gpuvm_bo_evict() is called we hold the dma-resv of the >> corresponding GEM object. >> >> In drm_gpuvm_validate() I assert that we hold *all* dma-resv, which >> implies that no >> one can call drm_gpuvm_bo_evict() on any of the VM's objects and no >> one can add a new >> one and directly call drm_gpuvm_bo_evict() on it either. > > But translated into how the data (the list in this case) is protected > it becomes > > "Either the spinlock and the bo resv of a single list item OR the bo > resvs of all bos that can potentially be on the list", > > while this is certainly possible to assert, any new / future code that > manipulates the evict list will probably get this wrong and as a result > the code becomes pretty fragile. I think drm_gpuvm_bo_destroy() already > gets it wrong in that it, while holding a single resv, doesn't take the > spinlock.That's true and I don't like it either. Unfortunately, with the dma-resv locking scheme we can't really protect the evict list without the drm_gpuvm_bo::evicted trick properly. But as pointed out in my other reply, I'm a bit worried about the drm_gpuvm_bo::evicted trick being too restrictive, but maybe it's fine doing it in the RESV_PROTECTED case.> > So I think that needs fixing, and if keeping that protection I think it > needs to be documented with the list member and ideally an assert. But > also note that lockdep_assert_held will typically give false true for > dma_resv locks; as long as the first dma_resv lock locked in a drm_exec > sequence remains locked, lockdep thinks *all* dma_resv locks are held. > (or something along those lines), so the resv lockdep asserts are > currently pretty useless. > > /Thomas > > > >> >>>> >>>> Thanks, >>>> >>>> Thomas >>>> >>>> >>> >> >