thr3ads.net - Nouveau - [PATCH 3/5] drm/sched: Warn if pending list is not empty [Apr 2025]

If this information is useful, please help other people find it:
Share via:

Danilo Krummrich

2025-Apr-17 12:11 UTC

[PATCH 3/5] drm/sched: Warn if pending list is not empty

On Thu, Apr 17, 2025 at 12:27:29PM +0100, Tvrtko Ursulin
wrote:> 
> On 17/04/2025 08:45, Philipp Stanner wrote:
> > On Mon, 2025-04-07 at 17:22 +0200, Philipp Stanner wrote:
> 
> Problem exactly is that jobs can outlive the entities and the scheduler,
> while some userspace may have a dma fence reference to the job via sync
> file. This new callback would not solve it for xe, but if everything
> required was reference counted it would.
I think you're mixing up the job and the dma_fence here, if a job outlives
the
scheduler, it clearly is a bug, always has been.

AFAIK, Xe reference counts it's driver specific job structures *and* the
driver
specific scheduler structure, such that drm_sched_fini() won't be called
before
all jobs have finished.

The reference counting solution, however, potentially creates subsequent
lifetime problems, which I think have been discussed already in a different
thread already. Philipp can probably link in the corresponding discussion.
> On the design level it feels like it adds too much state machine and makes
> things hard to follow/maintain. It would be nice to find a solutiuon where
> we end up with less state machine and not more.
Multiple solutions have been discussed already, e.g. just wait for the pending
list to be empty, reference count the scheduler for every pending job. Those all
had significant downsides, which I don't see with this proposal.

I'm all for better ideas though -- what do you propose?
> 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 6b72278c4b72..ae3152beca14 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -1465,6 +1465,10 @@ void drm_sched_fini(struct
drm_gpu_scheduler
> > > *sched)
> > >  ?	sched->ready = false;
> > >  ?	kfree(sched->sched_rq);
> > >  ?	sched->sched_rq = NULL;
> > > +
> > > +	if (!list_empty(&sched->pending_list))
> > > +		dev_err(sched->dev, "%s: Tearing down scheduler
> > > while jobs are pending!\n",
> > > +			__func__);
> 
> It isn't fair to add this error since it would out of the blue start
firing
> for everyone expect nouveau, no? Regardless if there is a leak or not.
I think it is pretty fair to warn when detecting a guaranteed bug, no?

If drm_sched_fini() is call while jobs are still on the pending_list, they
won't
ever be freed, because all workqueues are stopped.

Tvrtko Ursulin

2025-Apr-17 14:20 UTC

head link

[PATCH 3/5] drm/sched: Warn if pending list is not empty

On 17/04/2025 13:11, Danilo Krummrich wrote:> On Thu, Apr 17, 2025 at 12:27:29PM +0100, Tvrtko Ursulin wrote:
>>
>> On 17/04/2025 08:45, Philipp Stanner wrote:
>>> On Mon, 2025-04-07 at 17:22 +0200, Philipp Stanner wrote:
>>
>> Problem exactly is that jobs can outlive the entities and the
scheduler,
>> while some userspace may have a dma fence reference to the job via sync
>> file. This new callback would not solve it for xe, but if everything
>> required was reference counted it would.
> 
> I think you're mixing up the job and the dma_fence here, if a job
outlives the
> scheduler, it clearly is a bug, always has been.
> 
> AFAIK, Xe reference counts it's driver specific job structures *and*
the driver
> specific scheduler structure, such that drm_sched_fini() won't be
called before
> all jobs have finished.
Yes, sorry, dma fence. But it is not enough to postpone drm_sched_fini 
until the job is not finished. Problem is exported dma fence holds the 
pointer to drm_sched_fence (and so oopses in 
drm_sched_fence_get_timeline_name on fence->sched->name) *after* job had 
finished and driver was free to tear everything down.

That is what the "fini callback" wouldn't solve but reference
counting
would.

I don't see that any other driver which can export a fence couldn't 
suffer from the same problem, and even if I agree this maybe isn't 
strictly a scheduler problem, I cannot see it solvable without involving 
the scheduler, and in any other way than reference counting. And if 
reference counting could solve both problems then it feels attractive.
> The reference counting solution, however, potentially creates subsequent
> lifetime problems, which I think have been discussed already in a different
> thread already. Philipp can probably link in the corresponding discussion.
I don't doubt it causes problems but maybe there will be no other way 
than to tackle them.
>> On the design level it feels like it adds too much state machine and
makes
>> things hard to follow/maintain. It would be nice to find a solutiuon
where
>> we end up with less state machine and not more.
> 
> Multiple solutions have been discussed already, e.g. just wait for the
pending
> list to be empty, reference count the scheduler for every pending job.
Those all
> had significant downsides, which I don't see with this proposal.
> 
> I'm all for better ideas though -- what do you propose?
I think we need to brainstorm both issues and see if there is a solution 
which solves them both, with bonus points for being elegant.

Use after free is IMO even a worse problem so if reference counting is 
the only way then lets try that. But I will wait for those links to 
catch up on the past discussions.
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6b72278c4b72..ae3152beca14 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -1465,6 +1465,10 @@ void drm_sched_fini(struct
drm_gpu_scheduler
>>>> *sched)
>>>>   ?	sched->ready = false;
>>>>   ?	kfree(sched->sched_rq);
>>>>   ?	sched->sched_rq = NULL;
>>>> +
>>>> +	if (!list_empty(&sched->pending_list))
>>>> +		dev_err(sched->dev, "%s: Tearing down scheduler
>>>> while jobs are pending!\n",
>>>> +			__func__);
>>
>> It isn't fair to add this error since it would out of the blue
start firing
>> for everyone expect nouveau, no? Regardless if there is a leak or not.
> 
> I think it is pretty fair to warn when detecting a guaranteed bug, no?
> 
> If drm_sched_fini() is call while jobs are still on the pending_list, they
won't
> ever be freed, because all workqueues are stopped.
Is it a guaranteed bug for drivers are aware of the drm_sched_fini() 
limitation and are cleaning up upon themselves?

In other words if you apply the series up to here would it trigger for 
nouveau? Reportedly it triggers for the mock scheduler which also has no 
leak.

Also, I asked in my initial reply if we have a list of which of the 
current drivers suffer from memory leaks. Is it all or some etc.

Regards,

Tvrtko

Nouveau - Apr 2025 - [PATCH 3/5] drm/sched: Warn if pending list is not empty

[PATCH 3/5] drm/sched: Warn if pending list is not empty

[PATCH 3/5] drm/sched: Warn if pending list is not empty