thr3ads.net - Nouveau - [PATCH 3/5] drm/sched: Warn if pending list is not empty [Apr 2025]

If this information is useful, please help other people find it:
Share via:

Tvrtko Ursulin

2025-Apr-17 11:27 UTC

[PATCH 3/5] drm/sched: Warn if pending list is not empty

On 17/04/2025 08:45, Philipp Stanner wrote:> On Mon, 2025-04-07 at 17:22 +0200, Philipp Stanner wrote:
>> drm_sched_fini() can leak jobs under certain circumstances.
>>
>> Warn if that happens.
>>
>> Signed-off-by: Philipp Stanner <phasta at kernel.org>
>> ---
>>  ?drivers/gpu/drm/scheduler/sched_main.c | 4 ++++
> 
> I hear a lot of amazed silence for this series ^_^
> 
> If there's no major pushback, my intention is to pull this in once the
> blocking Nouveau-bug has been fixed (by pulling my patch).
It was on my todo list to read it but failed to get to it due various 
reasons.

I only managed to skim over it it and am not quite convinced. But 
because I did not have time to think about it very much my feedback at 
this point is not very deep.

On the high level I understand for nouveau the series is "net zero", 
right? No leaks before, no leaks after. What about other drivers? Which 
ones have known leaks which could be addressed by them implementing this 
new callback?

Presumably you would document the callback should only be implemented by 
drivers which are 100% sure the clean up can always reliably done? 
Otherwise unkillable process on stuck hardware or drivers with not good 
enough reset support can still happen. (Which would be worse than a leak 
on shutdown.)

Have you thought about any observable effects from the userspace point 
of view? For example something if which now works, such as submitting a 
job which writes to a shared buffer and then exiting could stop working 
after the callback is implemented? I don't see it, because it would be 
unreliable even today (timing dependent whether job is in the queue or 
scheduled at exit time) so just thinking out loud.

Also, since the cover letter mentions job reference counting was one 
idea that was discarded another related problem is about the lifetimes. 
I think it would be good to discuss potentially reference counting 
"everything" because for instance today I can crash the kernel
trivially
with the xe driver*. Probably other drivers too.

Problem exactly is that jobs can outlive the entities and the scheduler, 
while some userspace may have a dma fence reference to the job via sync 
file. This new callback would not solve it for xe, but if everything 
required was reference counted it would.

Back to the series.

On the design level it feels like it adds too much state machine and 
makes things hard to follow/maintain. It would be nice to find a 
solutiuon where we end up with less state machine and not more.

On the low level there are some patch ordering and naming, spelling and 
other small improvements to be made.

But as said at the start, I would need to set aside more time to provide 
better comments and/or ideas.

*)
https://patchwork.freedesktop.org/patch/642709/?series=146211&rev=2
> In the mean time, let me review my own stuff:
> 
>>  ?1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 6b72278c4b72..ae3152beca14 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -1465,6 +1465,10 @@ void drm_sched_fini(struct drm_gpu_scheduler
>> *sched)
>>  ?	sched->ready = false;
>>  ?	kfree(sched->sched_rq);
>>  ?	sched->sched_rq = NULL;
>> +
>> +	if (!list_empty(&sched->pending_list))
>> +		dev_err(sched->dev, "%s: Tearing down scheduler
>> while jobs are pending!\n",
>> +			__func__);
It isn't fair to add this error since it would out of the blue start 
firing for everyone expect nouveau, no? Regardless if there is a leak or 
not.
> 
> The "%s" here will be removed since dev_err() alreday prints the
> function name.
It does? But anyway, function name is redundant and irrelevant and 
should not be logged IMO. I would rather prefer we replaced sched->dev 
with sched->drm so could use drm loggers for clarity throughout.
> I find this dev_err() print more useful than the stack a WARN_ON prints
> (telling you about invalid_asm_exec_op or stuff like that).
> 
> Plus, I guess the places where drivers call drm_sched_fini() are very
> well defined and known, so a callstack wouldn't really be useful in the
> first place.
Agreed.

Regards,

Tvrtko
> 
> P.
> 
>>  ?}
>>  ?EXPORT_SYMBOL(drm_sched_fini);
>>   
>

Danilo Krummrich

2025-Apr-17 12:11 UTC

head link

[PATCH 3/5] drm/sched: Warn if pending list is not empty

On Thu, Apr 17, 2025 at 12:27:29PM +0100, Tvrtko Ursulin
wrote:> 
> On 17/04/2025 08:45, Philipp Stanner wrote:
> > On Mon, 2025-04-07 at 17:22 +0200, Philipp Stanner wrote:
> 
> Problem exactly is that jobs can outlive the entities and the scheduler,
> while some userspace may have a dma fence reference to the job via sync
> file. This new callback would not solve it for xe, but if everything
> required was reference counted it would.
I think you're mixing up the job and the dma_fence here, if a job outlives
the
scheduler, it clearly is a bug, always has been.

AFAIK, Xe reference counts it's driver specific job structures *and* the
driver
specific scheduler structure, such that drm_sched_fini() won't be called
before
all jobs have finished.

The reference counting solution, however, potentially creates subsequent
lifetime problems, which I think have been discussed already in a different
thread already. Philipp can probably link in the corresponding discussion.
> On the design level it feels like it adds too much state machine and makes
> things hard to follow/maintain. It would be nice to find a solutiuon where
> we end up with less state machine and not more.
Multiple solutions have been discussed already, e.g. just wait for the pending
list to be empty, reference count the scheduler for every pending job. Those all
had significant downsides, which I don't see with this proposal.

I'm all for better ideas though -- what do you propose?
> 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 6b72278c4b72..ae3152beca14 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -1465,6 +1465,10 @@ void drm_sched_fini(struct
drm_gpu_scheduler
> > > *sched)
> > >  ?	sched->ready = false;
> > >  ?	kfree(sched->sched_rq);
> > >  ?	sched->sched_rq = NULL;
> > > +
> > > +	if (!list_empty(&sched->pending_list))
> > > +		dev_err(sched->dev, "%s: Tearing down scheduler
> > > while jobs are pending!\n",
> > > +			__func__);
> 
> It isn't fair to add this error since it would out of the blue start
firing
> for everyone expect nouveau, no? Regardless if there is a leak or not.
I think it is pretty fair to warn when detecting a guaranteed bug, no?

If drm_sched_fini() is call while jobs are still on the pending_list, they
won't
ever be freed, because all workqueues are stopped.

Nouveau - Apr 2025 - [PATCH 3/5] drm/sched: Warn if pending list is not empty

[PATCH 3/5] drm/sched: Warn if pending list is not empty

[PATCH 3/5] drm/sched: Warn if pending list is not empty