Philipp Stanner
2025-Jun-03 09:31 UTC
[RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
Since its inception, the GPU scheduler can leak memory if the driver
calls drm_sched_fini() while there are still jobs in flight.
The simplest way to solve this in a backwards compatible manner is by
adding a new callback, drm_sched_backend_ops.cancel_job(), which
instructs the driver to signal the hardware fence associated with the
job. Afterwards, the scheduler can savely use the established free_job()
callback for freeing the job.
Implement the new backend_ops callback cancel_job().
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
Signed-off-by: Philipp Stanner <phasta at kernel.org>
---
drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++----------
include/drm/gpu_scheduler.h | 9 +++++++
2 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index d20726d7adf0..3f14f1e151fa 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const
struct drm_sched_init_
}
EXPORT_SYMBOL(drm_sched_init);
+static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)
+{
+ struct drm_sched_job *job, *tmp;
+
+ /* All other accessors are stopped. No locking necessary. */
+ list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list)
{
+ sched->ops->cancel_job(job);
+ list_del(&job->list);
+ sched->ops->free_job(job);
+ }
+}
+
/**
* drm_sched_fini - Destroy a gpu scheduler
*
@@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
*
* Tears down and cleans up the scheduler.
*
- * This stops submission of new jobs to the hardware through
- * drm_sched_backend_ops.run_job(). Consequently,
drm_sched_backend_ops.free_job()
- * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
- * There is no solution for this currently. Thus, it is up to the driver to
make
- * sure that:
- *
- * a) drm_sched_fini() is only called after for all submitted jobs
- * drm_sched_backend_ops.free_job() has been called or that
- * b) the jobs for which drm_sched_backend_ops.free_job() has not been called
- * after drm_sched_fini() ran are freed manually.
- *
- * FIXME: Take care of the above problem and prevent this function from leaking
- * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
+ * This stops submission of new jobs to the hardware through &struct
+ * drm_sched_backend_ops.run_job. If &struct
drm_sched_backend_ops.cancel_job
+ * is implemented, all jobs will be canceled through it and afterwards cleaned
+ * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not
+ * implemented, memory could leak.
*/
void drm_sched_fini(struct drm_gpu_scheduler *sched)
{
@@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
/* Confirm no work left behind accessing device structures */
cancel_delayed_work_sync(&sched->work_tdr);
+ /* Avoid memory leaks if supported by the driver. */
+ if (sched->ops->cancel_job)
+ drm_sched_kill_remaining_jobs(sched);
+
if (sched->own_submit_wq)
destroy_workqueue(sched->submit_wq);
sched->ready = false;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e62a7214e052..81dcbfc8c223 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
* and it's time to clean it up.
*/
void (*free_job)(struct drm_sched_job *sched_job);
+
+ /**
+ * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences
+ * get signaled in drm_sched_fini().
+ *
+ * Drivers need to signal the passed job's hardware fence with
+ * -ECANCELED in this callback. They must not free the job.
+ */
+ void (*cancel_job)(struct drm_sched_job *sched_job);
};
/**
--
2.49.0
Tvrtko Ursulin
2025-Jun-03 13:22 UTC
[RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
On 03/06/2025 10:31, Philipp Stanner wrote:> Since its inception, the GPU scheduler can leak memory if the driver > calls drm_sched_fini() while there are still jobs in flight. > > The simplest way to solve this in a backwards compatible manner is by > adding a new callback, drm_sched_backend_ops.cancel_job(), which > instructs the driver to signal the hardware fence associated with the > job. Afterwards, the scheduler can savely use the established free_job() > callback for freeing the job. > > Implement the new backend_ops callback cancel_job(). > > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com> > Signed-off-by: Philipp Stanner <phasta at kernel.org> > --- > drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++---------- > include/drm/gpu_scheduler.h | 9 +++++++ > 2 files changed, 30 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index d20726d7adf0..3f14f1e151fa 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_ > } > EXPORT_SYMBOL(drm_sched_init); > > +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)Only some bikeshedding comments, no need to act on them. 1) I would maybe s/kill/cancel/ to align with the ->cancel_job naming. 2) Would the callback presence check look better inside this helper so it is all consolidated and streamlined? Regards, Tvrtko> +{ > + struct drm_sched_job *job, *tmp; > + > + /* All other accessors are stopped. No locking necessary. */ > + list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) { > + sched->ops->cancel_job(job); > + list_del(&job->list); > + sched->ops->free_job(job); > + } > +} > + > /** > * drm_sched_fini - Destroy a gpu scheduler > * > @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init); > * > * Tears down and cleans up the scheduler. > * > - * This stops submission of new jobs to the hardware through > - * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job() > - * will not be called for all jobs still in drm_gpu_scheduler.pending_list. > - * There is no solution for this currently. Thus, it is up to the driver to make > - * sure that: > - * > - * a) drm_sched_fini() is only called after for all submitted jobs > - * drm_sched_backend_ops.free_job() has been called or that > - * b) the jobs for which drm_sched_backend_ops.free_job() has not been called > - * after drm_sched_fini() ran are freed manually. > - * > - * FIXME: Take care of the above problem and prevent this function from leaking > - * the jobs in drm_gpu_scheduler.pending_list under any circumstances. > + * This stops submission of new jobs to the hardware through &struct > + * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job > + * is implemented, all jobs will be canceled through it and afterwards cleaned > + * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not > + * implemented, memory could leak. > */ > void drm_sched_fini(struct drm_gpu_scheduler *sched) > { > @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) > /* Confirm no work left behind accessing device structures */ > cancel_delayed_work_sync(&sched->work_tdr); > > + /* Avoid memory leaks if supported by the driver. */ > + if (sched->ops->cancel_job) > + drm_sched_kill_remaining_jobs(sched); > + > if (sched->own_submit_wq) > destroy_workqueue(sched->submit_wq); > sched->ready = false; > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index e62a7214e052..81dcbfc8c223 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -512,6 +512,15 @@ struct drm_sched_backend_ops { > * and it's time to clean it up. > */ > void (*free_job)(struct drm_sched_job *sched_job); > + > + /** > + * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences > + * get signaled in drm_sched_fini(). > + * > + * Drivers need to signal the passed job's hardware fence with > + * -ECANCELED in this callback. They must not free the job. > + */ > + void (*cancel_job)(struct drm_sched_job *sched_job); > }; > > /**
Tvrtko Ursulin
2025-Jun-12 14:17 UTC
[RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
On 03/06/2025 10:31, Philipp Stanner wrote:> Since its inception, the GPU scheduler can leak memory if the driver > calls drm_sched_fini() while there are still jobs in flight. > > The simplest way to solve this in a backwards compatible manner is by > adding a new callback, drm_sched_backend_ops.cancel_job(), which > instructs the driver to signal the hardware fence associated with the > job. Afterwards, the scheduler can savely use the established free_job() > callback for freeing the job. > > Implement the new backend_ops callback cancel_job(). > > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>Please just add the link to the patch here (it is only in the cover letter): Link: https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin at igalia.com/ And you probably want to take the unit test modifications from the same patch too. You could put them in the same patch or separate. Regards, Tvrtko> Signed-off-by: Philipp Stanner <phasta at kernel.org> > --- > drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++---------- > include/drm/gpu_scheduler.h | 9 +++++++ > 2 files changed, 30 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index d20726d7adf0..3f14f1e151fa 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_ > } > EXPORT_SYMBOL(drm_sched_init); > > +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched) > +{ > + struct drm_sched_job *job, *tmp; > + > + /* All other accessors are stopped. No locking necessary. */ > + list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) { > + sched->ops->cancel_job(job); > + list_del(&job->list); > + sched->ops->free_job(job); > + } > +} > + > /** > * drm_sched_fini - Destroy a gpu scheduler > * > @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init); > * > * Tears down and cleans up the scheduler. > * > - * This stops submission of new jobs to the hardware through > - * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job() > - * will not be called for all jobs still in drm_gpu_scheduler.pending_list. > - * There is no solution for this currently. Thus, it is up to the driver to make > - * sure that: > - * > - * a) drm_sched_fini() is only called after for all submitted jobs > - * drm_sched_backend_ops.free_job() has been called or that > - * b) the jobs for which drm_sched_backend_ops.free_job() has not been called > - * after drm_sched_fini() ran are freed manually. > - * > - * FIXME: Take care of the above problem and prevent this function from leaking > - * the jobs in drm_gpu_scheduler.pending_list under any circumstances. > + * This stops submission of new jobs to the hardware through &struct > + * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job > + * is implemented, all jobs will be canceled through it and afterwards cleaned > + * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not > + * implemented, memory could leak. > */ > void drm_sched_fini(struct drm_gpu_scheduler *sched) > { > @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) > /* Confirm no work left behind accessing device structures */ > cancel_delayed_work_sync(&sched->work_tdr); > > + /* Avoid memory leaks if supported by the driver. */ > + if (sched->ops->cancel_job) > + drm_sched_kill_remaining_jobs(sched); > + > if (sched->own_submit_wq) > destroy_workqueue(sched->submit_wq); > sched->ready = false; > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index e62a7214e052..81dcbfc8c223 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -512,6 +512,15 @@ struct drm_sched_backend_ops { > * and it's time to clean it up. > */ > void (*free_job)(struct drm_sched_job *sched_job); > + > + /** > + * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences > + * get signaled in drm_sched_fini(). > + * > + * Drivers need to signal the passed job's hardware fence with > + * -ECANCELED in this callback. They must not free the job. > + */ > + void (*cancel_job)(struct drm_sched_job *sched_job); > }; > > /**