Tvrtko Ursulin
2025-Jun-03 12:27 UTC
[RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
On 03/06/2025 10:31, Philipp Stanner wrote:> An alternative version to [1], based on Tvrtko's suggestion from [2]. > > I tested this for Nouveau. Works. > > I'm having, however, bigger problems properly porting the unit tests and > have seen various explosions. In the process I noticed that some things > in the unit tests aren't right and a bit of a larger rework will be > necessary (for example, the timedout job callback must signal the > timedout fence, remove it from the list and so on).General approach I follow when implementing any mock component is to implement only as much is needed for a test to pass. Only add more and rework when a test/functionality is added which requires it. Specifically for timedout callback signaling I see that I had exactly that added in the patch you linked as [2]. > Anyways. Please comment on the general idea. I am obviously okay with it. :) Especially now that you verified it works well for nouveau. What I am not that ecstatic about is only getting the Suggested-by credit in 1/6. Given it is basically my patch with some cosmetic changes like the kernel doc and the cancel loop extracted to a helper.> @Tvrtko: As briefly brainstormed about on IRC, if you'd be willing to > take care of the unit tests patch, I could remove that one (and, > maaaaybe, the warning print patch) from the series and we could merge > this RFC's successor version %N once it's ready. What do you think?Okay in principle but the first thing I would suggest you could try is to take my unit tests adaptations from [2] verbatim. Benefit of keeping everything in one series is more confidence we are merging a solid thing. But I can take it on myself as a follow up too if you want. Regards, Tvrtko> > P. > > [1] https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta at kernel.org/ > [2] https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin at igalia.com/ > > Philipp Stanner (6): > drm/sched: Avoid memory leaks with cancel_job() callback > drm/sched/tests: Implement cancel_job() > drm/sched: Warn if pending list is not empty > drm/nouveau: Make fence container helper usable driver-wide > drm/nouveau: Add new callback for scheduler teardown > drm/nouveau: Remove waitque for sched teardown > > drivers/gpu/drm/nouveau/nouveau_fence.c | 35 +++++---- > drivers/gpu/drm/nouveau/nouveau_fence.h | 7 ++ > drivers/gpu/drm/nouveau/nouveau_sched.c | 35 +++++---- > drivers/gpu/drm/nouveau/nouveau_sched.h | 9 +-- > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 8 +-- > drivers/gpu/drm/scheduler/sched_main.c | 37 ++++++---- > .../gpu/drm/scheduler/tests/mock_scheduler.c | 71 +++++++------------ > drivers/gpu/drm/scheduler/tests/sched_tests.h | 4 +- > include/drm/gpu_scheduler.h | 9 +++ > 9 files changed, 115 insertions(+), 100 deletions(-) >
Philipp Stanner
2025-Jun-03 13:23 UTC
[RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote:> > On 03/06/2025 10:31, Philipp Stanner wrote: > > An alternative version to [1], based on Tvrtko's suggestion from > > [2]. > > > > I tested this for Nouveau. Works. > > > > I'm having, however, bigger problems properly porting the unit > > tests and > > have seen various explosions. In the process I noticed that some > > things > > in the unit tests aren't right and a bit of a larger rework will be > > necessary (for example, the timedout job callback must signal the > > timedout fence, remove it from the list and so on). > > General approach I follow when implementing any mock component is to > implement only as much is needed for a test to pass. Only add more > and > rework when a test/functionality is added which requires it. > > Specifically for timedout callback signaling I see that I had exactly > that added in the patch you linked as [2]. > ? > Anyways. Please comment on the general idea. > > I am obviously okay with it. :) Especially now that you verified it > works well for nouveau. > > What I am not that ecstatic about is only getting the Suggested-by > credit in 1/6. Given it is basically my patch with some cosmetic > changes > like the kernel doc and the cancel loop extracted to a helper.Sign the patch off and I give you the authorship if you want.> > > @Tvrtko: As briefly brainstormed about on IRC, if you'd be willing > > to > > take care of the unit tests patch, I could remove that one (and, > > maaaaybe, the warning print patch) from the series and we could > > merge > > this RFC's successor version %N once it's ready. What do you think? > > Okay in principle but the first thing I would suggest you could try > is > to take my unit tests adaptations from [2] verbatim. Benefit of > keeping > everything in one series is more confidence we are merging a solid > thing. But I can take it on myself as a follow up too if you want. > > Regards, > > Tvrtko > > > > > P. > > > > [1] > > https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta at kernel.org/ > > [2] > > https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin at igalia.com/ > > > > Philipp Stanner (6): > > ?? drm/sched: Avoid memory leaks with cancel_job() callback > > ?? drm/sched/tests: Implement cancel_job() > > ?? drm/sched: Warn if pending list is not empty > > ?? drm/nouveau: Make fence container helper usable driver-wide > > ?? drm/nouveau: Add new callback for scheduler teardown > > ?? drm/nouveau: Remove waitque for sched teardown > > > > ? drivers/gpu/drm/nouveau/nouveau_fence.c?????? | 35 +++++---- > > ? drivers/gpu/drm/nouveau/nouveau_fence.h?????? |? 7 ++ > > ? drivers/gpu/drm/nouveau/nouveau_sched.c?????? | 35 +++++---- > > ? drivers/gpu/drm/nouveau/nouveau_sched.h?????? |? 9 +-- > > ? drivers/gpu/drm/nouveau/nouveau_uvmm.c??????? |? 8 +-- > > ? drivers/gpu/drm/scheduler/sched_main.c??????? | 37 ++++++---- > > ? .../gpu/drm/scheduler/tests/mock_scheduler.c? | 71 +++++++------- > > ----- > > ? drivers/gpu/drm/scheduler/tests/sched_tests.h |? 4 +- > > ? include/drm/gpu_scheduler.h?????????????????? |? 9 +++ > > ? 9 files changed, 115 insertions(+), 100 deletions(-) > > >
Danilo Krummrich
2025-Jun-11 21:21 UTC
[RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
> On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote: > > On 03/06/2025 10:31, Philipp Stanner wrote: > > What I am not that ecstatic about is only getting the Suggested-by > > credit in 1/6. Given it is basically my patch with some cosmetic > > changes > > like the kernel doc and the cancel loop extracted to a helper. > > Sign the patch off and I give you the authorship if you want.AFAICS, the proposal of having cancel_job() has been a review comment which has been clarified with a reference patch. IMO, the fact that after some discussion Philipp decided to go with this suggestion and implement the suggestion in his patch series does not result in an obligation for him to hand over authorship of the patch he wrote to the person who suggested the change in the context of the code review. Anyways, it seems that Philipp did offer it however, so this seems to be resolved?