Sven Joachim
2021-Nov-03 20:47 UTC
[Nouveau] [PATCH 5.10 32/77] drm/ttm: fix memleak in ttm_transfered_destroy
On 2021-11-03 21:32 +0100, Karol Herbst wrote:> On Wed, Nov 3, 2021 at 9:29 PM Karol Herbst <kherbst at redhat.com> wrote: >> >> On Wed, Nov 3, 2021 at 8:52 PM Sven Joachim <svenjoac at gmx.de> wrote: >> > >> > On 2021-11-01 10:17 +0100, Greg Kroah-Hartman wrote: >> > >> > > From: Christian K?nig <christian.koenig at amd.com> >> > > >> > > commit 0db55f9a1bafbe3dac750ea669de9134922389b5 upstream. >> > > >> > > We need to cleanup the fences for ghost objects as well. >> > > >> > > Signed-off-by: Christian K?nig <christian.koenig at amd.com> >> > > Reported-by: Erhard F. <erhard_f at mailbox.org> >> > > Tested-by: Erhard F. <erhard_f at mailbox.org> >> > > Reviewed-by: Huang Rui <ray.huang at amd.com> >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029 >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447 >> > > CC: <stable at vger.kernel.org> >> > > Link: https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koenig at amd.com >> > > Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> >> > > --- >> > > drivers/gpu/drm/ttm/ttm_bo_util.c | 1 + >> > > 1 file changed, 1 insertion(+) >> > > >> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> > > @@ -322,6 +322,7 @@ static void ttm_transfered_destroy(struc >> > > struct ttm_transfer_obj *fbo; >> > > >> > > fbo = container_of(bo, struct ttm_transfer_obj, base); >> > > + dma_resv_fini(&fbo->base.base._resv); >> > > ttm_bo_put(fbo->bo); >> > > kfree(fbo); >> > > } >> > >> > Alas, this innocuous looking commit causes one of my systems to lock up >> > as soon as run startx. This happens with the nouveau driver, two other >> > systems with radeon and intel graphics are not affected. Also I only >> > noticed it in 5.10.77. Kernels 5.15 and 5.14.16 are not affected, and I >> > do not use 5.4 anymore. >> > >> > I am not familiar with nouveau's ttm management and what has changed >> > there between 5.10 and 5.14, but maybe one of their developers can shed >> > a light on this. >> > >> > Cheers, >> > Sven >> > >> >> could be related to 265ec0dd1a0d18f4114f62c0d4a794bb4e729bc1 > > maybe not.. but I did remember there being a few tmm related patches > which only hurt nouveau :/ I guess one could do a git bisect to > figure out what change "fixes" it.Maybe, but since the memory leaks reported by Erhard only started to show up in 5.14 (if I read the bugzilla reports correctly), perhaps the patch should simply be reverted on earlier kernels?> On which GPU do you see this problem?On an old GeForce 8500 GT, the whole PC is rather ancient. Cheers, Sven
Karol Herbst
2021-Nov-03 21:25 UTC
[Nouveau] [PATCH 5.10 32/77] drm/ttm: fix memleak in ttm_transfered_destroy
On Wed, Nov 3, 2021 at 9:47 PM Sven Joachim <svenjoac at gmx.de> wrote:> > On 2021-11-03 21:32 +0100, Karol Herbst wrote: > > > On Wed, Nov 3, 2021 at 9:29 PM Karol Herbst <kherbst at redhat.com> wrote: > >> > >> On Wed, Nov 3, 2021 at 8:52 PM Sven Joachim <svenjoac at gmx.de> wrote: > >> > > >> > On 2021-11-01 10:17 +0100, Greg Kroah-Hartman wrote: > >> > > >> > > From: Christian K?nig <christian.koenig at amd.com> > >> > > > >> > > commit 0db55f9a1bafbe3dac750ea669de9134922389b5 upstream. > >> > > > >> > > We need to cleanup the fences for ghost objects as well. > >> > > > >> > > Signed-off-by: Christian K?nig <christian.koenig at amd.com> > >> > > Reported-by: Erhard F. <erhard_f at mailbox.org> > >> > > Tested-by: Erhard F. <erhard_f at mailbox.org> > >> > > Reviewed-by: Huang Rui <ray.huang at amd.com> > >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029 > >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447 > >> > > CC: <stable at vger.kernel.org> > >> > > Link: https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koenig at amd.com > >> > > Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> > >> > > --- > >> > > drivers/gpu/drm/ttm/ttm_bo_util.c | 1 + > >> > > 1 file changed, 1 insertion(+) > >> > > > >> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > >> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > >> > > @@ -322,6 +322,7 @@ static void ttm_transfered_destroy(struc > >> > > struct ttm_transfer_obj *fbo; > >> > > > >> > > fbo = container_of(bo, struct ttm_transfer_obj, base); > >> > > + dma_resv_fini(&fbo->base.base._resv); > >> > > ttm_bo_put(fbo->bo); > >> > > kfree(fbo); > >> > > } > >> > > >> > Alas, this innocuous looking commit causes one of my systems to lock up > >> > as soon as run startx. This happens with the nouveau driver, two other > >> > systems with radeon and intel graphics are not affected. Also I only > >> > noticed it in 5.10.77. Kernels 5.15 and 5.14.16 are not affected, and I > >> > do not use 5.4 anymore. > >> > > >> > I am not familiar with nouveau's ttm management and what has changed > >> > there between 5.10 and 5.14, but maybe one of their developers can shed > >> > a light on this. > >> > > >> > Cheers, > >> > Sven > >> > > >> > >> could be related to 265ec0dd1a0d18f4114f62c0d4a794bb4e729bc1 > > > > maybe not.. but I did remember there being a few tmm related patches > > which only hurt nouveau :/ I guess one could do a git bisect to > > figure out what change "fixes" it. > > Maybe, but since the memory leaks reported by Erhard only started to > show up in 5.14 (if I read the bugzilla reports correctly), perhaps the > patch should simply be reverted on earlier kernels? >Yeah, I think this is probably the right approach.> > On which GPU do you see this problem? > > On an old GeForce 8500 GT, the whole PC is rather ancient. > > Cheers, > Sven >