Salvatore Bonaccorso
2022-Sep-20 10:42 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
Hi, On Fri, Aug 19, 2022 at 10:09:28PM +0200, Karol Herbst wrote:> It is a bit unlcear to us why that's helping, but it does and unbreaks > suspend/resume on a lot of GPUs without any known drawbacks. > > Cc: stable at vger.kernel.org # v5.15+ > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 35bb0bb3fe61..126b3c6e12f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > if (ret == 0) { > ret = nouveau_fence_new(chan, false, &fence); > if (ret == 0) { > + /* TODO: figure out a better solution here > + * > + * wait on the fence here explicitly as going through > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > + * > + * Without this the operation can timeout and we'll fallback to a > + * software copy, which might take several minutes to finish. > + */ > + nouveau_fence_wait(fence, false, false); > ret = ttm_bo_move_accel_cleanup(bo, > &fence->base, > evict, false, > -- > 2.37.1 > >While this is marked for 5.15+ only, a user in Debian was seeing the suspend issue as well on 5.10.y and did confirm the commit fixes the issue as well in the 5.10.y series: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705#69 Karol, Lyude, should that as well be picked for 5.10.y? Regards, Salvatore
Karol Herbst
2022-Sep-20 11:36 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
On Tue, Sep 20, 2022 at 12:42 PM Salvatore Bonaccorso <carnil at debian.org> wrote:> > Hi, > > On Fri, Aug 19, 2022 at 10:09:28PM +0200, Karol Herbst wrote: > > It is a bit unlcear to us why that's helping, but it does and unbreaks > > suspend/resume on a lot of GPUs without any known drawbacks. > > > > Cc: stable at vger.kernel.org # v5.15+ > > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > > --- > > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > > index 35bb0bb3fe61..126b3c6e12f9 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > > if (ret == 0) { > > ret = nouveau_fence_new(chan, false, &fence); > > if (ret == 0) { > > + /* TODO: figure out a better solution here > > + * > > + * wait on the fence here explicitly as going through > > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > > + * > > + * Without this the operation can timeout and we'll fallback to a > > + * software copy, which might take several minutes to finish. > > + */ > > + nouveau_fence_wait(fence, false, false); > > ret = ttm_bo_move_accel_cleanup(bo, > > &fence->base, > > evict, false, > > -- > > 2.37.1 > > > > > > While this is marked for 5.15+ only, a user in Debian was seeing the > suspend issue as well on 5.10.y and did confirm the commit fixes the > issue as well in the 5.10.y series: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705#69 > > Karol, Lyude, should that as well be picked for 5.10.y? >mhh from the original report 5.10 was fine, but maybe something got backported and it broke it? I'll try to do some testing on my machine and see what I can figure out, but it could also be a debian only issue at this point.> Regards, > Salvatore >