Karol Herbst
2022-Aug-19 20:09 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
It is a bit unlcear to us why that's helping, but it does and unbreaks suspend/resume on a lot of GPUs without any known drawbacks. Cc: stable at vger.kernel.org # v5.15+ Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 Signed-off-by: Karol Herbst <kherbst at redhat.com> --- drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 35bb0bb3fe61..126b3c6e12f9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, if (ret == 0) { ret = nouveau_fence_new(chan, false, &fence); if (ret == 0) { + /* TODO: figure out a better solution here + * + * wait on the fence here explicitly as going through + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. + * + * Without this the operation can timeout and we'll fallback to a + * software copy, which might take several minutes to finish. + */ + nouveau_fence_wait(fence, false, false); ret = ttm_bo_move_accel_cleanup(bo, &fence->base, evict, false, -- 2.37.1
Lyude Paul
2022-Aug-22 21:15 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
Reviewed-by: Lyude Paul <lyude at redhat.com> On Fri, 2022-08-19 at 22:09 +0200, Karol Herbst wrote:> It is a bit unlcear to us why that's helping, but it does and unbreaks > suspend/resume on a lot of GPUs without any known drawbacks. > > Cc: stable at vger.kernel.org # v5.15+ > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 35bb0bb3fe61..126b3c6e12f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > if (ret == 0) { > ret = nouveau_fence_new(chan, false, &fence); > if (ret == 0) { > + /* TODO: figure out a better solution here > + * > + * wait on the fence here explicitly as going through > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > + * > + * Without this the operation can timeout and we'll fallback to a > + * software copy, which might take several minutes to finish. > + */ > + nouveau_fence_wait(fence, false, false); > ret = ttm_bo_move_accel_cleanup(bo, > &fence->base, > evict, false,-- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
Salvatore Bonaccorso
2022-Sep-20 10:42 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
Hi, On Fri, Aug 19, 2022 at 10:09:28PM +0200, Karol Herbst wrote:> It is a bit unlcear to us why that's helping, but it does and unbreaks > suspend/resume on a lot of GPUs without any known drawbacks. > > Cc: stable at vger.kernel.org # v5.15+ > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 35bb0bb3fe61..126b3c6e12f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > if (ret == 0) { > ret = nouveau_fence_new(chan, false, &fence); > if (ret == 0) { > + /* TODO: figure out a better solution here > + * > + * wait on the fence here explicitly as going through > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > + * > + * Without this the operation can timeout and we'll fallback to a > + * software copy, which might take several minutes to finish. > + */ > + nouveau_fence_wait(fence, false, false); > ret = ttm_bo_move_accel_cleanup(bo, > &fence->base, > evict, false, > -- > 2.37.1 > >While this is marked for 5.15+ only, a user in Debian was seeing the suspend issue as well on 5.10.y and did confirm the commit fixes the issue as well in the 5.10.y series: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705#69 Karol, Lyude, should that as well be picked for 5.10.y? Regards, Salvatore
Computer Enthusiastic
2022-Nov-01 06:44 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
Hello, Il giorno ven 19 ago 2022 alle ore 22:09 Karol Herbst <kherbst at redhat.com> ha scritto:> > It is a bit unlcear to us why that's helping, but it does and unbreaks > suspend/resume on a lot of GPUs without any known drawbacks. > > Cc: stable at vger.kernel.org # v5.15+ > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 35bb0bb3fe61..126b3c6e12f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > if (ret == 0) { > ret = nouveau_fence_new(chan, false, &fence); > if (ret == 0) { > + /* TODO: figure out a better solution here > + * > + * wait on the fence here explicitly as going through > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > + * > + * Without this the operation can timeout and we'll fallback to a > + * software copy, which might take several minutes to finish. > + */ > + nouveau_fence_wait(fence, false, false); > ret = ttm_bo_move_accel_cleanup(bo, > &fence->base, > evict, false, > -- > 2.37.1 >Do you think it could be possible to make the patch land in kernel 5.10.x in the near future ? Is there something I can do to help it to happen ? Thanks.
Computer Enthusiastic
2022-Nov-19 05:20 UTC
[Nouveau] [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
Hello, Il giorno ven 19 ago 2022 alle ore 22:09 Karol Herbst <kherbst at redhat.com> ha scritto:> > It is a bit unlcear to us why that's helping, but it does and unbreaks > suspend/resume on a lot of GPUs without any known drawbacks. > > Cc: stable at vger.kernel.org # v5.15+ > Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 35bb0bb3fe61..126b3c6e12f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -822,6 +822,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, > if (ret == 0) { > ret = nouveau_fence_new(chan, false, &fence); > if (ret == 0) { > + /* TODO: figure out a better solution here > + * > + * wait on the fence here explicitly as going through > + * ttm_bo_move_accel_cleanup somehow doesn't seem to do it. > + * > + * Without this the operation can timeout and we'll fallback to a > + * software copy, which might take several minutes to finish. > + */ > + nouveau_fence_wait(fence, false, false); > ret = ttm_bo_move_accel_cleanup(bo, > &fence->base, > evict, false, > -- > 2.37.1 >Could it be possible to make land the aforementioned patch to the 5.10.x kernel version ? It is currently for >= 5.15.x kernel version only. Thanks.
Possibly Parallel Threads
- [PATCH drm-misc-next] drm/nouveau: fence: fix undefined fence state after emit
- [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
- [PATCH] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
- [RFC PATCH v1 00/16] Convert all ttm drivers to use the new reservation interface
- [PATCH 01/19] fence: add debugging lines to fence_is_signaled for the callback