Christian König
2021-Dec-09 10:23 UTC
[Nouveau] [PATCH] drm/nouveau: wait for the exclusive fence after the shared ones v2
Always waiting for the exclusive fence resulted on some performance regressions. So try to wait for the shared fences first, then the exclusive fence should always be signaled already. v2: fix incorrectly placed "(", add some comment why we do this. Signed-off-by: Christian K?nig <christian.koenig at amd.com> --- drivers/gpu/drm/nouveau/nouveau_fence.c | 28 +++++++++++++------------ 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 05d0b3eb3690..0ae416aa76dc 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -353,15 +353,22 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e if (ret) return ret; - } - fobj = dma_resv_shared_list(resv); - fence = dma_resv_excl_fence(resv); + fobj = NULL; + } else { + fobj = dma_resv_shared_list(resv); + } - if (fence) { + /* Waiting for the exclusive fence first causes performance regressions + * under some circumstances. So manually wait for the shared ones first. + */ + for (i = 0; i < (fobj ? fobj->shared_count : 0) && !ret; ++i) { struct nouveau_channel *prev = NULL; bool must_wait = true; + fence = rcu_dereference_protected(fobj->shared[i], + dma_resv_held(resv)); + f = nouveau_local_fence(fence, chan->drm); if (f) { rcu_read_lock(); @@ -373,20 +380,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e if (must_wait) ret = dma_fence_wait(fence, intr); - - return ret; } - if (!exclusive || !fobj) - return ret; - - for (i = 0; i < fobj->shared_count && !ret; ++i) { + fence = dma_resv_excl_fence(resv); + if (fence) { struct nouveau_channel *prev = NULL; bool must_wait = true; - fence = rcu_dereference_protected(fobj->shared[i], - dma_resv_held(resv)); - f = nouveau_local_fence(fence, chan->drm); if (f) { rcu_read_lock(); @@ -398,6 +398,8 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e if (must_wait) ret = dma_fence_wait(fence, intr); + + return ret; } return ret; -- 2.25.1
Thorsten Leemhuis
2021-Dec-10 09:06 UTC
[Nouveau] [PATCH] drm/nouveau: wait for the exclusive fence after the shared ones v2
Hi, this is your Linux kernel regression tracker speaking. On 09.12.21 11:23, Christian K?nig wrote:> Always waiting for the exclusive fence resulted on some performance > regressions. So try to wait for the shared fences first, then the > exclusive fence should always be signaled already. > > v2: fix incorrectly placed "(", add some comment why we do this. > > Signed-off-by: Christian K?nig <christian.koenig at amd.com>FWIW: In case you need to send an improved patch, could you please add this (see (?) below for the reasoning): Link: https://lore.kernel.org/dri-devel/da142fb9-07d7-24fe-4533-0247b8d16cdd at sfritsch.de/ And if the patch is already good to go: could the subsystem maintainer please add it when applying? See (?) for the reasoning. BTW, these two lines afaics are missing as well: Fixes: 3e1ad79bf661 ("drm/nouveau: always wait for the exclusive fence") Reported-by: Stefan Fritsch <sf at sfritsch.de> Ciao, Thorsten (?) Long story: The commit message would benefit from a link to the regression report, for reasons explained in Documentation/process/submitting-patches.rst. To quote: ``` If related discussions or any other background information behind the change can be found on the web, add 'Link:' tags pointing to it. In case your patch fixes a bug, for example, add a tag with a URL referencing the report in the mailing list archives or a bug tracker; ``` This concept is old, but the text was reworked recently to make this use case for the Link: tag clearer. For details see: https://git.kernel.org/linus/1f57bd42b77c Yes, that "Link:" is not really crucial; but it's good to have if someone needs to look into the backstory of this change sometime in the future. But I care for a different reason. I'm tracking this regression (and others) with regzbot, my Linux kernel regression tracking bot. This bot will notice if a patch with a Link: tag to a tracked regression gets posted and record that, which allowed anyone looking into the regression to quickly gasp the current status from regzbot's webui (https://linux-regtracking.leemhuis.info/regzbot ) or its reports. The bot will also notice if a commit with a Link: tag to a regression report is applied by Linus and then automatically mark the regression as resolved then. IOW: this tag makes my life a regression tracker a lot easier, as I otherwise have to tell regzbot manually when the fix lands. :-/ #regzbot ^backmonitor: https://lore.kernel.org/dri-devel/da142fb9-07d7-24fe-4533-0247b8d16cdd at sfritsch.de/
Stefan Fritsch
2021-Dec-11 09:59 UTC
[Nouveau] [PATCH] drm/nouveau: wait for the exclusive fence after the shared ones v2
On 09.12.21 11:23, Christian K?nig wrote:> Always waiting for the exclusive fence resulted on some performance > regressions. So try to wait for the shared fences first, then the > exclusive fence should always be signaled already. > > v2: fix incorrectly placed "(", add some comment why we do this. > > Signed-off-by: Christian K?nig <christian.koenig at amd.com>Tested-by: Stefan Fritsch <sf at sfritsch.de> Please also add a cc for linux-stable, so that this is fixed in 5.15.x Cheers, Stefan> --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 28 +++++++++++++------------ > 1 file changed, 15 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > index 05d0b3eb3690..0ae416aa76dc 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -353,15 +353,22 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e > > if (ret) > return ret; > - } > > - fobj = dma_resv_shared_list(resv); > - fence = dma_resv_excl_fence(resv); > + fobj = NULL; > + } else { > + fobj = dma_resv_shared_list(resv); > + } > > - if (fence) { > + /* Waiting for the exclusive fence first causes performance regressions > + * under some circumstances. So manually wait for the shared ones first. > + */ > + for (i = 0; i < (fobj ? fobj->shared_count : 0) && !ret; ++i) { > struct nouveau_channel *prev = NULL; > bool must_wait = true; > > + fence = rcu_dereference_protected(fobj->shared[i], > + dma_resv_held(resv)); > + > f = nouveau_local_fence(fence, chan->drm); > if (f) { > rcu_read_lock(); > @@ -373,20 +380,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e > > if (must_wait) > ret = dma_fence_wait(fence, intr); > - > - return ret; > } > > - if (!exclusive || !fobj) > - return ret; > - > - for (i = 0; i < fobj->shared_count && !ret; ++i) { > + fence = dma_resv_excl_fence(resv); > + if (fence) { > struct nouveau_channel *prev = NULL; > bool must_wait = true; > > - fence = rcu_dereference_protected(fobj->shared[i], > - dma_resv_held(resv)); > - > f = nouveau_local_fence(fence, chan->drm); > if (f) { > rcu_read_lock(); > @@ -398,6 +398,8 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e > > if (must_wait) > ret = dma_fence_wait(fence, intr); > + > + return ret; > } > > return ret;