thr3ads.net - Nouveau - [PATCH] nouveau/fence: handle cross device fences properly. (v3) [Jan 2025]

If this information is useful, please help other people find it:
Share via:

Dave Airlie

2025-Jan-09 00:55 UTC

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

From: Dave Airlie <airlied at redhat.com>

This is the 3rd iteration of this after talking to Ben and
Danilo, I think this makes sense now.

The fence sync logic doesn't handle a fence sync across devices
as it tries to write to a channel offset from one device into
the fence bo from a different device, which won't work so well.

This patch fixes that to avoid using the sync path in the case
where the fences come from different nouveau drm devices.

This works fine on a single device as the fence bo is shared
across the devices, and mapped into each channels vma space,
the channel offsets are therefore okay to pass between sides,
so one channel can sync on the seqnos from the other by using
the offset into it's vma.

Signed-off-by: Dave Airlie <airlied at redhat.com>
Cc: stable at vger.kernel.org
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index ee5e9d40c166..a3eb1f447a29 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct
nouveau_channel *chan,
 			if (f) {
 				struct nouveau_channel *prev;
 				bool must_wait = true;
+				bool local;
 
 				rcu_read_lock();
 				prev = rcu_dereference(f->channel);
-				if (prev && (prev == chan ||
-					     fctx->sync(f, prev, chan) == 0))
+				local = prev && prev->drm == chan->drm;
+				if (local && (prev == chan ||
+					      fctx->sync(f, prev, chan) == 0))
 					must_wait = false;
 				rcu_read_unlock();
 				if (!must_wait)
-- 
2.43.0

Ben Skeggs

2025-Jan-09 03:42 UTC

head link

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

On 9/1/25 10:55, Dave Airlie wrote:
> From: Dave Airlie <airlied at redhat.com>
>
> This is the 3rd iteration of this after talking to Ben and
> Danilo, I think this makes sense now.
>
> The fence sync logic doesn't handle a fence sync across devices
> as it tries to write to a channel offset from one device into
> the fence bo from a different device, which won't work so well.
>
> This patch fixes that to avoid using the sync path in the case
> where the fences come from different nouveau drm devices.
>
> This works fine on a single device as the fence bo is shared
> across the devices, and mapped into each channels vma space,
> the channel offsets are therefore okay to pass between sides,
> so one channel can sync on the seqnos from the other by using
> the offset into it's vma.
>
> Signed-off-by: Dave Airlie <airlied at redhat.com>
Signed-off-by: Ben Skeggs <bskeggs at nvidia.com>> Cc: stable at vger.kernel.org
> ---
>   drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index ee5e9d40c166..a3eb1f447a29 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct
nouveau_channel *chan,
>   			if (f) {
>   				struct nouveau_channel *prev;
>   				bool must_wait = true;
> +				bool local;
>   
>   				rcu_read_lock();
>   				prev = rcu_dereference(f->channel);
> -				if (prev && (prev == chan ||
> -					     fctx->sync(f, prev, chan) == 0))
> +				local = prev && prev->drm == chan->drm;
> +				if (local && (prev == chan ||
> +					      fctx->sync(f, prev, chan) == 0))
>   					must_wait = false;
>   				rcu_read_unlock();
>   				if (!must_wait)

Danilo Krummrich

2025-Jan-09 16:58 UTC

head link

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

On Thu, Jan 09, 2025 at 10:55:53AM +1000, Dave Airlie
wrote:> From: Dave Airlie <airlied at redhat.com>
> 
> This is the 3rd iteration of this after talking to Ben and
> Danilo, I think this makes sense now.
> 
> The fence sync logic doesn't handle a fence sync across devices
> as it tries to write to a channel offset from one device into
> the fence bo from a different device, which won't work so well.
> 
> This patch fixes that to avoid using the sync path in the case
> where the fences come from different nouveau drm devices.
> 
> This works fine on a single device as the fence bo is shared
> across the devices, and mapped into each channels vma space,
> the channel offsets are therefore okay to pass between sides,
> so one channel can sync on the seqnos from the other by using
> the offset into it's vma.
Huh, they indeed all share and map drm->fence->bo, good catch.
> 
> Signed-off-by: Dave Airlie <airlied at redhat.com>
> Cc: stable at vger.kernel.org
> ---
>  drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index ee5e9d40c166..a3eb1f447a29 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct
nouveau_channel *chan,
>  			if (f) {
>  				struct nouveau_channel *prev;
>  				bool must_wait = true;
> +				bool local;
>  
>  				rcu_read_lock();
>  				prev = rcu_dereference(f->channel);
> -				if (prev && (prev == chan ||
> -					     fctx->sync(f, prev, chan) == 0))
> +				local = prev && prev->drm == chan->drm;
struct nouveau_channel has no pointer to a struct nouveau_drm, this should be
prev->cli->drm and chan->cli->drm instead.

No need to resend, I can fix it when applying the patch if you want.
> +				if (local && (prev == chan ||
> +					      fctx->sync(f, prev, chan) == 0))
>  					must_wait = false;
>  				rcu_read_unlock();
>  				if (!must_wait)
> -- 
> 2.43.0
>

kernel test robot

2025-Jan-10 02:28 UTC

head link

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

Hi Dave,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on drm-misc/drm-misc-next drm-tip/drm-tip v6.13-rc6
next-20250109]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:   
https://github.com/intel-lab-lkp/linux/commits/Dave-Airlie/nouveau-fence-handle-cross-device-fences-properly-v3/20250109-085805
base:   linus/master
patch link:   
https://lore.kernel.org/r/20250109005553.623947-1-airlied%40gmail.com
patch subject: [PATCH] nouveau/fence: handle cross device fences properly. (v3)
config: loongarch-randconfig-002-20250110
(https://download.01.org/0day-ci/archive/20250110/202501101033.wlEjeZwK-lkp at
intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build):
(https://download.01.org/0day-ci/archive/20250110/202501101033.wlEjeZwK-lkp at
intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp at intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501101033.wlEjeZwK-lkp at
intel.com/

All errors (new ones prefixed by >>):

   drivers/gpu/drm/nouveau/nouveau_fence.c: In function
'nouveau_fence_sync':>> drivers/gpu/drm/nouveau/nouveau_fence.c:394:53: error: 'struct
nouveau_channel' has no member named 'drm'     394 |                                 local = prev && prev->drm
== chan->drm;
         |                                                     ^~
   drivers/gpu/drm/nouveau/nouveau_fence.c:394:66: error: 'struct
nouveau_channel' has no member named 'drm'
     394 |                                 local = prev && prev->drm
== chan->drm;
         |                                                                  ^~


vim +394 drivers/gpu/drm/nouveau/nouveau_fence.c

   356	
   357	int
   358	nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
   359			   bool exclusive, bool intr)
   360	{
   361		struct nouveau_fence_chan *fctx = chan->fence;
   362		struct dma_resv *resv = nvbo->bo.base.resv;
   363		int i, ret;
   364	
   365		ret = dma_resv_reserve_fences(resv, 1);
   366		if (ret)
   367			return ret;
   368	
   369		/* Waiting for the writes first causes performance regressions
   370		 * under some circumstances. So manually wait for the reads first.
   371		 */
   372		for (i = 0; i < 2; ++i) {
   373			struct dma_resv_iter cursor;
   374			struct dma_fence *fence;
   375	
   376			dma_resv_for_each_fence(&cursor, resv,
   377						dma_resv_usage_rw(exclusive),
   378						fence) {
   379				enum dma_resv_usage usage;
   380				struct nouveau_fence *f;
   381	
   382				usage = dma_resv_iter_usage(&cursor);
   383				if (i == 0 && usage == DMA_RESV_USAGE_WRITE)
   384					continue;
   385	
   386				f = nouveau_local_fence(fence, chan->cli->drm);
   387				if (f) {
   388					struct nouveau_channel *prev;
   389					bool must_wait = true;
   390					bool local;
   391	
   392					rcu_read_lock();
   393					prev = rcu_dereference(f->channel);
 > 394					local = prev && prev->drm == chan->drm;
   395					if (local && (prev == chan ||
   396						      fctx->sync(f, prev, chan) == 0))
   397						must_wait = false;
   398					rcu_read_unlock();
   399					if (!must_wait)
   400						continue;
   401				}
   402	
   403				ret = dma_fence_wait(fence, intr);
   404				if (ret)
   405					return ret;
   406			}
   407		}
   408	
   409		return 0;
   410	}
   411	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Nouveau - Jan 2025 - [PATCH] nouveau/fence: handle cross device fences properly. (v3)

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

[PATCH] nouveau/fence: handle cross device fences properly. (v3)

[PATCH] nouveau/fence: handle cross device fences properly. (v3)