Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
From: Thierry Reding <treding at nvidia.com> Hi, This series implements a new IOCTL to submit push buffers that can optionally return a sync FD or sync object to userspace. This is useful in cases where userspace wants to synchronize operations between the GPU and another driver (such as KMS for display). Among other things this allows extensions such as eglDupNativeFenceFDANDROID to be implemented. Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM sync objects to be passed rather than only sync FDs. It also allows any number of sync FDs/objects to be passed in or emitted. I think those are useful features, but I left them in a separate patch in case everybody else thinks that this won't be needed. If we decide to merge the new ABI then patch 4 should be squashed into patch 3. The corresponding userspace changes can be found here: libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ I've verified that this works with kmscube's --atomic mode and Weston. Thierry Thierry Reding (6): drm/nouveau: Split nouveau_fence_sync() drm/nouveau: Add nouveau_fence_ref() drm/nouveau: Support fence FDs at kickoff drm/nouveau: Support sync FDs and syncobjs drm/nouveau: Support DMA fence arrays drm/nouveau: Allow zero pushbuffer submits drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 +- drivers/gpu/drm/nouveau/nouveau_bo.c | 38 ++- drivers/gpu/drm/nouveau/nouveau_bo.h | 2 + drivers/gpu/drm/nouveau/nouveau_drm.c | 1 + drivers/gpu/drm/nouveau/nouveau_fence.c | 90 +++--- drivers/gpu/drm/nouveau/nouveau_fence.h | 3 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 402 ++++++++++++++++++------ drivers/gpu/drm/nouveau/nouveau_gem.h | 2 + include/uapi/drm/nouveau_drm.h | 23 ++ 9 files changed, 410 insertions(+), 155 deletions(-) -- 2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync()
From: Thierry Reding <treding at nvidia.com>
Turn nouveau_fence_sync() into a low-level helper that adds fence waits
to the channel command stream. The new nouveau_bo_sync() helper replaces
the previous nouveau_fence_sync() implementation. It passes each of the
buffer object's fences to nouveau_fence_sync() in turn.
This provides more fine-grained control over fences which is needed by
subsequent patches for sync fd support.
Heavily based on work by Lauri Peltonen <lpeltonen at nvidia.com>.
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
 drivers/gpu/drm/nouveau/dispnv04/crtc.c |  4 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c    | 38 +++++++++++++-
 drivers/gpu/drm/nouveau/nouveau_bo.h    |  2 +
 drivers/gpu/drm/nouveau/nouveau_fence.c | 68 +++++--------------------
 drivers/gpu/drm/nouveau/nouveau_fence.h |  2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c   |  2 +-
 6 files changed, 57 insertions(+), 59 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 6416b6907aeb..4af702d0d6bf 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -1117,7 +1117,7 @@ nv04_page_flip_emit(struct nouveau_channel *chan,
 	spin_unlock_irqrestore(&dev->event_lock, flags);
 
 	/* Synchronize with the old framebuffer */
-	ret = nouveau_fence_sync(old_bo, chan, false, false);
+	ret = nouveau_bo_sync(old_bo, chan, false, false);
 	if (ret)
 		goto fail;
 
@@ -1183,7 +1183,7 @@ nv04_crtc_page_flip(struct drm_crtc *crtc, struct
drm_framebuffer *fb,
 		goto fail_unpin;
 
 	/* synchronise rendering channel with the kernel's channel */
-	ret = nouveau_fence_sync(new_bo, chan, false, true);
+	ret = nouveau_bo_sync(new_bo, chan, false, true);
 	if (ret) {
 		ttm_bo_unreserve(&new_bo->bo);
 		goto fail_unpin;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 9140387f30dc..25ceabfa741c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -574,6 +574,42 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
 					PAGE_SIZE, DMA_FROM_DEVICE);
 }
 
+int
+nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
+		bool exclusive, bool intr)
+{
+	struct dma_resv *resv = nvbo->bo.base.resv;
+	struct dma_resv_list *fobj;
+	struct dma_fence *fence;
+	int ret = 0, i;
+
+	if (!exclusive) {
+		ret = dma_resv_reserve_shared(resv, 1);
+		if (ret < 0)
+			return ret;
+	}
+
+	fobj = dma_resv_get_list(resv);
+	fence = dma_resv_get_excl(resv);
+
+	if (fence && (!exclusive || !fobj || !fobj->shared_count))
+		return nouveau_fence_sync(fence, chan, intr);
+
+	if (!exclusive || !fobj)
+		return ret;
+
+	for (i = 0; i < fobj->shared_count && !ret; ++i) {
+		fence = rcu_dereference_protected(fobj->shared[i],
+						  dma_resv_held(resv));
+
+		ret = nouveau_fence_sync(fence, chan, intr);
+		if (ret < 0)
+			break;
+	}
+
+	return ret;
+}
+
 int
 nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible,
 		    bool no_wait_gpu)
@@ -717,7 +753,7 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int
evict, bool intr,
 	}
 
 	mutex_lock_nested(&cli->mutex, SINGLE_DEPTH_NESTING);
-	ret = nouveau_fence_sync(nouveau_bo(bo), chan, true, intr);
+	ret = nouveau_bo_sync(nouveau_bo(bo), chan, true, intr);
 	if (ret == 0) {
 		ret = drm->ttm.move(chan, bo, &bo->mem, new_reg);
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h
b/drivers/gpu/drm/nouveau/nouveau_bo.h
index aecb7481df0d..93d1706619a1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -96,6 +96,8 @@ int  nouveau_bo_validate(struct nouveau_bo *, bool
interruptible,
 			 bool no_wait_gpu);
 void nouveau_bo_sync_for_device(struct nouveau_bo *nvbo);
 void nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo);
+int nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *channel,
+		    bool exclusive, bool intr);
 
 /* TODO: submit equivalent to TTM generic API upstream? */
 static inline void __iomem *
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index e5dcbf67de7e..8e7550553584 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -339,66 +339,26 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy,
bool intr)
 }
 
 int
-nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool
exclusive, bool intr)
+nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		   bool intr)
 {
 	struct nouveau_fence_chan *fctx = chan->fence;
-	struct dma_fence *fence;
-	struct dma_resv *resv = nvbo->bo.base.resv;
-	struct dma_resv_list *fobj;
+	struct nouveau_channel *prev = NULL;
 	struct nouveau_fence *f;
-	int ret = 0, i;
-
-	if (!exclusive) {
-		ret = dma_resv_reserve_shared(resv, 1);
+	bool must_wait = true;
+	int ret = 0;
 
-		if (ret)
-			return ret;
+	f = nouveau_local_fence(fence, chan->drm);
+	if (f) {
+		rcu_read_lock();
+		prev = rcu_dereference(f->channel);
+		if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
+			must_wait = false;
+		rcu_read_unlock();
 	}
 
-	fobj = dma_resv_get_list(resv);
-	fence = dma_resv_get_excl(resv);
-
-	if (fence && (!exclusive || !fobj || !fobj->shared_count)) {
-		struct nouveau_channel *prev = NULL;
-		bool must_wait = true;
-
-		f = nouveau_local_fence(fence, chan->drm);
-		if (f) {
-			rcu_read_lock();
-			prev = rcu_dereference(f->channel);
-			if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
-				must_wait = false;
-			rcu_read_unlock();
-		}
-
-		if (must_wait)
-			ret = dma_fence_wait(fence, intr);
-
-		return ret;
-	}
-
-	if (!exclusive || !fobj)
-		return ret;
-
-	for (i = 0; i < fobj->shared_count && !ret; ++i) {
-		struct nouveau_channel *prev = NULL;
-		bool must_wait = true;
-
-		fence = rcu_dereference_protected(fobj->shared[i],
-						dma_resv_held(resv));
-
-		f = nouveau_local_fence(fence, chan->drm);
-		if (f) {
-			rcu_read_lock();
-			prev = rcu_dereference(f->channel);
-			if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
-				must_wait = false;
-			rcu_read_unlock();
-		}
-
-		if (must_wait)
-			ret = dma_fence_wait(fence, intr);
-	}
+	if (must_wait)
+		ret = dma_fence_wait(fence, intr);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h
b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 4887caa69c65..76cbf0c27a30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -24,7 +24,7 @@ void nouveau_fence_unref(struct nouveau_fence **);
 int  nouveau_fence_emit(struct nouveau_fence *, struct nouveau_channel *);
 bool nouveau_fence_done(struct nouveau_fence *);
 int  nouveau_fence_wait(struct nouveau_fence *, bool lazy, bool intr);
-int  nouveau_fence_sync(struct nouveau_bo *, struct nouveau_channel *, bool
exclusive, bool intr);
+int  nouveau_fence_sync(struct dma_fence *, struct nouveau_channel *, bool
intr);
 
 struct nouveau_fence_chan {
 	spinlock_t lock;
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 81f111ad3f4f..590e4c1d2e8a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -513,7 +513,7 @@ validate_list(struct nouveau_channel *chan, struct
nouveau_cli *cli,
 			return ret;
 		}
 
-		ret = nouveau_fence_sync(nvbo, chan, !!b->write_domains, true);
+		ret = nouveau_bo_sync(nvbo, chan, !!b->write_domains, true);
 		if (unlikely(ret)) {
 			if (ret != -ERESTARTSYS)
 				NV_PRINTK(err, cli, "fail post-validate sync\n");
-- 
2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref()
From: Thierry Reding <treding at nvidia.com>
This is a simple wrapper that increments the reference count of the
backing DMA fence.
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 9 +++++++++
 drivers/gpu/drm/nouveau/nouveau_fence.h | 1 +
 2 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 8e7550553584..8530c2684832 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -363,6 +363,15 @@ nouveau_fence_sync(struct dma_fence *fence, struct
nouveau_channel *chan,
 	return ret;
 }
 
+struct nouveau_fence *
+nouveau_fence_ref(struct nouveau_fence *fence)
+{
+	if (fence)
+		dma_fence_get(&fence->base);
+
+	return fence;
+}
+
 void
 nouveau_fence_unref(struct nouveau_fence **pfence)
 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h
b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 76cbf0c27a30..b8afd4b06445 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -19,6 +19,7 @@ struct nouveau_fence {
 
 int  nouveau_fence_new(struct nouveau_channel *, bool sysmem,
 		       struct nouveau_fence **);
+struct nouveau_fence *nouveau_fence_ref(struct nouveau_fence *);
 void nouveau_fence_unref(struct nouveau_fence **);
 
 int  nouveau_fence_emit(struct nouveau_fence *, struct nouveau_channel *);
-- 
2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff
From: Thierry Reding <treding at nvidia.com>
Add a new NOUVEAU_GEM_PUSHBUF2 IOCTL that accepts and emits a sync fence
FD from/to userspace if requested by the corresponding flags.
Based heavily on work by Lauri Peltonen <lpeltonen at nvidia.com>
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c |  1 +
 drivers/gpu/drm/nouveau/nouveau_gem.c | 79 +++++++++++++++++++++++++--
 drivers/gpu/drm/nouveau/nouveau_gem.h |  2 +
 include/uapi/drm/nouveau_drm.h        | 14 +++++
 4 files changed, 92 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 22d246acc5e5..c9cb2648a28b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -1140,6 +1140,7 @@ nouveau_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep,
DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_FINI, nouveau_gem_ioctl_cpu_fini,
DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_PUSHBUF2, nouveau_gem_ioctl_pushbuf2,
DRM_RENDER_ALLOW),
 };
 
 long
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 590e4c1d2e8a..039f244c0a00 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -24,6 +24,9 @@
  *
  */
 
+#include <linux/file.h>
+#include <linux/sync_file.h>
+
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
 #include "nouveau_fence.h"
@@ -677,24 +680,30 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli,
 	return ret;
 }
 
-int
-nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
-			  struct drm_file *file_priv)
+static int
+__nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
+			    struct drm_nouveau_gem_pushbuf2 *request,
+			    struct drm_file *file_priv)
 {
 	struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv);
 	struct nouveau_cli *cli = nouveau_cli(file_priv);
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
-	struct drm_nouveau_gem_pushbuf *req = data;
+	struct drm_nouveau_gem_pushbuf *req = &request->base;
 	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
 	struct drm_nouveau_gem_pushbuf_bo *bo;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
 	struct nouveau_fence *fence = NULL;
+	struct dma_fence *prefence = NULL;
 	int i, j, ret = 0;
 	bool do_reloc = false, sync = false;
 
+	/* check for unrecognized flags */
+	if (request->flags & ~NOUVEAU_GEM_PUSHBUF_FLAGS)
+		return -EINVAL;
+
 	if (unlikely(!abi16))
 		return -ENOMEM;
 
@@ -764,6 +773,15 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void
*data,
 		goto out_prevalid;
 	}
 
+	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_WAIT) {
+		prefence = sync_file_get_fence(request->fence);
+		if (prefence) {
+			ret = nouveau_fence_sync(prefence, chan, true);
+			if (ret < 0)
+				goto out;
+		}
+	}
+
 	/* Apply any relocations that are required */
 	if (do_reloc) {
 		if (!reloc) {
@@ -865,7 +883,30 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void
*data,
 		}
 	}
 
+	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_EMIT) {
+		struct sync_file *file;
+		int fd;
+
+		fd = get_unused_fd_flags(O_CLOEXEC);
+		if (fd < 0) {
+			ret = fd;
+			goto out;
+		}
+
+		file = sync_file_create(&fence->base);
+		if (!file) {
+			put_unused_fd(fd);
+			goto out;
+		}
+
+		fd_install(fd, file->file);
+		request->fence = fd;
+	}
+
 out:
+	if (prefence)
+		dma_fence_put(prefence);
+
 	validate_fini(&op, chan, fence, bo);
 	nouveau_fence_unref(&fence);
 
@@ -906,6 +947,27 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void
*data,
 	return nouveau_abi16_put(abi16, ret);
 }
 
+int
+nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
+			  struct drm_file *file_priv)
+{
+	struct drm_nouveau_gem_pushbuf *request = data;
+	struct drm_nouveau_gem_pushbuf2 req;
+	int ret;
+
+	memset(&req, 0, sizeof(req));
+	memcpy(&req.base, request, sizeof(*request));
+
+	ret = __nouveau_gem_ioctl_pushbuf(dev, &req, file_priv);
+
+	request->gart_available = req.base.gart_available;
+	request->vram_available = req.base.vram_available;
+	request->suffix1 = req.base.suffix1;
+	request->suffix0 = req.base.suffix0;
+
+	return ret;
+}
+
 int
 nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv)
@@ -973,3 +1035,12 @@ nouveau_gem_ioctl_info(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int
+nouveau_gem_ioctl_pushbuf2(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv)
+{
+	struct drm_nouveau_gem_pushbuf2 *req = data;
+
+	return __nouveau_gem_ioctl_pushbuf(dev, req, file_priv);
+}
+
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.h
b/drivers/gpu/drm/nouveau/nouveau_gem.h
index 978e07591990..652a63242303 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.h
@@ -29,6 +29,8 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device *,
void *,
 				      struct drm_file *);
 extern int nouveau_gem_ioctl_info(struct drm_device *, void *,
 				  struct drm_file *);
+extern int nouveau_gem_ioctl_pushbuf2(struct drm_device *, void *,
+				      struct drm_file *);
 
 extern int nouveau_gem_prime_pin(struct drm_gem_object *);
 extern void nouveau_gem_prime_unpin(struct drm_gem_object *);
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 853a327433d3..85425dc90301 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -115,6 +115,18 @@ struct drm_nouveau_gem_pushbuf {
 	__u64 gart_available;
 };
 
+#define NOUVEAU_GEM_PUSHBUF_FENCE_WAIT (1 << 0)
+#define NOUVEAU_GEM_PUSHBUF_FENCE_EMIT (1 << 1)
+#define NOUVEAU_GEM_PUSHBUF_FLAGS (NOUVEAU_GEM_PUSHBUF_FENCE_WAIT | \
+				   NOUVEAU_GEM_PUSHBUF_FENCE_EMIT)
+
+struct drm_nouveau_gem_pushbuf2 {
+	struct drm_nouveau_gem_pushbuf base;
+	__u32 flags;
+	__s32 fence;
+	__u64 reserved;
+};
+
 #define NOUVEAU_GEM_CPU_PREP_NOWAIT                                  0x00000001
 #define NOUVEAU_GEM_CPU_PREP_WRITE                                   0x00000004
 struct drm_nouveau_gem_cpu_prep {
@@ -141,6 +153,7 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
+#define DRM_NOUVEAU_GEM_PUSHBUF2       0x45
 
 struct drm_nouveau_svm_init {
 	__u64 unmanaged_addr;
@@ -196,6 +209,7 @@ struct drm_nouveau_svm_bind {
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE +
DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_FINI       DRM_IOW (DRM_COMMAND_BASE +
DRM_NOUVEAU_GEM_CPU_FINI, struct drm_nouveau_gem_cpu_fini)
 #define DRM_IOCTL_NOUVEAU_GEM_INFO           DRM_IOWR(DRM_COMMAND_BASE +
DRM_NOUVEAU_GEM_INFO, struct drm_nouveau_gem_info)
+#define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF2       DRM_IOWR(DRM_COMMAND_BASE +
DRM_NOUVEAU_GEM_PUSHBUF2, struct drm_nouveau_gem_pushbuf2)
 
 #if defined(__cplusplus)
 }
-- 
2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs
From: Thierry Reding <treding at nvidia.com>
Extends the new NOUVEAU_GEM_PUSHBUF2 IOCTL to accept and emit one or
more sync FDs and/or DRM native sync objects.
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
Note: If acceptable, this should be merged into the previous patch that
adds the new IOCTL.
 drivers/gpu/drm/nouveau/nouveau_gem.c | 180 ++++++++++++++++++++++----
 include/uapi/drm/nouveau_drm.h        |  21 ++-
 2 files changed, 167 insertions(+), 34 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 039f244c0a00..b3ece731e4e1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -26,6 +26,7 @@
 
 #include <linux/file.h>
 #include <linux/sync_file.h>
+#include <drm/drm_syncobj.h>
 
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
@@ -680,12 +681,137 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli,
 	return ret;
 }
 
+static int nouveau_channel_wait_fence(struct nouveau_channel *channel,
+				      struct drm_file *file_priv,
+				      struct drm_nouveau_gem_fence *f)
+{
+	struct dma_fence *fence;
+
+	if (f->flags & NOUVEAU_GEM_FENCE_FD) {
+		fence = sync_file_get_fence(f->handle);
+		if (!fence)
+			return -ENOENT;
+	} else {
+		struct drm_syncobj *syncobj;
+
+		syncobj = drm_syncobj_find(file_priv, f->handle);
+		if (!syncobj)
+			return -ENOENT;
+
+		fence = drm_syncobj_fence_get(syncobj);
+		drm_syncobj_put(syncobj);
+	}
+
+	return nouveau_fence_sync(fence, channel, true);
+}
+
+static int nouveau_channel_wait_fences(struct nouveau_channel *channel,
+				       struct drm_file *file_priv,
+				       struct drm_nouveau_gem_fence *fences,
+				       unsigned int num_fences)
+{
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < num_fences; i++) {
+		if (fences[i].flags & NOUVEAU_GEM_FENCE_WAIT) {
+			ret = nouveau_channel_wait_fence(channel, file_priv,
+							 &fences[i]);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+static struct nouveau_fence *
+nouveau_channel_emit_fence(struct nouveau_channel *channel,
+			   struct drm_file *file_priv,
+			   struct drm_nouveau_gem_fence *f)
+{
+	struct nouveau_fence *fence;
+	int ret;
+
+	ret = nouveau_fence_new(channel, false, &fence);
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	if (f->flags & NOUVEAU_GEM_FENCE_FD) {
+		struct sync_file *file;
+		int fd;
+
+		fd = get_unused_fd_flags(O_CLOEXEC);
+		if (fd < 0) {
+			ret = fd;
+			goto put;
+		}
+
+		file = sync_file_create(&fence->base);
+		if (!file) {
+			put_unused_fd(fd);
+			ret = -ENOMEM;
+			goto put;
+		}
+
+		fd_install(fd, file->file);
+		f->handle = fd;
+	} else {
+		struct drm_syncobj *syncobj;
+
+		ret = drm_syncobj_create(&syncobj, 0, &fence->base);
+		if (ret < 0)
+			goto put;
+
+		ret = drm_syncobj_get_handle(file_priv, syncobj, &f->handle);
+		drm_syncobj_put(syncobj);
+	}
+
+put:
+	nouveau_fence_unref(&fence);
+	return ERR_PTR(ret);
+}
+
+static struct nouveau_fence *
+nouveau_channel_emit_fences(struct nouveau_channel *channel,
+			    struct drm_file *file_priv,
+			    struct drm_nouveau_gem_fence *fences,
+			    unsigned int num_fences)
+{
+	struct nouveau_fence *fence = NULL, *f;
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < num_fences; i++) {
+		if (fences[i].flags & NOUVEAU_GEM_FENCE_EMIT) {
+			f = nouveau_channel_emit_fence(channel, file_priv,
+						        &fences[i]);
+			if (IS_ERR(f))
+				return f;
+
+			if (!fence)
+				fence = f;
+		}
+	}
+
+	if (!fence) {
+		ret = nouveau_fence_new(channel, false, &fence);
+		if (ret)
+			fence = ERR_PTR(ret);
+	} else {
+		nouveau_fence_ref(fence);
+	}
+
+	return fence;
+}
+
 static int
 __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 			    struct drm_nouveau_gem_pushbuf2 *request,
 			    struct drm_file *file_priv)
 {
 	struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv);
+	struct drm_nouveau_gem_fence __user *user_fences;
 	struct nouveau_cli *cli = nouveau_cli(file_priv);
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
@@ -693,12 +819,13 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
 	struct drm_nouveau_gem_pushbuf_bo *bo;
+	struct drm_nouveau_gem_fence *fences = NULL;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
 	struct nouveau_fence *fence = NULL;
-	struct dma_fence *prefence = NULL;
 	int i, j, ret = 0;
 	bool do_reloc = false, sync = false;
+	size_t size;
 
 	/* check for unrecognized flags */
 	if (request->flags & ~NOUVEAU_GEM_PUSHBUF_FLAGS)
@@ -773,13 +900,18 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		goto out_prevalid;
 	}
 
-	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_WAIT) {
-		prefence = sync_file_get_fence(request->fence);
-		if (prefence) {
-			ret = nouveau_fence_sync(prefence, chan, true);
-			if (ret < 0)
-				goto out;
+	if (request->num_fences > 0) {
+		fences = u_memcpya(request->fences, request->num_fences,
+				   sizeof(*fences));
+		if (IS_ERR(fences)) {
+			ret = PTR_ERR(fences);
+			goto out;
 		}
+
+		ret = nouveau_channel_wait_fences(chan, file_priv, fences,
+						  request->num_fences);
+		if (ret < 0)
+			goto out;
 	}
 
 	/* Apply any relocations that are required */
@@ -869,10 +1001,13 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		}
 	}
 
-	ret = nouveau_fence_new(chan, false, &fence);
-	if (ret) {
+	fence = nouveau_channel_emit_fences(chan, file_priv, fences,
+					    request->num_fences);
+	if (IS_ERR(fence)) {
+		ret = PTR_ERR(fence);
 		NV_PRINTK(err, cli, "error fencing pushbuf: %d\n", ret);
 		WIND_RING(chan);
+		fence = NULL;
 		goto out;
 	}
 
@@ -883,29 +1018,18 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		}
 	}
 
-	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_EMIT) {
-		struct sync_file *file;
-		int fd;
-
-		fd = get_unused_fd_flags(O_CLOEXEC);
-		if (fd < 0) {
-			ret = fd;
-			goto out;
-		}
-
-		file = sync_file_create(&fence->base);
-		if (!file) {
-			put_unused_fd(fd);
-			goto out;
-		}
+	user_fences = u64_to_user_ptr(request->fences);
+	size = sizeof(*fences) * request->num_fences;
 
-		fd_install(fd, file->file);
-		request->fence = fd;
+	if (copy_to_user(user_fences, fences, size)) {
+		WIND_RING(chan);
+		ret = -EFAULT;
+		fence = NULL;
+		goto out;
 	}
 
 out:
-	if (prefence)
-		dma_fence_put(prefence);
+	u_free(fences);
 
 	validate_fini(&op, chan, fence, bo);
 	nouveau_fence_unref(&fence);
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 85425dc90301..5b8d40228a1b 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -115,16 +115,25 @@ struct drm_nouveau_gem_pushbuf {
 	__u64 gart_available;
 };
 
-#define NOUVEAU_GEM_PUSHBUF_FENCE_WAIT (1 << 0)
-#define NOUVEAU_GEM_PUSHBUF_FENCE_EMIT (1 << 1)
-#define NOUVEAU_GEM_PUSHBUF_FLAGS (NOUVEAU_GEM_PUSHBUF_FENCE_WAIT | \
-				   NOUVEAU_GEM_PUSHBUF_FENCE_EMIT)
+#define NOUVEAU_GEM_FENCE_WAIT	(1 << 0)
+#define NOUVEAU_GEM_FENCE_EMIT	(1 << 1)
+#define NOUVEAU_GEM_FENCE_FD	(1 << 2)
+#define NOUVEAU_GEM_FENCE_FLAGS	(NOUVEAU_GEM_FENCE_WAIT | \
+				 NOUVEAU_GEM_FENCE_EMIT | \
+				 NOUVEAU_GEM_FENCE_FD)
+
+struct drm_nouveau_gem_fence {
+	__u32 handle;
+	__u32 flags;
+};
+
+#define NOUVEAU_GEM_PUSHBUF_FLAGS	0
 
 struct drm_nouveau_gem_pushbuf2 {
 	struct drm_nouveau_gem_pushbuf base;
 	__u32 flags;
-	__s32 fence;
-	__u64 reserved;
+	__u32 num_fences;
+	__u64 fences;
 };
 
 #define NOUVEAU_GEM_CPU_PREP_NOWAIT                                  0x00000001
-- 
2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 5/6] drm/nouveau: Support DMA fence arrays
From: Thierry Reding <treding at nvidia.com>
A DMA fence can be composed of multiple fences in an array. Support this
in the Nouveau driver by iteratively synchronizing to each DMA fence in
the array.
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 31 ++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 8530c2684832..c0849e09279c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -24,6 +24,7 @@
  *
  */
 
+#include <linux/dma-fence-array.h>
 #include <linux/ktime.h>
 #include <linux/hrtimer.h>
 #include <linux/sched/signal.h>
@@ -338,9 +339,9 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy,
bool intr)
 		return 0;
 }
 
-int
-nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
-		   bool intr)
+static int
+__nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		     bool intr)
 {
 	struct nouveau_fence_chan *fctx = chan->fence;
 	struct nouveau_channel *prev = NULL;
@@ -363,6 +364,30 @@ nouveau_fence_sync(struct dma_fence *fence, struct
nouveau_channel *chan,
 	return ret;
 }
 
+int
+nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		   bool intr)
+{
+	int ret = 0;
+
+	if (dma_fence_is_array(fence)) {
+		struct dma_fence_array *array = to_dma_fence_array(fence);
+		unsigned int i;
+
+		for (i = 0; i < array->num_fences; i++) {
+			struct dma_fence *f = array->fences[i];
+
+			ret = __nouveau_fence_sync(f, chan, intr);
+			if (ret < 0)
+				break;
+		}
+	} else {
+		ret = __nouveau_fence_sync(fence, chan, intr);
+	}
+
+	return ret;
+}
+
 struct nouveau_fence *
 nouveau_fence_ref(struct nouveau_fence *fence)
 {
-- 
2.28.0
Thierry Reding
2020-Aug-28  10:40 UTC
[Nouveau] [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits
From: Thierry Reding <treding at nvidia.com>
These are useful in cases where only a fence is to be created to wait
for existing jobs in the command stream.
Signed-off-by: Thierry Reding <treding at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_gem.c | 197 +++++++++++++-------------
 1 file changed, 99 insertions(+), 98 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index b3ece731e4e1..c70a045d7141 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -816,9 +816,9 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct drm_nouveau_gem_pushbuf *req = &request->base;
-	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
-	struct drm_nouveau_gem_pushbuf_bo *bo;
+	struct drm_nouveau_gem_pushbuf_push *push = NULL;
+	struct drm_nouveau_gem_pushbuf_bo *bo = NULL;
 	struct drm_nouveau_gem_fence *fences = NULL;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
@@ -850,8 +850,6 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 
 	req->vram_available = drm->gem.vram_available;
 	req->gart_available = drm->gem.gart_available;
-	if (unlikely(req->nr_push == 0))
-		goto out_next;
 
 	if (unlikely(req->nr_push > NOUVEAU_GEM_MAX_PUSH)) {
 		NV_PRINTK(err, cli, "pushbuf push count exceeds limit: %d max
%d\n",
@@ -871,33 +869,35 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		return nouveau_abi16_put(abi16, -EINVAL);
 	}
 
-	push = u_memcpya(req->push, req->nr_push, sizeof(*push));
-	if (IS_ERR(push))
-		return nouveau_abi16_put(abi16, PTR_ERR(push));
+	if (req->nr_push > 0) {
+		push = u_memcpya(req->push, req->nr_push, sizeof(*push));
+		if (IS_ERR(push))
+			return nouveau_abi16_put(abi16, PTR_ERR(push));
 
-	bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo));
-	if (IS_ERR(bo)) {
-		u_free(push);
-		return nouveau_abi16_put(abi16, PTR_ERR(bo));
-	}
+		bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo));
+		if (IS_ERR(bo)) {
+			u_free(push);
+			return nouveau_abi16_put(abi16, PTR_ERR(bo));
+		}
 
-	/* Ensure all push buffers are on validate list */
-	for (i = 0; i < req->nr_push; i++) {
-		if (push[i].bo_index >= req->nr_buffers) {
-			NV_PRINTK(err, cli, "push %d buffer not in list\n", i);
-			ret = -EINVAL;
-			goto out_prevalid;
+		/* Ensure all push buffers are on validate list */
+		for (i = 0; i < req->nr_push; i++) {
+			if (push[i].bo_index >= req->nr_buffers) {
+				NV_PRINTK(err, cli, "push %d buffer not in list\n", i);
+				ret = -EINVAL;
+				goto out_prevalid;
+			}
 		}
-	}
 
-	/* Validate buffer list */
+		/* Validate buffer list */
 revalidate:
-	ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo,
-					   req->nr_buffers, &op, &do_reloc);
-	if (ret) {
-		if (ret != -ERESTARTSYS)
-			NV_PRINTK(err, cli, "validate: %d\n", ret);
-		goto out_prevalid;
+		ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo,
+						   req->nr_buffers, &op, &do_reloc);
+		if (ret) {
+			if (ret != -ERESTARTSYS)
+				NV_PRINTK(err, cli, "validate: %d\n", ret);
+			goto out_prevalid;
+		}
 	}
 
 	if (request->num_fences > 0) {
@@ -915,89 +915,89 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	}
 
 	/* Apply any relocations that are required */
-	if (do_reloc) {
-		if (!reloc) {
-			validate_fini(&op, chan, NULL, bo);
-			reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc));
-			if (IS_ERR(reloc)) {
-				ret = PTR_ERR(reloc);
-				goto out_prevalid;
-			}
+	if (req->nr_push > 0) {
+		if (do_reloc) {
+			if (!reloc) {
+				validate_fini(&op, chan, NULL, bo);
+				reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc));
+				if (IS_ERR(reloc)) {
+					ret = PTR_ERR(reloc);
+					goto out_prevalid;
+				}
 
-			goto revalidate;
-		}
+				goto revalidate;
+			}
 
-		ret = nouveau_gem_pushbuf_reloc_apply(cli, req, reloc, bo);
-		if (ret) {
-			NV_PRINTK(err, cli, "reloc apply: %d\n", ret);
-			goto out;
+			ret = nouveau_gem_pushbuf_reloc_apply(cli, req, reloc, bo);
+			if (ret) {
+				NV_PRINTK(err, cli, "reloc apply: %d\n", ret);
+				goto out;
+			}
 		}
-	}
 
-	if (chan->dma.ib_max) {
-		ret = nouveau_dma_wait(chan, req->nr_push + 1, 16);
-		if (ret) {
-			NV_PRINTK(err, cli, "nv50cal_space: %d\n", ret);
-			goto out;
-		}
+		if (chan->dma.ib_max) {
+			ret = nouveau_dma_wait(chan, req->nr_push + 1, 16);
+			if (ret) {
+				NV_PRINTK(err, cli, "nv50cal_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_vma *vma = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_vma *vma = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
 
-			nv50_dma_push(chan, vma->addr + push[i].offset,
-				      push[i].length);
-		}
-	} else
-	if (drm->client.device.info.chipset >= 0x25) {
-		ret = PUSH_WAIT(chan->chan.push, req->nr_push * 2);
-		if (ret) {
-			NV_PRINTK(err, cli, "cal_space: %d\n", ret);
-			goto out;
-		}
+				nv50_dma_push(chan, vma->addr + push[i].offset,
+					      push[i].length);
+			}
+		} else if (drm->client.device.info.chipset >= 0x25) {
+			ret = PUSH_WAIT(chan->chan.push, req->nr_push * 2);
+			if (ret) {
+				NV_PRINTK(err, cli, "cal_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_bo *nvbo = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_bo *nvbo = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
 
-			PUSH_CALL(chan->chan.push, nvbo->offset + push[i].offset);
-			PUSH_DATA(chan->chan.push, 0);
-		}
-	} else {
-		ret = PUSH_WAIT(chan->chan.push, req->nr_push * (2 +
NOUVEAU_DMA_SKIPS));
-		if (ret) {
-			NV_PRINTK(err, cli, "jmp_space: %d\n", ret);
-			goto out;
-		}
+				PUSH_CALL(chan->chan.push, nvbo->offset + push[i].offset);
+				PUSH_DATA(chan->chan.push, 0);
+			}
+		} else {
+			ret = PUSH_WAIT(chan->chan.push, req->nr_push * (2 +
NOUVEAU_DMA_SKIPS));
+			if (ret) {
+				NV_PRINTK(err, cli, "jmp_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_bo *nvbo = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
-			uint32_t cmd;
-
-			cmd = chan->push.addr + ((chan->dma.cur + 2) << 2);
-			cmd |= 0x20000000;
-			if (unlikely(cmd != req->suffix0)) {
-				if (!nvbo->kmap.virtual) {
-					ret = ttm_bo_kmap(&nvbo->bo, 0,
-							  nvbo->bo.mem.
-							  num_pages,
-							  &nvbo->kmap);
-					if (ret) {
-						WIND_RING(chan);
-						goto out;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_bo *nvbo = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
+				uint32_t cmd;
+
+				cmd = chan->push.addr + ((chan->dma.cur + 2) << 2);
+				cmd |= 0x20000000;
+				if (unlikely(cmd != req->suffix0)) {
+					if (!nvbo->kmap.virtual) {
+						ret = ttm_bo_kmap(&nvbo->bo, 0,
+								  nvbo->bo.mem.num_pages,
+								  &nvbo->kmap);
+						if (ret) {
+							WIND_RING(chan);
+							goto out;
+						}
+						nvbo->validate_mapped = true;
 					}
-					nvbo->validate_mapped = true;
-				}
 
-				nouveau_bo_wr32(nvbo, (push[i].offset +
-						push[i].length - 8) / 4, cmd);
-			}
+					nouveau_bo_wr32(nvbo, (push[i].offset +
+							push[i].length - 8) / 4, cmd);
+				}
 
-			PUSH_JUMP(chan->chan.push, nvbo->offset + push[i].offset);
-			PUSH_DATA(chan->chan.push, 0);
-			for (j = 0; j < NOUVEAU_DMA_SKIPS; j++)
+				PUSH_JUMP(chan->chan.push, nvbo->offset + push[i].offset);
 				PUSH_DATA(chan->chan.push, 0);
+				for (j = 0; j < NOUVEAU_DMA_SKIPS; j++)
+					PUSH_DATA(chan->chan.push, 0);
+			}
 		}
 	}
 
@@ -1031,7 +1031,9 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 out:
 	u_free(fences);
 
-	validate_fini(&op, chan, fence, bo);
+	if (req->nr_push > 0)
+		validate_fini(&op, chan, fence, bo);
+
 	nouveau_fence_unref(&fence);
 
 	if (do_reloc) {
@@ -1054,7 +1056,6 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	u_free(bo);
 	u_free(push);
 
-out_next:
 	if (chan->dma.ib_max) {
 		req->suffix0 = 0x00000000;
 		req->suffix1 = 0x00000000;
-- 
2.28.0
Thierry Reding
2020-Sep-23  09:18 UTC
[Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote:> From: Thierry Reding <treding at nvidia.com> > > Hi, > > This series implements a new IOCTL to submit push buffers that can > optionally return a sync FD or sync object to userspace. This is useful > in cases where userspace wants to synchronize operations between the GPU > and another driver (such as KMS for display). Among other things this > allows extensions such as eglDupNativeFenceFDANDROID to be implemented. > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM > sync objects to be passed rather than only sync FDs. It also allows any > number of sync FDs/objects to be passed in or emitted. I think those are > useful features, but I left them in a separate patch in case everybody > else thinks that this won't be needed. If we decide to merge the new ABI > then patch 4 should be squashed into patch 3. > > The corresponding userspace changes can be found here: > > libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ > mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ > > I've verified that this works with kmscube's --atomic mode and Weston.Hi Ben, any thoughts on this series? I realize that this is somewhat suboptimal because we're effectively adding a duplicate of the existing IOCTL with only the "minor" extension of adding sync FDs/objects, but at the same time I don't have any good ideas on what else to add to make this more appealing or if you have any plans of your own to address this in the future. Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20200923/b7a6b23e/attachment.sig>
Daniel Vetter
2020-Sep-23  15:21 UTC
[Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
On Wed, Sep 23, 2020 at 11:18:53AM +0200, Thierry Reding wrote:> On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote: > > From: Thierry Reding <treding at nvidia.com> > > > > Hi, > > > > This series implements a new IOCTL to submit push buffers that can > > optionally return a sync FD or sync object to userspace. This is useful > > in cases where userspace wants to synchronize operations between the GPU > > and another driver (such as KMS for display). Among other things this > > allows extensions such as eglDupNativeFenceFDANDROID to be implemented. > > > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM > > sync objects to be passed rather than only sync FDs. It also allows any > > number of sync FDs/objects to be passed in or emitted. I think those are > > useful features, but I left them in a separate patch in case everybody > > else thinks that this won't be needed. If we decide to merge the new ABI > > then patch 4 should be squashed into patch 3. > > > > The corresponding userspace changes can be found here: > > > > libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ > > mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ > > > > I've verified that this works with kmscube's --atomic mode and Weston. > > Hi Ben, > > any thoughts on this series? I realize that this is somewhat suboptimal > because we're effectively adding a duplicate of the existing IOCTL with > only the "minor" extension of adding sync FDs/objects, but at the same > time I don't have any good ideas on what else to add to make this more > appealing or if you have any plans of your own to address this in the > future.drm core automatically zero-extends ioctl structs both ways, so if all you do is add more stuff to the top level ioctl struct at the bottom, there's no need to duplicate any code. At least as long as you guarantee that 0 =old behaviour for both in and out parameters. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Reasonably Related Threads
- [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
- [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
- [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
- [PATCH 0/3] drm/nouveau: Add support for fence FDs
- [PATCH 0/3] drm/tegra: Add support for fence FDs