thr3ads.net - Nouveau - [Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel

If this information is useful, please help other people find it:
Share via:

Alexandre Courbot

2014-Mar-24 08:42 UTC

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

Hi everyone,

Here is the second batch of patches to add GK20A support to Nouveau. This time
we are adding the actual chip support, and this series brings the driver to a
point where a slightly-tweaked Mesa successfully runs shaders and renders
triangles on GBM! Many thanks to Thierry Reding and the people on the
#nouveau IRC channel for their help without which we would not have reached
this milestone.

A few lines of hacks (not included here) are still needed to deal with cached
mappings triggering external aborts and CPU/GPU memory coherency issues, but I
hope to understand and address these issues next.

Most of the changes below have already been seen (and sometimes reviewed) in an
earlier patchset. What has been added is proper PGRAPH support (still needing
an external firmware and mostly reusing NVE4's code) as well as a better RAM
implementation.

How to represent and manage VRAM has been the hardest part to deal with, since
GK20A shares the system memory with the CPU without any kind of partition. I
have tried various approaches (included some in which no RAM object ever gets
instanciated) and finally decided to go for one using DMA-contiguous memory
allocations and relying on BAR mappings for kernel access and exposure to
user-space, as it fits better with existing code and keeps us safe from most of
the CPU/GPU memory coherency issues (at the cost of some performance).

Looking forward to your review of these few patches! :)

Cheers,
Alex.

Alexandre Courbot (12):
  drm/nouveau: fix missing newline
  drm/nouveau/timer: skip calibration on GK20A
  drm/nouveau/bar: only ioremap BAR3 if it exists
  drm/nouveau/bar/nvc0: support chips without BAR3
  drm/nouveau/fifo: add GK20A support
  drm/nouveau/ibus: add GK20A support
  drm/nouveau/fb: add GK20A support
  drm/nouveau/graph: enable when using external firmware
  drm/nouveau/graph: pad firmware code at load time
  drm/nouveau/graph: add GK20A support
  drm/nouveau: support GK20A in nouveau_accel_init()
  drm/nouveau: support for probing GK20A

 drivers/gpu/drm/nouveau/Makefile                   |   5 +
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c  |  20 +++
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |   1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    |  35 +++++
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |   4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c   |  12 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |   9 ++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |   2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   |  75 +++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |   1 +
 .../gpu/drm/nouveau/core/include/engine/graph.h    |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c     |   7 +-
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c     | 101 +++++++------
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c      |  56 +++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h      |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c   | 168 +++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 ++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c   |  19 ++-
 drivers/gpu/drm/nouveau/nouveau_drm.c              |  12 +-
 21 files changed, 578 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline

Add a missing newline at the end of a DRM_INFO message.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b2a674531fba..ef27949057c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -322,7 +322,7 @@ nouveau_get_hdmi_dev(struct drm_device *dev)
 	struct pci_dev *pdev = dev->pdev;
 
 	if (!pdev) {
-		DRM_INFO("not a PCI device; no HDMI");
+		DRM_INFO("not a PCI device; no HDMI\n");
 		drm->hdmi_device = NULL;
 		return;
 	}
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A

GK20A's timer is directly attached to the system timer and cannot be
calibrated. Skip the calibration phase on that chip since the
corresponding registers do not exist.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
index c0bdd10358d7..822fe0d8a871 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
@@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
 	if (ret)
 		return ret;
 
+	/* gk20a does not have the calibration registers */
+	if (device->chipset == 0xea)
+		goto skip_clk_init;
+
 	/* aim for 31.25MHz, which gives us nanosecond timestamps */
 	d = 1000000 / 32;
 
@@ -235,20 +239,23 @@ nv04_timer_init(struct nouveau_object *object)
 		d >>= 1;
 	}
 
-	/* restore the time before suspend */
-	lo = priv->suspend_time;
-	hi = (priv->suspend_time >> 32);
-
 	nv_debug(priv, "input frequency : %dHz\n", f);
 	nv_debug(priv, "input multiplier: %d\n", m);
 	nv_debug(priv, "numerator       : 0x%08x\n", n);
 	nv_debug(priv, "denominator     : 0x%08x\n", d);
 	nv_debug(priv, "timer frequency : %dHz\n", (f * m) * d / n);
-	nv_debug(priv, "time low        : 0x%08x\n", lo);
-	nv_debug(priv, "time high       : 0x%08x\n", hi);
 
 	nv_wr32(priv, NV04_PTIMER_NUMERATOR, n);
 	nv_wr32(priv, NV04_PTIMER_DENOMINATOR, d);
+
+skip_clk_init:
+	/* restore the time before suspend */
+	lo = priv->suspend_time;
+	hi = (priv->suspend_time >> 32);
+
+	nv_debug(priv, "time low        : 0x%08x\n", lo);
+	nv_debug(priv, "time high       : 0x%08x\n", hi);
+
 	nv_wr32(priv, NV04_PTIMER_INTR_0, 0xffffffff);
 	nv_wr32(priv, NV04_PTIMER_INTR_EN_0, 0x00000000);
 	nv_wr32(priv, NV04_PTIMER_TIME_1, hi);
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists

Some chips that use system memory exclusively (e.g. GK20A) do not
expose 2 BAR regions. For them only BAR1 exists, and it should be used
for USERD mapping. Do not map BAR3 if its resource does not exist.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
index bdf594116f3f..d713eeb75b13 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
@@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
 {
 	struct nouveau_device *device = nv_device(parent);
 	struct nouveau_bar *bar;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
 	int ret;
 
 	ret = nouveau_subdev_create_(parent, engine, oclass, 0, "BARCTL",
@@ -118,8 +119,10 @@ nouveau_bar_create_(struct nouveau_object *parent,
 	if (ret)
 		return ret;
 
-	bar->iomem = ioremap(nv_device_resource_start(device, 3),
-			     nv_device_resource_len(device, 3));
+	if (has_bar3)
+		bar->iomem = ioremap(nv_device_resource_start(device, 3),
+				     nv_device_resource_len(device, 3));
+
 	return 0;
 }
 
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3

Adapt the NVC0 BAR driver to make it able to support chips that do not
expose a BAR3. When this happens, BAR1 is then used for USERD mapping
and the BAR alloc() functions is disabled, making GPU objects unable
to rely on BAR for data access and falling back to PRAMIN.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c | 101 +++++++++++++------------
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
index 3f30db62e656..5da1b9447af0 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
@@ -79,87 +79,88 @@ nvc0_bar_unmap(struct nouveau_bar *bar, struct nouveau_vma
*vma)
 }
 
 static int
-nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
-	      struct nouveau_oclass *oclass, void *data, u32 size,
-	      struct nouveau_object **pobject)
+nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
 {
-	struct nouveau_device *device = nv_device(parent);
-	struct nvc0_bar_priv *priv;
+	struct nouveau_device *device = nv_device(&priv->base);
 	struct nouveau_gpuobj *mem;
 	struct nouveau_vm *vm;
+	resource_size_t bar_len;
 	int ret;
 
-	ret = nouveau_bar_create(parent, engine, oclass, &priv);
-	*pobject = nv_object(priv);
-	if (ret)
-		return ret;
-
-	/* BAR3 */
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[0].mem);
-	mem = priv->bar[0].mem;
+				&priv->bar[nr].mem);
+	mem = priv->bar[nr].mem;
 	if (ret)
 		return ret;
 
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[0].pgd);
+				&priv->bar[nr].pgd);
 	if (ret)
 		return ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 3), 0,
&vm);
+	bar_len = nv_device_resource_len(device, bar);
+
+	ret = nouveau_vm_new(device, 0, bar_len, 0, &vm);
 	if (ret)
 		return ret;
 
 	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL,
-				 (nv_device_resource_len(device, 3) >> 12) * 8,
-				 0x1000, NVOBJ_FLAG_ZERO_ALLOC,
-				 &vm->pgt[0].obj[0]);
-	vm->pgt[0].refcount[0] = 1;
-	if (ret)
-		return ret;
+	/*
+	 * Bootstrap page table lookup.
+	 */
+	if (bar == 3) {
+		ret = nouveau_gpuobj_new(nv_object(priv), NULL,
+					 (bar_len >> 12) * 8, 0x1000,
+					 NVOBJ_FLAG_ZERO_ALLOC,
+					&vm->pgt[0].obj[0]);
+		vm->pgt[0].refcount[0] = 1;
+		if (ret)
+			return ret;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[0].vm, priv->bar[0].pgd);
+	ret = nouveau_vm_ref(vm, &priv->bar[nr].vm, priv->bar[nr].pgd);
 	nouveau_vm_ref(NULL, &vm, NULL);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 3) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 3) - 1));
+	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0208, lower_32_bits(bar_len - 1));
+	nv_wo32(mem, 0x020c, upper_32_bits(bar_len - 1));
 
-	/* BAR1 */
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[1].mem);
-	mem = priv->bar[1].mem;
-	if (ret)
-		return ret;
+	return 0;
+}
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[1].pgd);
-	if (ret)
-		return ret;
+static int
+nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 size,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_device *device = nv_device(parent);
+	struct nvc0_bar_priv *priv;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
+	int ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 1), 0,
&vm);
+	ret = nouveau_bar_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
 
-	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
+	/* BAR3 */
+	if (has_bar3) {
+		ret = nvc0_bar_init_vm(priv, 0, 3);
+		if (ret)
+			return ret;
+		priv->base.alloc = nouveau_bar_alloc;
+		priv->base.kmap = nvc0_bar_kmap;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[1].vm, priv->bar[1].pgd);
-	nouveau_vm_ref(NULL, &vm, NULL);
+	/* BAR1 */
+	ret = nvc0_bar_init_vm(priv, 1, 1);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 1) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 1) - 1));
-
-	priv->base.alloc = nouveau_bar_alloc;
-	priv->base.kmap = nvc0_bar_kmap;
 	priv->base.umap = nvc0_bar_umap;
 	priv->base.unmap = nvc0_bar_unmap;
 	priv->base.flush = nv84_bar_flush;
@@ -201,7 +202,9 @@ nvc0_bar_init(struct nouveau_object *object)
 	nv_mask(priv, 0x100c80, 0x00000001, 0x00000000);
 
 	nv_wr32(priv, 0x001704, 0x80000000 | priv->bar[1].mem->addr >>
12);
-	nv_wr32(priv, 0x001714, 0xc0000000 | priv->bar[0].mem->addr >>
12);
+	if (priv->bar[0].mem)
+		nv_wr32(priv, 0x001714,
+			0xc0000000 | priv->bar[0].mem->addr >> 12);
 	return 0;
 }
 
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support

GK20A's FIFO is compatible with NVE0, but only features 128 channels and
1 runlist.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35 ++++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
 4 files changed, 38 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index d310c195bdfe..a90087bbdf88 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -237,6 +237,7 @@ nouveau-y += core/engine/fifo/nv50.o
 nouveau-y += core/engine/fifo/nv84.o
 nouveau-y += core/engine/fifo/nvc0.o
 nouveau-y += core/engine/fifo/nve0.o
+nouveau-y += core/engine/fifo/nvea.o
 nouveau-y += core/engine/fifo/nv108.o
 nouveau-y += core/engine/graph/ctxnv40.o
 nouveau-y += core/engine/graph/ctxnv50.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
index 014344ebee66..e96b32bb1bbc 100644
--- a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
@@ -8,6 +8,7 @@ int  nve0_fifo_ctor(struct nouveau_object *, struct
nouveau_object *,
 		    struct nouveau_object **);
 void nve0_fifo_dtor(struct nouveau_object *);
 int  nve0_fifo_init(struct nouveau_object *);
+int  nve0_fifo_fini(struct nouveau_object *, bool);
 
 struct nve0_fifo_impl {
 	struct nouveau_oclass base;
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
new file mode 100644
index 000000000000..5e3a9514df5d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nve0.h"
+
+struct nouveau_oclass *
+nvea_fifo_oclass = &(struct nve0_fifo_impl) {
+	.base.handle = NV_ENGINE(FIFO, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nve0_fifo_ctor,
+		.dtor = nve0_fifo_dtor,
+		.init = nve0_fifo_init,
+		.fini = nve0_fifo_fini,
+	},
+	.channels = 128,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
index 26b6b2bb1112..823356f45137 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
@@ -109,6 +109,7 @@ extern struct nouveau_oclass *nv50_fifo_oclass;
 extern struct nouveau_oclass *nv84_fifo_oclass;
 extern struct nouveau_oclass *nvc0_fifo_oclass;
 extern struct nouveau_oclass *nve0_fifo_oclass;
+extern struct nouveau_oclass *nvea_fifo_oclass;
 extern struct nouveau_oclass *nv108_fifo_oclass;
 
 void nv04_fifo_intr(struct nouveau_subdev *);
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support

Add support for initializing the priv ring of GK20A. This is done by the
BIOS on desktop GPUs, but needs to be done by hand on Tegra.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 +++++++++++++++++++++
 3 files changed, 112 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index a90087bbdf88..592141e62dda 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -132,6 +132,7 @@ nouveau-y += core/subdev/i2c/nv94.o
 nouveau-y += core/subdev/i2c/nvd0.o
 nouveau-y += core/subdev/ibus/nvc0.o
 nouveau-y += core/subdev/ibus/nve0.o
+nouveau-y += core/subdev/ibus/nvea.o
 nouveau-y += core/subdev/instmem/base.o
 nouveau-y += core/subdev/instmem/nv04.o
 nouveau-y += core/subdev/instmem/nv40.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
index 88814f159d89..056a42f92705 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
@@ -30,5 +30,6 @@ nouveau_ibus(void *obj)
 
 extern struct nouveau_oclass nvc0_ibus_oclass;
 extern struct nouveau_oclass nve0_ibus_oclass;
+extern struct nouveau_oclass nvea_ibus_oclass;
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
new file mode 100644
index 000000000000..151851286e99
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <subdev/ibus.h>
+
+struct nvea_ibus_priv {
+	struct nouveau_ibus base;
+};
+
+static void
+nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
+{
+	nv_mask(priv, 0x137250, 0x3f, 0);
+
+	nv_mask(priv, 0x000200, 0x20, 0);
+	udelay(20);
+	nv_mask(priv, 0x000200, 0x20, 0x20);
+
+	nv_wr32(priv, 0x12004c, 0x4);
+	nv_wr32(priv, 0x122204, 0x2);
+	nv_rd32(priv, 0x122204);
+}
+
+static void
+nvea_ibus_intr(struct nouveau_subdev *subdev)
+{
+	struct nvea_ibus_priv *priv = (void *)subdev;
+	u32 status0 = nv_rd32(priv, 0x120058);
+	s32 retry = 100;
+	u32 command;
+
+	if (status0 & 0x7) {
+		nv_debug(priv, "resetting priv ring\n");
+		nvea_ibus_init_priv_ring(priv);
+	}
+
+	/* Acknowledge interrupt */
+	nv_mask(priv, 0x12004c, 0x2, 0x2);
+
+	while (--retry >= 0) {
+		command = nv_rd32(priv, 0x12004c) & 0x3f;
+		if (command == 0)
+			break;
+	}
+
+	if (retry < 0)
+		nv_warn(priv, "timeout waiting for ringmaster ack\n");
+}
+
+static int
+nvea_ibus_init(struct nouveau_object *object)
+{
+	struct nvea_ibus_priv *priv = (void *)object;
+	int ret;
+
+	ret = _nouveau_ibus_init(object);
+	if (ret)
+		return ret;
+
+	nvea_ibus_init_priv_ring(priv);
+
+	return 0;
+}
+
+static int
+nvea_ibus_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	       struct nouveau_oclass *oclass, void *data, u32 size,
+	       struct nouveau_object **pobject)
+{
+	struct nvea_ibus_priv *priv;
+	int ret;
+
+	ret = nouveau_ibus_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	nv_subdev(priv)->intr = nvea_ibus_intr;
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ibus_oclass = {
+	.handle = NV_SUBDEV(IBUS, 0xea),
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ibus_ctor,
+		.dtor = _nouveau_ibus_dtor,
+		.init = nvea_ibus_init,
+		.fini = _nouveau_ibus_fini,
+	},
+};
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 07/12] drm/nouveau/fb: add GK20A support

Add a simple FB device for GK20A, as well as a RAM implementation based
on contiguous DMA memory allocations suitable for chips that use system
memory as video RAM.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                 |   2 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c    |  56 ++++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h    |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c | 168 +++++++++++++++++++++++
 5 files changed, 228 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 592141e62dda..708d2e33835f 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -100,6 +100,7 @@ nouveau-y += core/subdev/fb/nvaa.o
 nouveau-y += core/subdev/fb/nvaf.o
 nouveau-y += core/subdev/fb/nvc0.o
 nouveau-y += core/subdev/fb/nve0.o
+nouveau-y += core/subdev/fb/nvea.o
 nouveau-y += core/subdev/fb/ramnv04.o
 nouveau-y += core/subdev/fb/ramnv10.o
 nouveau-y += core/subdev/fb/ramnv1a.o
@@ -114,6 +115,7 @@ nouveau-y += core/subdev/fb/ramnva3.o
 nouveau-y += core/subdev/fb/ramnvaa.o
 nouveau-y += core/subdev/fb/ramnvc0.o
 nouveau-y += core/subdev/fb/ramnve0.o
+nouveau-y += core/subdev/fb/ramnvea.o
 nouveau-y += core/subdev/fb/sddr3.o
 nouveau-y += core/subdev/fb/gddr5.o
 nouveau-y += core/subdev/gpio/base.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
index d7ecafbae1ca..3905816755ba 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
@@ -105,6 +105,7 @@ extern struct nouveau_oclass *nvaa_fb_oclass;
 extern struct nouveau_oclass *nvaf_fb_oclass;
 extern struct nouveau_oclass *nvc0_fb_oclass;
 extern struct nouveau_oclass *nve0_fb_oclass;
+extern struct nouveau_oclass *nvea_fb_oclass;
 
 #include <subdev/bios/ramcfg.h>
 
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
new file mode 100644
index 000000000000..62dbec48481e
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+struct nvea_fb_priv {
+	struct nouveau_fb base;
+};
+
+static int
+nvea_fb_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	     struct nouveau_oclass *oclass, void *data, u32 size,
+	     struct nouveau_object **pobject)
+{
+	struct nvea_fb_priv *priv;
+	int ret;
+
+	ret = nouveau_fb_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+struct nouveau_oclass *
+nvea_fb_oclass = &(struct nouveau_fb_impl) {
+	.base.handle = NV_SUBDEV(FB, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_fb_ctor,
+		.dtor = _nouveau_fb_dtor,
+		.init = _nouveau_fb_init,
+		.fini = _nouveau_fb_fini,
+	},
+	.memtype = nvc0_fb_memtype_valid,
+	.ram = &nvea_ram_oclass,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
index edaf95dee612..0b95a25504d3 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
@@ -32,6 +32,7 @@ extern struct nouveau_oclass nva3_ram_oclass;
 extern struct nouveau_oclass nvaa_ram_oclass;
 extern struct nouveau_oclass nvc0_ram_oclass;
 extern struct nouveau_oclass nve0_ram_oclass;
+extern struct nouveau_oclass nvea_ram_oclass;
 
 int nouveau_sddr3_calc(struct nouveau_ram *ram);
 int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts);
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
new file mode 100644
index 000000000000..4e3d757614b6
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
@@ -0,0 +1,168 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "priv.h"
+
+#include <subdev/fb.h>
+
+#include <linux/mm.h>
+#include <linux/types.h>
+#include <linux/dma-contiguous.h>
+
+static void
+nvea_ram_put(struct nouveau_fb *pfb, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem = *pmem;
+	int i;
+
+	*pmem = NULL;
+
+	for (i = 0; i < mem->size; i++) {
+		struct page *page;
+
+		if (mem->pages[i] == 0)
+			break;
+
+		page = pfn_to_page(dma_to_pfn(dev, mem->pages[i]));
+		dma_release_from_contiguous(dev, page, 1);
+	}
+
+	kfree(mem->pages);
+	kfree(mem);
+}
+
+static int
+nvea_ram_get(struct nouveau_fb *pfb, u64 size, u32 align, u32 ncmin,
+	     u32 memtype, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem;
+	int type = memtype & 0xff;
+	dma_addr_t dma_addr;
+	int npages;
+	int order;
+	int i;
+
+	nv_debug(pfb, "%s: size: %llx align: %x, ncmin: %x\n", __func__,
size,
+		 align, ncmin);
+
+	npages = size >> PAGE_SHIFT;
+	if (npages == 0)
+		npages = 1;
+
+	if (align == 0)
+		align = PAGE_SIZE;
+	align >>= PAGE_SHIFT;
+
+	/* round alignment to the next power of 2, if needed */
+	order = fls(align);
+	if ((align & (align - 1)) == 0)
+		order--;
+
+	ncmin >>= PAGE_SHIFT;
+	/*
+	 * allocate pages by chunks of "align" size, otherwise we may leave
+	 * holes in the contiguous memory area.
+	 */
+	if (ncmin == 0)
+		ncmin = npages;
+	else if (align > ncmin)
+		ncmin = align;
+
+	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+	if (!mem)
+		return -ENOMEM;
+
+	mem->size = npages;
+	mem->memtype = type;
+
+	mem->pages = kzalloc(sizeof(dma_addr_t) * npages, GFP_KERNEL);
+	if (!mem) {
+		kfree(mem);
+		return -ENOMEM;
+	}
+
+	while (npages) {
+		struct page *pages;
+		int pos = 0;
+
+		/* don't overflow in case size is not a multiple of ncmin */
+		if (ncmin > npages)
+			ncmin = npages;
+
+		pages = dma_alloc_from_contiguous(dev, ncmin, order);
+		if (!pages) {
+			nvea_ram_put(pfb, &mem);
+			return -ENOMEM;
+		}
+
+		dma_addr = pfn_to_dma(nv_device_base(nv_device(pfb)),
+				      page_to_pfn(pages));
+
+		nv_debug(pfb, "  alloc count: %x, order: %x, addr: %x\n", ncmin,
+			 order, dma_addr);
+
+		for (i = 0; i < ncmin; i++)
+			mem->pages[pos + i] = dma_addr + (PAGE_SIZE * i);
+
+		pos += ncmin;
+		npages -= ncmin;
+	}
+
+	mem->offset = (u64)mem->pages[0];
+
+	*pmem = mem;
+
+	return 0;
+}
+
+static int
+nvea_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 datasize,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_ram *ram;
+	int ret;
+
+	ret = nouveau_ram_create(parent, engine, oclass, &ram);
+	*pobject = nv_object(ram);
+	if (ret)
+		return ret;
+	ram->type = NV_MEM_TYPE_STOLEN;
+	ram->size = get_num_physpages() << PAGE_SHIFT;
+
+	ram->get = nvea_ram_get;
+	ram->put = nvea_ram_put;
+
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ram_oclass = {
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ram_ctor,
+		.dtor = _nouveau_ram_dtor,
+		.init = _nouveau_ram_init,
+		.fini = _nouveau_ram_fini,
+	},
+};
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware

nvc0_graph_ctor() would only let the graphics engine be enabled if its
oclass has a proper microcode linked to it. This prevents GR from being
enabled at all on chips that rely exclusively on external firmware, even
though such a use-case is valid.

Relax the conditions enabling the GR engine to also include the case
where an external firmware has also been loaded.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index 6ef8bf181b2d..f997a18f5760 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct
nouveau_object *engine,
 	struct nvc0_graph_oclass *oclass = (void *)bclass;
 	struct nouveau_device *device = nv_device(parent);
 	struct nvc0_graph_priv *priv;
+	bool use_fw;
 	int ret, i;
 
+	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
+
 	ret = nouveau_graph_create(parent, engine, bclass,
-				   (oclass->fecs.ucode != NULL), &priv);
+				   (oclass->fecs.ucode != NULL) || use_fw,
+				   &priv);
 	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
@@ -1146,7 +1150,7 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct
nouveau_object *engine,
 
 	priv->base.units = nvc0_graph_units;
 
-	if (nouveau_boolopt(device->cfgopt, "NvGrUseFW", false)) {
+	if (use_fw) {
 		nv_info(priv, "using external firmware\n");
 		if (nvc0_graph_ctor_fw(priv, "fuc409c", &priv->fuc409c) ||
 		    nvc0_graph_ctor_fw(priv, "fuc409d", &priv->fuc409d) ||
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
run from non-prepadded firmware files.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index f997a18f5760..367e72daf8b1 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32
fuc_base,
 			nv_wr32(priv, fuc_base + 0x0188, i >> 6);
 		nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
 	}
+
+	/* code must be padded to 0x40 */
+	for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
+		nv_wr32(priv, fuc_base + 0x0184, 0);
 }
 
 static void
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support

Add a GR device for GK20A based on NVE4, with the correct classes
definitions (GK20A's 3D class is 0xa297).

Most of the NVE4 code can be used on GK20A, so make relevant bits of
NVE4 available to other chips as well.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75 ++++++++++++++++++++++
 .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
 6 files changed, 89 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 708d2e33835f..032a4744f843 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
 nouveau-y += core/engine/graph/nvd7.o
 nouveau-y += core/engine/graph/nvd9.o
 nouveau-y += core/engine/graph/nve4.o
+nouveau-y += core/engine/graph/nvea.o
 nouveau-y += core/engine/graph/nvf0.o
 nouveau-y += core/engine/graph/nv108.o
 nouveau-y += core/engine/mpeg/nv31.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
index e2de73ee5eee..3904073f860d 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
@@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
 	{}
 };
 
-static void
+void
 nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
 {
 	u32 magic[GPC_MAX][2];
@@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv,
struct nvc0_grctx *info)
 	nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_grctx_init_hub[] = {
 	nvc0_grctx_init_base,
 	nve4_grctx_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
index b0ab6de270b2..904f09b540c5 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
@@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
 
+void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
+
 extern struct nouveau_oclass *nvc0_grctx_oclass;
 extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
 extern struct nvc0_graph_init nvc0_grctx_init_base[];
@@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[];
 extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
 extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
 extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
+extern struct nvc0_graph_init nve4_grctx_init_icmd[];
+
+extern struct nvc0_graph_init *nve4_grctx_init_hub[];
+extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
 
 extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
 extern struct nvc0_graph_init nvc0_grctx_init_902d[];
 extern struct nvc0_graph_init nvc0_grctx_init_9039[];
 extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
 extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
+extern struct nvc0_graph_init nve4_grctx_init_a097[];
 
 void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
 void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
@@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[];
 
 extern struct nouveau_oclass *nv108_grctx_oclass;
 
+extern struct nvc0_graph_init *nve4_graph_init_mmio[];
+
 #define mmio_data(s,a,p) do {                                                 
\
 	info->buffer[info->buffer_nr] = round_up(info->addr, (a));           
\
 	info->addr = info->buffer[info->buffer_nr++] + (s);                  
\
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
index 05ec09c88517..442857c5c120 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
@@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
 	return nvc0_graph_init_ctxctl(priv);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_graph_init_mmio[] = {
 	nve4_graph_init_regs,
 	nvc0_graph_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
new file mode 100644
index 000000000000..d5e6a1adcacb
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+static struct nouveau_oclass
+nvea_graph_sclass[] = {
+	{ 0x902d, &nouveau_object_ofuncs },
+	{ 0xa040, &nouveau_object_ofuncs },
+	{ 0xa297, &nouveau_object_ofuncs },
+	{ 0xa0c0, &nouveau_object_ofuncs },
+	{}
+};
+
+static struct nvc0_graph_mthd
+nvea_grctx_init_mthd[] = {
+	{ 0xa297, nve4_grctx_init_a097, },
+	{ 0x902d, nvc0_grctx_init_902d, },
+	{ 0x902d, nvc0_grctx_init_mthd_magic, },
+	{}
+};
+
+struct nouveau_oclass *
+nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
+	.base.handle = NV_ENGCTX(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_context_ctor,
+		.dtor = nvc0_graph_context_dtor,
+		.init = _nouveau_graph_context_init,
+		.fini = _nouveau_graph_context_fini,
+		.rd32 = _nouveau_graph_context_rd32,
+		.wr32 = _nouveau_graph_context_wr32,
+	},
+	.main = nve4_grctx_generate_main,
+	.mods = nve4_grctx_generate_mods,
+	.unkn = nve4_grctx_generate_unkn,
+	.hub  = nve4_grctx_init_hub,
+	.gpc  = nve4_grctx_init_gpc,
+	.icmd = nve4_grctx_init_icmd,
+	.mthd = nvea_grctx_init_mthd,
+}.base;
+
+
+struct nouveau_oclass *
+nvea_graph_oclass = &(struct nvc0_graph_oclass) {
+	.base.handle = NV_ENGINE(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_ctor,
+		.dtor = nvc0_graph_dtor,
+		.init = nve4_graph_init,
+		.fini = _nouveau_graph_fini,
+	},
+	.cclass = &nvea_grctx_oclass,
+	.sclass = nvea_graph_sclass,
+	.mmio = nve4_graph_init_mmio,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
index 97705618de97..632d7e172172 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
@@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
 extern struct nouveau_oclass *nvd7_graph_oclass;
 extern struct nouveau_oclass *nvd9_graph_oclass;
 extern struct nouveau_oclass *nve4_graph_oclass;
+extern struct nouveau_oclass *nvea_graph_oclass;
 extern struct nouveau_oclass *nvf0_graph_oclass;
 extern struct nouveau_oclass *nv108_graph_oclass;
 
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()

GK20A does not embed a dedicated COPY engine and thus cannot allocate
the copy channel that nouveau_accel_init() attempts to create. It also
lacks any display hardware, so the creation of a software channel does
not apply neither.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ef27949057c3..f2394e84eae6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -171,6 +171,11 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	if (device->chipset == 0xea) {
+		/* gk20a does not have CE0/CE1 */
+		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
+		arg1 = 1;
+	} else
 	if (device->card_type >= NV_E0) {
 		ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE,
 					  NVDRM_CHAN + 1,
@@ -207,6 +212,10 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	/* Need to figure out how to handle sw for gk20a */
+	if (device->chipset == 0xea)
+		goto skip_sw_init;
+
 	ret = nouveau_object_new(nv_object(drm), NVDRM_CHAN, NVDRM_NVSW,
 				 nouveau_abi16_swclass(drm), NULL, 0, &object);
 	if (ret == 0) {
@@ -233,6 +242,7 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+skip_sw_init:
 	if (device->card_type < NV_C0) {
 		ret = nouveau_gpuobj_new(drm->device, NULL, 32, 0, 0,
 					&drm->notify);
-- 
1.9.1

Alexandre Courbot

2014-Mar-24 08:42 UTC

head link

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

Set the correct subdev/engine classes when GK20A (0xea) is probed.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
index 987edbc30a09..8509dd57de1f 100644
--- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
@@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
 		device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
 		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
 		break;
+	case 0xea:
+		device->cname = "GK20A";
+		device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
+		device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
+		device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
+		device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
+		device->oclass[NVDEV_SUBDEV_IBUS   ] = &nvea_ibus_oclass;
+		device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
+		device->oclass[NVDEV_SUBDEV_VM     ] = &nvc0_vmmgr_oclass;
+		device->oclass[NVDEV_SUBDEV_BAR    ] = &nvc0_bar_oclass;
+		device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass;
+		device->oclass[NVDEV_ENGINE_FIFO   ] =  nvea_fifo_oclass;
+		/* TODO will need an implementation for this at some point... */
+#if 0
+		device->oclass[NVDEV_ENGINE_SW     ] =  nvc0_software_oclass;
+#endif
+		device->oclass[NVDEV_ENGINE_GR     ] =  nvea_graph_oclass;
+		device->oclass[NVDEV_ENGINE_COPY2  ] = &nve0_copy2_oclass;
+		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
+		break;
 	case 0xf0:
 		device->cname = "GK110";
 		device->oclass[NVDEV_SUBDEV_VBIOS  ] = &nouveau_bios_oclass;
-- 
1.9.1

Lucas Stach

2014-Mar-24 13:19 UTC

head link

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

Hi Alexandre,

Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre
Courbot:> Hi everyone,
[...]> 
> A few lines of hacks (not included here) are still needed to deal with
cached
> mappings triggering external aborts and CPU/GPU memory coherency issues,
but I
> hope to understand and address these issues next.
For the coherency issue part you may want to look at my Nouveau on ARM
series. Most of it never made it upstream, as I lacked the time to work
further on this, but it solves the coherency issue from the kernel.

It does so by doing the necessary manual cache flushes/invalidates on
buffer access, so costs some performance. To avoid this you really want
to get writecombined mappings into the kernel<->userspace interface.
Simply mapping the pushbuf as WC/US has brought a 7% performance
increase in OpenArena when I last tested this. This test was done with
only one PCIe lane, so the perf increase may be even better with a more
adequate interconnect.

Regards,
Lucas
-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Thierry Reding

2014-Mar-24 21:49 UTC

head link

[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline

On Mon, Mar 24, 2014 at 05:42:23PM +0900, Alexandre Courbot
wrote:> Add a missing newline at the end of a DRM_INFO message.
> 
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Thierry Reding <treding at nvidia.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/50d176b9/attachment.sig>

Thierry Reding

2014-Mar-24 21:54 UTC

head link

[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A

On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot
wrote:> GK20A's timer is directly attached to the system timer and cannot be
> calibrated. Skip the calibration phase on that chip since the
> corresponding registers do not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> index c0bdd10358d7..822fe0d8a871 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>  	if (ret)
>  		return ret;
>  
> +	/* gk20a does not have the calibration registers */
> +	if (device->chipset == 0xea)
> +		goto skip_clk_init;
I'm concerned that this won't scale in the future. Perhaps a better
solution would be to add a "flags" or "features" field to
struct
nouveau_device along with feature bits such as HAS_CALIBRATION or
similar.

That way we don't have to touch this code for every new future Tegra
chip. Unless perhaps if there's a reason to expect things to change in
newer generations.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/16ab7985/attachment.sig>

Thierry Reding

2014-Mar-24 22:10 UTC

head link

[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3

On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote:
[...]> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
[...]>  static int
> -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object
*engine,
> -	      struct nouveau_oclass *oclass, void *data, u32 size,
> -	      struct nouveau_object **pobject)
> +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
>  {
[...]> -	/* BAR3 */
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
> -				&priv->bar[0].mem);
> -	mem = priv->bar[0].mem;
> +				&priv->bar[nr].mem);
> +	mem = priv->bar[nr].mem;
>  	if (ret)
>  		return ret;
>  
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
> -				&priv->bar[0].pgd);
> +				&priv->bar[nr].pgd);
>  	if (ret)
>  		return ret;
[...]> +static int
> +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object
*engine,
> +	      struct nouveau_oclass *oclass, void *data, u32 size,
> +	      struct nouveau_object **pobject)
> +{
[...]> +	/* BAR3 */
> +	if (has_bar3) {
> +		ret = nvc0_bar_init_vm(priv, 0, 3);
[...]> +	/* BAR1 */
> +	ret = nvc0_bar_init_vm(priv, 1, 1);
>  	if (ret)
>  		return ret;
The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It
is hard to see from the invocation what these numbers mean and therefore
distinguish which parameter is which.

Perhaps a slightly more readable way would be to pass in a pointer to a
structure as second parameter instead of the index into an array. So
it'd look somewhat like this:

	if (has_bar3) {
		ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3);
		...
	}
	...
	ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1);
	...

Unfortunately that would require a new type to be created for the bar[]
structures, so it'd be slightly more intrusive.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/e33e3552/attachment.sig>

Thierry Reding

2014-Mar-24 22:13 UTC

head link

[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists

On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot
wrote:> Some chips that use system memory exclusively (e.g. GK20A) do not
> expose 2 BAR regions. For them only BAR1 exists, and it should be used
> for USERD mapping. Do not map BAR3 if its resource does not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> index bdf594116f3f..d713eeb75b13 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
>  {
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nouveau_bar *bar;
> +	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
I don't think this variable is really necessary since you only use the
expression once anyway, but I don't feel very strongly about it, so
either way:

Reviewed-by: Thierry Reding <treding at nvidia.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/67a79728/attachment.sig>

Thierry Reding

2014-Mar-24 22:14 UTC

head link

[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support

On Mon, Mar 24, 2014 at 05:42:27PM +0900, Alexandre Courbot
wrote:> GK20A's FIFO is compatible with NVE0, but only features 128 channels
and
> 1 runlist.
> 
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35
++++++++++++++++++++++
>  drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
>  4 files changed, 38 insertions(+)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
Looks good to me:

Reviewed-by: Thierry Reding <treding at nvidia.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/9c7c7236/attachment.sig>

Thierry Reding

2014-Mar-24 22:34 UTC

head link

[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support

On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
[...]> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
[...]> +#include <subdev/ibus.h>
> +
> +struct nvea_ibus_priv {
> +	struct nouveau_ibus base;
> +};
> +
> +static void
> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
> +{
> +	nv_mask(priv, 0x137250, 0x3f, 0);
> +
> +	nv_mask(priv, 0x000200, 0x20, 0);
> +	udelay(20);
usleep_range()?
> +static void
> +nvea_ibus_intr(struct nouveau_subdev *subdev)
> +{
[...]> +	/* Acknowledge interrupt */
> +	nv_mask(priv, 0x12004c, 0x2, 0x2);
> +
> +	while (--retry >= 0) {
> +		command = nv_rd32(priv, 0x12004c) & 0x3f;
> +		if (command == 0)
> +			break;
> +	}
> +
> +	if (retry < 0)
> +		nv_warn(priv, "timeout waiting for ringmaster ack\n");
> +}
Perhaps I'm being paranoid, but this loop now depends on the frequency
of the various clocks involved and therefore might break at some point
if the frequencies get sufficiently high.

So a slightly safer implementation would use a proper timeout using a
combination of msecs_to_jiffies(), time_before() and usleep_range(),
like so:

	timeout = jiffies + msecs_to_jiffies(...);

	while (time_before(jiffies, timeout)) {
		command = nv_rd32(...) & 0x3f;
		if (command == 0)
			break;

		usleep_range(...);
	}

	if (time_after(jiffies, timeout))
		nv_warn(...);

This assumes that there's some known timeout after which the ringmaster
is expected to have acked the interrupt. On that note, I wonder if the
warning is accurate here: it's my understanding that writing 0x2 to the
register does acknowledge the interrupt, so the ringmaster does in fact
"clear" it rather than "acknowledge" it, doesn't it?

Although now that I mention it I seem to remember that this write is
actually sending a command to the ring master and perhaps waiting for
the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
adding a comment or so describing what this sequence does would be
appropriate here?

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/16ffb2ac/attachment-0001.sig>

Thierry Reding

2014-Mar-24 22:58 UTC

head link

[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware

On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
[...]> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index 6ef8bf181b2d..f997a18f5760 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent,
struct nouveau_object *engine,
>  	struct nvc0_graph_oclass *oclass = (void *)bclass;
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nvc0_graph_priv *priv;
> +	bool use_fw;
Perhaps "ext_fw" or "use_ext_fw" would be more accurate.
>  	int ret, i;
>  
> +	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW",
false);
> +
>  	ret = nouveau_graph_create(parent, engine, bclass,
> -				   (oclass->fecs.ucode != NULL), &priv);
> +				   (oclass->fecs.ucode != NULL) || use_fw,
> +				   &priv);
Or perhaps a more intuitive way would be to name the variable "enable"
and have something like:

	if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
		enable = oclass->fecs.ucode != NULL;
	else
		enable = true;

	ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
	...

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140324/8d29e34e/attachment.sig>

Thierry Reding

2014-Mar-24 23:02 UTC

head link

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

On Mon, Mar 24, 2014 at 05:42:31PM +0900, Alexandre Courbot
wrote:> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.
Perhaps this (and the comment in the code) should mention a unit. In
this case it's 0x40 words. Also, I think using a decimal number would
read easier here and in the comment. Furthermore, perhaps choosing a
byte as the unit would be even more intuitive.

But either way the code is correct, so:

Reviewed-by: Thierry Reding <treding at nvidia.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140325/589a12af/attachment.sig>

Thierry Reding

2014-Mar-24 23:10 UTC

head link

[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()

On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot
wrote:> GK20A does not embed a dedicated COPY engine and thus cannot allocate
> the copy channel that nouveau_accel_init() attempts to create. It also
> lacks any display hardware, so the creation of a software channel does
> not apply neither.
Perhaps this should be two separate patches?
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
b/drivers/gpu/drm/nouveau/nouveau_drm.c
[...]> +	if (device->chipset == 0xea) {
> +		/* gk20a does not have CE0/CE1 */
This would be another good candidate for a feature flag.
> +		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
> +		arg1 = 1;
> +	} else
>  	if (device->card_type >= NV_E0) {
The formatting here is somewhat weird. From a brief look I couldn't find
any indication that nouveau deviates from the standard coding style, so
this should be:

	} else if (...) {
> +	/* Need to figure out how to handle sw for gk20a */
> +	if (device->chipset == 0xea)
> +		goto skip_sw_init;
The commit message makes it sound like SW isn't needed since gk20a
"lacks any display hardware". In that case the comment here
doesn't make
much sense.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140325/1f743847/attachment.sig>

Thierry Reding

2014-Mar-24 23:11 UTC

head link

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

On Mon, Mar 24, 2014 at 05:42:34PM +0900, Alexandre Courbot wrote:
[...]> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
[...]> +		/* TODO will need an implementation for this at some point... */
Do we? If so the commit message in 11/12 is misleading.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20140325/268fe079/attachment.sig>

Ben Skeggs

2014-Mar-26 04:22 UTC

head link

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at
nvidia.com> wrote:> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.
>
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index f997a18f5760..367e72daf8b1 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32
fuc_base,
>                         nv_wr32(priv, fuc_base + 0x0188, i >> 6);
>                 nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
>         }
> +
> +       /* code must be padded to 0x40 */
> +       for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)"for (; i & 0x3f; i++)" would work just as well :)
> +               nv_wr32(priv, fuc_base + 0x0184, 0);
>  }
>
>  static void
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

Ben Skeggs

2014-Mar-26 04:24 UTC

head link

[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at
nvidia.com> wrote:> Add a GR device for GK20A based on NVE4, with the correct classes
> definitions (GK20A's 3D class is 0xa297).
>
> Most of the NVE4 code can be used on GK20A, so make relevant bits of
> NVE4 available to other chips as well.This will need a bit of a rebase on top of the tree I mentioned
earlier (also queued for drm-next now), where I've further split out
and named the various chunks of state.

Does GK104 match entirely correctly, or just happen to work?  I could
probably hunt down the GK20A netlist images and check that actually :)
>
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
>  drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75
++++++++++++++++++++++
>  .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
>  6 files changed, 89 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
>
> diff --git a/drivers/gpu/drm/nouveau/Makefile
b/drivers/gpu/drm/nouveau/Makefile
> index 708d2e33835f..032a4744f843 100644
> --- a/drivers/gpu/drm/nouveau/Makefile
> +++ b/drivers/gpu/drm/nouveau/Makefile
> @@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
>  nouveau-y += core/engine/graph/nvd7.o
>  nouveau-y += core/engine/graph/nvd9.o
>  nouveau-y += core/engine/graph/nve4.o
> +nouveau-y += core/engine/graph/nvea.o
>  nouveau-y += core/engine/graph/nvf0.o
>  nouveau-y += core/engine/graph/nv108.o
>  nouveau-y += core/engine/mpeg/nv31.o
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> index e2de73ee5eee..3904073f860d 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> @@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
>         {}
>  };
>
> -static void
> +void
>  nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx
*info)
>  {
>         u32 magic[GPC_MAX][2];
> @@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv,
struct nvc0_grctx *info)
>         nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_grctx_init_hub[] = {
>         nvc0_grctx_init_base,
>         nve4_grctx_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> index b0ab6de270b2..904f09b540c5 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> @@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv
*);
>  void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
>  void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
>
> +void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx
*);
> +
>  extern struct nouveau_oclass *nvc0_grctx_oclass;
>  extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
>  extern struct nvc0_graph_init nvc0_grctx_init_base[];
> @@ -237,12 +239,17 @@ extern struct nvc0_graph_init
nvc0_grctx_init_gpc_1[];
>  extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
>  extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
>  extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
> +extern struct nvc0_graph_init nve4_grctx_init_icmd[];
> +
> +extern struct nvc0_graph_init *nve4_grctx_init_hub[];
> +extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
>
>  extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
>  extern struct nvc0_graph_init nvc0_grctx_init_902d[];
>  extern struct nvc0_graph_init nvc0_grctx_init_9039[];
>  extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
>  extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
> +extern struct nvc0_graph_init nve4_grctx_init_a097[];
>
>  void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx
*);
>  void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
> @@ -277,6 +284,8 @@ extern struct nvc0_graph_init
nvf0_grctx_init_unk60xx[];
>
>  extern struct nouveau_oclass *nv108_grctx_oclass;
>
> +extern struct nvc0_graph_init *nve4_graph_init_mmio[];
> +
>  #define mmio_data(s,a,p) do {                                             
\
>         info->buffer[info->buffer_nr] = round_up(info->addr, (a));
\
>         info->addr = info->buffer[info->buffer_nr++] + (s);       
\
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> index 05ec09c88517..442857c5c120 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> @@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
>         return nvc0_graph_init_ctxctl(priv);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_graph_init_mmio[] = {
>         nve4_graph_init_regs,
>         nvc0_graph_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> new file mode 100644
> index 000000000000..d5e6a1adcacb
> --- /dev/null
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> @@ -0,0 +1,75 @@
> +/*
> + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
"Software"),
> + * to deal in the Software without restriction, including without
limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included
in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "nvc0.h"
> +
> +static struct nouveau_oclass
> +nvea_graph_sclass[] = {
> +       { 0x902d, &nouveau_object_ofuncs },
> +       { 0xa040, &nouveau_object_ofuncs },
> +       { 0xa297, &nouveau_object_ofuncs },
> +       { 0xa0c0, &nouveau_object_ofuncs },
> +       {}
> +};
> +
> +static struct nvc0_graph_mthd
> +nvea_grctx_init_mthd[] = {
> +       { 0xa297, nve4_grctx_init_a097, },
> +       { 0x902d, nvc0_grctx_init_902d, },
> +       { 0x902d, nvc0_grctx_init_mthd_magic, },
> +       {}
> +};
> +
> +struct nouveau_oclass *
> +nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
> +       .base.handle = NV_ENGCTX(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_context_ctor,
> +               .dtor = nvc0_graph_context_dtor,
> +               .init = _nouveau_graph_context_init,
> +               .fini = _nouveau_graph_context_fini,
> +               .rd32 = _nouveau_graph_context_rd32,
> +               .wr32 = _nouveau_graph_context_wr32,
> +       },
> +       .main = nve4_grctx_generate_main,
> +       .mods = nve4_grctx_generate_mods,
> +       .unkn = nve4_grctx_generate_unkn,
> +       .hub  = nve4_grctx_init_hub,
> +       .gpc  = nve4_grctx_init_gpc,
> +       .icmd = nve4_grctx_init_icmd,
> +       .mthd = nvea_grctx_init_mthd,
> +}.base;
> +
> +
> +struct nouveau_oclass *
> +nvea_graph_oclass = &(struct nvc0_graph_oclass) {
> +       .base.handle = NV_ENGINE(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_ctor,
> +               .dtor = nvc0_graph_dtor,
> +               .init = nve4_graph_init,
> +               .fini = _nouveau_graph_fini,
> +       },
> +       .cclass = &nvea_grctx_oclass,
> +       .sclass = nvea_graph_sclass,
> +       .mmio = nve4_graph_init_mmio,
> +}.base;
> diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> index 97705618de97..632d7e172172 100644
> --- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> +++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> @@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
>  extern struct nouveau_oclass *nvd7_graph_oclass;
>  extern struct nouveau_oclass *nvd9_graph_oclass;
>  extern struct nouveau_oclass *nve4_graph_oclass;
> +extern struct nouveau_oclass *nvea_graph_oclass;
>  extern struct nouveau_oclass *nvf0_graph_oclass;
>  extern struct nouveau_oclass *nv108_graph_oclass;
>
> --
> 1.9.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Ben Skeggs

2014-Mar-26 04:28 UTC

head link

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at
nvidia.com> wrote:> Set the correct subdev/engine classes when GK20A (0xea) is probed.
>
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20
++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> index 987edbc30a09..8509dd57de1f 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
>                 device->oclass[NVDEV_ENGINE_PPP    ] =
&nvc0_ppp_oclass;
>                 device->oclass[NVDEV_ENGINE_PERFMON] =
&nve0_perfmon_oclass;
>                 break;
> +       case 0xea:
> +               device->cname = "GK20A";
> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
> +               device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
> +               device->oclass[NVDEV_SUBDEV_TIMER  ] =
&nv04_timer_oclass;As per note on the PTIMER patch, can just switch this to
"gk20a_timer_oclass" on the latest code.
> +               device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
> +               device->oclass[NVDEV_SUBDEV_IBUS   ] =
&nvea_ibus_oclass;
> +               device->oclass[NVDEV_SUBDEV_INSTMEM] =
nv50_instmem_oclass;
> +               device->oclass[NVDEV_SUBDEV_VM     ] =
&nvc0_vmmgr_oclass;
> +               device->oclass[NVDEV_SUBDEV_BAR    ] =
&nvc0_bar_oclass;
> +               device->oclass[NVDEV_ENGINE_DMAOBJ ] =
&nvd0_dmaeng_oclass;
> +               device->oclass[NVDEV_ENGINE_FIFO   ] = 
nvea_fifo_oclass;
> +               /* TODO will need an implementation for this at some
point... */
> +#if 0
> +               device->oclass[NVDEV_ENGINE_SW     ] = 
nvc0_software_oclass;
> +#endif
> +               device->oclass[NVDEV_ENGINE_GR     ] = 
nvea_graph_oclass;
> +               device->oclass[NVDEV_ENGINE_COPY2  ] =
&nve0_copy2_oclass;
> +               device->oclass[NVDEV_ENGINE_PERFMON] =
&nve0_perfmon_oclass;
> +               break;
>         case 0xf0:
>                 device->cname = "GK110";
>                 device->oclass[NVDEV_SUBDEV_VBIOS  ] =
&nouveau_bios_oclass;
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

Alexandre Courbot

2014-Mar-26 06:33 UTC

head link

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

Hi Lucas,

On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach at pengutronix.de>
wrote:> Hi Alexandre,
>
> Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
>> Hi everyone,
> [...]
>>
>> A few lines of hacks (not included here) are still needed to deal with
cached
>> mappings triggering external aborts and CPU/GPU memory coherency
issues, but I
>> hope to understand and address these issues next.
>
> For the coherency issue part you may want to look at my Nouveau on ARM
> series. Most of it never made it upstream, as I lacked the time to work
> further on this, but it solves the coherency issue from the kernel.
Oh, thanks for pointing this out, it will probably be most useful.
Shall I assume the patches at
https://www.mail-archive.com/nouveau at lists.freedesktop.org/msg13557.html
are up-to-date? Would you mind if I include the relevant patches of
yours in the next iteration of this series?
>
> It does so by doing the necessary manual cache flushes/invalidates on
> buffer access, so costs some performance. To avoid this you really want
> to get writecombined mappings into the kernel<->userspace interface.
> Simply mapping the pushbuf as WC/US has brought a 7% performance
> increase in OpenArena when I last tested this. This test was done with
> only one PCIe lane, so the perf increase may be even better with a more
> adequate interconnect.
Interestingly if I allow writecombined mappings in the kernel I get
faults when attempting the read the mapped area:

[   78.074854] Unhandled fault: external abort on non-linefetch
(0x1008) at 0xf003e010
...
[   78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>]
(nouveau_fence_update+0x5c/0x80)
[   78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>]
(nouveau_fence_done+0x18/0x28)
[   78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>]
(ttm_bo_wait+0x104/0x184)
[   78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>]
(nouveau_gem_ioctl_cpu_prep+0x40/0xe8)
[   78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from
[<c029fd5c>] (drm_ioctl+0x404/0x4b8)
[   78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>]
(nouveau_drm_ioctl+0x54/0x80)
[   78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>]
(do_vfs_ioctl+0x3f0/0x5bc)
[   78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>]
(SyS_ioctl+0x34/0x5c)
[   78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>]
(ret_fast_syscall+0x0/0x30)

To avoid these I need to set the VRAM default_caching to
TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO
being accessed through the BAR, they are correctly considered as IO
memory and mapped using ttm_bo_ioremap(), so it really seems to be
unhappy with the WC mapping itself.

Note that if I go ahead and force the use of pgprot_writecombine() in
ttm_io_prot() to get writecombined user-space mappings, pure DRM
programs that map a buffer and try to read it fail similarly, while
Mesa's glReadPixels() seems to be happy. I'm not sure what it does
differently here.

Cheers,
Alex.

Reasonably Related Threads

Search for more possibly parallel threads

Nouveau - Mar 2014 - [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline

[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A

[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists

[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3

[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support

[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support

[Nouveau] [PATCH 07/12] drm/nouveau/fb: add GK20A support

[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support

[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline

[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A

[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3

[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists

[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support

[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support

[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time

[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support

[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A

[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

Reasonably Related Threads