thr3ads.net - Nouveau - [Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Alexandre Courbot

2015-Jun-23 06:16 UTC

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

Second version of this patchset. Not many changes since first version - I hope
this means the changes are not too controversial.

Changes since v1:
- Removed lookup for previous FW files in "nouveau/"
- Went back to using request_firmware() since we only try to load one file

Original cover letter follows:

GM20B is the GPU of the upcoming Tegra X1 SoC. This series adds initial support
for it, based on a rework of the already-supported GK20A. It also introduces
support for NVIDIA-provided firmware files, which is why I have added a few
NVIDIA people who are relevant to this discussion.

The first patch adds support for loading the FECS and GPCCS firmwares from
firmware files officially released by NVIDIA. As you know such firmwares will
soon become a necessity for newer GPUs because some falcons will require signed
firmware to operate. In addition there is no reverse-engineered version of the
GK20A firmwares yet, so since an external file is needed anyway, it may as well
be provided officially. NVIDIA plans to release firmwares as one file per binary
to keep things simple. The layout will be
nvidia/<gpu>/<firmware>.bin, so for
GK20A FECS/GPCCS we have:

nvidia/gk20a/fecs_inst.bin (aka fuc409c)
nvidia/gk20a/fecs_data.bin (aka fuc409d)
nvidia/gk20a/gpccs_inst.bin (aka fuc41ac)
nvidia/gk20a/gpccs_data.bin (aka fuc41ad)

All firmware files listed in this patchset are clean for release, and I am just
waiting for a community ack of the layout to send a patch to linux-firmware.

The second patch reworks existing GK20A support to make it closer to what our
nvgpu driver does. Support so far was heavily based on GK104, which somehow made
me feel uneasy - and quite scared after I looked more closely at what nvgpu
does. In particular the GK104 MMIO bundles differed significantly from what
nvgpu does. This change aligns things and (probably less significant, but still
safer) reorders the initialization sequence to match the one of nvgpu.

You will note that the MMIO bundles now come as firmware files of their own. I
am not sure the community will be pleased with an increase of firmware files,
however the rationale for this is as follows:
- These initialization sequences are related to the firmwares, so it makes sense
  to distribute them under the same medium
- If NVIDIA needs to update the firmwares for some reason, it can atomically
  update the MMIO bundles and provide a coherent set, instead of having to
  introduce versioning into the firmware and driver
- For IP reasons, I as an NVIDIA employee cannot extract these register
  sequences and link them into Nouveau
- These are just a bunch of register address/value pairs anyway

The new firmware files introduced are:

nvidia/gk20a/sw_nonctx.bin (gr_pack_mmio)
nvidia/gk20a/sw_ctx.bin (grctx_pack_hub, grctx_pack_gpc, grctx_pack_zcull,
                         grctx_pack_tpc, grctx_pack_ppc)
nvidia/gk20a/sw_bundle_init.bin (grctx_pack_icmd)
nvidia/gk20a/sw_method_init.bin (grctx_pack_mthd)

Third patch is trivial and adds the GM20B FIFO device.

Fourth patch adds GM20B GR based on the reworked GK20A support. GM20B will rely
on the same firmware files as GK20A (also clean for release). Note that this is
not full support yet for released devices, which will require secure boot. This
will be my focus once this patchset is merged (Deepak got a working version,
but there is still a lot of work to do on it before it is upstreamable).

The last two patches recognize GM20B at the device and platform level. Nothing
really exciting.

I hope the addition of firmware files will not become too controversial. If it
does, I have good arguments to support it. ;) Besides the GK20A rework that
probably few people care about, the point is the addition of a basic layout for
the firmwares that NVIDIA will officially release to finally support secure
boot, and I would like to make sure we get this right.

Alexandre Courbot (6):
  gr: use NVIDIA-provided external firmwares
  gr/gk20a: use same initialization sequence as nvgpu
  fifo: add GM20B fifo
  gr: add GM20B support
  device: recognize GM20B
  platform: recognize GM20B

 drm/nouveau/include/nvkm/engine/fifo.h |   1 +
 drm/nouveau/include/nvkm/engine/gr.h   |   1 +
 drm/nouveau/nouveau_platform.c         |   1 +
 drm/nouveau/nvkm/engine/device/gm100.c |  20 ++
 drm/nouveau/nvkm/engine/fifo/Kbuild    |   1 +
 drm/nouveau/nvkm/engine/fifo/gk104.h   |   4 +
 drm/nouveau/nvkm/engine/fifo/gm204.c   |   2 +-
 drm/nouveau/nvkm/engine/fifo/gm20b.c   |  34 ++++
 drm/nouveau/nvkm/engine/gr/Kbuild      |   2 +
 drm/nouveau/nvkm/engine/gr/ctxgf100.h  |   7 +
 drm/nouveau/nvkm/engine/gr/ctxgk20a.c  |  65 +++++--
 drm/nouveau/nvkm/engine/gr/ctxgm107.c  |   2 +-
 drm/nouveau/nvkm/engine/gr/ctxgm204.c  |   4 +-
 drm/nouveau/nvkm/engine/gr/ctxgm20b.c  | 110 +++++++++++
 drm/nouveau/nvkm/engine/gr/gf100.c     |  35 ++--
 drm/nouveau/nvkm/engine/gr/gf100.h     |  18 ++
 drm/nouveau/nvkm/engine/gr/gk20a.c     | 336 +++++++++++++++++++++++++++++++--
 drm/nouveau/nvkm/engine/gr/gk20a.h     |  35 ++++
 drm/nouveau/nvkm/engine/gr/gm20b.c     |  84 +++++++++
 19 files changed, 716 insertions(+), 46 deletions(-)
 create mode 100644 drm/nouveau/nvkm/engine/fifo/gm20b.c
 create mode 100644 drm/nouveau/nvkm/engine/gr/ctxgm20b.c
 create mode 100644 drm/nouveau/nvkm/engine/gr/gk20a.h
 create mode 100644 drm/nouveau/nvkm/engine/gr/gm20b.c

-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 1/6] gr: use NVIDIA-provided external firmwares

NVIDIA will officially start providing GR firmwares through
linux-firmware for GPUs that require it. Change the GR firmware lookup
function to use these files.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/nvkm/engine/gr/gf100.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drm/nouveau/nvkm/engine/gr/gf100.c
b/drm/nouveau/nvkm/engine/gr/gf100.c
index ca11ddb6ed46..454080339572 100644
--- a/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -1550,18 +1550,25 @@ gf100_gr_ctor_fw(struct gf100_gr_priv *priv, const char
*fwname,
 {
 	struct nvkm_device *device = nv_device(priv);
 	const struct firmware *fw;
-	char f[32];
+	char f[64];
+	char cname[16];
 	int ret;
+	int i;
+
+	/* Convert device name to lowercase */
+	strncpy(cname, device->cname, sizeof(cname));
+	cname[sizeof(cname) - 1] = '\0';
+	i = strlen(cname);
+	while (i) {
+		--i;
+		cname[i] = tolower(cname[i]);
+	}
 
-	snprintf(f, sizeof(f), "nouveau/nv%02x_%s", device->chipset,
fwname);
+	snprintf(f, sizeof(f), "nvidia/%s/%s.bin", cname, fwname);
 	ret = request_firmware(&fw, f, nv_device_base(device));
 	if (ret) {
-		snprintf(f, sizeof(f), "nouveau/%s", fwname);
-		ret = request_firmware(&fw, f, nv_device_base(device));
-		if (ret) {
-			nv_error(priv, "failed to load %s\n", fwname);
-			return ret;
-		}
+		nv_error(priv, "failed to load %s\n", fwname);
+		return ret;
 	}
 
 	fuc->size = fw->size;
@@ -1615,10 +1622,10 @@ gf100_gr_ctor(struct nvkm_object *parent, struct
nvkm_object *engine,
 
 	if (use_ext_fw) {
 		nv_info(priv, "using external firmware\n");
-		if (gf100_gr_ctor_fw(priv, "fuc409c", &priv->fuc409c) ||
-		    gf100_gr_ctor_fw(priv, "fuc409d", &priv->fuc409d) ||
-		    gf100_gr_ctor_fw(priv, "fuc41ac", &priv->fuc41ac) ||
-		    gf100_gr_ctor_fw(priv, "fuc41ad", &priv->fuc41ad))
+		if (gf100_gr_ctor_fw(priv, "fecs_inst", &priv->fuc409c) ||
+		    gf100_gr_ctor_fw(priv, "fecs_data", &priv->fuc409d) ||
+		    gf100_gr_ctor_fw(priv, "gpccs_inst", &priv->fuc41ac) ||
+		    gf100_gr_ctor_fw(priv, "gpccs_data", &priv->fuc41ad))
 			return -ENODEV;
 		priv->firmware = true;
 	}
-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 2/6] gr/gk20a: use same initialization sequence as nvgpu

GK20A's initialization was based on GK104, but differences exist in the
way the initial context is built and the initialization process itself.

This patch follows the same initialization sequence as nvgpu performs
to avoid bad surprises. Since the register bundles initialization also
differ considerably from GK104, the register packs are now loaded from
firmware files, again similarly to what is done with nvgpu.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/nvkm/engine/gr/ctxgk20a.c |  65 +++++--
 drm/nouveau/nvkm/engine/gr/gf100.c    |   3 +-
 drm/nouveau/nvkm/engine/gr/gf100.h    |  12 ++
 drm/nouveau/nvkm/engine/gr/gk20a.c    | 336 ++++++++++++++++++++++++++++++++--
 drm/nouveau/nvkm/engine/gr/gk20a.h    |  35 ++++
 5 files changed, 421 insertions(+), 30 deletions(-)
 create mode 100644 drm/nouveau/nvkm/engine/gr/gk20a.h

diff --git a/drm/nouveau/nvkm/engine/gr/ctxgk20a.c
b/drm/nouveau/nvkm/engine/gr/ctxgk20a.c
index 2f241f6f0f0a..3fe080e31a86 100644
--- a/drm/nouveau/nvkm/engine/gr/ctxgk20a.c
+++ b/drm/nouveau/nvkm/engine/gr/ctxgk20a.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ * Copyright (c) 2014-2015, NVIDIA CORPORATION. All rights reserved.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the
"Software"),
@@ -19,14 +19,56 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  * DEALINGS IN THE SOFTWARE.
  */
+
 #include "ctxgf100.h"
+#include "gk20a.h"
+
+#include <subdev/mc.h>
+
+static void
+gk20a_grctx_generate_main(struct gf100_gr_priv *priv, struct gf100_grctx *info)
+{
+	struct gf100_grctx_oclass *oclass = (void *)nv_engine(priv)->cclass;
+	int idle_timeout_save;
+	int i;
+
+	gf100_gr_mmio(priv, priv->fuc_sw_ctx);
+
+	gf100_gr_wait_idle(priv);
+
+	idle_timeout_save = nv_rd32(priv, 0x404154);
+	nv_wr32(priv, 0x404154, 0x00000000);
+
+	oclass->attrib(info);
+
+	oclass->unkn(priv);
+
+	gf100_grctx_generate_tpcid(priv);
+	gf100_grctx_generate_r406028(priv);
+	gk104_grctx_generate_r418bb8(priv);
+	gf100_grctx_generate_r406800(priv);
+
+	for (i = 0; i < 8; i++)
+		nv_wr32(priv, 0x4064d0 + (i * 0x04), 0x00000000);
+
+	nv_wr32(priv, 0x405b00, (priv->tpc_total << 8) | priv->gpc_nr);
+
+	gk104_grctx_generate_rop_active_fbps(priv);
+
+	nv_mask(priv, 0x5044b0, 0x8000000, 0x8000000);
+
+	gf100_gr_wait_idle(priv);
+
+	nv_wr32(priv, 0x404154, idle_timeout_save);
+	gf100_gr_wait_idle(priv);
+
+	gf100_gr_mthd(priv, priv->fuc_method);
+	gf100_gr_wait_idle(priv);
 
-static const struct gf100_gr_pack
-gk20a_grctx_pack_mthd[] = {
-	{ gk104_grctx_init_a097_0, 0xa297 },
-	{ gf100_grctx_init_902d_0, 0x902d },
-	{}
-};
+	gf100_gr_icmd(priv, priv->fuc_bundle);
+	oclass->pagepool(info);
+	oclass->bundle(info);
+}
 
 struct nvkm_oclass *
 gk20a_grctx_oclass = &(struct gf100_grctx_oclass) {
@@ -39,15 +81,8 @@ gk20a_grctx_oclass = &(struct gf100_grctx_oclass) {
 		.rd32 = _nvkm_gr_context_rd32,
 		.wr32 = _nvkm_gr_context_wr32,
 	},
-	.main  = gk104_grctx_generate_main,
+	.main  = gk20a_grctx_generate_main,
 	.unkn  = gk104_grctx_generate_unkn,
-	.hub   = gk104_grctx_pack_hub,
-	.gpc   = gk104_grctx_pack_gpc,
-	.zcull = gf100_grctx_pack_zcull,
-	.tpc   = gk104_grctx_pack_tpc,
-	.ppc   = gk104_grctx_pack_ppc,
-	.icmd  = gk104_grctx_pack_icmd,
-	.mthd  = gk20a_grctx_pack_mthd,
 	.bundle = gk104_grctx_generate_bundle,
 	.bundle_size = 0x1800,
 	.bundle_min_gpm_fifo_depth = 0x62,
diff --git a/drm/nouveau/nvkm/engine/gr/gf100.c
b/drm/nouveau/nvkm/engine/gr/gf100.c
index 454080339572..288423b84667 100644
--- a/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -1537,7 +1537,7 @@ gf100_gr_init(struct nvkm_object *object)
 	return gf100_gr_init_ctxctl(priv);
 }
 
-static void
+void
 gf100_gr_dtor_fw(struct gf100_gr_fuc *fuc)
 {
 	kfree(fuc->data);
@@ -1690,6 +1690,7 @@ gf100_gr_ctor(struct nvkm_object *parent, struct
nvkm_object *engine,
 		break;
 	case 0xd7:
 	case 0xd9: /* 1/0/0/0, 1 */
+	case 0xea: /* gk20a */
 		priv->magic_not_rop_nr = 0x01;
 		break;
 	}
diff --git a/drm/nouveau/nvkm/engine/gr/gf100.h
b/drm/nouveau/nvkm/engine/gr/gf100.h
index c9533fdac4fc..972efd7b7934 100644
--- a/drm/nouveau/nvkm/engine/gr/gf100.h
+++ b/drm/nouveau/nvkm/engine/gr/gf100.h
@@ -76,6 +76,15 @@ struct gf100_gr_priv {
 	struct gf100_gr_fuc fuc41ad;
 	bool firmware;
 
+	/*
+	 * Used if the register packs are loaded from NVIDIA fw instead of
+	 * using hardcoded arrays.
+	 */
+	struct gf100_gr_pack *fuc_sw_nonctx;
+	struct gf100_gr_pack *fuc_sw_ctx;
+	struct gf100_gr_pack *fuc_bundle;
+	struct gf100_gr_pack *fuc_method;
+
 	struct gf100_gr_zbc_color zbc_color[NVKM_LTC_MAX_ZBC_CNT];
 	struct gf100_gr_zbc_depth zbc_depth[NVKM_LTC_MAX_ZBC_CNT];
 
@@ -116,6 +125,9 @@ void gf100_gr_context_dtor(struct nvkm_object *);
 
 void gf100_gr_ctxctl_debug(struct gf100_gr_priv *);
 
+void gf100_gr_dtor_fw(struct gf100_gr_fuc *);
+int  gf100_gr_ctor_fw(struct gf100_gr_priv *, const char *,
+		      struct gf100_gr_fuc *);
 u64  gf100_gr_units(struct nvkm_gr *);
 int  gf100_gr_ctor(struct nvkm_object *, struct nvkm_object *,
 		     struct nvkm_oclass *, void *data, u32 size,
diff --git a/drm/nouveau/nvkm/engine/gr/gk20a.c
b/drm/nouveau/nvkm/engine/gr/gk20a.c
index 40ff5eb9180c..d27ef3ea2226 100644
--- a/drm/nouveau/nvkm/engine/gr/gk20a.c
+++ b/drm/nouveau/nvkm/engine/gr/gk20a.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ * Copyright (c) 2014-2015, NVIDIA CORPORATION. All rights reserved.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the
"Software"),
@@ -19,10 +19,11 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  * DEALINGS IN THE SOFTWARE.
  */
-#include "gf100.h"
+#include "gk20a.h"
 #include "ctxgf100.h"
 
 #include <nvif/class.h>
+#include <subdev/timer.h>
 
 static struct nvkm_oclass
 gk20a_gr_sclass[] = {
@@ -33,17 +34,324 @@ gk20a_gr_sclass[] = {
 	{}
 };
 
+static void
+gk20a_gr_init_dtor(struct gf100_gr_pack *pack)
+{
+	vfree(pack);
+}
+
+struct gk20a_fw_av
+{
+	u32 addr;
+	u32 data;
+};
+
+static struct gf100_gr_pack *
+gk20a_gr_av_to_init(struct gf100_gr_fuc *fuc)
+{
+	struct gf100_gr_init *init;
+	struct gf100_gr_pack *pack;
+	const int nent = (fuc->size / sizeof(struct gk20a_fw_av));
+	int i;
+
+	pack = vzalloc((sizeof(*pack) * 2) + (sizeof(*init) * (nent + 1)));
+	if (!pack)
+		return ERR_PTR(-ENOMEM);
+
+	init = (void *)(pack + 2);
+
+	pack[0].init = init;
+
+	for (i = 0; i < nent; i++) {
+		struct gf100_gr_init *ent = &init[i];
+		struct gk20a_fw_av *av = &((struct gk20a_fw_av *)fuc->data)[i];
+
+		ent->addr = av->addr;
+		ent->data = av->data;
+		ent->count = 1;
+		ent->pitch = 1;
+	}
+
+	return pack;
+}
+
+struct gk20a_fw_aiv
+{
+	u32 addr;
+	u32 index;
+	u32 data;
+};
+
+static struct gf100_gr_pack *
+gk20a_gr_aiv_to_init(struct gf100_gr_fuc *fuc)
+{
+	struct gf100_gr_init *init;
+	struct gf100_gr_pack *pack;
+	const int nent = (fuc->size / sizeof(struct gk20a_fw_aiv));
+	int i;
+
+	pack = vzalloc((sizeof(*pack) * 2) + (sizeof(*init) * (nent + 1)));
+	if (!pack)
+		return ERR_PTR(-ENOMEM);
+
+	init = (void *)(pack + 2);
+
+	pack[0].init = init;
+
+	for (i = 0; i < nent; i++) {
+		struct gf100_gr_init *ent = &init[i];
+		struct gk20a_fw_aiv *av = &((struct gk20a_fw_aiv *)fuc->data)[i];
+
+		ent->addr = av->addr;
+		ent->data = av->data;
+		ent->count = 1;
+		ent->pitch = 1;
+	}
+
+	return pack;
+}
+
+static struct gf100_gr_pack *
+gk20a_gr_av_to_method(struct gf100_gr_fuc *fuc)
+{
+	struct gf100_gr_init *init;
+	struct gf100_gr_pack *pack;
+	/* We don't suppose we will initialize more than 16 classes here... */
+	static const unsigned int max_classes = 16;
+	const int nent = (fuc->size / sizeof(struct gk20a_fw_av));
+	int i, classidx = 0;
+	u32 prevclass = 0;
+
+	pack = vzalloc((sizeof(*pack) * max_classes) +
+		       (sizeof(*init) * (nent + 1)));
+	if (!pack)
+		return ERR_PTR(-ENOMEM);
+
+	init = (void *)(pack + max_classes);
+
+	for (i = 0; i < nent; i++) {
+		struct gf100_gr_init *ent = &init[i];
+		struct gk20a_fw_av *av = &((struct gk20a_fw_av *)fuc->data)[i];
+		u32 class = av->addr & 0xffff;
+		u32 addr = (av->addr & 0xffff0000) >> 14;
+
+		if (prevclass != class) {
+			pack[classidx].init = ent;
+			pack[classidx].type = class;
+			prevclass = class;
+			if (++classidx >= max_classes) {
+				vfree(pack);
+				return ERR_PTR(-ENOSPC);
+			}
+		}
+
+		ent->addr = addr;
+		ent->data = av->data;
+		ent->count = 1;
+		ent->pitch = 1;
+	}
+
+	return pack;
+}
+
+static int
+gk20a_gr_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
+	      struct nvkm_oclass *oclass, void *data, u32 size,
+	      struct nvkm_object **pobject)
+{
+	int err;
+	struct gf100_gr_priv *priv;
+	struct gf100_gr_fuc fuc;
+
+	err = gf100_gr_ctor(parent, engine, oclass, data, size, pobject);
+	if (err)
+		return err;
+
+	priv = (void *)*pobject;
+
+	err = gf100_gr_ctor_fw(priv, "sw_nonctx", &fuc);
+	if (err)
+		return err;
+	priv->fuc_sw_nonctx = gk20a_gr_av_to_init(&fuc);
+	gf100_gr_dtor_fw(&fuc);
+	if (IS_ERR(priv->fuc_sw_nonctx))
+		return PTR_ERR(priv->fuc_sw_nonctx);
+
+	err = gf100_gr_ctor_fw(priv, "sw_ctx", &fuc);
+	if (err)
+		return err;
+	priv->fuc_sw_ctx = gk20a_gr_aiv_to_init(&fuc);
+	gf100_gr_dtor_fw(&fuc);
+	if (IS_ERR(priv->fuc_sw_ctx))
+		return PTR_ERR(priv->fuc_sw_ctx);
+
+	err = gf100_gr_ctor_fw(priv, "sw_bundle_init", &fuc);
+	if (err)
+		return err;
+	priv->fuc_bundle = gk20a_gr_av_to_init(&fuc);
+	gf100_gr_dtor_fw(&fuc);
+	if (IS_ERR(priv->fuc_bundle))
+		return PTR_ERR(priv->fuc_bundle);
+
+	err = gf100_gr_ctor_fw(priv, "sw_method_init", &fuc);
+	if (err)
+		return err;
+	priv->fuc_method = gk20a_gr_av_to_method(&fuc);
+	gf100_gr_dtor_fw(&fuc);
+	if (IS_ERR(priv->fuc_method))
+		return PTR_ERR(priv->fuc_method);
+
+	return 0;
+}
+
+static void
+gk20a_gr_dtor(struct nvkm_object *object)
+{
+	struct gf100_gr_priv *priv = (void *)object;
+
+	gk20a_gr_init_dtor(priv->fuc_method);
+	gk20a_gr_init_dtor(priv->fuc_bundle);
+	gk20a_gr_init_dtor(priv->fuc_sw_ctx);
+	gk20a_gr_init_dtor(priv->fuc_sw_nonctx);
+
+	gf100_gr_dtor(object);
+}
+
+static int
+gk20a_gr_wait_mem_scrubbing(struct gf100_gr_priv *priv)
+{
+	if (!nv_wait(priv, 0x40910c, 0x6, 0x0)) {
+		nv_error(priv, "FECS mem scrubbing timeout\n");
+		return -ETIMEDOUT;
+	}
+
+	if (!nv_wait(priv, 0x41a10c, 0x6, 0x0)) {
+		nv_error(priv, "GPCCS mem scrubbing timeout\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static void
+gk20a_gr_set_hww_esr_report_mask(struct gf100_gr_priv *priv)
+{
+	nv_wr32(priv, 0x419e44, 0x1ffffe);
+	nv_wr32(priv, 0x419e4c, 0x7f);
+}
+
+static int
+gk20a_gr_init(struct nvkm_object *object)
+{
+	struct gk20a_gr_oclass *oclass = (void *)object->oclass;
+	struct gf100_gr_priv *priv = (void *)object;
+	const u32 magicgpc918 = DIV_ROUND_UP(0x00800000, priv->tpc_total);
+	u32 data[TPC_MAX / 8] = {};
+	u8  tpcnr[GPC_MAX];
+	int gpc, tpc;
+	int ret, i;
+
+	ret = nvkm_gr_init(&priv->base);
+	if (ret)
+		return ret;
+
+	/* Clear SCC RAM */
+	nv_wr32(priv, 0x40802c, 0x1);
+
+	gf100_gr_mmio(priv, priv->fuc_sw_nonctx);
+
+	ret = gk20a_gr_wait_mem_scrubbing(priv);
+	if (ret)
+		return ret;
+
+	ret = gf100_gr_wait_idle(priv);
+	if (ret)
+		return ret;
+
+	/* MMU debug buffer */
+	nv_wr32(priv, 0x100cc8, priv->unk4188b4->addr >> 8);
+	nv_wr32(priv, 0x100ccc, priv->unk4188b8->addr >> 8);
+
+	if (oclass->init_gpc_mmu)
+		oclass->init_gpc_mmu(priv);
+
+	/* Set the PE as stream master */
+	nv_mask(priv, 0x503018, 0x1, 0x1);
+
+	/* Zcull init */
+	memset(data, 0x00, sizeof(data));
+	memcpy(tpcnr, priv->tpc_nr, sizeof(priv->tpc_nr));
+	for (i = 0, gpc = -1; i < priv->tpc_total; i++) {
+		do {
+			gpc = (gpc + 1) % priv->gpc_nr;
+		} while (!tpcnr[gpc]);
+		tpc = priv->tpc_nr[gpc] - tpcnr[gpc]--;
+
+		data[i / 8] |= tpc << ((i % 8) * 4);
+	}
+
+	nv_wr32(priv, GPC_BCAST(0x0980), data[0]);
+	nv_wr32(priv, GPC_BCAST(0x0984), data[1]);
+	nv_wr32(priv, GPC_BCAST(0x0988), data[2]);
+	nv_wr32(priv, GPC_BCAST(0x098c), data[3]);
+
+	for (gpc = 0; gpc < priv->gpc_nr; gpc++) {
+		nv_wr32(priv, GPC_UNIT(gpc, 0x0914),
+			priv->magic_not_rop_nr << 8 | priv->tpc_nr[gpc]);
+		nv_wr32(priv, GPC_UNIT(gpc, 0x0910), 0x00040000 |
+			priv->tpc_total);
+		nv_wr32(priv, GPC_UNIT(gpc, 0x0918), magicgpc918);
+	}
+
+	nv_wr32(priv, GPC_BCAST(0x3fd4), magicgpc918);
+
+	/* Enable FIFO access */
+	nv_wr32(priv, 0x400500, 0x00010001);
+
+	/* Enable interrupts */
+	nv_wr32(priv, 0x400100, 0xffffffff);
+	nv_wr32(priv, 0x40013c, 0xffffffff);
+
+	/* Enable FECS error interrupts */
+	nv_wr32(priv, 0x409c24, 0x000f0000);
+
+	/* Enable hardware warning exceptions */
+	nv_wr32(priv, 0x404000, 0xc0000000);
+	nv_wr32(priv, 0x404600, 0xc0000000);
+
+	if (oclass->set_hww_esr_report_mask)
+		oclass->set_hww_esr_report_mask(priv);
+
+	/* Enable TPC exceptions per GPC */
+	nv_wr32(priv, 0x419d0c, 0x2);
+	nv_wr32(priv, 0x41ac94, (((1 << priv->tpc_total) - 1) & 0xff)
<< 16);
+
+	/* Reset and enable all exceptions */
+	nv_wr32(priv, 0x400108, 0xffffffff);
+	nv_wr32(priv, 0x400138, 0xffffffff);
+	nv_wr32(priv, 0x400118, 0xffffffff);
+	nv_wr32(priv, 0x400130, 0xffffffff);
+	nv_wr32(priv, 0x40011c, 0xffffffff);
+	nv_wr32(priv, 0x400134, 0xffffffff);
+
+	gf100_gr_zbc_init(priv);
+
+	return gf100_gr_init_ctxctl(priv);
+}
+
 struct nvkm_oclass *
-gk20a_gr_oclass = &(struct gf100_gr_oclass) {
-	.base.handle = NV_ENGINE(GR, 0xea),
-	.base.ofuncs = &(struct nvkm_ofuncs) {
-		.ctor = gf100_gr_ctor,
-		.dtor = gf100_gr_dtor,
-		.init = gk104_gr_init,
-		.fini = _nvkm_gr_fini,
+gk20a_gr_oclass = &(struct gk20a_gr_oclass) {
+	.gf100 = {
+		.base.handle = NV_ENGINE(GR, 0xea),
+		.base.ofuncs = &(struct nvkm_ofuncs) {
+			.ctor = gk20a_gr_ctor,
+			.dtor = gk20a_gr_dtor,
+			.init = gk20a_gr_init,
+			.fini = _nvkm_gr_fini,
+		},
+		.cclass = &gk20a_grctx_oclass,
+		.sclass = gk20a_gr_sclass,
+		.ppc_nr = 1,
 	},
-	.cclass = &gk20a_grctx_oclass,
-	.sclass = gk20a_gr_sclass,
-	.mmio = gk104_gr_pack_mmio,
-	.ppc_nr = 1,
-}.base;
+	.set_hww_esr_report_mask = gk20a_gr_set_hww_esr_report_mask,
+}.gf100.base;
diff --git a/drm/nouveau/nvkm/engine/gr/gk20a.h
b/drm/nouveau/nvkm/engine/gr/gk20a.h
new file mode 100644
index 000000000000..b36958505a81
--- /dev/null
+++ b/drm/nouveau/nvkm/engine/gr/gk20a.h
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __GK20A_GR_H__
+#define __GK20A_GR_H__
+
+#include "gf100.h"
+
+struct gk20a_gr_oclass {
+	struct gf100_gr_oclass gf100;
+
+	void (*init_gpc_mmu)(struct gf100_gr_priv *);
+	void (*set_hww_esr_report_mask)(struct gf100_gr_priv *);
+};
+
+#endif
-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 3/6] fifo: add GM20B fifo

GM20B has a 512-channels FIFO similar to GK104.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/include/nvkm/engine/fifo.h |  1 +
 drm/nouveau/nvkm/engine/fifo/Kbuild    |  1 +
 drm/nouveau/nvkm/engine/fifo/gk104.h   |  4 ++++
 drm/nouveau/nvkm/engine/fifo/gm204.c   |  2 +-
 drm/nouveau/nvkm/engine/fifo/gm20b.c   | 34 ++++++++++++++++++++++++++++++++++
 5 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 drm/nouveau/nvkm/engine/fifo/gm20b.c

diff --git a/drm/nouveau/include/nvkm/engine/fifo.h
b/drm/nouveau/include/nvkm/engine/fifo.h
index 97cdeab8e44c..9100b800562e 100644
--- a/drm/nouveau/include/nvkm/engine/fifo.h
+++ b/drm/nouveau/include/nvkm/engine/fifo.h
@@ -117,6 +117,7 @@ extern struct nvkm_oclass *gk104_fifo_oclass;
 extern struct nvkm_oclass *gk20a_fifo_oclass;
 extern struct nvkm_oclass *gk208_fifo_oclass;
 extern struct nvkm_oclass *gm204_fifo_oclass;
+extern struct nvkm_oclass *gm20b_fifo_oclass;
 
 int  nvkm_fifo_uevent_ctor(struct nvkm_object *, void *, u32,
 			   struct nvkm_notify *);
diff --git a/drm/nouveau/nvkm/engine/fifo/Kbuild
b/drm/nouveau/nvkm/engine/fifo/Kbuild
index 42891cb71ea3..dc81a8b64f35 100644
--- a/drm/nouveau/nvkm/engine/fifo/Kbuild
+++ b/drm/nouveau/nvkm/engine/fifo/Kbuild
@@ -10,3 +10,4 @@ nvkm-y += nvkm/engine/fifo/gk104.o
 nvkm-y += nvkm/engine/fifo/gk20a.o
 nvkm-y += nvkm/engine/fifo/gk208.o
 nvkm-y += nvkm/engine/fifo/gm204.o
+nvkm-y += nvkm/engine/fifo/gm20b.o
diff --git a/drm/nouveau/nvkm/engine/fifo/gk104.h
b/drm/nouveau/nvkm/engine/fifo/gk104.h
index 318d30d6ee1a..b77d75f86b73 100644
--- a/drm/nouveau/nvkm/engine/fifo/gk104.h
+++ b/drm/nouveau/nvkm/engine/fifo/gk104.h
@@ -15,4 +15,8 @@ struct gk104_fifo_impl {
 };
 
 extern struct nvkm_ofuncs gk104_fifo_chan_ofuncs;
+
+int  gm204_fifo_ctor(struct nvkm_object *, struct nvkm_object *,
+		    struct nvkm_oclass *, void *, u32,
+		    struct nvkm_object **);
 #endif
diff --git a/drm/nouveau/nvkm/engine/fifo/gm204.c
b/drm/nouveau/nvkm/engine/fifo/gm204.c
index 749d525dd8e3..7596587b0e7c 100644
--- a/drm/nouveau/nvkm/engine/fifo/gm204.c
+++ b/drm/nouveau/nvkm/engine/fifo/gm204.c
@@ -31,7 +31,7 @@ gm204_fifo_sclass[] = {
 	{}
 };
 
-static int
+int
 gm204_fifo_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
 		struct nvkm_oclass *oclass, void *data, u32 size,
 		struct nvkm_object **pobject)
diff --git a/drm/nouveau/nvkm/engine/fifo/gm20b.c
b/drm/nouveau/nvkm/engine/fifo/gm20b.c
new file mode 100644
index 000000000000..4abf547c34e6
--- /dev/null
+++ b/drm/nouveau/nvkm/engine/fifo/gm20b.c
@@ -0,0 +1,34 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include "gk104.h"
+
+struct nvkm_oclass *
+gm20b_fifo_oclass = &(struct gk104_fifo_impl) {
+	.base.handle = NV_ENGINE(FIFO, 0x2b),
+	.base.ofuncs = &(struct nvkm_ofuncs) {
+		.ctor = gm204_fifo_ctor,
+		.dtor = gk104_fifo_dtor,
+		.init = gk104_fifo_init,
+		.fini = gk104_fifo_fini,
+	},
+	.channels = 512,
+}.base;
-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 4/6] gr: add GM20B support

Add support for GM20B's graphics engine, based on GK20A. Note that this
code alone will not allow the engine to initialize on released devices
which require PMU-assisted secure boot.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/include/nvkm/engine/gr.h  |   1 +
 drm/nouveau/nvkm/engine/gr/Kbuild     |   2 +
 drm/nouveau/nvkm/engine/gr/ctxgf100.h |   7 +++
 drm/nouveau/nvkm/engine/gr/ctxgm107.c |   2 +-
 drm/nouveau/nvkm/engine/gr/ctxgm204.c |   4 +-
 drm/nouveau/nvkm/engine/gr/ctxgm20b.c | 110 ++++++++++++++++++++++++++++++++++
 drm/nouveau/nvkm/engine/gr/gf100.c    |   1 +
 drm/nouveau/nvkm/engine/gr/gf100.h    |   6 ++
 drm/nouveau/nvkm/engine/gr/gk20a.c    |   6 +-
 drm/nouveau/nvkm/engine/gr/gm20b.c    |  84 ++++++++++++++++++++++++++
 10 files changed, 217 insertions(+), 6 deletions(-)
 create mode 100644 drm/nouveau/nvkm/engine/gr/ctxgm20b.c
 create mode 100644 drm/nouveau/nvkm/engine/gr/gm20b.c

diff --git a/drm/nouveau/include/nvkm/engine/gr.h
b/drm/nouveau/include/nvkm/engine/gr.h
index 7cbe20280760..c772497cac3e 100644
--- a/drm/nouveau/include/nvkm/engine/gr.h
+++ b/drm/nouveau/include/nvkm/engine/gr.h
@@ -74,6 +74,7 @@ extern struct nvkm_oclass *gk208_gr_oclass;
 extern struct nvkm_oclass *gm107_gr_oclass;
 extern struct nvkm_oclass *gm204_gr_oclass;
 extern struct nvkm_oclass *gm206_gr_oclass;
+extern struct nvkm_oclass *gm20b_gr_oclass;
 
 #include <core/enum.h>
 
diff --git a/drm/nouveau/nvkm/engine/gr/Kbuild
b/drm/nouveau/nvkm/engine/gr/Kbuild
index 2e1b92f71d9e..e91b4dfc0bf3 100644
--- a/drm/nouveau/nvkm/engine/gr/Kbuild
+++ b/drm/nouveau/nvkm/engine/gr/Kbuild
@@ -14,6 +14,7 @@ nvkm-y += nvkm/engine/gr/ctxgk208.o
 nvkm-y += nvkm/engine/gr/ctxgm107.o
 nvkm-y += nvkm/engine/gr/ctxgm204.o
 nvkm-y += nvkm/engine/gr/ctxgm206.o
+nvkm-y += nvkm/engine/gr/ctxgm20b.o
 nvkm-y += nvkm/engine/gr/nv04.o
 nvkm-y += nvkm/engine/gr/nv10.o
 nvkm-y += nvkm/engine/gr/nv20.o
@@ -38,3 +39,4 @@ nvkm-y += nvkm/engine/gr/gk208.o
 nvkm-y += nvkm/engine/gr/gm107.o
 nvkm-y += nvkm/engine/gr/gm204.o
 nvkm-y += nvkm/engine/gr/gm206.o
+nvkm-y += nvkm/engine/gr/gm20b.o
diff --git a/drm/nouveau/nvkm/engine/gr/ctxgf100.h
b/drm/nouveau/nvkm/engine/gr/ctxgf100.h
index 3676a3342bc5..f89ab3706cf3 100644
--- a/drm/nouveau/nvkm/engine/gr/ctxgf100.h
+++ b/drm/nouveau/nvkm/engine/gr/ctxgf100.h
@@ -91,6 +91,10 @@ void gk104_grctx_generate_r418bb8(struct gf100_gr_priv *);
 void gk104_grctx_generate_rop_active_fbps(struct gf100_gr_priv *);
 
 
+void gm107_grctx_generate_bundle(struct gf100_grctx *);
+void gm107_grctx_generate_pagepool(struct gf100_grctx *);
+void gm107_grctx_generate_attrib(struct gf100_grctx *);
+
 extern struct nvkm_oclass *gk110_grctx_oclass;
 extern struct nvkm_oclass *gk110b_grctx_oclass;
 extern struct nvkm_oclass *gk208_grctx_oclass;
@@ -102,8 +106,11 @@ void gm107_grctx_generate_attrib(struct gf100_grctx *);
 
 extern struct nvkm_oclass *gm204_grctx_oclass;
 void gm204_grctx_generate_main(struct gf100_gr_priv *, struct gf100_grctx *);
+void gm204_grctx_generate_tpcid(struct gf100_gr_priv *);
+void gm204_grctx_generate_405b60(struct gf100_gr_priv *);
 
 extern struct nvkm_oclass *gm206_grctx_oclass;
+extern struct nvkm_oclass *gm20b_grctx_oclass;
 
 /* context init value lists */
 
diff --git a/drm/nouveau/nvkm/engine/gr/ctxgm107.c
b/drm/nouveau/nvkm/engine/gr/ctxgm107.c
index fbeaae3ae6ce..6bf2fd1a05ba 100644
--- a/drm/nouveau/nvkm/engine/gr/ctxgm107.c
+++ b/drm/nouveau/nvkm/engine/gr/ctxgm107.c
@@ -931,7 +931,7 @@ gm107_grctx_generate_attrib(struct gf100_grctx *info)
 	}
 }
 
-static void
+void
 gm107_grctx_generate_tpcid(struct gf100_gr_priv *priv)
 {
 	int gpc, tpc, id;
diff --git a/drm/nouveau/nvkm/engine/gr/ctxgm204.c
b/drm/nouveau/nvkm/engine/gr/ctxgm204.c
index ea8e66151aa8..efc76bfae896 100644
--- a/drm/nouveau/nvkm/engine/gr/ctxgm204.c
+++ b/drm/nouveau/nvkm/engine/gr/ctxgm204.c
@@ -918,7 +918,7 @@ gm204_grctx_pack_ppc[] = {
  * PGRAPH context implementation
 
******************************************************************************/
 
-static void
+void
 gm204_grctx_generate_tpcid(struct gf100_gr_priv *priv)
 {
 	int gpc, tpc, id;
@@ -943,7 +943,7 @@ gm204_grctx_generate_rop_active_fbps(struct gf100_gr_priv
*priv)
 	nv_mask(priv, 0x408958, 0x0000000f, fbp_count); /* crop */
 }
 
-static void
+void
 gm204_grctx_generate_405b60(struct gf100_gr_priv *priv)
 {
 	const u32 dist_nr = DIV_ROUND_UP(priv->tpc_total, 4);
diff --git a/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
b/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
new file mode 100644
index 000000000000..c011bf327276
--- /dev/null
+++ b/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include "ctxgf100.h"
+
+static void
+gm20b_grctx_generate_r406028(struct gf100_gr_priv *priv)
+{
+	u32 tpc_per_gpc = 0;
+	int i;
+
+	for (i = 0; i < priv->gpc_nr; i++)
+		tpc_per_gpc |= priv->tpc_nr[i] << (4 * i);
+
+	nv_wr32(priv, 0x406028, tpc_per_gpc);
+	nv_wr32(priv, 0x405870, tpc_per_gpc);
+}
+
+static void
+gm20b_grctx_generate_main(struct gf100_gr_priv *priv, struct gf100_grctx *info)
+{
+	struct gf100_grctx_oclass *oclass = (void *)nv_engine(priv)->cclass;
+	int idle_timeout_save;
+	int i, tmp;
+
+	gf100_gr_mmio(priv, priv->fuc_sw_ctx);
+
+	gf100_gr_wait_idle(priv);
+
+	idle_timeout_save = nv_rd32(priv, 0x404154);
+	nv_wr32(priv, 0x404154, 0x00000000);
+
+	oclass->attrib(info);
+
+	oclass->unkn(priv);
+
+	gm204_grctx_generate_tpcid(priv);
+	gm20b_grctx_generate_r406028(priv);
+	gk104_grctx_generate_r418bb8(priv);
+
+	for (i = 0; i < 8; i++)
+		nv_wr32(priv, 0x4064d0 + (i * 0x04), 0x00000000);
+
+	nv_wr32(priv, 0x405b00, (priv->tpc_total << 8) | priv->gpc_nr);
+
+	gk104_grctx_generate_rop_active_fbps(priv);
+	nv_wr32(priv, 0x408908, nv_rd32(priv, 0x410108) | 0x80000000);
+
+	for (tmp = 0, i = 0; i < priv->gpc_nr; i++)
+		tmp |= ((1 << priv->tpc_nr[i]) - 1) << (i * 4);
+	nv_wr32(priv, 0x4041c4, tmp);
+
+	gm204_grctx_generate_405b60(priv);
+
+	gf100_gr_wait_idle(priv);
+
+	nv_wr32(priv, 0x404154, idle_timeout_save);
+	gf100_gr_wait_idle(priv);
+
+	gf100_gr_mthd(priv, priv->fuc_method);
+	gf100_gr_wait_idle(priv);
+
+	gf100_gr_icmd(priv, priv->fuc_bundle);
+	oclass->pagepool(info);
+	oclass->bundle(info);
+}
+
+struct nvkm_oclass *
+gm20b_grctx_oclass = &(struct gf100_grctx_oclass) {
+	.base.handle = NV_ENGCTX(GR, 0x2b),
+	.base.ofuncs = &(struct nvkm_ofuncs) {
+		.ctor = gf100_gr_context_ctor,
+		.dtor = gf100_gr_context_dtor,
+		.init = _nvkm_gr_context_init,
+		.fini = _nvkm_gr_context_fini,
+		.rd32 = _nvkm_gr_context_rd32,
+		.wr32 = _nvkm_gr_context_wr32,
+	},
+	.main  = gm20b_grctx_generate_main,
+	.unkn  = gk104_grctx_generate_unkn,
+	.bundle = gm107_grctx_generate_bundle,
+	.bundle_size = 0x1800,
+	.bundle_min_gpm_fifo_depth = 0x182,
+	.bundle_token_limit = 0x1c0,
+	.pagepool = gm107_grctx_generate_pagepool,
+	.pagepool_size = 0x8000,
+	.attrib = gm107_grctx_generate_attrib,
+	.attrib_nr_max = 0x600,
+	.attrib_nr = 0x400,
+	.alpha_nr_max = 0xc00,
+	.alpha_nr = 0x800,
+}.base;
\ No newline at end of file
diff --git a/drm/nouveau/nvkm/engine/gr/gf100.c
b/drm/nouveau/nvkm/engine/gr/gf100.c
index 288423b84667..e7c3e9e57385 100644
--- a/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -1691,6 +1691,7 @@ gf100_gr_ctor(struct nvkm_object *parent, struct
nvkm_object *engine,
 	case 0xd7:
 	case 0xd9: /* 1/0/0/0, 1 */
 	case 0xea: /* gk20a */
+	case 0x12b: /* gm20b */
 		priv->magic_not_rop_nr = 0x01;
 		break;
 	}
diff --git a/drm/nouveau/nvkm/engine/gr/gf100.h
b/drm/nouveau/nvkm/engine/gr/gf100.h
index 972efd7b7934..f185f034d1ea 100644
--- a/drm/nouveau/nvkm/engine/gr/gf100.h
+++ b/drm/nouveau/nvkm/engine/gr/gf100.h
@@ -141,6 +141,12 @@ int  gk104_gr_ctor(struct nvkm_object *, struct nvkm_object
*,
 		     struct nvkm_object **);
 int  gk104_gr_init(struct nvkm_object *);
 
+int  gk20a_gr_ctor(struct nvkm_object *, struct nvkm_object *,
+		     struct nvkm_oclass *, void *data, u32 size,
+		     struct nvkm_object **);
+void gk20a_gr_dtor(struct nvkm_object *);
+int  gk20a_gr_init(struct nvkm_object *);
+
 int  gm204_gr_init(struct nvkm_object *);
 
 extern struct nvkm_ofuncs gf100_fermi_ofuncs;
diff --git a/drm/nouveau/nvkm/engine/gr/gk20a.c
b/drm/nouveau/nvkm/engine/gr/gk20a.c
index d27ef3ea2226..fc4a910b2498 100644
--- a/drm/nouveau/nvkm/engine/gr/gk20a.c
+++ b/drm/nouveau/nvkm/engine/gr/gk20a.c
@@ -154,7 +154,7 @@ gk20a_gr_av_to_method(struct gf100_gr_fuc *fuc)
 	return pack;
 }
 
-static int
+int
 gk20a_gr_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
 	      struct nvkm_oclass *oclass, void *data, u32 size,
 	      struct nvkm_object **pobject)
@@ -204,7 +204,7 @@ gk20a_gr_ctor(struct nvkm_object *parent, struct nvkm_object
*engine,
 	return 0;
 }
 
-static void
+void
 gk20a_gr_dtor(struct nvkm_object *object)
 {
 	struct gf100_gr_priv *priv = (void *)object;
@@ -240,7 +240,7 @@ gk20a_gr_set_hww_esr_report_mask(struct gf100_gr_priv *priv)
 	nv_wr32(priv, 0x419e4c, 0x7f);
 }
 
-static int
+int
 gk20a_gr_init(struct nvkm_object *object)
 {
 	struct gk20a_gr_oclass *oclass = (void *)object->oclass;
diff --git a/drm/nouveau/nvkm/engine/gr/gm20b.c
b/drm/nouveau/nvkm/engine/gr/gm20b.c
new file mode 100644
index 000000000000..897628062d58
--- /dev/null
+++ b/drm/nouveau/nvkm/engine/gr/gm20b.c
@@ -0,0 +1,84 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include "gk20a.h"
+#include "ctxgf100.h"
+
+#include <nvif/class.h>
+#include <subdev/timer.h>
+
+static struct nvkm_oclass
+gm20b_gr_sclass[] = {
+	{ FERMI_TWOD_A, &nvkm_object_ofuncs },
+	{ KEPLER_INLINE_TO_MEMORY_B, &nvkm_object_ofuncs },
+	{ MAXWELL_B, &gf100_fermi_ofuncs, gf100_gr_9097_omthds },
+	{ MAXWELL_COMPUTE_B, &nvkm_object_ofuncs, gf100_gr_90c0_omthds },
+	{}
+};
+
+static void
+gm20b_gr_init_gpc_mmu(struct gf100_gr_priv *priv)
+{
+	u32 val;
+
+	/* TODO this needs to be removed once secure boot works */
+	if (1) {
+		nv_wr32(priv, 0x100ce4, 0xffffffff);
+	}
+
+	/* TODO update once secure boot works */
+	val = nv_rd32(priv, 0x100c80);
+	val &= 0xf000087f;
+	nv_wr32(priv, 0x418880, val);
+	nv_wr32(priv, 0x418890, 0);
+	nv_wr32(priv, 0x418894, 0);
+
+	nv_wr32(priv, 0x4188b0, nv_rd32(priv, 0x100cc4));
+	nv_wr32(priv, 0x4188b4, nv_rd32(priv, 0x100cc8));
+	nv_wr32(priv, 0x4188b8, nv_rd32(priv, 0x100ccc));
+
+	nv_wr32(priv, 0x4188ac, nv_rd32(priv, 0x100800));
+}
+
+static void
+gm20b_gr_set_hww_esr_report_mask(struct gf100_gr_priv *priv)
+{
+	nv_wr32(priv, 0x419e44, 0xdffffe);
+	nv_wr32(priv, 0x419e4c, 0x5);
+}
+
+struct nvkm_oclass *
+gm20b_gr_oclass = &(struct gk20a_gr_oclass) {
+	.gf100 = {
+		.base.handle = NV_ENGINE(GR, 0x2b),
+		.base.ofuncs = &(struct nvkm_ofuncs) {
+			.ctor = gk20a_gr_ctor,
+			.dtor = gf100_gr_dtor,
+			.init = gk20a_gr_init,
+			.fini = _nvkm_gr_fini,
+		},
+		.cclass = &gm20b_grctx_oclass,
+		.sclass = gm20b_gr_sclass,
+		.ppc_nr = 1,
+	},
+	.init_gpc_mmu = gm20b_gr_init_gpc_mmu,
+	.set_hww_esr_report_mask = gm20b_gr_set_hww_esr_report_mask,
+}.gf100.base;
-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 5/6] device: recognize GM20B

Recognize GM20B and assign the right engines and subdevs.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/nvkm/engine/device/gm100.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drm/nouveau/nvkm/engine/device/gm100.c
b/drm/nouveau/nvkm/engine/device/gm100.c
index 70abf1ec7c98..a51b3ce50f36 100644
--- a/drm/nouveau/nvkm/engine/device/gm100.c
+++ b/drm/nouveau/nvkm/engine/device/gm100.c
@@ -181,6 +181,26 @@ gm100_identify(struct nvkm_device *device)
 		device->oclass[NVDEV_ENGINE_MSPPP  ] = &gf100_msppp_oclass;
 #endif
 		break;
+	case 0x12b:
+		device->cname = "GM20B";
+
+		device->oclass[NVDEV_SUBDEV_MC     ] =  gk20a_mc_oclass;
+		device->oclass[NVDEV_SUBDEV_MMU    ] = &gf100_mmu_oclass;
+		device->oclass[NVDEV_SUBDEV_BUS    ] =  gf100_bus_oclass;
+		device->oclass[NVDEV_SUBDEV_FUSE   ] = &gm107_fuse_oclass;
+		device->oclass[NVDEV_SUBDEV_TIMER  ] = &gk20a_timer_oclass;
+		device->oclass[NVDEV_SUBDEV_FB     ] =  gk20a_fb_oclass;
+		device->oclass[NVDEV_SUBDEV_LTC    ] =  gm107_ltc_oclass;
+		device->oclass[NVDEV_SUBDEV_IBUS   ] = &gk20a_ibus_oclass;
+		device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
+		device->oclass[NVDEV_SUBDEV_MMU    ] = &gf100_mmu_oclass;
+		device->oclass[NVDEV_SUBDEV_BAR    ] = &gk20a_bar_oclass;
+		device->oclass[NVDEV_ENGINE_DMAOBJ ] =  gf110_dmaeng_oclass;
+		device->oclass[NVDEV_ENGINE_FIFO   ] =  gm20b_fifo_oclass;
+		device->oclass[NVDEV_ENGINE_SW     ] =  gf100_sw_oclass;
+		device->oclass[NVDEV_ENGINE_GR     ] =  gm20b_gr_oclass;
+		device->oclass[NVDEV_ENGINE_CE2    ] = &gm204_ce2_oclass;
+		break;
 	default:
 		nv_fatal(device, "unknown Maxwell chipset\n");
 		return -EINVAL;
-- 
2.4.4

Alexandre Courbot

2015-Jun-23 06:16 UTC

head link

[Nouveau] [PATCH v2 6/6] platform: recognize GM20B

Allow the platform driver to recognize GM20B.

Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
 drm/nouveau/nouveau_platform.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drm/nouveau/nouveau_platform.c b/drm/nouveau/nouveau_platform.c
index dcfbbfaf1739..7a39d449fefa 100644
--- a/drm/nouveau/nouveau_platform.c
+++ b/drm/nouveau/nouveau_platform.c
@@ -252,6 +252,7 @@ static int nouveau_platform_remove(struct platform_device
*pdev)
 #if IS_ENABLED(CONFIG_OF)
 static const struct of_device_id nouveau_platform_match[] = {
 	{ .compatible = "nvidia,gk20a" },
+	{ .compatible = "nvidia,gm20b" },
 	{ }
 };
 
-- 
2.4.4

Ben Skeggs

2015-Jun-24 23:59 UTC

head link

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

On 23 June 2015 at 16:16, Alexandre Courbot <acourbot at nvidia.com>
wrote:> Second version of this patchset. Not many changes since first version - I
hope
> this means the changes are not too controversial.
>
> Changes since v1:
> - Removed lookup for previous FW files in "nouveau/"
> - Went back to using request_firmware() since we only try to load one fileHey Alex,

I've merged this in my tree as-is for the moment, hopefully to go to
Linus for the next merge window.  If there's any changes from you guys
on how you want the firmware to be handled, it'd be nice to have those
made concrete before then :)

Thanks,
Ben.
>
> Original cover letter follows:
>
> GM20B is the GPU of the upcoming Tegra X1 SoC. This series adds initial
support
> for it, based on a rework of the already-supported GK20A. It also
introduces
> support for NVIDIA-provided firmware files, which is why I have added a few
> NVIDIA people who are relevant to this discussion.
>
> The first patch adds support for loading the FECS and GPCCS firmwares from
> firmware files officially released by NVIDIA. As you know such firmwares
will
> soon become a necessity for newer GPUs because some falcons will require
signed
> firmware to operate. In addition there is no reverse-engineered version of
the
> GK20A firmwares yet, so since an external file is needed anyway, it may as
well
> be provided officially. NVIDIA plans to release firmwares as one file per
binary
> to keep things simple. The layout will be
nvidia/<gpu>/<firmware>.bin, so for
> GK20A FECS/GPCCS we have:
>
> nvidia/gk20a/fecs_inst.bin (aka fuc409c)
> nvidia/gk20a/fecs_data.bin (aka fuc409d)
> nvidia/gk20a/gpccs_inst.bin (aka fuc41ac)
> nvidia/gk20a/gpccs_data.bin (aka fuc41ad)
>
> All firmware files listed in this patchset are clean for release, and I am
just
> waiting for a community ack of the layout to send a patch to
linux-firmware.
>
> The second patch reworks existing GK20A support to make it closer to what
our
> nvgpu driver does. Support so far was heavily based on GK104, which somehow
made
> me feel uneasy - and quite scared after I looked more closely at what nvgpu
> does. In particular the GK104 MMIO bundles differed significantly from what
> nvgpu does. This change aligns things and (probably less significant, but
still
> safer) reorders the initialization sequence to match the one of nvgpu.
>
> You will note that the MMIO bundles now come as firmware files of their
own. I
> am not sure the community will be pleased with an increase of firmware
files,
> however the rationale for this is as follows:
> - These initialization sequences are related to the firmwares, so it makes
sense
>   to distribute them under the same medium
> - If NVIDIA needs to update the firmwares for some reason, it can
atomically
>   update the MMIO bundles and provide a coherent set, instead of having to
>   introduce versioning into the firmware and driver
> - For IP reasons, I as an NVIDIA employee cannot extract these register
>   sequences and link them into Nouveau
> - These are just a bunch of register address/value pairs anyway
>
> The new firmware files introduced are:
>
> nvidia/gk20a/sw_nonctx.bin (gr_pack_mmio)
> nvidia/gk20a/sw_ctx.bin (grctx_pack_hub, grctx_pack_gpc, grctx_pack_zcull,
>                          grctx_pack_tpc, grctx_pack_ppc)
> nvidia/gk20a/sw_bundle_init.bin (grctx_pack_icmd)
> nvidia/gk20a/sw_method_init.bin (grctx_pack_mthd)
>
> Third patch is trivial and adds the GM20B FIFO device.
>
> Fourth patch adds GM20B GR based on the reworked GK20A support. GM20B will
rely
> on the same firmware files as GK20A (also clean for release). Note that
this is
> not full support yet for released devices, which will require secure boot.
This
> will be my focus once this patchset is merged (Deepak got a working
version,
> but there is still a lot of work to do on it before it is upstreamable).
>
> The last two patches recognize GM20B at the device and platform level.
Nothing
> really exciting.
>
> I hope the addition of firmware files will not become too controversial. If
it
> does, I have good arguments to support it. ;) Besides the GK20A rework that
> probably few people care about, the point is the addition of a basic layout
for
> the firmwares that NVIDIA will officially release to finally support secure
> boot, and I would like to make sure we get this right.
>
> Alexandre Courbot (6):
>   gr: use NVIDIA-provided external firmwares
>   gr/gk20a: use same initialization sequence as nvgpu
>   fifo: add GM20B fifo
>   gr: add GM20B support
>   device: recognize GM20B
>   platform: recognize GM20B
>
>  drm/nouveau/include/nvkm/engine/fifo.h |   1 +
>  drm/nouveau/include/nvkm/engine/gr.h   |   1 +
>  drm/nouveau/nouveau_platform.c         |   1 +
>  drm/nouveau/nvkm/engine/device/gm100.c |  20 ++
>  drm/nouveau/nvkm/engine/fifo/Kbuild    |   1 +
>  drm/nouveau/nvkm/engine/fifo/gk104.h   |   4 +
>  drm/nouveau/nvkm/engine/fifo/gm204.c   |   2 +-
>  drm/nouveau/nvkm/engine/fifo/gm20b.c   |  34 ++++
>  drm/nouveau/nvkm/engine/gr/Kbuild      |   2 +
>  drm/nouveau/nvkm/engine/gr/ctxgf100.h  |   7 +
>  drm/nouveau/nvkm/engine/gr/ctxgk20a.c  |  65 +++++--
>  drm/nouveau/nvkm/engine/gr/ctxgm107.c  |   2 +-
>  drm/nouveau/nvkm/engine/gr/ctxgm204.c  |   4 +-
>  drm/nouveau/nvkm/engine/gr/ctxgm20b.c  | 110 +++++++++++
>  drm/nouveau/nvkm/engine/gr/gf100.c     |  35 ++--
>  drm/nouveau/nvkm/engine/gr/gf100.h     |  18 ++
>  drm/nouveau/nvkm/engine/gr/gk20a.c     | 336
+++++++++++++++++++++++++++++++--
>  drm/nouveau/nvkm/engine/gr/gk20a.h     |  35 ++++
>  drm/nouveau/nvkm/engine/gr/gm20b.c     |  84 +++++++++
>  19 files changed, 716 insertions(+), 46 deletions(-)
>  create mode 100644 drm/nouveau/nvkm/engine/fifo/gm20b.c
>  create mode 100644 drm/nouveau/nvkm/engine/gr/ctxgm20b.c
>  create mode 100644 drm/nouveau/nvkm/engine/gr/gk20a.h
>  create mode 100644 drm/nouveau/nvkm/engine/gr/gm20b.c
>
> --
> 2.4.4
>
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

Alexandre Courbot

2015-Jun-25 00:14 UTC

head link

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

On 06/25/2015 08:59 AM, Ben Skeggs wrote:> On 23 June 2015 at 16:16, Alexandre Courbot <acourbot at nvidia.com>
wrote:
>> Second version of this patchset. Not many changes since first version -
I hope
>> this means the changes are not too controversial.
>>
>> Changes since v1:
>> - Removed lookup for previous FW files in "nouveau/"
>> - Went back to using request_firmware() since we only try to load one
file
> Hey Alex,
>
> I've merged this in my tree as-is for the moment, hopefully to go to
> Linus for the next merge window.  If there's any changes from you guys
> on how you want the firmware to be handled, it'd be nice to have those
> made concrete before then :)
Excellent, that will give us a short grace period. Thanks Ben! :)

Nouveau - Jun 2015 - [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

[Nouveau] [PATCH v2 1/6] gr: use NVIDIA-provided external firmwares

[Nouveau] [PATCH v2 2/6] gr/gk20a: use same initialization sequence as nvgpu

[Nouveau] [PATCH v2 3/6] fifo: add GM20B fifo

[Nouveau] [PATCH v2 4/6] gr: add GM20B support

[Nouveau] [PATCH v2 5/6] device: recognize GM20B

[Nouveau] [PATCH v2 6/6] platform: recognize GM20B

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths

[Nouveau] [PATCH v2 0/6] Improve GK20A support, introduce GM20B, firmware paths