Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
Hi everyone, Here is the second batch of patches to add GK20A support to Nouveau. This time we are adding the actual chip support, and this series brings the driver to a point where a slightly-tweaked Mesa successfully runs shaders and renders triangles on GBM! Many thanks to Thierry Reding and the people on the #nouveau IRC channel for their help without which we would not have reached this milestone. A few lines of hacks (not included here) are still needed to deal with cached mappings triggering external aborts and CPU/GPU memory coherency issues, but I hope to understand and address these issues next. Most of the changes below have already been seen (and sometimes reviewed) in an earlier patchset. What has been added is proper PGRAPH support (still needing an external firmware and mostly reusing NVE4's code) as well as a better RAM implementation. How to represent and manage VRAM has been the hardest part to deal with, since GK20A shares the system memory with the CPU without any kind of partition. I have tried various approaches (included some in which no RAM object ever gets instanciated) and finally decided to go for one using DMA-contiguous memory allocations and relying on BAR mappings for kernel access and exposure to user-space, as it fits better with existing code and keeps us safe from most of the CPU/GPU memory coherency issues (at the cost of some performance). Looking forward to your review of these few patches! :) Cheers, Alex. Alexandre Courbot (12): drm/nouveau: fix missing newline drm/nouveau/timer: skip calibration on GK20A drm/nouveau/bar: only ioremap BAR3 if it exists drm/nouveau/bar/nvc0: support chips without BAR3 drm/nouveau/fifo: add GK20A support drm/nouveau/ibus: add GK20A support drm/nouveau/fb: add GK20A support drm/nouveau/graph: enable when using external firmware drm/nouveau/graph: pad firmware code at load time drm/nouveau/graph: add GK20A support drm/nouveau: support GK20A in nouveau_accel_init() drm/nouveau: support for probing GK20A drivers/gpu/drm/nouveau/Makefile | 5 + drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 +++ drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h | 1 + drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c | 35 +++++ .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c | 4 +- drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 12 +- drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h | 9 ++ drivers/gpu/drm/nouveau/core/engine/graph/nve4.c | 2 +- drivers/gpu/drm/nouveau/core/engine/graph/nvea.c | 75 +++++++++ drivers/gpu/drm/nouveau/core/include/engine/fifo.h | 1 + .../gpu/drm/nouveau/core/include/engine/graph.h | 1 + drivers/gpu/drm/nouveau/core/include/subdev/fb.h | 1 + drivers/gpu/drm/nouveau/core/include/subdev/ibus.h | 1 + drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +- drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c | 101 +++++++------ drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c | 56 +++++++ drivers/gpu/drm/nouveau/core/subdev/fb/priv.h | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c | 168 +++++++++++++++++++++ drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c | 110 ++++++++++++++ drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 ++- drivers/gpu/drm/nouveau/nouveau_drm.c | 12 +- 21 files changed, 578 insertions(+), 63 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline
Add a missing newline at the end of a DRM_INFO message. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index b2a674531fba..ef27949057c3 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -322,7 +322,7 @@ nouveau_get_hdmi_dev(struct drm_device *dev) struct pci_dev *pdev = dev->pdev; if (!pdev) { - DRM_INFO("not a PCI device; no HDMI"); + DRM_INFO("not a PCI device; no HDMI\n"); drm->hdmi_device = NULL; return; } -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
GK20A's timer is directly attached to the system timer and cannot be calibrated. Skip the calibration phase on that chip since the corresponding registers do not exist. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c index c0bdd10358d7..822fe0d8a871 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object) if (ret) return ret; + /* gk20a does not have the calibration registers */ + if (device->chipset == 0xea) + goto skip_clk_init; + /* aim for 31.25MHz, which gives us nanosecond timestamps */ d = 1000000 / 32; @@ -235,20 +239,23 @@ nv04_timer_init(struct nouveau_object *object) d >>= 1; } - /* restore the time before suspend */ - lo = priv->suspend_time; - hi = (priv->suspend_time >> 32); - nv_debug(priv, "input frequency : %dHz\n", f); nv_debug(priv, "input multiplier: %d\n", m); nv_debug(priv, "numerator : 0x%08x\n", n); nv_debug(priv, "denominator : 0x%08x\n", d); nv_debug(priv, "timer frequency : %dHz\n", (f * m) * d / n); - nv_debug(priv, "time low : 0x%08x\n", lo); - nv_debug(priv, "time high : 0x%08x\n", hi); nv_wr32(priv, NV04_PTIMER_NUMERATOR, n); nv_wr32(priv, NV04_PTIMER_DENOMINATOR, d); + +skip_clk_init: + /* restore the time before suspend */ + lo = priv->suspend_time; + hi = (priv->suspend_time >> 32); + + nv_debug(priv, "time low : 0x%08x\n", lo); + nv_debug(priv, "time high : 0x%08x\n", hi); + nv_wr32(priv, NV04_PTIMER_INTR_0, 0xffffffff); nv_wr32(priv, NV04_PTIMER_INTR_EN_0, 0x00000000); nv_wr32(priv, NV04_PTIMER_TIME_1, hi); -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
Some chips that use system memory exclusively (e.g. GK20A) do not expose 2 BAR regions. For them only BAR1 exists, and it should be used for USERD mapping. Do not map BAR3 if its resource does not exist. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c index bdf594116f3f..d713eeb75b13 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent, { struct nouveau_device *device = nv_device(parent); struct nouveau_bar *bar; + bool has_bar3 = nv_device_resource_len(device, 3) != 0; int ret; ret = nouveau_subdev_create_(parent, engine, oclass, 0, "BARCTL", @@ -118,8 +119,10 @@ nouveau_bar_create_(struct nouveau_object *parent, if (ret) return ret; - bar->iomem = ioremap(nv_device_resource_start(device, 3), - nv_device_resource_len(device, 3)); + if (has_bar3) + bar->iomem = ioremap(nv_device_resource_start(device, 3), + nv_device_resource_len(device, 3)); + return 0; } -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
Adapt the NVC0 BAR driver to make it able to support chips that do not expose a BAR3. When this happens, BAR1 is then used for USERD mapping and the BAR alloc() functions is disabled, making GPU objects unable to rely on BAR for data access and falling back to PRAMIN. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c | 101 +++++++++++++------------ 1 file changed, 52 insertions(+), 49 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c index 3f30db62e656..5da1b9447af0 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c @@ -79,87 +79,88 @@ nvc0_bar_unmap(struct nouveau_bar *bar, struct nouveau_vma *vma) } static int -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine, - struct nouveau_oclass *oclass, void *data, u32 size, - struct nouveau_object **pobject) +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar) { - struct nouveau_device *device = nv_device(parent); - struct nvc0_bar_priv *priv; + struct nouveau_device *device = nv_device(&priv->base); struct nouveau_gpuobj *mem; struct nouveau_vm *vm; + resource_size_t bar_len; int ret; - ret = nouveau_bar_create(parent, engine, oclass, &priv); - *pobject = nv_object(priv); - if (ret) - return ret; - - /* BAR3 */ ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0, - &priv->bar[0].mem); - mem = priv->bar[0].mem; + &priv->bar[nr].mem); + mem = priv->bar[nr].mem; if (ret) return ret; ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0, - &priv->bar[0].pgd); + &priv->bar[nr].pgd); if (ret) return ret; - ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 3), 0, &vm); + bar_len = nv_device_resource_len(device, bar); + + ret = nouveau_vm_new(device, 0, bar_len, 0, &vm); if (ret) return ret; atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]); - ret = nouveau_gpuobj_new(nv_object(priv), NULL, - (nv_device_resource_len(device, 3) >> 12) * 8, - 0x1000, NVOBJ_FLAG_ZERO_ALLOC, - &vm->pgt[0].obj[0]); - vm->pgt[0].refcount[0] = 1; - if (ret) - return ret; + /* + * Bootstrap page table lookup. + */ + if (bar == 3) { + ret = nouveau_gpuobj_new(nv_object(priv), NULL, + (bar_len >> 12) * 8, 0x1000, + NVOBJ_FLAG_ZERO_ALLOC, + &vm->pgt[0].obj[0]); + vm->pgt[0].refcount[0] = 1; + if (ret) + return ret; + } - ret = nouveau_vm_ref(vm, &priv->bar[0].vm, priv->bar[0].pgd); + ret = nouveau_vm_ref(vm, &priv->bar[nr].vm, priv->bar[nr].pgd); nouveau_vm_ref(NULL, &vm, NULL); if (ret) return ret; - nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[0].pgd->addr)); - nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[0].pgd->addr)); - nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 3) - 1)); - nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 3) - 1)); + nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[nr].pgd->addr)); + nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[nr].pgd->addr)); + nv_wo32(mem, 0x0208, lower_32_bits(bar_len - 1)); + nv_wo32(mem, 0x020c, upper_32_bits(bar_len - 1)); - /* BAR1 */ - ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0, - &priv->bar[1].mem); - mem = priv->bar[1].mem; - if (ret) - return ret; + return 0; +} - ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0, - &priv->bar[1].pgd); - if (ret) - return ret; +static int +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine, + struct nouveau_oclass *oclass, void *data, u32 size, + struct nouveau_object **pobject) +{ + struct nouveau_device *device = nv_device(parent); + struct nvc0_bar_priv *priv; + bool has_bar3 = nv_device_resource_len(device, 3) != 0; + int ret; - ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 1), 0, &vm); + ret = nouveau_bar_create(parent, engine, oclass, &priv); + *pobject = nv_object(priv); if (ret) return ret; - atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]); + /* BAR3 */ + if (has_bar3) { + ret = nvc0_bar_init_vm(priv, 0, 3); + if (ret) + return ret; + priv->base.alloc = nouveau_bar_alloc; + priv->base.kmap = nvc0_bar_kmap; + } - ret = nouveau_vm_ref(vm, &priv->bar[1].vm, priv->bar[1].pgd); - nouveau_vm_ref(NULL, &vm, NULL); + /* BAR1 */ + ret = nvc0_bar_init_vm(priv, 1, 1); if (ret) return ret; - nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[1].pgd->addr)); - nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[1].pgd->addr)); - nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 1) - 1)); - nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 1) - 1)); - - priv->base.alloc = nouveau_bar_alloc; - priv->base.kmap = nvc0_bar_kmap; priv->base.umap = nvc0_bar_umap; priv->base.unmap = nvc0_bar_unmap; priv->base.flush = nv84_bar_flush; @@ -201,7 +202,9 @@ nvc0_bar_init(struct nouveau_object *object) nv_mask(priv, 0x100c80, 0x00000001, 0x00000000); nv_wr32(priv, 0x001704, 0x80000000 | priv->bar[1].mem->addr >> 12); - nv_wr32(priv, 0x001714, 0xc0000000 | priv->bar[0].mem->addr >> 12); + if (priv->bar[0].mem) + nv_wr32(priv, 0x001714, + 0xc0000000 | priv->bar[0].mem->addr >> 12); return 0; } -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support
GK20A's FIFO is compatible with NVE0, but only features 128 channels and 1 runlist. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/Makefile | 1 + drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h | 1 + drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c | 35 ++++++++++++++++++++++ drivers/gpu/drm/nouveau/core/include/engine/fifo.h | 1 + 4 files changed, 38 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index d310c195bdfe..a90087bbdf88 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -237,6 +237,7 @@ nouveau-y += core/engine/fifo/nv50.o nouveau-y += core/engine/fifo/nv84.o nouveau-y += core/engine/fifo/nvc0.o nouveau-y += core/engine/fifo/nve0.o +nouveau-y += core/engine/fifo/nvea.o nouveau-y += core/engine/fifo/nv108.o nouveau-y += core/engine/graph/ctxnv40.o nouveau-y += core/engine/graph/ctxnv50.o diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h index 014344ebee66..e96b32bb1bbc 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h @@ -8,6 +8,7 @@ int nve0_fifo_ctor(struct nouveau_object *, struct nouveau_object *, struct nouveau_object **); void nve0_fifo_dtor(struct nouveau_object *); int nve0_fifo_init(struct nouveau_object *); +int nve0_fifo_fini(struct nouveau_object *, bool); struct nve0_fifo_impl { struct nouveau_oclass base; diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c new file mode 100644 index 000000000000..5e3a9514df5d --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c @@ -0,0 +1,35 @@ +/* + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include "nve0.h" + +struct nouveau_oclass * +nvea_fifo_oclass = &(struct nve0_fifo_impl) { + .base.handle = NV_ENGINE(FIFO, 0xea), + .base.ofuncs = &(struct nouveau_ofuncs) { + .ctor = nve0_fifo_ctor, + .dtor = nve0_fifo_dtor, + .init = nve0_fifo_init, + .fini = nve0_fifo_fini, + }, + .channels = 128, +}.base; diff --git a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h index 26b6b2bb1112..823356f45137 100644 --- a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h +++ b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h @@ -109,6 +109,7 @@ extern struct nouveau_oclass *nv50_fifo_oclass; extern struct nouveau_oclass *nv84_fifo_oclass; extern struct nouveau_oclass *nvc0_fifo_oclass; extern struct nouveau_oclass *nve0_fifo_oclass; +extern struct nouveau_oclass *nvea_fifo_oclass; extern struct nouveau_oclass *nv108_fifo_oclass; void nv04_fifo_intr(struct nouveau_subdev *); -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support
Add support for initializing the priv ring of GK20A. This is done by the BIOS on desktop GPUs, but needs to be done by hand on Tegra. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/Makefile | 1 + drivers/gpu/drm/nouveau/core/include/subdev/ibus.h | 1 + drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c | 110 +++++++++++++++++++++ 3 files changed, 112 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index a90087bbdf88..592141e62dda 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -132,6 +132,7 @@ nouveau-y += core/subdev/i2c/nv94.o nouveau-y += core/subdev/i2c/nvd0.o nouveau-y += core/subdev/ibus/nvc0.o nouveau-y += core/subdev/ibus/nve0.o +nouveau-y += core/subdev/ibus/nvea.o nouveau-y += core/subdev/instmem/base.o nouveau-y += core/subdev/instmem/nv04.o nouveau-y += core/subdev/instmem/nv40.o diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h index 88814f159d89..056a42f92705 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h @@ -30,5 +30,6 @@ nouveau_ibus(void *obj) extern struct nouveau_oclass nvc0_ibus_oclass; extern struct nouveau_oclass nve0_ibus_oclass; +extern struct nouveau_oclass nvea_ibus_oclass; #endif diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c new file mode 100644 index 000000000000..151851286e99 --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c @@ -0,0 +1,110 @@ +/* + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include <subdev/ibus.h> + +struct nvea_ibus_priv { + struct nouveau_ibus base; +}; + +static void +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv) +{ + nv_mask(priv, 0x137250, 0x3f, 0); + + nv_mask(priv, 0x000200, 0x20, 0); + udelay(20); + nv_mask(priv, 0x000200, 0x20, 0x20); + + nv_wr32(priv, 0x12004c, 0x4); + nv_wr32(priv, 0x122204, 0x2); + nv_rd32(priv, 0x122204); +} + +static void +nvea_ibus_intr(struct nouveau_subdev *subdev) +{ + struct nvea_ibus_priv *priv = (void *)subdev; + u32 status0 = nv_rd32(priv, 0x120058); + s32 retry = 100; + u32 command; + + if (status0 & 0x7) { + nv_debug(priv, "resetting priv ring\n"); + nvea_ibus_init_priv_ring(priv); + } + + /* Acknowledge interrupt */ + nv_mask(priv, 0x12004c, 0x2, 0x2); + + while (--retry >= 0) { + command = nv_rd32(priv, 0x12004c) & 0x3f; + if (command == 0) + break; + } + + if (retry < 0) + nv_warn(priv, "timeout waiting for ringmaster ack\n"); +} + +static int +nvea_ibus_init(struct nouveau_object *object) +{ + struct nvea_ibus_priv *priv = (void *)object; + int ret; + + ret = _nouveau_ibus_init(object); + if (ret) + return ret; + + nvea_ibus_init_priv_ring(priv); + + return 0; +} + +static int +nvea_ibus_ctor(struct nouveau_object *parent, struct nouveau_object *engine, + struct nouveau_oclass *oclass, void *data, u32 size, + struct nouveau_object **pobject) +{ + struct nvea_ibus_priv *priv; + int ret; + + ret = nouveau_ibus_create(parent, engine, oclass, &priv); + *pobject = nv_object(priv); + if (ret) + return ret; + + nv_subdev(priv)->intr = nvea_ibus_intr; + return 0; +} + +struct nouveau_oclass +nvea_ibus_oclass = { + .handle = NV_SUBDEV(IBUS, 0xea), + .ofuncs = &(struct nouveau_ofuncs) { + .ctor = nvea_ibus_ctor, + .dtor = _nouveau_ibus_dtor, + .init = nvea_ibus_init, + .fini = _nouveau_ibus_fini, + }, +}; -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 07/12] drm/nouveau/fb: add GK20A support
Add a simple FB device for GK20A, as well as a RAM implementation based on contiguous DMA memory allocations suitable for chips that use system memory as video RAM. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/Makefile | 2 + drivers/gpu/drm/nouveau/core/include/subdev/fb.h | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c | 56 ++++++++ drivers/gpu/drm/nouveau/core/subdev/fb/priv.h | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c | 168 +++++++++++++++++++++++ 5 files changed, 228 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index 592141e62dda..708d2e33835f 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -100,6 +100,7 @@ nouveau-y += core/subdev/fb/nvaa.o nouveau-y += core/subdev/fb/nvaf.o nouveau-y += core/subdev/fb/nvc0.o nouveau-y += core/subdev/fb/nve0.o +nouveau-y += core/subdev/fb/nvea.o nouveau-y += core/subdev/fb/ramnv04.o nouveau-y += core/subdev/fb/ramnv10.o nouveau-y += core/subdev/fb/ramnv1a.o @@ -114,6 +115,7 @@ nouveau-y += core/subdev/fb/ramnva3.o nouveau-y += core/subdev/fb/ramnvaa.o nouveau-y += core/subdev/fb/ramnvc0.o nouveau-y += core/subdev/fb/ramnve0.o +nouveau-y += core/subdev/fb/ramnvea.o nouveau-y += core/subdev/fb/sddr3.o nouveau-y += core/subdev/fb/gddr5.o nouveau-y += core/subdev/gpio/base.o diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h index d7ecafbae1ca..3905816755ba 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h @@ -105,6 +105,7 @@ extern struct nouveau_oclass *nvaa_fb_oclass; extern struct nouveau_oclass *nvaf_fb_oclass; extern struct nouveau_oclass *nvc0_fb_oclass; extern struct nouveau_oclass *nve0_fb_oclass; +extern struct nouveau_oclass *nvea_fb_oclass; #include <subdev/bios/ramcfg.h> diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c new file mode 100644 index 000000000000..62dbec48481e --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c @@ -0,0 +1,56 @@ +/* + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include "nvc0.h" + +struct nvea_fb_priv { + struct nouveau_fb base; +}; + +static int +nvea_fb_ctor(struct nouveau_object *parent, struct nouveau_object *engine, + struct nouveau_oclass *oclass, void *data, u32 size, + struct nouveau_object **pobject) +{ + struct nvea_fb_priv *priv; + int ret; + + ret = nouveau_fb_create(parent, engine, oclass, &priv); + *pobject = nv_object(priv); + if (ret) + return ret; + + return 0; +} + +struct nouveau_oclass * +nvea_fb_oclass = &(struct nouveau_fb_impl) { + .base.handle = NV_SUBDEV(FB, 0xea), + .base.ofuncs = &(struct nouveau_ofuncs) { + .ctor = nvea_fb_ctor, + .dtor = _nouveau_fb_dtor, + .init = _nouveau_fb_init, + .fini = _nouveau_fb_fini, + }, + .memtype = nvc0_fb_memtype_valid, + .ram = &nvea_ram_oclass, +}.base; diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h index edaf95dee612..0b95a25504d3 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h @@ -32,6 +32,7 @@ extern struct nouveau_oclass nva3_ram_oclass; extern struct nouveau_oclass nvaa_ram_oclass; extern struct nouveau_oclass nvc0_ram_oclass; extern struct nouveau_oclass nve0_ram_oclass; +extern struct nouveau_oclass nvea_ram_oclass; int nouveau_sddr3_calc(struct nouveau_ram *ram); int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c new file mode 100644 index 000000000000..4e3d757614b6 --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c @@ -0,0 +1,168 @@ +/* + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include "priv.h" + +#include <subdev/fb.h> + +#include <linux/mm.h> +#include <linux/types.h> +#include <linux/dma-contiguous.h> + +static void +nvea_ram_put(struct nouveau_fb *pfb, struct nouveau_mem **pmem) +{ + struct device *dev = nv_device_base(nv_device(pfb)); + struct nouveau_mem *mem = *pmem; + int i; + + *pmem = NULL; + + for (i = 0; i < mem->size; i++) { + struct page *page; + + if (mem->pages[i] == 0) + break; + + page = pfn_to_page(dma_to_pfn(dev, mem->pages[i])); + dma_release_from_contiguous(dev, page, 1); + } + + kfree(mem->pages); + kfree(mem); +} + +static int +nvea_ram_get(struct nouveau_fb *pfb, u64 size, u32 align, u32 ncmin, + u32 memtype, struct nouveau_mem **pmem) +{ + struct device *dev = nv_device_base(nv_device(pfb)); + struct nouveau_mem *mem; + int type = memtype & 0xff; + dma_addr_t dma_addr; + int npages; + int order; + int i; + + nv_debug(pfb, "%s: size: %llx align: %x, ncmin: %x\n", __func__, size, + align, ncmin); + + npages = size >> PAGE_SHIFT; + if (npages == 0) + npages = 1; + + if (align == 0) + align = PAGE_SIZE; + align >>= PAGE_SHIFT; + + /* round alignment to the next power of 2, if needed */ + order = fls(align); + if ((align & (align - 1)) == 0) + order--; + + ncmin >>= PAGE_SHIFT; + /* + * allocate pages by chunks of "align" size, otherwise we may leave + * holes in the contiguous memory area. + */ + if (ncmin == 0) + ncmin = npages; + else if (align > ncmin) + ncmin = align; + + mem = kzalloc(sizeof(*mem), GFP_KERNEL); + if (!mem) + return -ENOMEM; + + mem->size = npages; + mem->memtype = type; + + mem->pages = kzalloc(sizeof(dma_addr_t) * npages, GFP_KERNEL); + if (!mem) { + kfree(mem); + return -ENOMEM; + } + + while (npages) { + struct page *pages; + int pos = 0; + + /* don't overflow in case size is not a multiple of ncmin */ + if (ncmin > npages) + ncmin = npages; + + pages = dma_alloc_from_contiguous(dev, ncmin, order); + if (!pages) { + nvea_ram_put(pfb, &mem); + return -ENOMEM; + } + + dma_addr = pfn_to_dma(nv_device_base(nv_device(pfb)), + page_to_pfn(pages)); + + nv_debug(pfb, " alloc count: %x, order: %x, addr: %x\n", ncmin, + order, dma_addr); + + for (i = 0; i < ncmin; i++) + mem->pages[pos + i] = dma_addr + (PAGE_SIZE * i); + + pos += ncmin; + npages -= ncmin; + } + + mem->offset = (u64)mem->pages[0]; + + *pmem = mem; + + return 0; +} + +static int +nvea_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, + struct nouveau_oclass *oclass, void *data, u32 datasize, + struct nouveau_object **pobject) +{ + struct nouveau_ram *ram; + int ret; + + ret = nouveau_ram_create(parent, engine, oclass, &ram); + *pobject = nv_object(ram); + if (ret) + return ret; + ram->type = NV_MEM_TYPE_STOLEN; + ram->size = get_num_physpages() << PAGE_SHIFT; + + ram->get = nvea_ram_get; + ram->put = nvea_ram_put; + + return 0; +} + +struct nouveau_oclass +nvea_ram_oclass = { + .ofuncs = &(struct nouveau_ofuncs) { + .ctor = nvea_ram_ctor, + .dtor = _nouveau_ram_dtor, + .init = _nouveau_ram_init, + .fini = _nouveau_ram_fini, + }, +}; -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
nvc0_graph_ctor() would only let the graphics engine be enabled if its oclass has a proper microcode linked to it. This prevents GR from being enabled at all on chips that rely exclusively on external firmware, even though such a use-case is valid. Relax the conditions enabling the GR engine to also include the case where an external firmware has also been loaded. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c index 6ef8bf181b2d..f997a18f5760 100644 --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine, struct nvc0_graph_oclass *oclass = (void *)bclass; struct nouveau_device *device = nv_device(parent); struct nvc0_graph_priv *priv; + bool use_fw; int ret, i; + use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false); + ret = nouveau_graph_create(parent, engine, bclass, - (oclass->fecs.ucode != NULL), &priv); + (oclass->fecs.ucode != NULL) || use_fw, + &priv); *pobject = nv_object(priv); if (ret) return ret; @@ -1146,7 +1150,7 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine, priv->base.units = nvc0_graph_units; - if (nouveau_boolopt(device->cfgopt, "NvGrUseFW", false)) { + if (use_fw) { nv_info(priv, "using external firmware\n"); if (nvc0_graph_ctor_fw(priv, "fuc409c", &priv->fuc409c) || nvc0_graph_ctor_fw(priv, "fuc409d", &priv->fuc409d) || -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
Pad the microcode to a multiple of 0x40, otherwise firmware will fail to run from non-prepadded firmware files. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c index f997a18f5760..367e72daf8b1 100644 --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base, nv_wr32(priv, fuc_base + 0x0188, i >> 6); nv_wr32(priv, fuc_base + 0x0184, code->data[i]); } + + /* code must be padded to 0x40 */ + for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++) + nv_wr32(priv, fuc_base + 0x0184, 0); } static void -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support
Add a GR device for GK20A based on NVE4, with the correct classes definitions (GK20A's 3D class is 0xa297). Most of the NVE4 code can be used on GK20A, so make relevant bits of NVE4 available to other chips as well. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/Makefile | 1 + .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c | 4 +- drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h | 9 +++ drivers/gpu/drm/nouveau/core/engine/graph/nve4.c | 2 +- drivers/gpu/drm/nouveau/core/engine/graph/nvea.c | 75 ++++++++++++++++++++++ .../gpu/drm/nouveau/core/include/engine/graph.h | 1 + 6 files changed, 89 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index 708d2e33835f..032a4744f843 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o nouveau-y += core/engine/graph/nvd7.o nouveau-y += core/engine/graph/nvd9.o nouveau-y += core/engine/graph/nve4.o +nouveau-y += core/engine/graph/nvea.o nouveau-y += core/engine/graph/nvf0.o nouveau-y += core/engine/graph/nv108.o nouveau-y += core/engine/mpeg/nv31.o diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c index e2de73ee5eee..3904073f860d 100644 --- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c +++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c @@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = { {} }; -static void +void nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info) { u32 magic[GPC_MAX][2]; @@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info) nv_mask(priv, 0x41be10, 0x00800000, 0x00800000); } -static struct nvc0_graph_init * +struct nvc0_graph_init * nve4_grctx_init_hub[] = { nvc0_grctx_init_base, nve4_grctx_init_unk40xx, diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h index b0ab6de270b2..904f09b540c5 100644 --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h @@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *); void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *); void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *); +void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *); + extern struct nouveau_oclass *nvc0_grctx_oclass; extern struct nvc0_graph_init *nvc0_grctx_init_hub[]; extern struct nvc0_graph_init nvc0_grctx_init_base[]; @@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[]; extern struct nvc0_graph_init nvc0_grctx_init_tpc[]; extern struct nvc0_graph_init nvc0_grctx_init_icmd[]; extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; // +extern struct nvc0_graph_init nve4_grctx_init_icmd[]; + +extern struct nvc0_graph_init *nve4_grctx_init_hub[]; +extern struct nvc0_graph_init *nve4_grctx_init_gpc[]; extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[]; extern struct nvc0_graph_init nvc0_grctx_init_902d[]; extern struct nvc0_graph_init nvc0_grctx_init_9039[]; extern struct nvc0_graph_init nvc0_grctx_init_90c0[]; extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[]; +extern struct nvc0_graph_init nve4_grctx_init_a097[]; void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *); void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *); @@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[]; extern struct nouveau_oclass *nv108_grctx_oclass; +extern struct nvc0_graph_init *nve4_graph_init_mmio[]; + #define mmio_data(s,a,p) do { \ info->buffer[info->buffer_nr] = round_up(info->addr, (a)); \ info->addr = info->buffer[info->buffer_nr++] + (s); \ diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c index 05ec09c88517..442857c5c120 100644 --- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c @@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object) return nvc0_graph_init_ctxctl(priv); } -static struct nvc0_graph_init * +struct nvc0_graph_init * nve4_graph_init_mmio[] = { nve4_graph_init_regs, nvc0_graph_init_unk40xx, diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c new file mode 100644 index 000000000000..d5e6a1adcacb --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c @@ -0,0 +1,75 @@ +/* + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include "nvc0.h" + +static struct nouveau_oclass +nvea_graph_sclass[] = { + { 0x902d, &nouveau_object_ofuncs }, + { 0xa040, &nouveau_object_ofuncs }, + { 0xa297, &nouveau_object_ofuncs }, + { 0xa0c0, &nouveau_object_ofuncs }, + {} +}; + +static struct nvc0_graph_mthd +nvea_grctx_init_mthd[] = { + { 0xa297, nve4_grctx_init_a097, }, + { 0x902d, nvc0_grctx_init_902d, }, + { 0x902d, nvc0_grctx_init_mthd_magic, }, + {} +}; + +struct nouveau_oclass * +nvea_grctx_oclass = &(struct nvc0_grctx_oclass) { + .base.handle = NV_ENGCTX(GR, 0xea), + .base.ofuncs = &(struct nouveau_ofuncs) { + .ctor = nvc0_graph_context_ctor, + .dtor = nvc0_graph_context_dtor, + .init = _nouveau_graph_context_init, + .fini = _nouveau_graph_context_fini, + .rd32 = _nouveau_graph_context_rd32, + .wr32 = _nouveau_graph_context_wr32, + }, + .main = nve4_grctx_generate_main, + .mods = nve4_grctx_generate_mods, + .unkn = nve4_grctx_generate_unkn, + .hub = nve4_grctx_init_hub, + .gpc = nve4_grctx_init_gpc, + .icmd = nve4_grctx_init_icmd, + .mthd = nvea_grctx_init_mthd, +}.base; + + +struct nouveau_oclass * +nvea_graph_oclass = &(struct nvc0_graph_oclass) { + .base.handle = NV_ENGINE(GR, 0xea), + .base.ofuncs = &(struct nouveau_ofuncs) { + .ctor = nvc0_graph_ctor, + .dtor = nvc0_graph_dtor, + .init = nve4_graph_init, + .fini = _nouveau_graph_fini, + }, + .cclass = &nvea_grctx_oclass, + .sclass = nvea_graph_sclass, + .mmio = nve4_graph_init_mmio, +}.base; diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h index 97705618de97..632d7e172172 100644 --- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h +++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h @@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass; extern struct nouveau_oclass *nvd7_graph_oclass; extern struct nouveau_oclass *nvd9_graph_oclass; extern struct nouveau_oclass *nve4_graph_oclass; +extern struct nouveau_oclass *nvea_graph_oclass; extern struct nouveau_oclass *nvf0_graph_oclass; extern struct nouveau_oclass *nv108_graph_oclass; -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
GK20A does not embed a dedicated COPY engine and thus cannot allocate the copy channel that nouveau_accel_init() attempts to create. It also lacks any display hardware, so the creation of a software channel does not apply neither. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index ef27949057c3..f2394e84eae6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -171,6 +171,11 @@ nouveau_accel_init(struct nouveau_drm *drm) return; } + if (device->chipset == 0xea) { + /* gk20a does not have CE0/CE1 */ + arg0 = NVE0_CHANNEL_IND_ENGINE_GR; + arg1 = 1; + } else if (device->card_type >= NV_E0) { ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE, NVDRM_CHAN + 1, @@ -207,6 +212,10 @@ nouveau_accel_init(struct nouveau_drm *drm) return; } + /* Need to figure out how to handle sw for gk20a */ + if (device->chipset == 0xea) + goto skip_sw_init; + ret = nouveau_object_new(nv_object(drm), NVDRM_CHAN, NVDRM_NVSW, nouveau_abi16_swclass(drm), NULL, 0, &object); if (ret == 0) { @@ -233,6 +242,7 @@ nouveau_accel_init(struct nouveau_drm *drm) return; } +skip_sw_init: if (device->card_type < NV_C0) { ret = nouveau_gpuobj_new(drm->device, NULL, 32, 0, 0, &drm->notify); -- 1.9.1
Alexandre Courbot
2014-Mar-24 08:42 UTC
[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
Set the correct subdev/engine classes when GK20A (0xea) is probed. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c index 987edbc30a09..8509dd57de1f 100644 --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device) device->oclass[NVDEV_ENGINE_PPP ] = &nvc0_ppp_oclass; device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass; break; + case 0xea: + device->cname = "GK20A"; + device->oclass[NVDEV_SUBDEV_MC ] = nvc3_mc_oclass; + device->oclass[NVDEV_SUBDEV_BUS ] = nvc0_bus_oclass; + device->oclass[NVDEV_SUBDEV_TIMER ] = &nv04_timer_oclass; + device->oclass[NVDEV_SUBDEV_FB ] = nvea_fb_oclass; + device->oclass[NVDEV_SUBDEV_IBUS ] = &nvea_ibus_oclass; + device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass; + device->oclass[NVDEV_SUBDEV_VM ] = &nvc0_vmmgr_oclass; + device->oclass[NVDEV_SUBDEV_BAR ] = &nvc0_bar_oclass; + device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass; + device->oclass[NVDEV_ENGINE_FIFO ] = nvea_fifo_oclass; + /* TODO will need an implementation for this at some point... */ +#if 0 + device->oclass[NVDEV_ENGINE_SW ] = nvc0_software_oclass; +#endif + device->oclass[NVDEV_ENGINE_GR ] = nvea_graph_oclass; + device->oclass[NVDEV_ENGINE_COPY2 ] = &nve0_copy2_oclass; + device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass; + break; case 0xf0: device->cname = "GK110"; device->oclass[NVDEV_SUBDEV_VBIOS ] = &nouveau_bios_oclass; -- 1.9.1
Lucas Stach
2014-Mar-24 13:19 UTC
[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
Hi Alexandre, Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:> Hi everyone,[...]> > A few lines of hacks (not included here) are still needed to deal with cached > mappings triggering external aborts and CPU/GPU memory coherency issues, but I > hope to understand and address these issues next.For the coherency issue part you may want to look at my Nouveau on ARM series. Most of it never made it upstream, as I lacked the time to work further on this, but it solves the coherency issue from the kernel. It does so by doing the necessary manual cache flushes/invalidates on buffer access, so costs some performance. To avoid this you really want to get writecombined mappings into the kernel<->userspace interface. Simply mapping the pushbuf as WC/US has brought a 7% performance increase in OpenArena when I last tested this. This test was done with only one PCIe lane, so the perf increase may be even better with a more adequate interconnect. Regards, Lucas -- Pengutronix e.K. | Lucas Stach | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Thierry Reding
2014-Mar-24 21:49 UTC
[Nouveau] [PATCH 01/12] drm/nouveau: fix missing newline
On Mon, Mar 24, 2014 at 05:42:23PM +0900, Alexandre Courbot wrote:> Add a missing newline at the end of a DRM_INFO message. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-)Reviewed-by: Thierry Reding <treding at nvidia.com> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/50d176b9/attachment.sig>
Thierry Reding
2014-Mar-24 21:54 UTC
[Nouveau] [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:> GK20A's timer is directly attached to the system timer and cannot be > calibrated. Skip the calibration phase on that chip since the > corresponding registers do not exist. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c > index c0bdd10358d7..822fe0d8a871 100644 > --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c > +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c > @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object) > if (ret) > return ret; > > + /* gk20a does not have the calibration registers */ > + if (device->chipset == 0xea) > + goto skip_clk_init;I'm concerned that this won't scale in the future. Perhaps a better solution would be to add a "flags" or "features" field to struct nouveau_device along with feature bits such as HAS_CALIBRATION or similar. That way we don't have to touch this code for every new future Tegra chip. Unless perhaps if there's a reason to expect things to change in newer generations. Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/16ab7985/attachment.sig>
Thierry Reding
2014-Mar-24 22:10 UTC
[Nouveau] [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote: [...]> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c[...]> static int > -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine, > - struct nouveau_oclass *oclass, void *data, u32 size, > - struct nouveau_object **pobject) > +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar) > {[...]> - /* BAR3 */ > ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0, > - &priv->bar[0].mem); > - mem = priv->bar[0].mem; > + &priv->bar[nr].mem); > + mem = priv->bar[nr].mem; > if (ret) > return ret; > > ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0, > - &priv->bar[0].pgd); > + &priv->bar[nr].pgd); > if (ret) > return ret;[...]> +static int > +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine, > + struct nouveau_oclass *oclass, void *data, u32 size, > + struct nouveau_object **pobject) > +{[...]> + /* BAR3 */ > + if (has_bar3) { > + ret = nvc0_bar_init_vm(priv, 0, 3);[...]> + /* BAR1 */ > + ret = nvc0_bar_init_vm(priv, 1, 1); > if (ret) > return ret;The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It is hard to see from the invocation what these numbers mean and therefore distinguish which parameter is which. Perhaps a slightly more readable way would be to pass in a pointer to a structure as second parameter instead of the index into an array. So it'd look somewhat like this: if (has_bar3) { ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3); ... } ... ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1); ... Unfortunately that would require a new type to be created for the bar[] structures, so it'd be slightly more intrusive. Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/e33e3552/attachment.sig>
Thierry Reding
2014-Mar-24 22:13 UTC
[Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot wrote:> Some chips that use system memory exclusively (e.g. GK20A) do not > expose 2 BAR regions. For them only BAR1 exists, and it should be used > for USERD mapping. Do not map BAR3 if its resource does not exist. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c > index bdf594116f3f..d713eeb75b13 100644 > --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c > +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c > @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent, > { > struct nouveau_device *device = nv_device(parent); > struct nouveau_bar *bar; > + bool has_bar3 = nv_device_resource_len(device, 3) != 0;I don't think this variable is really necessary since you only use the expression once anyway, but I don't feel very strongly about it, so either way: Reviewed-by: Thierry Reding <treding at nvidia.com> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/67a79728/attachment.sig>
Thierry Reding
2014-Mar-24 22:14 UTC
[Nouveau] [PATCH 05/12] drm/nouveau/fifo: add GK20A support
On Mon, Mar 24, 2014 at 05:42:27PM +0900, Alexandre Courbot wrote:> GK20A's FIFO is compatible with NVE0, but only features 128 channels and > 1 runlist. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/Makefile | 1 + > drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h | 1 + > drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c | 35 ++++++++++++++++++++++ > drivers/gpu/drm/nouveau/core/include/engine/fifo.h | 1 + > 4 files changed, 38 insertions(+) > create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.cLooks good to me: Reviewed-by: Thierry Reding <treding at nvidia.com> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/9c7c7236/attachment.sig>
Thierry Reding
2014-Mar-24 22:34 UTC
[Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support
On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote: [...]> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c[...]> +#include <subdev/ibus.h> > + > +struct nvea_ibus_priv { > + struct nouveau_ibus base; > +}; > + > +static void > +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv) > +{ > + nv_mask(priv, 0x137250, 0x3f, 0); > + > + nv_mask(priv, 0x000200, 0x20, 0); > + udelay(20);usleep_range()?> +static void > +nvea_ibus_intr(struct nouveau_subdev *subdev) > +{[...]> + /* Acknowledge interrupt */ > + nv_mask(priv, 0x12004c, 0x2, 0x2); > + > + while (--retry >= 0) { > + command = nv_rd32(priv, 0x12004c) & 0x3f; > + if (command == 0) > + break; > + } > + > + if (retry < 0) > + nv_warn(priv, "timeout waiting for ringmaster ack\n"); > +}Perhaps I'm being paranoid, but this loop now depends on the frequency of the various clocks involved and therefore might break at some point if the frequencies get sufficiently high. So a slightly safer implementation would use a proper timeout using a combination of msecs_to_jiffies(), time_before() and usleep_range(), like so: timeout = jiffies + msecs_to_jiffies(...); while (time_before(jiffies, timeout)) { command = nv_rd32(...) & 0x3f; if (command == 0) break; usleep_range(...); } if (time_after(jiffies, timeout)) nv_warn(...); This assumes that there's some known timeout after which the ringmaster is expected to have acked the interrupt. On that note, I wonder if the warning is accurate here: it's my understanding that writing 0x2 to the register does acknowledge the interrupt, so the ringmaster does in fact "clear" it rather than "acknowledge" it, doesn't it? Although now that I mention it I seem to remember that this write is actually sending a command to the ring master and perhaps waiting for the register to return to 0 is indeed waiting for an ACK of sorts. Maybe adding a comment or so describing what this sequence does would be appropriate here? Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/16ffb2ac/attachment-0001.sig>
Thierry Reding
2014-Mar-24 22:58 UTC
[Nouveau] [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote: [...]> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > index 6ef8bf181b2d..f997a18f5760 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine, > struct nvc0_graph_oclass *oclass = (void *)bclass; > struct nouveau_device *device = nv_device(parent); > struct nvc0_graph_priv *priv; > + bool use_fw;Perhaps "ext_fw" or "use_ext_fw" would be more accurate.> int ret, i; > > + use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false); > + > ret = nouveau_graph_create(parent, engine, bclass, > - (oclass->fecs.ucode != NULL), &priv); > + (oclass->fecs.ucode != NULL) || use_fw, > + &priv);Or perhaps a more intuitive way would be to name the variable "enable" and have something like: if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false)) enable = oclass->fecs.ucode != NULL; else enable = true; ret = nouveau_graph_create(parent, engine, bclass, enable, &priv); ... Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140324/8d29e34e/attachment.sig>
Thierry Reding
2014-Mar-24 23:02 UTC
[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
On Mon, Mar 24, 2014 at 05:42:31PM +0900, Alexandre Courbot wrote:> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to > run from non-prepadded firmware files.Perhaps this (and the comment in the code) should mention a unit. In this case it's 0x40 words. Also, I think using a decimal number would read easier here and in the comment. Furthermore, perhaps choosing a byte as the unit would be even more intuitive. But either way the code is correct, so: Reviewed-by: Thierry Reding <treding at nvidia.com> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140325/589a12af/attachment.sig>
Thierry Reding
2014-Mar-24 23:10 UTC
[Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:> GK20A does not embed a dedicated COPY engine and thus cannot allocate > the copy channel that nouveau_accel_init() attempts to create. It also > lacks any display hardware, so the creation of a software channel does > not apply neither.Perhaps this should be two separate patches?> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c[...]> + if (device->chipset == 0xea) { > + /* gk20a does not have CE0/CE1 */This would be another good candidate for a feature flag.> + arg0 = NVE0_CHANNEL_IND_ENGINE_GR; > + arg1 = 1; > + } else > if (device->card_type >= NV_E0) {The formatting here is somewhat weird. From a brief look I couldn't find any indication that nouveau deviates from the standard coding style, so this should be: } else if (...) {> + /* Need to figure out how to handle sw for gk20a */ > + if (device->chipset == 0xea) > + goto skip_sw_init;The commit message makes it sound like SW isn't needed since gk20a "lacks any display hardware". In that case the comment here doesn't make much sense. Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140325/1f743847/attachment.sig>
Thierry Reding
2014-Mar-24 23:11 UTC
[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
On Mon, Mar 24, 2014 at 05:42:34PM +0900, Alexandre Courbot wrote: [...]> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c[...]> + /* TODO will need an implementation for this at some point... */Do we? If so the commit message in 11/12 is misleading. Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140325/268fe079/attachment.sig>
Ben Skeggs
2014-Mar-26 04:22 UTC
[Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at nvidia.com> wrote:> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to > run from non-prepadded firmware files. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > index f997a18f5760..367e72daf8b1 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c > @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base, > nv_wr32(priv, fuc_base + 0x0188, i >> 6); > nv_wr32(priv, fuc_base + 0x0184, code->data[i]); > } > + > + /* code must be padded to 0x40 */ > + for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)"for (; i & 0x3f; i++)" would work just as well :)> + nv_wr32(priv, fuc_base + 0x0184, 0); > } > > static void > -- > 1.9.1 > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau
Ben Skeggs
2014-Mar-26 04:24 UTC
[Nouveau] [PATCH 10/12] drm/nouveau/graph: add GK20A support
On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at nvidia.com> wrote:> Add a GR device for GK20A based on NVE4, with the correct classes > definitions (GK20A's 3D class is 0xa297). > > Most of the NVE4 code can be used on GK20A, so make relevant bits of > NVE4 available to other chips as well.This will need a bit of a rebase on top of the tree I mentioned earlier (also queued for drm-next now), where I've further split out and named the various chunks of state. Does GK104 match entirely correctly, or just happen to work? I could probably hunt down the GK20A netlist images and check that actually :)> > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/Makefile | 1 + > .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c | 4 +- > drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h | 9 +++ > drivers/gpu/drm/nouveau/core/engine/graph/nve4.c | 2 +- > drivers/gpu/drm/nouveau/core/engine/graph/nvea.c | 75 ++++++++++++++++++++++ > .../gpu/drm/nouveau/core/include/engine/graph.h | 1 + > 6 files changed, 89 insertions(+), 3 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c > > diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile > index 708d2e33835f..032a4744f843 100644 > --- a/drivers/gpu/drm/nouveau/Makefile > +++ b/drivers/gpu/drm/nouveau/Makefile > @@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o > nouveau-y += core/engine/graph/nvd7.o > nouveau-y += core/engine/graph/nvd9.o > nouveau-y += core/engine/graph/nve4.o > +nouveau-y += core/engine/graph/nvea.o > nouveau-y += core/engine/graph/nvf0.o > nouveau-y += core/engine/graph/nv108.o > nouveau-y += core/engine/mpeg/nv31.o > diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c > index e2de73ee5eee..3904073f860d 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c > @@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = { > {} > }; > > -static void > +void > nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info) > { > u32 magic[GPC_MAX][2]; > @@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info) > nv_mask(priv, 0x41be10, 0x00800000, 0x00800000); > } > > -static struct nvc0_graph_init * > +struct nvc0_graph_init * > nve4_grctx_init_hub[] = { > nvc0_grctx_init_base, > nve4_grctx_init_unk40xx, > diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h > index b0ab6de270b2..904f09b540c5 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h > @@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *); > void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *); > void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *); > > +void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *); > + > extern struct nouveau_oclass *nvc0_grctx_oclass; > extern struct nvc0_graph_init *nvc0_grctx_init_hub[]; > extern struct nvc0_graph_init nvc0_grctx_init_base[]; > @@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[]; > extern struct nvc0_graph_init nvc0_grctx_init_tpc[]; > extern struct nvc0_graph_init nvc0_grctx_init_icmd[]; > extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; // > +extern struct nvc0_graph_init nve4_grctx_init_icmd[]; > + > +extern struct nvc0_graph_init *nve4_grctx_init_hub[]; > +extern struct nvc0_graph_init *nve4_grctx_init_gpc[]; > > extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[]; > extern struct nvc0_graph_init nvc0_grctx_init_902d[]; > extern struct nvc0_graph_init nvc0_grctx_init_9039[]; > extern struct nvc0_graph_init nvc0_grctx_init_90c0[]; > extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[]; > +extern struct nvc0_graph_init nve4_grctx_init_a097[]; > > void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *); > void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *); > @@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[]; > > extern struct nouveau_oclass *nv108_grctx_oclass; > > +extern struct nvc0_graph_init *nve4_graph_init_mmio[]; > + > #define mmio_data(s,a,p) do { \ > info->buffer[info->buffer_nr] = round_up(info->addr, (a)); \ > info->addr = info->buffer[info->buffer_nr++] + (s); \ > diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c > index 05ec09c88517..442857c5c120 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c > @@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object) > return nvc0_graph_init_ctxctl(priv); > } > > -static struct nvc0_graph_init * > +struct nvc0_graph_init * > nve4_graph_init_mmio[] = { > nve4_graph_init_regs, > nvc0_graph_init_unk40xx, > diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c > new file mode 100644 > index 000000000000..d5e6a1adcacb > --- /dev/null > +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c > @@ -0,0 +1,75 @@ > +/* > + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +#include "nvc0.h" > + > +static struct nouveau_oclass > +nvea_graph_sclass[] = { > + { 0x902d, &nouveau_object_ofuncs }, > + { 0xa040, &nouveau_object_ofuncs }, > + { 0xa297, &nouveau_object_ofuncs }, > + { 0xa0c0, &nouveau_object_ofuncs }, > + {} > +}; > + > +static struct nvc0_graph_mthd > +nvea_grctx_init_mthd[] = { > + { 0xa297, nve4_grctx_init_a097, }, > + { 0x902d, nvc0_grctx_init_902d, }, > + { 0x902d, nvc0_grctx_init_mthd_magic, }, > + {} > +}; > + > +struct nouveau_oclass * > +nvea_grctx_oclass = &(struct nvc0_grctx_oclass) { > + .base.handle = NV_ENGCTX(GR, 0xea), > + .base.ofuncs = &(struct nouveau_ofuncs) { > + .ctor = nvc0_graph_context_ctor, > + .dtor = nvc0_graph_context_dtor, > + .init = _nouveau_graph_context_init, > + .fini = _nouveau_graph_context_fini, > + .rd32 = _nouveau_graph_context_rd32, > + .wr32 = _nouveau_graph_context_wr32, > + }, > + .main = nve4_grctx_generate_main, > + .mods = nve4_grctx_generate_mods, > + .unkn = nve4_grctx_generate_unkn, > + .hub = nve4_grctx_init_hub, > + .gpc = nve4_grctx_init_gpc, > + .icmd = nve4_grctx_init_icmd, > + .mthd = nvea_grctx_init_mthd, > +}.base; > + > + > +struct nouveau_oclass * > +nvea_graph_oclass = &(struct nvc0_graph_oclass) { > + .base.handle = NV_ENGINE(GR, 0xea), > + .base.ofuncs = &(struct nouveau_ofuncs) { > + .ctor = nvc0_graph_ctor, > + .dtor = nvc0_graph_dtor, > + .init = nve4_graph_init, > + .fini = _nouveau_graph_fini, > + }, > + .cclass = &nvea_grctx_oclass, > + .sclass = nvea_graph_sclass, > + .mmio = nve4_graph_init_mmio, > +}.base; > diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h > index 97705618de97..632d7e172172 100644 > --- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h > +++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h > @@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass; > extern struct nouveau_oclass *nvd7_graph_oclass; > extern struct nouveau_oclass *nvd9_graph_oclass; > extern struct nouveau_oclass *nve4_graph_oclass; > +extern struct nouveau_oclass *nvea_graph_oclass; > extern struct nouveau_oclass *nvf0_graph_oclass; > extern struct nouveau_oclass *nv108_graph_oclass; > > -- > 1.9.1 > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Ben Skeggs
2014-Mar-26 04:28 UTC
[Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot at nvidia.com> wrote:> Set the correct subdev/engine classes when GK20A (0xea) is probed. > > Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> > --- > drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c > index 987edbc30a09..8509dd57de1f 100644 > --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c > +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c > @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device) > device->oclass[NVDEV_ENGINE_PPP ] = &nvc0_ppp_oclass; > device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass; > break; > + case 0xea: > + device->cname = "GK20A"; > + device->oclass[NVDEV_SUBDEV_MC ] = nvc3_mc_oclass; > + device->oclass[NVDEV_SUBDEV_BUS ] = nvc0_bus_oclass; > + device->oclass[NVDEV_SUBDEV_TIMER ] = &nv04_timer_oclass;As per note on the PTIMER patch, can just switch this to "gk20a_timer_oclass" on the latest code.> + device->oclass[NVDEV_SUBDEV_FB ] = nvea_fb_oclass; > + device->oclass[NVDEV_SUBDEV_IBUS ] = &nvea_ibus_oclass; > + device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass; > + device->oclass[NVDEV_SUBDEV_VM ] = &nvc0_vmmgr_oclass; > + device->oclass[NVDEV_SUBDEV_BAR ] = &nvc0_bar_oclass; > + device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass; > + device->oclass[NVDEV_ENGINE_FIFO ] = nvea_fifo_oclass; > + /* TODO will need an implementation for this at some point... */ > +#if 0 > + device->oclass[NVDEV_ENGINE_SW ] = nvc0_software_oclass; > +#endif > + device->oclass[NVDEV_ENGINE_GR ] = nvea_graph_oclass; > + device->oclass[NVDEV_ENGINE_COPY2 ] = &nve0_copy2_oclass; > + device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass; > + break; > case 0xf0: > device->cname = "GK110"; > device->oclass[NVDEV_SUBDEV_VBIOS ] = &nouveau_bios_oclass; > -- > 1.9.1 > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau
Alexandre Courbot
2014-Mar-26 06:33 UTC
[Nouveau] [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
Hi Lucas, On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach at pengutronix.de> wrote:> Hi Alexandre, > > Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot: >> Hi everyone, > [...] >> >> A few lines of hacks (not included here) are still needed to deal with cached >> mappings triggering external aborts and CPU/GPU memory coherency issues, but I >> hope to understand and address these issues next. > > For the coherency issue part you may want to look at my Nouveau on ARM > series. Most of it never made it upstream, as I lacked the time to work > further on this, but it solves the coherency issue from the kernel.Oh, thanks for pointing this out, it will probably be most useful. Shall I assume the patches at https://www.mail-archive.com/nouveau at lists.freedesktop.org/msg13557.html are up-to-date? Would you mind if I include the relevant patches of yours in the next iteration of this series?> > It does so by doing the necessary manual cache flushes/invalidates on > buffer access, so costs some performance. To avoid this you really want > to get writecombined mappings into the kernel<->userspace interface. > Simply mapping the pushbuf as WC/US has brought a 7% performance > increase in OpenArena when I last tested this. This test was done with > only one PCIe lane, so the perf increase may be even better with a more > adequate interconnect.Interestingly if I allow writecombined mappings in the kernel I get faults when attempting the read the mapped area: [ 78.074854] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf003e010 ... [ 78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>] (nouveau_fence_update+0x5c/0x80) [ 78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>] (nouveau_fence_done+0x18/0x28) [ 78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>] (ttm_bo_wait+0x104/0x184) [ 78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>] (nouveau_gem_ioctl_cpu_prep+0x40/0xe8) [ 78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from [<c029fd5c>] (drm_ioctl+0x404/0x4b8) [ 78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>] (nouveau_drm_ioctl+0x54/0x80) [ 78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>] (do_vfs_ioctl+0x3f0/0x5bc) [ 78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>] (SyS_ioctl+0x34/0x5c) [ 78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>] (ret_fast_syscall+0x0/0x30) To avoid these I need to set the VRAM default_caching to TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO being accessed through the BAR, they are correctly considered as IO memory and mapped using ttm_bo_ioremap(), so it really seems to be unhappy with the WC mapping itself. Note that if I go ahead and force the use of pgprot_writecombine() in ttm_io_prot() to get writecombined user-space mappings, pure DRM programs that map a buffer and try to read it fail similarly, while Mesa's glReadPixels() seems to be happy. I'm not sure what it does differently here. Cheers, Alex.
Possibly Parallel Threads
- [PATCH v2 00/10] drm/nouveau: support for GK20A, cont'd
- [PATCH v4 0/9] drm/nouveau: support for GK20A, cont'd
- [PATCH v3 0/9] drm/nouveau: support for GK20A, cont'd
- [RFC 00/16] drm/nouveau: initial support for GK20A (Tegra K1)
- [RFC 14/16] drm/nouveau/fb: add GK20A support