jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
From: Jérôme Glisse <jglisse at redhat.com> (mm is cced just to allow exposure of device driver work without ccing a long list of peoples. I do not think there is anything usefull to discuss from mm point of view but i might be wrong, so just for the curious :)). git://people.freedesktop.org/~glisse/linux branch: nouveau-hmm-v00 https://cgit.freedesktop.org/~glisse/linux/log/?h=nouveau-hmm-v00 This patchset adds SVM (Share Virtual Memory) using HMM (Heterogeneous Memory Management) to the nouveau driver. SVM means that GPU threads spawn by GPU driver for a specific user process can access any valid CPU address in that process. A valid pointer is a pointer inside an area coming from mmap of private, share or regular file. Pointer to a mmap of a device file or special file are not supported. This is an RFC for few reasons technical reasons listed below and also because we are still working on a proper open source userspace (namely a OpenCL 2.0 for nouveau inside mesa). Open source userspace being a requirement for the DRM subsystem. I pushed in [1] a simple standalone program that can be use to test SVM through HMM with nouveau. I expect we will have a somewhat working userspace in the coming weeks, work being well underway and some patches have already been posted on mesa mailing list. They are two aspect that need to sorted before this can be considered ready. First we want to decide how to update GPU page table from HMM. In this patchset i added new methods to vmm to allow GPU page table to be updated without nvkm_memory or nvkm_vma object (see patch 7 and 8 special mapping method for HMM). It just take an array of pages and flags. It allow for both system and device private memory to be interleaved. The second aspect is how to create a HMM enabled channel. Channel is a term use for NVidia GPU command queue, each process using nouveau have at least one channel, it can have multiple channels. They are not created by process directly but rather by device driver backend of common library like OpenGL, OpenCL or Vulkan. They are work underway to revamp nouveau channel creation with a new userspace API. So we might want to delay upstreaming until this lands. We can stil discuss one aspect specific to HMM here namely the issue around GEM objects used for some specific part of the GPU. Some engine inside the GPU (engine are a GPU block like the display block which is responsible of scaning memory to send out a picture through some connector for instance HDMI or DisplayPort) can only access memory with virtual address below (1 << 40). To accomodate those we need to create a "hole" inside the process address space. This patchset have a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of device file offset so that process can mmap this range with PROT_NONE to create a hole (process must make sure the hole is below 1 << 40). I feel un-easy of doing it this way but maybe it is ok with other folks. Note that this patchset do not show usage of device private memory as it depends on other architectural changes to nouveau. However it is very easy to add it with some gross hack so if people would like to see it i can also post an RFC for that. As a preview it only adds two new ioctl which allow userspace to ask for migration of a range of virtual address, expectation is that the userspace library will know better where to place thing and kernel will try to sastify this (with no guaranty, it is a best effort). As usual comments and questions are welcome. Cheers, Jérôme Glisse [1] https://cgit.freedesktop.org/~glisse/moche Ben Skeggs (4): drm/nouveau/core: define engine for handling replayable faults drm/nouveau/mmu/gp100: allow gcc/tex to generate replayable faults drm/nouveau/mc/gp100-: handle replayable fault interrupt drm/nouveau/fault/gp100: initial implementation of MaxwellFaultBufferA Jérôme Glisse (9): drm/nouveau/vmm: enable page table iterator over non populated range drm/nouveau/core/memory: add some useful accessor macros drm/nouveau: special mapping method for HMM drm/nouveau: special mapping method for HMM (user interface) drm/nouveau: add SVM through HMM support to nouveau client drm/nouveau: add HMM area creation drm/nouveau: add HMM area creation user interface drm/nouveau: HMM area creation helpers for nouveau client drm/nouveau: HACK FOR HMM AREA drivers/gpu/drm/nouveau/Kbuild | 3 + drivers/gpu/drm/nouveau/include/nvif/class.h | 2 + drivers/gpu/drm/nouveau/include/nvif/clb069.h | 8 + drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 ++ drivers/gpu/drm/nouveau/include/nvif/vmm.h | 4 + drivers/gpu/drm/nouveau/include/nvkm/core/device.h | 3 + drivers/gpu/drm/nouveau/include/nvkm/core/memory.h | 8 + .../gpu/drm/nouveau/include/nvkm/engine/fault.h | 5 + drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + drivers/gpu/drm/nouveau/nouveau_drm.c | 5 + drivers/gpu/drm/nouveau/nouveau_drv.h | 3 + drivers/gpu/drm/nouveau/nouveau_hmm.c | 367 +++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_hmm.h | 64 ++++ drivers/gpu/drm/nouveau/nouveau_ttm.c | 9 +- drivers/gpu/drm/nouveau/nouveau_vmm.c | 83 +++++ drivers/gpu/drm/nouveau/nouveau_vmm.h | 12 + drivers/gpu/drm/nouveau/nvif/vmm.c | 80 +++++ drivers/gpu/drm/nouveau/nvkm/core/subdev.c | 1 + drivers/gpu/drm/nouveau/nvkm/engine/Kbuild | 1 + drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 8 + drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/device/user.c | 1 + drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild | 4 + drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c | 116 +++++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c | 61 ++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h | 29 ++ drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c | 136 ++++++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h | 7 + drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c | 20 +- drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 88 ++++- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 241 ++++++++++++-- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 8 + drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 78 ++++- 35 files changed, 1463 insertions(+), 33 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/include/nvif/clb069.h create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h create mode 100644 drivers/gpu/drm/nouveau/nouveau_hmm.c create mode 100644 drivers/gpu/drm/nouveau/nouveau_hmm.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 01/13] drm/nouveau/vmm: enable page table iterator over non populated range
From: Jérôme Glisse <jglisse at redhat.com> This patch modify the page table iterator to support empty range when unmaping a range (ie when it is not trying to populate the range). Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 75 ++++++++++++++++++--------- 1 file changed, 51 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c index 93946dcee319..20d31526ba8f 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c @@ -75,6 +75,7 @@ struct nvkm_vmm_iter { struct nvkm_vmm *vmm; u64 cnt; u16 max, lvl; + u64 start, addr; u32 pte[NVKM_VMM_LEVELS_MAX]; struct nvkm_vmm_pt *pt[NVKM_VMM_LEVELS_MAX]; int flush; @@ -485,6 +486,23 @@ nvkm_vmm_ref_swpt(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgd, u32 pdei) return true; } +static inline u64 +nvkm_vmm_iter_addr(const struct nvkm_vmm_iter *it, + const struct nvkm_vmm_desc *desc) +{ + int max = it->max; + u64 addr; + + /* Reconstruct address */ + addr = it->pte[max--]; + do { + addr = addr << desc[max].bits; + addr |= it->pte[max]; + } while (max--); + + return addr; +} + static inline u64 nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, u64 addr, u64 size, const char *name, bool ref, @@ -494,21 +512,23 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, { const struct nvkm_vmm_desc *desc = page->desc; struct nvkm_vmm_iter it; - u64 bits = addr >> page->shift; + u64 addr_bits = addr >> page->shift; it.page = page; it.desc = desc; it.vmm = vmm; it.cnt = size >> page->shift; it.flush = NVKM_VMM_LEVELS_MAX; + it.start = it.addr = addr; /* Deconstruct address into PTE indices for each mapping level. */ for (it.lvl = 0; desc[it.lvl].bits; it.lvl++) { - it.pte[it.lvl] = bits & ((1 << desc[it.lvl].bits) - 1); - bits >>= desc[it.lvl].bits; + it.pte[it.lvl] = addr_bits & ((1 << desc[it.lvl].bits) - 1); + addr_bits >>= desc[it.lvl].bits; } it.max = --it.lvl; it.pt[it.max] = vmm->pd; + addr_bits = addr >> page->shift; it.lvl = 0; TRA(&it, "%s: %016llx %016llx %d %lld PTEs", name, @@ -521,7 +541,8 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, const int type = desc->type == SPT; const u32 pten = 1 << desc->bits; const u32 ptei = it.pte[0]; - const u32 ptes = min_t(u64, it.cnt, pten - ptei); + u32 ptes = min_t(u64, it.cnt, pten - ptei); + u64 tmp; /* Walk down the tree, finding page tables for each level. */ for (; it.lvl; it.lvl--) { @@ -529,9 +550,14 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, struct nvkm_vmm_pt *pgd = pgt; /* Software PT. */ - if (ref && NVKM_VMM_PDE_INVALID(pgd->pde[pdei])) { - if (!nvkm_vmm_ref_swpt(&it, pgd, pdei)) - goto fail; + if (NVKM_VMM_PDE_INVALID(pgd->pde[pdei])) { + if (ref) { + if (!nvkm_vmm_ref_swpt(&it, pgd, pdei)) + goto fail; + } else { + it.pte[it.lvl] += 1; + goto next; + } } it.pt[it.lvl - 1] = pgt = pgd->pde[pdei]; @@ -545,9 +571,16 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, if (!nvkm_vmm_ref_hwpt(&it, pgd, pdei)) goto fail; } + + /* With HMM we might walk down un-populated range */ + if (!pgt) { + it.pte[it.lvl] += 1; + goto next; + } } /* Handle PTE updates. */ + it.addr = nvkm_vmm_iter_addr(&it, desc) << PAGE_SHIFT; if (!REF_PTES || REF_PTES(&it, ptei, ptes)) { struct nvkm_mmu_pt *pt = pgt->pt[type]; if (MAP_PTES || CLR_PTES) { @@ -558,32 +591,26 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, nvkm_vmm_flush_mark(&it); } } + it.pte[it.lvl] += ptes; +next: /* Walk back up the tree to the next position. */ - it.pte[it.lvl] += ptes; - it.cnt -= ptes; - if (it.cnt) { - while (it.pte[it.lvl] == (1 << desc[it.lvl].bits)) { - it.pte[it.lvl++] = 0; - it.pte[it.lvl]++; - } + while (it.pte[it.lvl] == (1 << desc[it.lvl].bits)) { + it.pte[it.lvl++] = 0; + if (it.lvl == it.max) + break; + it.pte[it.lvl]++; } + tmp = nvkm_vmm_iter_addr(&it, desc); + it.cnt -= min_t(u64, it.cnt, tmp - addr_bits); + addr_bits = tmp; }; nvkm_vmm_flush(&it); return ~0ULL; fail: - /* Reconstruct the failure address so the caller is able to - * reverse any partially completed operations. - */ - addr = it.pte[it.max--]; - do { - addr = addr << desc[it.max].bits; - addr |= it.pte[it.max]; - } while (it.max--); - - return addr << page->shift; + return addr_bits << page->shift; } static void -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 02/13] drm/nouveau/core/memory: add some useful accessor macros
From: Jérôme Glisse <jglisse at redhat.com> Adds support for 64-bits read. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> --- drivers/gpu/drm/nouveau/include/nvkm/core/memory.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h b/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h index 05f505de0075..d1a886c4d2d9 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h @@ -82,6 +82,14 @@ void nvkm_memory_tags_put(struct nvkm_memory *, struct nvkm_device *, nvkm_wo32((o), __a + 4, upper_32_bits(__d)); \ } while(0) +#define nvkm_ro64(o,a) ({ \ + u64 _data; \ + _data = nvkm_ro32((o), (a) + 4); \ + _data = _data << 32; \ + _data |= nvkm_ro32((o), (a) + 0); \ + _data; \ +}) + #define nvkm_fill(t,s,o,a,d,c) do { \ u64 _a = (a), _c = (c), _d = (d), _o = _a >> s, _s = _c << s; \ u##t __iomem *_m = nvkm_kmap(o); \ -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 03/13] drm/nouveau/core: define engine for handling replayable faults
From: Ben Skeggs <bskeggs at redhat.com> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/include/nvkm/core/device.h | 3 +++ drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h | 4 ++++ drivers/gpu/drm/nouveau/nvkm/core/subdev.c | 1 + drivers/gpu/drm/nouveau/nvkm/engine/Kbuild | 1 + drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 2 ++ drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild | 0 7 files changed, 12 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h index 560265b15ec2..de3d2566ee4d 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h @@ -42,6 +42,7 @@ enum nvkm_devidx { NVKM_ENGINE_CIPHER, NVKM_ENGINE_DISP, NVKM_ENGINE_DMAOBJ, + NVKM_ENGINE_FAULT, NVKM_ENGINE_FIFO, NVKM_ENGINE_GR, NVKM_ENGINE_IFB, @@ -147,6 +148,7 @@ struct nvkm_device { struct nvkm_engine *cipher; struct nvkm_disp *disp; struct nvkm_dma *dma; + struct nvkm_engine *fault; struct nvkm_fifo *fifo; struct nvkm_gr *gr; struct nvkm_engine *ifb; @@ -218,6 +220,7 @@ struct nvkm_device_chip { int (*cipher )(struct nvkm_device *, int idx, struct nvkm_engine **); int (*disp )(struct nvkm_device *, int idx, struct nvkm_disp **); int (*dma )(struct nvkm_device *, int idx, struct nvkm_dma **); + int (*fault )(struct nvkm_device *, int idx, struct nvkm_engine **); int (*fifo )(struct nvkm_device *, int idx, struct nvkm_fifo **); int (*gr )(struct nvkm_device *, int idx, struct nvkm_gr **); int (*ifb )(struct nvkm_device *, int idx, struct nvkm_engine **); diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h new file mode 100644 index 000000000000..398ca5a02eee --- /dev/null +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h @@ -0,0 +1,4 @@ +#ifndef __NVKM_FAULT_H__ +#define __NVKM_FAULT_H__ +#include <core/engine.h> +#endif diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c index a134d225f958..0d50b2206da2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c @@ -63,6 +63,7 @@ nvkm_subdev_name[NVKM_SUBDEV_NR] = { [NVKM_ENGINE_CIPHER ] = "cipher", [NVKM_ENGINE_DISP ] = "disp", [NVKM_ENGINE_DMAOBJ ] = "dma", + [NVKM_ENGINE_FAULT ] = "fault", [NVKM_ENGINE_FIFO ] = "fifo", [NVKM_ENGINE_GR ] = "gr", [NVKM_ENGINE_IFB ] = "ifb", diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/Kbuild index 78571e8b01c5..3aa90a6d5392 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/engine/Kbuild @@ -7,6 +7,7 @@ include $(src)/nvkm/engine/cipher/Kbuild include $(src)/nvkm/engine/device/Kbuild include $(src)/nvkm/engine/disp/Kbuild include $(src)/nvkm/engine/dma/Kbuild +include $(src)/nvkm/engine/fault/Kbuild include $(src)/nvkm/engine/fifo/Kbuild include $(src)/nvkm/engine/gr/Kbuild include $(src)/nvkm/engine/mpeg/Kbuild diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 05cd674326a6..2fe862ac0d95 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2466,6 +2466,7 @@ nvkm_device_engine(struct nvkm_device *device, int index) _(CIPHER , device->cipher , device->cipher); _(DISP , device->disp , &device->disp->engine); _(DMAOBJ , device->dma , &device->dma->engine); + _(FAULT , device->fault , device->fault); _(FIFO , device->fifo , &device->fifo->engine); _(GR , device->gr , &device->gr->engine); _(IFB , device->ifb , device->ifb); @@ -2919,6 +2920,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func, _(NVKM_ENGINE_CIPHER , cipher); _(NVKM_ENGINE_DISP , disp); _(NVKM_ENGINE_DMAOBJ , dma); + _(NVKM_ENGINE_FAULT , fault); _(NVKM_ENGINE_FIFO , fifo); _(NVKM_ENGINE_GR , gr); _(NVKM_ENGINE_IFB , ifb); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h index 08d0bf605722..3be45ac6e58d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h @@ -32,6 +32,7 @@ #include <engine/cipher.h> #include <engine/disp.h> #include <engine/dma.h> +#include <engine/fault.h> #include <engine/fifo.h> #include <engine/gr.h> #include <engine/mpeg.h> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild new file mode 100644 index 000000000000..e69de29bb2d1 -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 04/13] drm/nouveau/mmu/gp100: allow gcc/tex to generate replayable faults
From: Ben Skeggs <bskeggs at redhat.com> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c index 059fafe0e771..8752d9ce4af0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c @@ -315,7 +315,10 @@ gp100_vmm_flush(struct nvkm_vmm *vmm, int depth) int gp100_vmm_join(struct nvkm_vmm *vmm, struct nvkm_memory *inst) { - const u64 base = BIT_ULL(10) /* VER2 */ | BIT_ULL(11); /* 64KiB */ + const u64 base = BIT_ULL(4) /* FAULT_REPLAY_TEX */ | + BIT_ULL(5) /* FAULT_REPLAY_GCC */ | + BIT_ULL(10) /* VER2 */ | + BIT_ULL(11) /* 64KiB */; return gf100_vmm_join_(vmm, inst, base); } -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 05/13] drm/nouveau/mc/gp100-: handle replayable fault interrupt
From: Ben Skeggs <bskeggs at redhat.com> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c | 20 +++++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h | 2 ++ 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c index 7321ad3758c3..9ab5bfe1e588 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.c @@ -75,10 +75,28 @@ gp100_mc_intr_mask(struct nvkm_mc *base, u32 mask, u32 intr) spin_unlock_irqrestore(&mc->lock, flags); } +const struct nvkm_mc_map +gp100_mc_intr[] = { + { 0x04000000, NVKM_ENGINE_DISP }, + { 0x00000100, NVKM_ENGINE_FIFO }, + { 0x00000200, NVKM_ENGINE_FAULT }, + { 0x40000000, NVKM_SUBDEV_IBUS }, + { 0x10000000, NVKM_SUBDEV_BUS }, + { 0x08000000, NVKM_SUBDEV_FB }, + { 0x02000000, NVKM_SUBDEV_LTC }, + { 0x01000000, NVKM_SUBDEV_PMU }, + { 0x00200000, NVKM_SUBDEV_GPIO }, + { 0x00200000, NVKM_SUBDEV_I2C }, + { 0x00100000, NVKM_SUBDEV_TIMER }, + { 0x00040000, NVKM_SUBDEV_THERM }, + { 0x00002000, NVKM_SUBDEV_FB }, + {}, +}; + static const struct nvkm_mc_func gp100_mc = { .init = nv50_mc_init, - .intr = gk104_mc_intr, + .intr = gp100_mc_intr, .intr_unarm = gp100_mc_intr_unarm, .intr_rearm = gp100_mc_intr_rearm, .intr_mask = gp100_mc_intr_mask, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c index 2283e3b74277..ff8629de97d6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.c @@ -34,7 +34,7 @@ gp10b_mc_init(struct nvkm_mc *mc) static const struct nvkm_mc_func gp10b_mc = { .init = gp10b_mc_init, - .intr = gk104_mc_intr, + .intr = gp100_mc_intr, .intr_unarm = gp100_mc_intr_unarm, .intr_rearm = gp100_mc_intr_rearm, .intr_mask = gp100_mc_intr_mask, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h index 8869d79c2b59..d9e3691d45b7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/priv.h @@ -57,4 +57,6 @@ int gp100_mc_new_(const struct nvkm_mc_func *, struct nvkm_device *, int, extern const struct nvkm_mc_map gk104_mc_intr[]; extern const struct nvkm_mc_map gk104_mc_reset[]; + +extern const struct nvkm_mc_map gp100_mc_intr[]; #endif -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 06/13] drm/nouveau/fault/gp100: initial implementation of MaxwellFaultBufferA
From: Ben Skeggs <bskeggs at redhat.com> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/include/nvif/class.h | 2 + drivers/gpu/drm/nouveau/include/nvif/clb069.h | 8 ++ .../gpu/drm/nouveau/include/nvkm/engine/fault.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 6 + drivers/gpu/drm/nouveau/nvkm/engine/device/user.c | 1 + drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild | 4 + drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c | 116 ++++++++++++++++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c | 61 +++++++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h | 29 +++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c | 136 +++++++++++++++++++++ drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h | 7 ++ 11 files changed, 371 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/include/nvif/clb069.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h b/drivers/gpu/drm/nouveau/include/nvif/class.h index a7c5bf572788..98ac250670b7 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/class.h +++ b/drivers/gpu/drm/nouveau/include/nvif/class.h @@ -52,6 +52,8 @@ #define NV04_DISP /* cl0046.h */ 0x00000046 +#define MAXWELL_FAULT_BUFFER_A /* clb069.h */ 0x0000b069 + #define NV03_CHANNEL_DMA /* cl506b.h */ 0x0000006b #define NV10_CHANNEL_DMA /* cl506b.h */ 0x0000006e #define NV17_CHANNEL_DMA /* cl506b.h */ 0x0000176e diff --git a/drivers/gpu/drm/nouveau/include/nvif/clb069.h b/drivers/gpu/drm/nouveau/include/nvif/clb069.h new file mode 100644 index 000000000000..b0d509fd8631 --- /dev/null +++ b/drivers/gpu/drm/nouveau/include/nvif/clb069.h @@ -0,0 +1,8 @@ +#ifndef __NVIF_CLB069_H__ +#define __NVIF_CLB069_H__ + +struct nvb069_vn { +}; + +#define NVB069_VN_NTFY_FAULT 0x00 +#endif diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h index 398ca5a02eee..08893f13e2f9 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fault.h @@ -1,4 +1,5 @@ #ifndef __NVKM_FAULT_H__ #define __NVKM_FAULT_H__ #include <core/engine.h> +int gp100_fault_new(struct nvkm_device *, int, struct nvkm_engine **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 2fe862ac0d95..ee67caf95a4e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2184,6 +2184,7 @@ nv130_chipset = { .ce[5] = gp100_ce_new, .dma = gf119_dma_new, .disp = gp100_disp_new, + .fault = gp100_fault_new, .fifo = gp100_fifo_new, .gr = gp100_gr_new, .sw = gf100_sw_new, @@ -2217,6 +2218,7 @@ nv132_chipset = { .ce[3] = gp102_ce_new, .disp = gp102_disp_new, .dma = gf119_dma_new, + .fault = gp100_fault_new, .fifo = gp100_fifo_new, .gr = gp102_gr_new, .nvdec = gp102_nvdec_new, @@ -2252,6 +2254,7 @@ nv134_chipset = { .ce[3] = gp102_ce_new, .disp = gp102_disp_new, .dma = gf119_dma_new, + .fault = gp100_fault_new, .fifo = gp100_fifo_new, .gr = gp102_gr_new, .nvdec = gp102_nvdec_new, @@ -2287,6 +2290,7 @@ nv136_chipset = { .ce[3] = gp102_ce_new, .disp = gp102_disp_new, .dma = gf119_dma_new, + .fault = gp100_fault_new, .fifo = gp100_fifo_new, .gr = gp102_gr_new, .nvdec = gp102_nvdec_new, @@ -2322,6 +2326,7 @@ nv137_chipset = { .ce[3] = gp102_ce_new, .disp = gp102_disp_new, .dma = gf119_dma_new, + .fault = gp100_fault_new, .fifo = gp100_fifo_new, .gr = gp107_gr_new, .nvdec = gp102_nvdec_new, @@ -2382,6 +2387,7 @@ nv13b_chipset = { .top = gk104_top_new, .ce[2] = gp102_ce_new, .dma = gf119_dma_new, + .fault = gp100_fault_new, .fifo = gp10b_fifo_new, .gr = gp10b_gr_new, .sw = gf100_sw_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c index 17adcb4e8854..5eee439f615c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c @@ -276,6 +276,7 @@ nvkm_udevice_child_get(struct nvkm_object *object, int index, struct nvkm_device *device = udev->device; struct nvkm_engine *engine; u64 mask = (1ULL << NVKM_ENGINE_DMAOBJ) | + (1ULL << NVKM_ENGINE_FAULT) | (1ULL << NVKM_ENGINE_FIFO) | (1ULL << NVKM_ENGINE_DISP) | (1ULL << NVKM_ENGINE_PM); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild index e69de29bb2d1..627d74eaba1d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/Kbuild @@ -0,0 +1,4 @@ +nvkm-y += nvkm/engine/fault/base.o +nvkm-y += nvkm/engine/fault/gp100.o + +nvkm-y += nvkm/engine/fault/user.o diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c new file mode 100644 index 000000000000..a970012e84c8 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/base.c @@ -0,0 +1,116 @@ +/* + * Copyright 2017 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ +#include "priv.h" +#include "user.h" + +#include <core/client.h> +#include <core/notify.h> + +static int +nvkm_fault_ntfy_ctor(struct nvkm_object *object, void *data, u32 size, + struct nvkm_notify *notify) +{ + if (size == 0) { + notify->size = 0; + notify->types = 1; + notify->index = 0; + return 0; + } + return -ENOSYS; +} + +static const struct nvkm_event_func +nvkm_fault_ntfy = { + .ctor = nvkm_fault_ntfy_ctor, +}; + +static int +nvkm_fault_class_new(struct nvkm_device *device, + const struct nvkm_oclass *oclass, void *data, u32 size, + struct nvkm_object **pobject) +{ + struct nvkm_fault *fault = nvkm_fault(device->fault); + if (!oclass->client->super) + return -EACCES; + return nvkm_ufault_new(fault, oclass, data, size, pobject); +} + +static const struct nvkm_device_oclass +nvkm_fault_class = { + .ctor = nvkm_fault_class_new, +}; + +static int +nvkm_fault_class_get(struct nvkm_oclass *oclass, int index, + const struct nvkm_device_oclass **class) +{ + struct nvkm_fault *fault = nvkm_fault(oclass->engine); + if (index == 0) { + oclass->base.oclass = fault->func->oclass; + oclass->base.minver = -1; + oclass->base.maxver = -1; + *class = &nvkm_fault_class; + } + return 1; +} + +static void +nvkm_fault_intr(struct nvkm_engine *engine) +{ + struct nvkm_fault *fault = nvkm_fault(engine); + nvkm_event_send(&fault->event, 1, 0, NULL, 0); +} + +static void * +nvkm_fault_dtor(struct nvkm_engine *engine) +{ + struct nvkm_fault *fault = nvkm_fault(engine); + nvkm_event_fini(&fault->event); + return fault; +} + +static const struct nvkm_engine_func +nvkm_fault = { + .dtor = nvkm_fault_dtor, + .intr = nvkm_fault_intr, + .base.sclass = nvkm_fault_class_get, +}; + +int +nvkm_fault_new_(const struct nvkm_fault_func *func, struct nvkm_device *device, + int index, struct nvkm_engine **pengine) +{ + struct nvkm_fault *fault; + int ret; + + if (!(fault = kzalloc(sizeof(*fault), GFP_KERNEL))) + return -ENOMEM; + *pengine = &fault->engine; + fault->func = func; + + ret = nvkm_engine_ctor(&nvkm_fault, device, index, true, + &fault->engine); + if (ret) + return ret; + + return nvkm_event_init(&nvkm_fault_ntfy, 1, 1, &fault->event); +} diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c new file mode 100644 index 000000000000..4120bc043a3d --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/gp100.c @@ -0,0 +1,61 @@ +/* + * Copyright 2017 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ +#include "priv.h" + +#include <nvif/class.h> + +static void +gp100_fault_fini(struct nvkm_fault *fault) +{ + struct nvkm_device *device = fault->engine.subdev.device; + nvkm_mask(device, 0x002a70, 0x00000001, 0x00000000); +} + +static void +gp100_fault_init(struct nvkm_fault *fault) +{ + struct nvkm_device *device = fault->engine.subdev.device; + nvkm_wr32(device, 0x002a74, upper_32_bits(fault->vma->addr)); + nvkm_wr32(device, 0x002a70, lower_32_bits(fault->vma->addr)); + nvkm_mask(device, 0x002a70, 0x00000001, 0x00000001); +} + +static u32 +gp100_fault_size(struct nvkm_fault *fault) +{ + return nvkm_rd32(fault->engine.subdev.device, 0x002a78) * 32; +} + +static const struct nvkm_fault_func +gp100_fault = { + .size = gp100_fault_size, + .init = gp100_fault_init, + .fini = gp100_fault_fini, + .oclass = MAXWELL_FAULT_BUFFER_A, +}; + +int +gp100_fault_new(struct nvkm_device *device, int index, + struct nvkm_engine **pengine) +{ + return nvkm_fault_new_(&gp100_fault, device, index, pengine); +} diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h new file mode 100644 index 000000000000..5e3e6366b0fb --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/priv.h @@ -0,0 +1,29 @@ +#ifndef __NVKM_FAULT_PRIV_H__ +#define __NVKM_FAULT_PRIV_H__ +#define nvkm_fault(p) container_of((p), struct nvkm_fault, engine) +#include <engine/fault.h> + +#include <core/event.h> +#include <subdev/mmu.h> + +struct nvkm_fault { + const struct nvkm_fault_func *func; + struct nvkm_engine engine; + + struct nvkm_event event; + + struct nvkm_object *user; + struct nvkm_memory *mem; + struct nvkm_vma *vma; +}; + +struct nvkm_fault_func { + u32 (*size)(struct nvkm_fault *); + void (*init)(struct nvkm_fault *); + void (*fini)(struct nvkm_fault *); + s32 oclass; +}; + +int nvkm_fault_new_(const struct nvkm_fault_func *, struct nvkm_device *, + int, struct nvkm_engine **); +#endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c b/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c new file mode 100644 index 000000000000..5cc1c4b989bb --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.c @@ -0,0 +1,136 @@ +/* + * Copyright 2017 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ +#include "user.h" + +#include <core/object.h> +#include <core/memory.h> +#include <subdev/bar.h> + +#include <nvif/clb069.h> +#include <nvif/unpack.h> + +static int +nvkm_ufault_map(struct nvkm_object *object, void *argv, u32 argc, + enum nvkm_object_map *type, u64 *addr, u64 *size) +{ + struct nvkm_fault *fault = nvkm_fault(object->engine); + struct nvkm_device *device = fault->engine.subdev.device; + *type = NVKM_OBJECT_MAP_IO; + *addr = device->func->resource_addr(device, 3) + fault->vma->addr; + *size = nvkm_memory_size(fault->mem); + return 0; +} + +static int +nvkm_ufault_ntfy(struct nvkm_object *object, u32 type, + struct nvkm_event **pevent) +{ + struct nvkm_fault *fault = nvkm_fault(object->engine); + if (type == NVB069_VN_NTFY_FAULT) { + *pevent = &fault->event; + return 0; + } + return -EINVAL; +} + +static int +nvkm_ufault_fini(struct nvkm_object *object, bool suspend) +{ + struct nvkm_fault *fault = nvkm_fault(object->engine); + fault->func->fini(fault); + return 0; +} + +static int +nvkm_ufault_init(struct nvkm_object *object) +{ + struct nvkm_fault *fault = nvkm_fault(object->engine); + fault->func->init(fault); + return 0; +} + +static void * +nvkm_ufault_dtor(struct nvkm_object *object) +{ + struct nvkm_fault *fault = nvkm_fault(object->engine); + struct nvkm_vmm *bar2 = nvkm_bar_bar2_vmm(fault->engine.subdev.device); + + mutex_lock(&fault->engine.subdev.mutex); + if (fault->user == object) + fault->user = NULL; + mutex_unlock(&fault->engine.subdev.mutex); + + nvkm_vmm_put(bar2, &fault->vma); + nvkm_memory_unref(&fault->mem); + return object; +} + +static const struct nvkm_object_func +nvkm_ufault = { + .dtor = nvkm_ufault_dtor, + .init = nvkm_ufault_init, + .fini = nvkm_ufault_fini, + .ntfy = nvkm_ufault_ntfy, + .map = nvkm_ufault_map, +}; + +int +nvkm_ufault_new(struct nvkm_fault *fault, const struct nvkm_oclass *oclass, + void *argv, u32 argc, struct nvkm_object **pobject) +{ + union { + struct nvb069_vn vn; + } *args = argv; + struct nvkm_subdev *subdev = &fault->engine.subdev; + struct nvkm_device *device = subdev->device; + struct nvkm_vmm *bar2 = nvkm_bar_bar2_vmm(device); + u32 size = fault->func->size(fault); + int ret = -ENOSYS; + + if ((ret = nvif_unvers(ret, &argv, &argc, args->vn))) + return ret; + + ret = nvkm_object_new_(&nvkm_ufault, oclass, NULL, 0, pobject); + if (ret) + return ret; + + ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, + 0x1000, false, &fault->mem); + if (ret) + return ret; + + ret = nvkm_vmm_get(bar2, 12, nvkm_memory_size(fault->mem), &fault->vma); + if (ret) + return ret; + + ret = nvkm_memory_map(fault->mem, 0, bar2, fault->vma, NULL, 0); + if (ret) + return ret; + + mutex_lock(&subdev->mutex); + if (!fault->user) + fault->user = *pobject; + else + ret = -EBUSY; + mutex_unlock(&subdev->mutex); + return 0; +} diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h b/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h new file mode 100644 index 000000000000..70c03bbbc0b2 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fault/user.h @@ -0,0 +1,7 @@ +#ifndef __NVKM_FAULT_USER_H__ +#define __NVKM_FAULT_USER_H__ +#include "priv.h" + +int nvkm_ufault_new(struct nvkm_fault *, const struct nvkm_oclass *, + void *, u32, struct nvkm_object **); +#endif -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 07/13] drm/nouveau: special mapping method for HMM
From: Jérôme Glisse <jglisse at redhat.com> HMM does not have any of the usual memory object properties. For HMM inside any range the following is true: - not all page in a range are valid - not all page have same permission (read only, read and write) - not all page are in same memory (system memory, GPU memory) Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 21 +++++ drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 105 ++++++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 6 ++ drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 73 ++++++++++++++ 4 files changed, 204 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h index baab93398e54..719d50e6296f 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h @@ -2,6 +2,21 @@ #ifndef __NVKM_MMU_H__ #define __NVKM_MMU_H__ #include <core/subdev.h> +#include <linux/hmm.h> + +/* Need to change HMM to be more driver friendly */ +#if IS_ENABLED(CONFIG_HMM) +#else +typedef unsigned long hmm_pfn_t; +#define HMM_PFN_VALID (1 << 0) +#define HMM_PFN_READ (1 << 1) +#define HMM_PFN_WRITE (1 << 2) +#define HMM_PFN_ERROR (1 << 3) +#define HMM_PFN_EMPTY (1 << 4) +#define HMM_PFN_SPECIAL (1 << 5) +#define HMM_PFN_DEVICE_UNADDRESSABLE (1 << 6) +#define HMM_PFN_SHIFT 7 +#endif struct nvkm_vma { struct list_head head; @@ -56,6 +71,7 @@ void nvkm_vmm_part(struct nvkm_vmm *, struct nvkm_memory *inst); int nvkm_vmm_get(struct nvkm_vmm *, u8 page, u64 size, struct nvkm_vma **); void nvkm_vmm_put(struct nvkm_vmm *, struct nvkm_vma **); + struct nvkm_vmm_map { struct nvkm_memory *memory; u64 offset; @@ -63,6 +79,11 @@ struct nvkm_vmm_map { struct nvkm_mm_node *mem; struct scatterlist *sgl; dma_addr_t *dma; +#define NV_HMM_PAGE_FLAG_V HMM_PFN_VALID +#define NV_HMM_PAGE_FLAG_W HMM_PFN_WRITE +#define NV_HMM_PAGE_FLAG_E HMM_PFN_ERROR +#define NV_HMM_PAGE_PFN_SHIFT HMM_PFN_SHIFT + u64 *pages; u64 off; const struct nvkm_vmm_page *page; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c index 20d31526ba8f..96671987ce53 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c @@ -75,7 +75,7 @@ struct nvkm_vmm_iter { struct nvkm_vmm *vmm; u64 cnt; u16 max, lvl; - u64 start, addr; + u64 start, addr, *pages; u32 pte[NVKM_VMM_LEVELS_MAX]; struct nvkm_vmm_pt *pt[NVKM_VMM_LEVELS_MAX]; int flush; @@ -281,6 +281,59 @@ nvkm_vmm_unref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes) return true; } +static bool +nvkm_vmm_unref_hmm_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes) +{ + const struct nvkm_vmm_desc *desc = it->desc; + const int type = desc->type == SPT; + struct nvkm_vmm_pt *pgt = it->pt[0]; + struct nvkm_mmu_pt *pt; + int mapped; + + pt = pgt->pt[type]; + mapped = desc->func->hmm_unmap(it->vmm, pt, ptei, ptes, NULL); + if (mapped <= 0) + return false; + ptes = mapped; + + /* Dual-PTs need special handling, unless PDE becoming invalid. */ + if (desc->type == SPT && (pgt->refs[0] || pgt->refs[1])) + nvkm_vmm_unref_sptes(it, pgt, desc, ptei, ptes); + + /* GPU may have cached the PTs, flush before freeing. */ + nvkm_vmm_flush_mark(it); + nvkm_vmm_flush(it); + + nvkm_kmap(pt->memory); + while (mapped--) { + u64 data = nvkm_ro64(pt->memory, pt->base + ptei * 8); + dma_addr_t dma = (data >> 8) << 12; + + if (!data) { + ptei++; + continue; + } + dma_unmap_page(it->vmm->mmu->subdev.device->dev, dma, + PAGE_SIZE, DMA_BIDIRECTIONAL); + VMM_WO064(pt, it->vmm, ptei++ * 8, 0UL); + } + nvkm_done(pt->memory); + + /* Drop PTE references. */ + pgt->refs[type] -= ptes; + + /* PT no longer neeed? Destroy it. */ + if (!pgt->refs[type]) { + it->lvl++; + TRA(it, "%s empty", nvkm_vmm_desc_type(desc)); + it->lvl--; + nvkm_vmm_unref_pdes(it); + return false; /* PTE writes for unmap() not necessary. */ + } + + return true; +} + static void nvkm_vmm_ref_sptes(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgt, const struct nvkm_vmm_desc *desc, u32 ptei, u32 ptes) @@ -349,6 +402,32 @@ nvkm_vmm_ref_sptes(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgt, } } +static bool +nvkm_vmm_ref_hmm_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes) +{ + const struct nvkm_vmm_desc *desc = it->desc; + const int type = desc->type == SPT; + struct nvkm_vmm_pt *pgt = it->pt[0]; + struct nvkm_mmu_pt *pt; + int mapped; + + pt = pgt->pt[type]; + mapped = desc->func->hmm_map(it->vmm, pt, ptei, ptes, + &it->pages[(it->addr - it->start) >> PAGE_SHIFT]); + if (mapped <= 0) + return false; + ptes = mapped; + + /* Take PTE references. */ + pgt->refs[type] += ptes; + + /* Dual-PTs need special handling. */ + if (desc->type == SPT) + nvkm_vmm_ref_sptes(it, pgt, desc, ptei, ptes); + + return true; +} + static bool nvkm_vmm_ref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes) { @@ -520,6 +599,7 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page, it.cnt = size >> page->shift; it.flush = NVKM_VMM_LEVELS_MAX; it.start = it.addr = addr; + it.pages = map ? map->pages : NULL; /* Deconstruct address into PTE indices for each mapping level. */ for (it.lvl = 0; desc[it.lvl].bits; it.lvl++) { @@ -1184,6 +1264,29 @@ nvkm_vmm_map(struct nvkm_vmm *vmm, struct nvkm_vma *vma, void *argv, u32 argc, return ret; } +void +nvkm_vmm_hmm_map(struct nvkm_vmm *vmm, u64 addr, u64 npages, u64 *pages) +{ + struct nvkm_vmm_map map = {0}; + + for (map.page = vmm->func->page; map.page->shift != 12; map.page++); + map.pages = pages; + + nvkm_vmm_iter(vmm, map.page, addr, npages << PAGE_SHIFT, "ref + map", + true, nvkm_vmm_ref_hmm_ptes, NULL, &map, NULL); +} + +void +nvkm_vmm_hmm_unmap(struct nvkm_vmm *vmm, u64 addr, u64 npages) +{ + struct nvkm_vmm_map map = {0}; + + for (map.page = vmm->func->page; map.page->shift != 12; map.page++); + + nvkm_vmm_iter(vmm, map.page, addr, npages << PAGE_SHIFT, "unmap + unref", + false, nvkm_vmm_unref_hmm_ptes, NULL, NULL, NULL); +} + static void nvkm_vmm_put_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma) { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index da06e64d8a7d..a630aa2a77e4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -56,6 +56,8 @@ typedef void (*nvkm_vmm_pde_func)(struct nvkm_vmm *, struct nvkm_vmm_pt *, u32 pdei); typedef void (*nvkm_vmm_pte_func)(struct nvkm_vmm *, struct nvkm_mmu_pt *, u32 ptei, u32 ptes, struct nvkm_vmm_map *); +typedef int (*nvkm_vmm_hmm_func)(struct nvkm_vmm *, struct nvkm_mmu_pt *, + u32 ptei, u32 ptes, u64 *pages); struct nvkm_vmm_desc_func { nvkm_vmm_pxe_func invalid; @@ -67,6 +69,8 @@ struct nvkm_vmm_desc_func { nvkm_vmm_pte_func mem; nvkm_vmm_pte_func dma; nvkm_vmm_pte_func sgl; + nvkm_vmm_hmm_func hmm_map; + nvkm_vmm_hmm_func hmm_unmap; }; extern const struct nvkm_vmm_desc_func gf100_vmm_pgd; @@ -163,6 +167,8 @@ int nvkm_vmm_get_locked(struct nvkm_vmm *, bool getref, bool mapref, void nvkm_vmm_put_locked(struct nvkm_vmm *, struct nvkm_vma *); void nvkm_vmm_unmap_locked(struct nvkm_vmm *, struct nvkm_vma *); void nvkm_vmm_unmap_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma); +void nvkm_vmm_hmm_map(struct nvkm_vmm *vmm, u64 addr, u64 npages, u64 *pages); +void nvkm_vmm_hmm_unmap(struct nvkm_vmm *vmm, u64 addr, u64 npages); struct nvkm_vma *nvkm_vma_tail(struct nvkm_vma *, u64 tail); void nvkm_vmm_node_insert(struct nvkm_vmm *, struct nvkm_vma *); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c index 8752d9ce4af0..bae32fc28289 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c @@ -67,6 +67,77 @@ gp100_vmm_pgt_dma(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, VMM_MAP_ITER_DMA(vmm, pt, ptei, ptes, map, gp100_vmm_pgt_pte); } +static int +gp100_vmm_pgt_hmm_map(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, + u32 ptei, u32 ptes, u64 *pages) +{ + int mapped = 0; + + nvkm_kmap(pt->memory); + while (ptes--) { + u64 data = nvkm_ro64(pt->memory, pt->base + ptei * 8); + u64 page = *pages; + struct page *tmp; + dma_addr_t dma; + + if (!(page & NV_HMM_PAGE_FLAG_V)) { + pages++; ptei++; + continue; + } + + if ((data & 1)) { + *pages |= NV_HMM_PAGE_FLAG_V; + pages++; ptei++; + continue; + } + + tmp = pfn_to_page(page >> NV_HMM_PAGE_PFN_SHIFT); + dma = dma_map_page(vmm->mmu->subdev.device->dev, tmp, + 0, PAGE_SIZE, DMA_BIDIRECTIONAL); + if (dma_mapping_error(vmm->mmu->subdev.device->dev, dma)) { + *pages |= NV_HMM_PAGE_FLAG_E; + pages++; ptei++; + continue; + } + + data = (2 << 1); + data |= ((dma >> PAGE_SHIFT) << 8); + data |= page & NV_HMM_PAGE_FLAG_V ? (1 << 0) : 0; + data |= page & NV_HMM_PAGE_FLAG_W ? 0 : (1 << 6); + + VMM_WO064(pt, vmm, ptei++ * 8, data); + mapped++; + pages++; + } + nvkm_done(pt->memory); + + return mapped; +} + +static int +gp100_vmm_pgt_hmm_unmap(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, + u32 ptei, u32 ptes, u64 *pages) +{ + int unmapped = 0; + + nvkm_kmap(pt->memory); + while (ptes--) { + u64 data = nvkm_ro64(pt->memory, pt->base + ptei * 8); + + if (!(data & 1)) { + VMM_WO064(pt, vmm, ptei++ * 8, 0UL); + continue; + } + + /* Clear valid but keep pte value so we can dma_unmap() */ + VMM_WO064(pt, vmm, ptei++ * 8, data ^ 1); + unmapped++; + } + nvkm_done(pt->memory); + + return unmapped; +} + static void gp100_vmm_pgt_mem(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes, struct nvkm_vmm_map *map) @@ -89,6 +160,8 @@ gp100_vmm_desc_spt = { .mem = gp100_vmm_pgt_mem, .dma = gp100_vmm_pgt_dma, .sgl = gp100_vmm_pgt_sgl, + .hmm_map = gp100_vmm_pgt_hmm_map, + .hmm_unmap = gp100_vmm_pgt_hmm_unmap, }; static void -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 08/13] drm/nouveau: special mapping method for HMM (user interface)
From: Jérôme Glisse <jglisse at redhat.com> Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 17 ++++++++ drivers/gpu/drm/nouveau/include/nvif/vmm.h | 2 + drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 25 ++++-------- drivers/gpu/drm/nouveau/nvif/vmm.c | 29 ++++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 49 ++++++++++++++++++++--- 5 files changed, 99 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h index 2928ecd989ad..2c24817ca533 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h +++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h @@ -14,6 +14,8 @@ struct nvif_vmm_v0 { #define NVIF_VMM_V0_PUT 0x02 #define NVIF_VMM_V0_MAP 0x03 #define NVIF_VMM_V0_UNMAP 0x04 +#define NVIF_VMM_V0_HMM_MAP 0x05 +#define NVIF_VMM_V0_HMM_UNMAP 0x06 struct nvif_vmm_page_v0 { __u8 version; @@ -61,4 +63,19 @@ struct nvif_vmm_unmap_v0 { __u8 pad01[7]; __u64 addr; }; + +struct nvif_vmm_hmm_map_v0 { + __u8 version; + __u8 pad01[7]; + __u64 addr; + __u64 npages; + __u64 pages; +}; + +struct nvif_vmm_hmm_unmap_v0 { + __u8 version; + __u8 pad01[7]; + __u64 addr; + __u64 npages; +}; #endif diff --git a/drivers/gpu/drm/nouveau/include/nvif/vmm.h b/drivers/gpu/drm/nouveau/include/nvif/vmm.h index c5db8a2e82df..c5e4adaa0e3c 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/vmm.h +++ b/drivers/gpu/drm/nouveau/include/nvif/vmm.h @@ -39,4 +39,6 @@ void nvif_vmm_put(struct nvif_vmm *, struct nvif_vma *); int nvif_vmm_map(struct nvif_vmm *, u64 addr, u64 size, void *argv, u32 argc, struct nvif_mem *, u64 offset); int nvif_vmm_unmap(struct nvif_vmm *, u64); +int nvif_vmm_hmm_map(struct nvif_vmm *vmm, u64 addr, u64 npages, u64 *pages); +int nvif_vmm_hmm_unmap(struct nvif_vmm *vmm, u64 addr, u64 npages); #endif diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h index 719d50e6296f..8f08718e05aa 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h @@ -4,20 +4,6 @@ #include <core/subdev.h> #include <linux/hmm.h> -/* Need to change HMM to be more driver friendly */ -#if IS_ENABLED(CONFIG_HMM) -#else -typedef unsigned long hmm_pfn_t; -#define HMM_PFN_VALID (1 << 0) -#define HMM_PFN_READ (1 << 1) -#define HMM_PFN_WRITE (1 << 2) -#define HMM_PFN_ERROR (1 << 3) -#define HMM_PFN_EMPTY (1 << 4) -#define HMM_PFN_SPECIAL (1 << 5) -#define HMM_PFN_DEVICE_UNADDRESSABLE (1 << 6) -#define HMM_PFN_SHIFT 7 -#endif - struct nvkm_vma { struct list_head head; struct rb_node tree; @@ -79,10 +65,13 @@ struct nvkm_vmm_map { struct nvkm_mm_node *mem; struct scatterlist *sgl; dma_addr_t *dma; -#define NV_HMM_PAGE_FLAG_V HMM_PFN_VALID -#define NV_HMM_PAGE_FLAG_W HMM_PFN_WRITE -#define NV_HMM_PAGE_FLAG_E HMM_PFN_ERROR -#define NV_HMM_PAGE_PFN_SHIFT HMM_PFN_SHIFT +#define NV_HMM_PAGE_FLAG_V (1 << 0) +#define NV_HMM_PAGE_FLAG_R 0 +#define NV_HMM_PAGE_FLAG_W (1 << 1) +#define NV_HMM_PAGE_FLAG_E (-1ULL) +#define NV_HMM_PAGE_FLAG_N 0 +#define NV_HMM_PAGE_FLAG_S (1ULL << 63) +#define NV_HMM_PAGE_PFN_SHIFT 8 u64 *pages; u64 off; diff --git a/drivers/gpu/drm/nouveau/nvif/vmm.c b/drivers/gpu/drm/nouveau/nvif/vmm.c index 31cdb2d2e1ff..27a7b95b4e9c 100644 --- a/drivers/gpu/drm/nouveau/nvif/vmm.c +++ b/drivers/gpu/drm/nouveau/nvif/vmm.c @@ -32,6 +32,35 @@ nvif_vmm_unmap(struct nvif_vmm *vmm, u64 addr) sizeof(struct nvif_vmm_unmap_v0)); } +int +nvif_vmm_hmm_map(struct nvif_vmm *vmm, u64 addr, u64 npages, u64 *pages) +{ + struct nvif_vmm_hmm_map_v0 args; + int ret; + + args.version = 0; + args.addr = addr; + args.npages = npages; + args.pages = (uint64_t)pages; + ret = nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_MAP, + &args, sizeof(args)); + return ret; +} + +int +nvif_vmm_hmm_unmap(struct nvif_vmm *vmm, u64 addr, u64 npages) +{ + struct nvif_vmm_hmm_unmap_v0 args; + int ret; + + args.version = 0; + args.addr = addr; + args.npages = npages; + ret = nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_UNMAP, + &args, sizeof(args)); + return ret; +} + int nvif_vmm_map(struct nvif_vmm *vmm, u64 addr, u64 size, void *argv, u32 argc, struct nvif_mem *mem, u64 offset) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c index 37b201b95f15..739f2af02552 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c @@ -274,16 +274,55 @@ nvkm_uvmm_mthd_page(struct nvkm_uvmm *uvmm, void *argv, u32 argc) return 0; } +static int +nvkm_uvmm_mthd_hmm_map(struct nvkm_uvmm *uvmm, void *argv, u32 argc) +{ + union { + struct nvif_vmm_hmm_map_v0 v0; + } *args = argv; + struct nvkm_vmm *vmm = uvmm->vmm; + int ret = -ENOSYS; + + if ((ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) + return ret; + + mutex_lock(&vmm->mutex); + nvkm_vmm_hmm_map(vmm, args->v0.addr, args->v0.npages, + (u64 *)args->v0.pages); + mutex_unlock(&vmm->mutex); + return 0; +} + +static int +nvkm_uvmm_mthd_hmm_unmap(struct nvkm_uvmm *uvmm, void *argv, u32 argc) +{ + union { + struct nvif_vmm_hmm_unmap_v0 v0; + } *args = argv; + struct nvkm_vmm *vmm = uvmm->vmm; + int ret = -ENOSYS; + + if ((ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) + return ret; + + mutex_lock(&vmm->mutex); + nvkm_vmm_hmm_unmap(vmm, args->v0.addr, args->v0.npages); + mutex_unlock(&vmm->mutex); + return 0; +} + static int nvkm_uvmm_mthd(struct nvkm_object *object, u32 mthd, void *argv, u32 argc) { struct nvkm_uvmm *uvmm = nvkm_uvmm(object); switch (mthd) { - case NVIF_VMM_V0_PAGE : return nvkm_uvmm_mthd_page (uvmm, argv, argc); - case NVIF_VMM_V0_GET : return nvkm_uvmm_mthd_get (uvmm, argv, argc); - case NVIF_VMM_V0_PUT : return nvkm_uvmm_mthd_put (uvmm, argv, argc); - case NVIF_VMM_V0_MAP : return nvkm_uvmm_mthd_map (uvmm, argv, argc); - case NVIF_VMM_V0_UNMAP : return nvkm_uvmm_mthd_unmap (uvmm, argv, argc); + case NVIF_VMM_V0_PAGE : return nvkm_uvmm_mthd_page (uvmm, argv, argc); + case NVIF_VMM_V0_GET : return nvkm_uvmm_mthd_get (uvmm, argv, argc); + case NVIF_VMM_V0_PUT : return nvkm_uvmm_mthd_put (uvmm, argv, argc); + case NVIF_VMM_V0_MAP : return nvkm_uvmm_mthd_map (uvmm, argv, argc); + case NVIF_VMM_V0_UNMAP : return nvkm_uvmm_mthd_unmap (uvmm, argv, argc); + case NVIF_VMM_V0_HMM_MAP : return nvkm_uvmm_mthd_hmm_map (uvmm, argv, argc); + case NVIF_VMM_V0_HMM_UNMAP: return nvkm_uvmm_mthd_hmm_unmap(uvmm, argv, argc); default: break; } -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 09/13] drm/nouveau: add SVM through HMM support to nouveau client
From: Jérôme Glisse <jglisse at redhat.com> SVM (Share Virtual Memory) through HMM (Heterogeneous Memory Management) to nouveau client. SVM means that any valid pointer (private anonymous, share memory or mmap of regular file) on the CPU is also valid on the GPU. To achieve SVM with nouveau we use HMM kernel infrastructure. There is one nouveau client object created each time the device file is open by a process, this is best we can achieve. Idealy we would like an object that exist for each process address space but there is no such thing in the kernel. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/Kbuild | 3 + drivers/gpu/drm/nouveau/nouveau_drm.c | 5 + drivers/gpu/drm/nouveau/nouveau_drv.h | 3 + drivers/gpu/drm/nouveau/nouveau_hmm.c | 339 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_hmm.h | 63 +++++++ 5 files changed, 413 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/nouveau_hmm.c create mode 100644 drivers/gpu/drm/nouveau/nouveau_hmm.h diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild index 9c0c650655e9..8e61e118ccfe 100644 --- a/drivers/gpu/drm/nouveau/Kbuild +++ b/drivers/gpu/drm/nouveau/Kbuild @@ -35,6 +35,9 @@ nouveau-y += nouveau_prime.o nouveau-y += nouveau_sgdma.o nouveau-y += nouveau_ttm.o nouveau-y += nouveau_vmm.o +ifdef CONFIG_HMM_MIRROR +nouveau-$(CONFIG_DEVICE_PRIVATE) += nouveau_hmm.o +endif # DRM - modesetting nouveau-$(CONFIG_DRM_NOUVEAU_BACKLIGHT) += nouveau_backlight.o diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 3e293029e3a6..e67b08ba8b80 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -167,6 +167,7 @@ nouveau_cli_work(struct work_struct *w) static void nouveau_cli_fini(struct nouveau_cli *cli) { + nouveau_hmm_fini(cli); nouveau_cli_work_flush(cli, true); usif_client_fini(cli); nouveau_vmm_fini(&cli->vmm); @@ -965,6 +966,10 @@ nouveau_drm_open(struct drm_device *dev, struct drm_file *fpriv) list_add(&cli->head, &drm->clients); mutex_unlock(&drm->client.mutex); + ret = nouveau_hmm_init(cli); + if (ret) + return ret; + done: if (ret && cli) { nouveau_cli_fini(cli); diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index 96f6bd8aee5d..75c741d5125c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -65,6 +65,7 @@ struct platform_device; #include "nouveau_fence.h" #include "nouveau_bios.h" #include "nouveau_vmm.h" +#include "nouveau_hmm.h" struct nouveau_drm_tile { struct nouveau_fence *fence; @@ -104,6 +105,8 @@ struct nouveau_cli { struct list_head notifys; char name[32]; + struct nouveau_hmm hmm; + struct work_struct work; struct list_head worker; struct mutex lock; diff --git a/drivers/gpu/drm/nouveau/nouveau_hmm.c b/drivers/gpu/drm/nouveau/nouveau_hmm.c new file mode 100644 index 000000000000..a4c6f687f6a8 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nouveau_hmm.c @@ -0,0 +1,339 @@ +/* + * Copyright (C) 2018 Red Hat All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining + * a copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sublicense, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial + * portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE + * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION + * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Jérôme Glisse, Ben Skeggs + */ +#include <nvif/class.h> +#include <nvif/clb069.h> +#include "nouveau_hmm.h" +#include "nouveau_drv.h" +#include "nouveau_bo.h" +#include <nvkm/subdev/mmu.h> +#include <linux/sched/mm.h> +#include <linux/mm.h> + +struct fault_entry { + u32 instlo; + u32 insthi; + u32 addrlo; + u32 addrhi; + u32 timelo; + u32 timehi; + u32 rsvd; + u32 info; +}; + +#define NV_PFAULT_ACCESS_R 0 /* read */ +#define NV_PFAULT_ACCESS_W 1 /* write */ +#define NV_PFAULT_ACCESS_A 2 /* atomic */ +#define NV_PFAULT_ACCESS_P 3 /* prefetch */ + +static inline u64 +fault_entry_addr(const struct fault_entry *fe) +{ + return ((u64)fe->addrhi << 32) | (fe->addrlo & PAGE_MASK); +} + +static inline unsigned +fault_entry_access(const struct fault_entry *fe) +{ + return ((u64)fe->info >> 16) & 7; +} + +struct nouveau_vmf { + struct vm_area_struct *vma; + struct nouveau_cli *cli; + uint64_t *pages;; + u64 npages; + u64 start; +}; + +static void +nouveau_hmm_fault_signal(struct nouveau_cli *cli, + struct fault_entry *fe, + bool success) +{ + u32 gpc, isgpc, client; + + if (!(fe->info & 0x80000000)) + return; + + gpc = (fe->info & 0x1f000000) >> 24; + isgpc = (fe->info & 0x00100000) >> 20; + client = (fe->info & 0x00007f00) >> 8; + fe->info &= 0x7fffffff; + + if (success) { + nvif_wr32(&cli->device.object, 0x100cbc, 0x80000000 | + (1 << 3) | (client << 9) | + (gpc << 15) | (isgpc << 20)); + } else { + nvif_wr32(&cli->device.object, 0x100cbc, 0x80000000 | + (4 << 3) | (client << 9) | + (gpc << 15) | (isgpc << 20)); + } +} + +static const uint64_t hmm_pfn_flags[HMM_PFN_FLAG_MAX] = { + /* FIXME find a way to build time check order */ + NV_HMM_PAGE_FLAG_V, /* HMM_PFN_FLAG_VALID */ + NV_HMM_PAGE_FLAG_W, /* HMM_PFN_FLAG_WRITE */ + NV_HMM_PAGE_FLAG_E, /* HMM_PFN_FLAG_ERROR */ + NV_HMM_PAGE_FLAG_N, /* HMM_PFN_FLAG_NONE */ + NV_HMM_PAGE_FLAG_S, /* HMM_PFN_FLAG_SPECIAL */ + 0, /* HMM_PFN_FLAG_DEVICE_UNADDRESSABLE */ +}; + +static int +nouveau_hmm_handle_fault(struct nouveau_vmf *vmf) +{ + struct nouveau_hmm *hmm = &vmf->cli->hmm; + struct hmm_range range; + int ret; + + range.vma = vmf->vma; + range.start = vmf->start; + range.end = vmf->start + vmf->npages; + range.pfns = vmf->pages; + range.pfn_shift = NV_HMM_PAGE_PFN_SHIFT; + range.flags = hmm_pfn_flags; + + ret = hmm_vma_fault(&range, true); + if (ret) + return ret; + + mutex_lock(&hmm->mutex); + if (!hmm_vma_range_done(&range)) { + mutex_unlock(&hmm->mutex); + return -EAGAIN; + } + + nvif_vmm_hmm_map(&vmf->cli->vmm.vmm, vmf->start, + vmf->npages, (u64 *)vmf->pages); + mutex_unlock(&hmm->mutex); + return 0; +} + +static int +nouveau_hmm_rpfb_process(struct nvif_notify *ntfy) +{ + struct nouveau_hmm *hmm = container_of(ntfy, typeof(*hmm), pending); + struct nouveau_cli *cli = container_of(hmm, typeof(*cli), hmm); + u32 get = nvif_rd32(&cli->device.object, 0x002a7c); + u32 put = nvif_rd32(&cli->device.object, 0x002a80); + struct fault_entry *fe = (void *)hmm->rpfb.map.ptr; + u32 processed = 0, next = get; + + for (; hmm->enabled && (get != put); get = next) { + /* FIXME something else than a 16 pages window ... */ + const u64 max_pages = 16; + const u64 range_mask = (max_pages << PAGE_SHIFT) - 1; + u64 addr, start, end, i; + struct nouveau_vmf vmf; + u64 pages[16] = {0}; + int ret; + + if (!(fe[get].info & 0x80000000)) { + processed++; get++; + continue; + } + + start = fault_entry_addr(&fe[get]) & (~range_mask); + end = start + range_mask + 1; + + for (next = get; next < put; ++next) { + unsigned access; + + if (!(fe[next].info & 0x80000000)) { + continue; + } + + addr = fault_entry_addr(&fe[next]); + if (addr < start || addr >= end) { + break; + } + + i = (addr - start) >> PAGE_SHIFT; + access = fault_entry_access(&fe[next]); + pages[i] = (access == NV_PFAULT_ACCESS_W) ? + NV_HMM_PAGE_FLAG_V | + NV_HMM_PAGE_FLAG_W : + NV_HMM_PAGE_FLAG_V; + } + +again: + down_read(&hmm->mm->mmap_sem); + vmf.vma = find_vma_intersection(hmm->mm, start, end); + if (vmf.vma == NULL) { + up_read(&hmm->mm->mmap_sem); + for (i = 0; i < max_pages; ++i) { + pages[i] = NV_HMM_PAGE_FLAG_E; + } + goto signal; + } + + /* Mark error */ + for (addr = start, i = 0; addr < vmf.vma->vm_start; + addr += PAGE_SIZE, ++i) { + pages[i] = NV_HMM_PAGE_FLAG_E; + } + for (addr = end - PAGE_SIZE, i = max_pages - 1; + addr >= vmf.vma->vm_end; addr -= PAGE_SIZE, --i) { + pages[i] = NV_HMM_PAGE_FLAG_E; + } + vmf.start = max_t(u64, start, vmf.vma->vm_start); + end = min_t(u64, end, vmf.vma->vm_end); + + vmf.cli = cli; + vmf.pages = &pages[(vmf.start - start) >> PAGE_SHIFT]; + vmf.npages = (end - vmf.start) >> PAGE_SHIFT; + ret = nouveau_hmm_handle_fault(&vmf); + switch (ret) { + case -EAGAIN: + up_read(&hmm->mm->mmap_sem); + /* fallthrough */ + case -EBUSY: + /* Try again */ + goto again; + default: + up_read(&hmm->mm->mmap_sem); + break; + } + + signal: + for (; get < next; ++get) { + bool success; + + if (!(fe[get].info & 0x80000000)) { + continue; + } + + addr = fault_entry_addr(&fe[get]); + i = (addr - start) >> PAGE_SHIFT; + success = !(pages[i] & NV_HMM_PAGE_FLAG_E); + nouveau_hmm_fault_signal(cli, &fe[get], success); + } + } + + nvif_wr32(&cli->device.object, 0x002a7c, get); + return hmm->enabled ? NVIF_NOTIFY_KEEP : NVIF_NOTIFY_DROP; +} + +static void +nouveau_vmm_sync_pagetables(struct hmm_mirror *mirror, + enum hmm_update_type update, + unsigned long start, + unsigned long end) +{ +} + +static const struct hmm_mirror_ops nouveau_hmm_mirror_ops = { + .sync_cpu_device_pagetables = &nouveau_vmm_sync_pagetables, +}; + +void +nouveau_hmm_fini(struct nouveau_cli *cli) +{ + if (!cli->hmm.enabled) + return; + + cli->hmm.enabled = false; + nvif_notify_fini(&cli->hmm.pending); + nvif_object_fini(&cli->hmm.rpfb); + + hmm_mirror_unregister(&cli->hmm.mirror); + nouveau_vmm_sync_pagetables(&cli->hmm.mirror, HMM_UPDATE_INVALIDATE, + PAGE_SIZE, TASK_SIZE); +} + +int +nouveau_hmm_init(struct nouveau_cli *cli) +{ + struct mm_struct *mm = get_task_mm(current); + static const struct nvif_mclass rpfbs[] = { + { MAXWELL_FAULT_BUFFER_A, -1 }, + {} + }; + bool super; + int ret; + + if (cli->hmm.enabled) + return 0; + + mutex_init(&cli->hmm.mutex); + + down_write(&mm->mmap_sem); + mutex_lock(&cli->hmm.mutex); + cli->hmm.mirror.ops = &nouveau_hmm_mirror_ops; + ret = hmm_mirror_register(&cli->hmm.mirror, mm); + if (!ret) + cli->hmm.mm = mm; + mutex_unlock(&cli->hmm.mutex); + up_write(&mm->mmap_sem); + mmput(mm); + if (ret) + return ret; + + /* Allocate replayable fault buffer. */ + ret = nvif_mclass(&cli->device.object, rpfbs); + if (ret < 0) { + hmm_mirror_unregister(&cli->hmm.mirror); + return ret; + } + + super = cli->base.super; + cli->base.super = true; + ret = nvif_object_init(&cli->device.object, 0, + rpfbs[ret].oclass, + NULL, 0, &cli->hmm.rpfb); + if (ret) { + hmm_mirror_unregister(&cli->hmm.mirror); + cli->base.super = super; + return ret; + } + nvif_object_map(&cli->hmm.rpfb, NULL, 0); + + /* Request notification of pending replayable faults. */ + ret = nvif_notify_init(&cli->hmm.rpfb, nouveau_hmm_rpfb_process, + true, NVB069_VN_NTFY_FAULT, NULL, 0, 0, + &cli->hmm.pending); + cli->base.super = super; + if (ret) + goto error_notify; + + ret = nvif_notify_get(&cli->hmm.pending); + if (ret) + goto error_notify_get; + + cli->hmm.mm = current->mm; + cli->hmm.task = current; + cli->hmm.enabled = true; + return 0; + +error_notify_get: + nvif_notify_fini(&cli->hmm.pending); +error_notify: + nvif_object_fini(&cli->hmm.rpfb); + hmm_mirror_unregister(&cli->hmm.mirror); + return ret; +} diff --git a/drivers/gpu/drm/nouveau/nouveau_hmm.h b/drivers/gpu/drm/nouveau/nouveau_hmm.h new file mode 100644 index 000000000000..47f31cf8ac56 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nouveau_hmm.h @@ -0,0 +1,63 @@ +/* + * Copyright (C) 2018 Red Hat All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining + * a copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sublicense, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial + * portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE + * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION + * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Jérôme Glisse, Ben Skeggs + */ +#ifndef NOUVEAU_HMM_H +#define NOUVEAU_HMM_H +#include <nvif/object.h> +#include <nvif/notify.h> +#include <nouveau_vmm.h> +#include <linux/hmm.h> + +#if defined(CONFIG_HMM_MIRROR) && defined(CONFIG_DEVICE_PRIVATE) + +struct nouveau_hmm { + struct nvif_object rpfb; + struct nvif_notify pending; + struct task_struct *task; + struct hmm_mirror mirror; + struct mm_struct *mm; + struct mutex mutex; + bool enabled; +}; + +void nouveau_hmm_fini(struct nouveau_cli *cli); +int nouveau_hmm_init(struct nouveau_cli *cli); + +#else /* defined(CONFIG_HMM_MIRROR) && defined(CONFIG_DEVICE_PRIVATE) */ + +struct nouveau_hmm { +}; + +static inline void nouveau_hmm_fini(struct nouveau_cli *cli) +{ +} + +static inline void nouveau_hmm_init(struct nouveau_cli *cli) +{ + return -EINVAL; +} + +#endif /* defined(CONFIG_HMM_MIRROR) && defined(CONFIG_DEVICE_PRIVATE) */ +#endif /* NOUVEAU_HMM_H */ -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 10/13] drm/nouveau: add HMM area creation
From: Jérôme Glisse <jglisse at redhat.com> HMM area is a virtual address range under HMM control, GPU access inside such range is like CPU access. For thing to work properly HMM range should cover everything except a reserved range for GEM buffer object. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 63 +++++++++++++++++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 2 + 2 files changed, 65 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c index 96671987ce53..ef4b839932fa 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c @@ -1540,6 +1540,69 @@ nvkm_vmm_get_locked(struct nvkm_vmm *vmm, bool getref, bool mapref, bool sparse, return 0; } +int +nvkm_vmm_hmm_init(struct nvkm_vmm *vmm, u64 start, u64 end, + struct nvkm_vma **pvma) +{ + struct nvkm_vma *vma = NULL, *tmp; + struct rb_node *node; + + /* Locate smallest block that can possibly satisfy the allocation. */ + node = vmm->free.rb_node; + while (node) { + struct nvkm_vma *this = rb_entry(node, typeof(*this), tree); + + if (this->addr <= start && (this->addr + this->size) >= end) { + rb_erase(&this->tree, &vmm->free); + vma = this; + break; + } + node = node->rb_left; + } + + if (vma == NULL) { + return -EINVAL; + } + + if (start != vma->addr) { + if (!(tmp = nvkm_vma_tail(vma, vma->size + vma->addr - start))) { + nvkm_vmm_put_region(vmm, vma); + return -ENOMEM; + } + nvkm_vmm_free_insert(vmm, vma); + vma = tmp; + } + + if (end < (vma->addr + vma->size)) { + if (!(tmp = nvkm_vma_tail(vma, vma->size + vma->addr - end))) { + nvkm_vmm_put_region(vmm, vma); + return -ENOMEM; + } + nvkm_vmm_free_insert(vmm, tmp); + } + + vma->mapref = false; + vma->sparse = false; + vma->page = NVKM_VMA_PAGE_NONE; + vma->refd = NVKM_VMA_PAGE_NONE; + vma->used = true; + nvkm_vmm_node_insert(vmm, vma); + *pvma = vma; + return 0; +} + +void +nvkm_vmm_hmm_fini(struct nvkm_vmm *vmm, u64 start, u64 end) +{ + struct nvkm_vma *vma; + u64 size = (end - start); + + vma = nvkm_vmm_node_search(vmm, start); + if (vma && vma->addr == start && vma->size == size) { + nvkm_vmm_put_locked(vmm, vma); + } +} + int nvkm_vmm_get(struct nvkm_vmm *vmm, u8 page, u64 size, struct nvkm_vma **pvma) { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index a630aa2a77e4..04d672a4dccb 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -165,6 +165,8 @@ int nvkm_vmm_get_locked(struct nvkm_vmm *, bool getref, bool mapref, bool sparse, u8 page, u8 align, u64 size, struct nvkm_vma **pvma); void nvkm_vmm_put_locked(struct nvkm_vmm *, struct nvkm_vma *); +int nvkm_vmm_hmm_init(struct nvkm_vmm *, u64, u64, struct nvkm_vma **); +void nvkm_vmm_hmm_fini(struct nvkm_vmm *, u64, u64); void nvkm_vmm_unmap_locked(struct nvkm_vmm *, struct nvkm_vma *); void nvkm_vmm_unmap_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma); void nvkm_vmm_hmm_map(struct nvkm_vmm *vmm, u64 addr, u64 npages, u64 *pages); -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 11/13] drm/nouveau: add HMM area creation user interface
From: Jérôme Glisse <jglisse at redhat.com> User API to create HMM area. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 9 +++++ drivers/gpu/drm/nouveau/include/nvif/vmm.h | 2 + drivers/gpu/drm/nouveau/nvif/vmm.c | 51 ++++++++++++++++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 39 ++++++++++++++++++++ 4 files changed, 101 insertions(+) diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h index 2c24817ca533..0383864b033b 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h +++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h @@ -16,6 +16,8 @@ struct nvif_vmm_v0 { #define NVIF_VMM_V0_UNMAP 0x04 #define NVIF_VMM_V0_HMM_MAP 0x05 #define NVIF_VMM_V0_HMM_UNMAP 0x06 +#define NVIF_VMM_V0_HMM_INIT 0x07 +#define NVIF_VMM_V0_HMM_FINI 0x08 struct nvif_vmm_page_v0 { __u8 version; @@ -78,4 +80,11 @@ struct nvif_vmm_hmm_unmap_v0 { __u64 addr; __u64 npages; }; + +struct nvif_vmm_hmm_v0 { + __u8 version; + __u8 pad01[7]; + __u64 start; + __u64 end; +}; #endif diff --git a/drivers/gpu/drm/nouveau/include/nvif/vmm.h b/drivers/gpu/drm/nouveau/include/nvif/vmm.h index c5e4adaa0e3c..f11f8c510ebd 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/vmm.h +++ b/drivers/gpu/drm/nouveau/include/nvif/vmm.h @@ -39,6 +39,8 @@ void nvif_vmm_put(struct nvif_vmm *, struct nvif_vma *); int nvif_vmm_map(struct nvif_vmm *, u64 addr, u64 size, void *argv, u32 argc, struct nvif_mem *, u64 offset); int nvif_vmm_unmap(struct nvif_vmm *, u64); +int nvif_vmm_hmm_init(struct nvif_vmm *vmm, u64 hstart, u64 hend); +void nvif_vmm_hmm_fini(struct nvif_vmm *vmm, u64 hstart, u64 hend); int nvif_vmm_hmm_map(struct nvif_vmm *vmm, u64 addr, u64 npages, u64 *pages); int nvif_vmm_hmm_unmap(struct nvif_vmm *vmm, u64 addr, u64 npages); #endif diff --git a/drivers/gpu/drm/nouveau/nvif/vmm.c b/drivers/gpu/drm/nouveau/nvif/vmm.c index 27a7b95b4e9c..788e02e47750 100644 --- a/drivers/gpu/drm/nouveau/nvif/vmm.c +++ b/drivers/gpu/drm/nouveau/nvif/vmm.c @@ -32,6 +32,57 @@ nvif_vmm_unmap(struct nvif_vmm *vmm, u64 addr) sizeof(struct nvif_vmm_unmap_v0)); } +int +nvif_vmm_hmm_init(struct nvif_vmm *vmm, u64 hstart, u64 hend) +{ + struct nvif_vmm_hmm_v0 args; + int ret; + + if (hstart > PAGE_SIZE) { + args.version = 0; + args.start = PAGE_SIZE; + args.end = hstart; + ret = nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_INIT, + &args, sizeof(args)); + if (ret) + return ret; + } + + args.version = 0; + args.start = hend; + args.end = TASK_SIZE; + ret = nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_INIT, + &args, sizeof(args)); + if (ret && hstart > PAGE_SIZE) { + args.version = 0; + args.start = PAGE_SIZE; + args.end = hstart; + nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_FINI, + &args, sizeof(args)); + } + return ret; +} + +void +nvif_vmm_hmm_fini(struct nvif_vmm *vmm, u64 hstart, u64 hend) +{ + struct nvif_vmm_hmm_v0 args; + + if (hstart > PAGE_SIZE) { + args.version = 0; + args.start = PAGE_SIZE; + args.end = hstart; + nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_FINI, + &args, sizeof(args)); + } + + args.version = 0; + args.start = hend; + args.end = TASK_SIZE; + nvif_object_mthd(&vmm->object, NVIF_VMM_V0_HMM_FINI, + &args, sizeof(args)); +} + int nvif_vmm_hmm_map(struct nvif_vmm *vmm, u64 addr, u64 npages, u64 *pages) { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c index 739f2af02552..34e00aa73fd0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c @@ -274,6 +274,43 @@ nvkm_uvmm_mthd_page(struct nvkm_uvmm *uvmm, void *argv, u32 argc) return 0; } +static int +nvkm_uvmm_mthd_hmm_init(struct nvkm_uvmm *uvmm, void *argv, u32 argc) +{ + union { + struct nvif_vmm_hmm_v0 v0; + } *args = argv; + struct nvkm_vmm *vmm = uvmm->vmm; + struct nvkm_vma *vma; + int ret = -ENOSYS; + + if ((ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) + return ret; + + mutex_lock(&vmm->mutex); + ret = nvkm_vmm_hmm_init(vmm, args->v0.start, args->v0.end, &vma); + mutex_unlock(&vmm->mutex); + return ret; +} + +static int +nvkm_uvmm_mthd_hmm_fini(struct nvkm_uvmm *uvmm, void *argv, u32 argc) +{ + union { + struct nvif_vmm_hmm_v0 v0; + } *args = argv; + struct nvkm_vmm *vmm = uvmm->vmm; + int ret = -ENOSYS; + + if ((ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) + return ret; + + mutex_lock(&vmm->mutex); + nvkm_vmm_hmm_fini(vmm, args->v0.start, args->v0.end); + mutex_unlock(&vmm->mutex); + return 0; +} + static int nvkm_uvmm_mthd_hmm_map(struct nvkm_uvmm *uvmm, void *argv, u32 argc) { @@ -321,6 +358,8 @@ nvkm_uvmm_mthd(struct nvkm_object *object, u32 mthd, void *argv, u32 argc) case NVIF_VMM_V0_PUT : return nvkm_uvmm_mthd_put (uvmm, argv, argc); case NVIF_VMM_V0_MAP : return nvkm_uvmm_mthd_map (uvmm, argv, argc); case NVIF_VMM_V0_UNMAP : return nvkm_uvmm_mthd_unmap (uvmm, argv, argc); + case NVIF_VMM_V0_HMM_INIT : return nvkm_uvmm_mthd_hmm_init (uvmm, argv, argc); + case NVIF_VMM_V0_HMM_FINI : return nvkm_uvmm_mthd_hmm_fini (uvmm, argv, argc); case NVIF_VMM_V0_HMM_MAP : return nvkm_uvmm_mthd_hmm_map (uvmm, argv, argc); case NVIF_VMM_V0_HMM_UNMAP: return nvkm_uvmm_mthd_hmm_unmap(uvmm, argv, argc); default: -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 12/13] drm/nouveau: HMM area creation helpers for nouveau client
From: Jérôme Glisse <jglisse at redhat.com> Helpers to create area of virtual address under HMM control for a nouveau client. GPU access to HMM area are valid as long as the hole vma exist in the process virtual address space. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> Cc: Ben Skeggs <bskeggs at redhat.com> --- drivers/gpu/drm/nouveau/nouveau_hmm.c | 28 ++++++++++++ drivers/gpu/drm/nouveau/nouveau_hmm.h | 1 + drivers/gpu/drm/nouveau/nouveau_vmm.c | 83 +++++++++++++++++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_vmm.h | 12 +++++ 4 files changed, 124 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_hmm.c b/drivers/gpu/drm/nouveau/nouveau_hmm.c index a4c6f687f6a8..680e29bbf367 100644 --- a/drivers/gpu/drm/nouveau/nouveau_hmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_hmm.c @@ -245,6 +245,31 @@ nouveau_vmm_sync_pagetables(struct hmm_mirror *mirror, unsigned long start, unsigned long end) { + struct nouveau_hmm *hmm; + struct nouveau_cli *cli; + + hmm = container_of(mirror, struct nouveau_hmm, mirror); + if (!hmm->hole.vma || hmm->hole.start == hmm->hole.end) + return; + + /* Ignore area inside hole */ + end = min(end, TASK_SIZE); + if (start >= hmm->hole.start && end <= hmm->hole.end) + return; + if (start < hmm->hole.start && end > hmm->hole.start) { + nouveau_vmm_sync_pagetables(mirror, update, start, + hmm->hole.start); + start = hmm->hole.end; + } else if (start < hmm->hole.end && start >= hmm->hole.start) { + start = hmm->hole.end; + } + if (end <= start) + return; + + cli = container_of(hmm, struct nouveau_cli, hmm); + mutex_lock(&hmm->mutex); + nvif_vmm_hmm_unmap(&cli->vmm.vmm, start, (end - start) >> PAGE_SHIFT); + mutex_unlock(&hmm->mutex); } static const struct hmm_mirror_ops nouveau_hmm_mirror_ops = { @@ -254,6 +279,8 @@ static const struct hmm_mirror_ops nouveau_hmm_mirror_ops = { void nouveau_hmm_fini(struct nouveau_cli *cli) { + struct nouveau_hmm *hmm = &cli->hmm; + if (!cli->hmm.enabled) return; @@ -262,6 +289,7 @@ nouveau_hmm_fini(struct nouveau_cli *cli) nvif_object_fini(&cli->hmm.rpfb); hmm_mirror_unregister(&cli->hmm.mirror); + nvif_vmm_hmm_fini(&cli->vmm.vmm, hmm->hole.start, hmm->hole.end); nouveau_vmm_sync_pagetables(&cli->hmm.mirror, HMM_UPDATE_INVALIDATE, PAGE_SIZE, TASK_SIZE); } diff --git a/drivers/gpu/drm/nouveau/nouveau_hmm.h b/drivers/gpu/drm/nouveau/nouveau_hmm.h index 47f31cf8ac56..bc68dcf0748b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_hmm.h +++ b/drivers/gpu/drm/nouveau/nouveau_hmm.h @@ -33,6 +33,7 @@ #if defined(CONFIG_HMM_MIRROR) && defined(CONFIG_DEVICE_PRIVATE) struct nouveau_hmm { + struct nouveau_vmm_hole hole; struct nvif_object rpfb; struct nvif_notify pending; struct task_struct *task; diff --git a/drivers/gpu/drm/nouveau/nouveau_vmm.c b/drivers/gpu/drm/nouveau/nouveau_vmm.c index f5371d96b003..8e6c47a99edb 100644 --- a/drivers/gpu/drm/nouveau/nouveau_vmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_vmm.c @@ -115,6 +115,89 @@ nouveau_vma_new(struct nouveau_bo *nvbo, struct nouveau_vmm *vmm, return ret; } +static int +vmm_hole_fault(struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + +static void +vmm_hole_open(struct vm_area_struct *vma) +{ + struct nouveau_cli *cli = vma->vm_private_data; + struct nouveau_vmm_hole *hole = &cli->hmm.hole; + + /* + * No need for atomic this happen under mmap_sem write lock. Make sure + * this assumption holds with a BUG_ON() + */ + BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem)); + hole->count++; +} + +static void +vmm_hole_close(struct vm_area_struct *vma) +{ + struct nouveau_cli *cli = vma->vm_private_data; + struct nouveau_vmm_hole *hole = &cli->hmm.hole; + + /* + * No need for atomic this happen under mmap_sem write lock with one + * exception when a process is being kill (from do_exit()). For that + * reasons we don't test with BUG_ON(). + */ + if ((--hole->count) <= 0) { + nouveau_hmm_fini(cli); + hole->vma = NULL; + } +} + +static int +vmm_hole_access(struct vm_area_struct *vma, unsigned long addr, + void *buf, int len, int write) +{ + return -EIO; +} + +static const struct vm_operations_struct vmm_hole_vm_ops = { + .access = vmm_hole_access, + .close = vmm_hole_close, + .fault = vmm_hole_fault, + .open = vmm_hole_open, +}; + +int +nouveau_vmm_hmm(struct nouveau_cli *cli, struct file *file, + struct vm_area_struct *vma) +{ + struct nouveau_vmm_hole *hole = &cli->hmm.hole; + unsigned long size = vma->vm_end - vma->vm_start; + unsigned long pgsize = size >> PAGE_SHIFT; + int ret; + + if ((vma->vm_pgoff + pgsize) > (DRM_FILE_PAGE_OFFSET + (4UL << 30))) + return -EINVAL; + + if (!cli->hmm.enabled) + return -EINVAL; + + hole->vma = vma; + hole->cli = cli; + hole->file = file; + hole->start = vma->vm_start; + hole->end = vma->vm_end; + hole->count = 1; + + ret = nvif_vmm_hmm_init(&cli->vmm.vmm, vma->vm_start, vma->vm_end); + if (ret) + return ret; + + vma->vm_private_data = cli; + vma->vm_ops = &vmm_hole_vm_ops; + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; + return 0; +} + void nouveau_vmm_fini(struct nouveau_vmm *vmm) { diff --git a/drivers/gpu/drm/nouveau/nouveau_vmm.h b/drivers/gpu/drm/nouveau/nouveau_vmm.h index 5c31f43678d3..43d30feb3057 100644 --- a/drivers/gpu/drm/nouveau/nouveau_vmm.h +++ b/drivers/gpu/drm/nouveau/nouveau_vmm.h @@ -13,6 +13,15 @@ struct nouveau_vma { struct nouveau_mem *mem; }; +struct nouveau_vmm_hole { + struct vm_area_struct *vma; + struct nouveau_cli *cli; + struct file *file; + unsigned long start; + unsigned long end; + int count; +}; + struct nouveau_vma *nouveau_vma_find(struct nouveau_bo *, struct nouveau_vmm *); int nouveau_vma_new(struct nouveau_bo *, struct nouveau_vmm *, struct nouveau_vma **); @@ -26,6 +35,9 @@ struct nouveau_vmm { struct nvkm_vm *vm; }; +int nouveau_vmm_hmm(struct nouveau_cli *cli, struct file *file, + struct vm_area_struct *vma); + int nouveau_vmm_init(struct nouveau_cli *, s32 oclass, struct nouveau_vmm *); void nouveau_vmm_fini(struct nouveau_vmm *); #endif -- 2.14.3
jglisse at redhat.com
2018-Mar-10 03:21 UTC
[Nouveau] [RFC PATCH 13/13] drm/nouveau: HACK FOR HMM AREA
From: Jérôme Glisse <jglisse at redhat.com> Allow userspace to create a virtual address range hole for GEM object. Signed-off-by: Jérôme Glisse <jglisse at redhat.com> --- drivers/gpu/drm/nouveau/nouveau_ttm.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c index dff51a0ee028..eafde4c6b7d4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c @@ -172,6 +172,13 @@ nouveau_ttm_mmap(struct file *filp, struct vm_area_struct *vma) if (unlikely(vma->vm_pgoff < DRM_FILE_PAGE_OFFSET)) return drm_legacy_mmap(filp, vma); + /* Hack for HMM */ + if (vma->vm_pgoff < (DRM_FILE_PAGE_OFFSET + (4UL << 30))) { + struct nouveau_cli *cli = file_priv->driver_priv; + + return nouveau_vmm_hmm(cli, filp, vma); + } + return ttm_bo_mmap(filp, vma, &drm->ttm.bdev); } @@ -305,7 +312,7 @@ nouveau_ttm_init(struct nouveau_drm *drm) drm->ttm.bo_global_ref.ref.object, &nouveau_bo_driver, dev->anon_inode->i_mapping, - DRM_FILE_PAGE_OFFSET, + DRM_FILE_PAGE_OFFSET + (4UL << 30), drm->client.mmu.dmabits <= 32 ? true : false); if (ret) { NV_ERROR(drm, "error initialising bo driver, %d\n", ret); -- 2.14.3
Christian König
2018-Mar-10 15:01 UTC
[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
Good to have an example how to use HMM with an upstream driver. Am 10.03.2018 um 04:21 schrieb jglisse at redhat.com:> This patchset adds SVM (Share Virtual Memory) using HMM (Heterogeneous > Memory Management) to the nouveau driver. SVM means that GPU threads > spawn by GPU driver for a specific user process can access any valid > CPU address in that process. A valid pointer is a pointer inside an > area coming from mmap of private, share or regular file. Pointer to > a mmap of a device file or special file are not supported.BTW: The recent IOMMU patches which generalized the PASID handling calls this SVA for shared virtual address space. We should probably sync up with those guys at some point what naming to use.> This is an RFC for few reasons technical reasons listed below and also > because we are still working on a proper open source userspace (namely > a OpenCL 2.0 for nouveau inside mesa). Open source userspace being a > requirement for the DRM subsystem. I pushed in [1] a simple standalone > program that can be use to test SVM through HMM with nouveau. I expect > we will have a somewhat working userspace in the coming weeks, work > being well underway and some patches have already been posted on mesa > mailing list.You could use the OpenGL extensions to import arbitrary user pointers as bringup use case for this. I was hoping to do the same for my ATC/HMM work on radeonsi and as far as I know there are even piglit tests for that.> They are work underway to revamp nouveau channel creation with a new > userspace API. So we might want to delay upstreaming until this lands. > We can stil discuss one aspect specific to HMM here namely the issue > around GEM objects used for some specific part of the GPU. Some engine > inside the GPU (engine are a GPU block like the display block which > is responsible of scaning memory to send out a picture through some > connector for instance HDMI or DisplayPort) can only access memory > with virtual address below (1 << 40). To accomodate those we need to > create a "hole" inside the process address space. This patchset have > a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of > device file offset so that process can mmap this range with PROT_NONE > to create a hole (process must make sure the hole is below 1 << 40). > I feel un-easy of doing it this way but maybe it is ok with other > folks.Well we have essentially the same problem with pre gfx9 AMD hardware. Felix might have some advise how it was solved for HSA. Regards, Christian.
Jerome Glisse
2018-Mar-10 17:55 UTC
[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
On Sat, Mar 10, 2018 at 04:01:58PM +0100, Christian König wrote:> Good to have an example how to use HMM with an upstream driver.I have tried to keep hardware specific bits and overal HMM logic separated so people can use it as an example without needing to understand NVidia GPU. I think i can still split patches a bit some more along that line.> Am 10.03.2018 um 04:21 schrieb jglisse at redhat.com: > > This patchset adds SVM (Share Virtual Memory) using HMM (Heterogeneous > > Memory Management) to the nouveau driver. SVM means that GPU threads > > spawn by GPU driver for a specific user process can access any valid > > CPU address in that process. A valid pointer is a pointer inside an > > area coming from mmap of private, share or regular file. Pointer to > > a mmap of a device file or special file are not supported. > > BTW: The recent IOMMU patches which generalized the PASID handling calls > this SVA for shared virtual address space. > > We should probably sync up with those guys at some point what naming to use.Let's create a committee to decide on the name ;)> > > This is an RFC for few reasons technical reasons listed below and also > > because we are still working on a proper open source userspace (namely > > a OpenCL 2.0 for nouveau inside mesa). Open source userspace being a > > requirement for the DRM subsystem. I pushed in [1] a simple standalone > > program that can be use to test SVM through HMM with nouveau. I expect > > we will have a somewhat working userspace in the coming weeks, work > > being well underway and some patches have already been posted on mesa > > mailing list. > > You could use the OpenGL extensions to import arbitrary user pointers as > bringup use case for this. > > I was hoping to do the same for my ATC/HMM work on radeonsi and as far > as I know there are even piglit tests for that.OpenGL extensions are bit too limited when i checked them long time ago. I think we rather have something like OpenCL ready so that it is easier to justify some of the more compute only features. My timeline is 4.18 for HMM inside nouveau upstream (roughly) as first some other changes to nouveau need to land. So i am thinking (hoping :)) that all the stars will be properly align by then.> > They are work underway to revamp nouveau channel creation with a new > > userspace API. So we might want to delay upstreaming until this lands. > > We can stil discuss one aspect specific to HMM here namely the issue > > around GEM objects used for some specific part of the GPU. Some engine > > inside the GPU (engine are a GPU block like the display block which > > is responsible of scaning memory to send out a picture through some > > connector for instance HDMI or DisplayPort) can only access memory > > with virtual address below (1 << 40). To accomodate those we need to > > create a "hole" inside the process address space. This patchset have > > a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of > > device file offset so that process can mmap this range with PROT_NONE > > to create a hole (process must make sure the hole is below 1 << 40). > > I feel un-easy of doing it this way but maybe it is ok with other > > folks. > > Well we have essentially the same problem with pre gfx9 AMD hardware. > Felix might have some advise how it was solved for HSA.Here my concern is around API expose to userspace for this "hole"/reserved area. I considered several options: - Have userspace allocate all object needed by GPU and mmap them at proper VA (Virtual Address) this need kernel to do special handling for those like blocking userspace access for sensitive object (page table, command buffer, ...) so a lot of kernel changes. This somewhat make sense with some of the nouveau API rework that have not landed yet. - Have kernel directly create a PROT_NONE vma against device file. Nice thing is that it is easier in kernel to find a hole of proper size in proper range. But this is ugly and i think i would be stone to death if i were to propose that. - just pick a range and cross finger that userspace never got any of its allocation in it (at least any allocation it want to use on the GPU). - Have userspace mmap with PROT_NONE a specific region of the device file to create this hole (this is what this patchset does). Note that PROT_NONE is not strictly needed but this is what it would get as device driver block any access to it. Any other solution i missed ? I don't like any of the above ... but this is more of a taste thing. The last option is the least ugly in my view. Also in this patchset if user- space munmap() the hole or any part of it, it does kill HMM for the process. This is an important details for security and consistant result in front of buggy/rogue applications. Other aspect bother me too, like should i create a region in device file so that mmap need to happen in the region, or should i just pick a single offset that would trigger the special mmap path. Nowadays i think none of the drm driver properly handle mmap that cross the DRM_FILE_OFFSET boundary, but we don't expect those to happen either. So latter option (single offset) would make sense. Cheers, Jérôme
Daniel Vetter
2018-Mar-12 17:30 UTC
[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
On Sat, Mar 10, 2018 at 04:01:58PM +0100, Christian K??nig wrote:> Good to have an example how to use HMM with an upstream driver. > > Am 10.03.2018 um 04:21 schrieb jglisse at redhat.com: > > This patchset adds SVM (Share Virtual Memory) using HMM (Heterogeneous > > Memory Management) to the nouveau driver. SVM means that GPU threads > > spawn by GPU driver for a specific user process can access any valid > > CPU address in that process. A valid pointer is a pointer inside an > > area coming from mmap of private, share or regular file. Pointer to > > a mmap of a device file or special file are not supported. > > BTW: The recent IOMMU patches which generalized the PASID handling calls > this SVA for shared virtual address space. > > We should probably sync up with those guys at some point what naming to use. > > > This is an RFC for few reasons technical reasons listed below and also > > because we are still working on a proper open source userspace (namely > > a OpenCL 2.0 for nouveau inside mesa). Open source userspace being a > > requirement for the DRM subsystem. I pushed in [1] a simple standalone > > program that can be use to test SVM through HMM with nouveau. I expect > > we will have a somewhat working userspace in the coming weeks, work > > being well underway and some patches have already been posted on mesa > > mailing list. > > You could use the OpenGL extensions to import arbitrary user pointers as > bringup use case for this. > > I was hoping to do the same for my ATC/HMM work on radeonsi and as far as I > know there are even piglit tests for that.Yeah userptr seems like a reasonable bring-up use-case for stuff like this, makes it all a bit more manageable. I suggested the same for the i915 efforts. Definitely has my ack for upstream HMM/SVM uapi extensions.> > They are work underway to revamp nouveau channel creation with a new > > userspace API. So we might want to delay upstreaming until this lands. > > We can stil discuss one aspect specific to HMM here namely the issue > > around GEM objects used for some specific part of the GPU. Some engine > > inside the GPU (engine are a GPU block like the display block which > > is responsible of scaning memory to send out a picture through some > > connector for instance HDMI or DisplayPort) can only access memory > > with virtual address below (1 << 40). To accomodate those we need to > > create a "hole" inside the process address space. This patchset have > > a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of > > device file offset so that process can mmap this range with PROT_NONE > > to create a hole (process must make sure the hole is below 1 << 40). > > I feel un-easy of doing it this way but maybe it is ok with other > > folks. > > Well we have essentially the same problem with pre gfx9 AMD hardware. Felix > might have some advise how it was solved for HSA.Couldn't we do an in-kernel address space for those special gpu blocks? As long as it's display the kernel needs to manage it anyway, and adding a 2nd mapping when you pin/unpin for scanout usage shouldn't really matter (as long as you cache the mapping until the buffer gets thrown out of vram). More-or-less what we do for i915 (where we have an entirely separate address space for these things which is 4G on the latest chips). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Felix Kuehling
2018-Mar-12 18:28 UTC
[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
On 2018-03-10 10:01 AM, Christian König wrote:>> To accomodate those we need to >> create a "hole" inside the process address space. This patchset have >> a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of >> device file offset so that process can mmap this range with PROT_NONE >> to create a hole (process must make sure the hole is below 1 << 40). >> I feel un-easy of doing it this way but maybe it is ok with other >> folks. > > Well we have essentially the same problem with pre gfx9 AMD hardware. > Felix might have some advise how it was solved for HSA.For pre-gfx9 hardware we reserve address space in user mode using a big mmap PROT_NONE call at application start. Then we manage the address space in user mode and use MAP_FIXED to map buffers at specific addresses within the reserved range. The big address space reservation causes issues for some debugging tools (clang-sanitizer was mentioned to me), so with gfx9 we're going to get rid of this address space reservation. Regards, Felix
Possibly Parallel Threads
- [PATCH 0/6] nouveau/hmm: add support for mapping large pages
- [PATCH v3 0/5] mm/hmm/nouveau: add PMD system memory mapping
- [PATCH v2 0/5] mm/hmm/nouveau: add PMD system memory mapping
- [PATCH] drm/nouveau/mmu: Remove unneeded semicolon
- [PATCH v2] drm/nouveau/mmu: Remove unneeded semicolon