thr3ads.net - search: "gmmu"

[PATCH 0/6] map big page by platform IOMMU

2015 Apr 17

2

[PATCH 0/6] map big page by platform IOMMU

...e TTM_PL_TT, like GK20A, can > Nit: GK20A can *only* allocate GPU memory from TTM_PL_TT. Trying to > allocate from VRAM will result in an error. Yep. > >> *allocate* big page though the IOMMU hardware inside the SoC. This is a try >> to map the imported buffers as big pages in GMMU by the platform IOMMU. With >> some preparation work to map decreate small pages into big page(s) by IOMMU > decreate? It should be discrete. Sorry for the typo. > >> the GMMU eventually sees the imported buffer as chunks of big pages and does >> the mapping. And then we c...

[PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages

2015 Apr 16

2

[PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages

...AMD one is)? Is there some sort of shared API for this stuff that you should be (or are?) using? -ilia On Thu, Apr 16, 2015 at 7:06 AM, Vince Hsu <vinceh at nvidia.com> wrote: > This patch uses IOMMU to aggregate (probably) discrete small pages as larger > big page(s) and map it to GMMU. > > Signed-off-by: Vince Hsu <vinceh at nvidia.com> > --- > drm/nouveau/nvkm/engine/device/gk104.c | 2 +- > drm/nouveau/nvkm/subdev/mmu/Kbuild | 1 + > drm/nouveau/nvkm/subdev/mmu/gk20a.c | 253 +++++++++++++++++++++++++++++++++ > 3 files changed, 255 insert...

[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible

2015 Apr 20

3

[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible

...ngs. > > Doing IOMMU mapping for the whole buffer with dma_map_sg is also faster than > mapping page by page, because you can do only one TLB invalidate in the end > of the loop instead of after every page if you use dma_map_single. > > All of these would talk for having IOMMU and GMMU mapping loops separate. > This patch set does not implement both the advantages above, but your > suggestion would take us further away from that than Vince's version. Aha, looks like both Vince and I overlooked this point. So IIUC we would need to make sure a GPU buffer is only ever map...

[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible

2015 Apr 17

2

[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh at nvidia.com> wrote: > This patch implements a way to aggregate the small pages and make them be > mapped as big page(s) by utilizing the platform IOMMU if supported. And then > we can enable compression support for these big pages later. > > Signed-off-by: Vince Hsu <vinceh at nvidia.com> > --- >

[RFC 14/16] drm/nouveau/fb: add GK20A support

2014 Feb 01

2

[RFC 14/16] drm/nouveau/fb: add GK20A support

Am Samstag, den 01.02.2014, 18:28 -0500 schrieb Ilia Mirkin: > On Sat, Feb 1, 2014 at 8:40 AM, Lucas Stach <dev at lynxeye.de> wrote: > > Am Samstag, den 01.02.2014, 12:16 +0900 schrieb Alexandre Courbot: > >> Add a clumsy-but-working FB support for GK20A. This chip only uses system > >> memory, so we allocate a big chunk using CMA and let the existing memory >

[PATCH 0/6] map big page by platform IOMMU

2015 Apr 16

15

[PATCH 0/6] map big page by platform IOMMU

...hich has memory type TTM_PL_TT are mapped as small pages probably due to lack of big page allocation. But the platform device which also use memory type TTM_PL_TT, like GK20A, can *allocate* big page though the IOMMU hardware inside the SoC. This is a try to map the imported buffers as big pages in GMMU by the platform IOMMU. With some preparation work to map decreate small pages into big page(s) by IOMMU the GMMU eventually sees the imported buffer as chunks of big pages and does the mapping. And then we can probably do the compression on teh imported buffer which is composed of non-contiguous sm...

[RFC 14/16] drm/nouveau/fb: add GK20A support

2014 Feb 02

0

[RFC 14/16] drm/nouveau/fb: add GK20A support

...(you could enable the SMMU and make things more interesting/complex, but for now it seems untimely to even consider doing so). Actually even the concept of a GART is not needed here: all your memory management needs could be fulfilled by getting pages with alloc_page() and arranging them using the GMMU. No GART, no BAR (at least for the purpose of mapping objects for CPU access), no PRAMIN. I really wonder how that picture would fit within Nouveau, and it is quite likely that there is an elegant solution to this problem already that my lack of understanding of Nouveau prevents me from seeing. Th...

[PATCH V2] pmu/gk20a: PMU boot support.

2015 Apr 08

3

[PATCH V2] pmu/gk20a: PMU boot support.

..., 0x1000, 0, &pmu->ucode.obj); + if (ret) + goto fw_alloc_err; + + ucode_image = (u32 *)((u8 *)desc + desc->descriptor_size); + for (i = 0; i < (desc->app_start_offset + desc->app_size); i += 4) + nv_wo32(pmu->ucode.obj, i, ucode_image[i/4]); + + /* map allocated memory into GMMU */ + ret = nvkm_gpuobj_map_vm(nv_gpuobj(pmu->ucode.obj), vm, + NV_MEM_ACCESS_RW, &pmu->ucode.vma); + if (ret) + goto map_err; + + nv_debug(ppmu, "%s function end\n", __func__); + return ret; +map_err: + nvkm_gpuobj_destroy(pmu->ucode.obj); +fw_alloc_err: + nvkm_gpuobj...

[PATCH v4] pmu/gk20a: PMU boot support

2015 Apr 13

3

[PATCH v4] pmu/gk20a: PMU boot support

...0x1000, 0, &priv->ucode.obj); + if (ret) + return ret; + + ucode_image = (u32 *)((u8 *)desc + desc->descriptor_size); + for (i = 0; i < (desc->app_start_offset + desc->app_size); i += 4) + nv_wo32(priv->ucode.obj, i, ucode_image[i/4]); + + /* map allocated memory into GMMU */ + ret = nvkm_gpuobj_map_vm(priv->ucode.obj, vm, NV_MEM_ACCESS_RW, + &priv->ucode.vma); + if (ret) + return ret; + + return ret; +} + +static int +gk20a_init_pmu_setup_sw(struct gk20a_pmu_priv *priv) +{ + struct nvkm_pmu_priv_vm *pmuvm = &priv->pmuvm; + int ret = 0; + + INIT...

CUDA fixed VA allocations and sparse mappings

2015 Jul 07

5

CUDA fixed VA allocations and sparse mappings

Hello, I am currently looking into ways to support fixed virtual address allocations and sparse mappings in nouveau, as a step towards supporting CUDA. CUDA requires that the GPU virtual address for a given buffer match the CPU virtual address. Therefore, when mapping a CUDA buffer, we have to have a way of specifying a particular virtual address to map to (we would ask that the CPU virtual

[PATCH v4] pmu/gk20a: PMU boot support

2015 Apr 30

2

[PATCH v4] pmu/gk20a: PMU boot support

...> + ucode_image = (u32 *)((u8 *)desc + desc->descriptor_size); >> + for (i = 0; i < (desc->app_start_offset + desc->app_size); i += 4) >> + nv_wo32(priv->ucode.obj, i, ucode_image[i/4]); >> + >> + /* map allocated memory into GMMU */ >> + ret = nvkm_gpuobj_map_vm(priv->ucode.obj, vm, NV_MEM_ACCESS_RW, >> + &priv->ucode.vma); >> + if (ret) >> + return ret; >> + >> + return ret; >> +} >> + >> +static i...

[PATCH] pmu/gk20a: PMU boot support.

2015 Mar 11

0

[PATCH] pmu/gk20a: PMU boot support.

...(i = 0; i < (desc->app_start_offset + desc->app_size) >> 2; i++) { > + nv_wo32(pmu->ucode.pmubufobj, i << 2, ucode_image[i]); > + pr_info("writing 0x%08x\n", ucode_image[i]); > + } > + /* map allocated memory into GMMU */ > + ret = nvkm_gpuobj_map_vm(nv_gpuobj(pmu->ucode.pmubufobj), vm, > + NV_MEM_ACCESS_RW, > + &pmu->ucode.pmubufvma); > + if (ret) > + goto map_err; > + > + nv_debug(p...

[PATCH] pmu/gk20a: PMU boot support.

2015 Mar 11

3

[PATCH] pmu/gk20a: PMU boot support.

...ge = (u32 *)((u32)desc + desc->descriptor_size); + for (i = 0; i < (desc->app_start_offset + desc->app_size) >> 2; i++) { + nv_wo32(pmu->ucode.pmubufobj, i << 2, ucode_image[i]); + pr_info("writing 0x%08x\n", ucode_image[i]); + } + /* map allocated memory into GMMU */ + ret = nvkm_gpuobj_map_vm(nv_gpuobj(pmu->ucode.pmubufobj), vm, + NV_MEM_ACCESS_RW, + &pmu->ucode.pmubufvma); + if (ret) + goto map_err; + + nv_debug(ppmu, "%s function end\n", __func__); + return ret; +map_err: + nvkm_gpuobj_destroy(pmu->ucode.pmubufobj); +...

[PATCH] pmu/gk20a: PMU boot support.

2015 Mar 12

2

[PATCH] pmu/gk20a: PMU boot support.

...(i = 0; i < (desc->app_start_offset + desc->app_size) >> 2; i++) { > + nv_wo32(pmu->ucode.pmubufobj, i << 2, ucode_image[i]); > + pr_info("writing 0x%08x\n", ucode_image[i]); > + } > + /* map allocated memory into GMMU */ > + ret = nvkm_gpuobj_map_vm(nv_gpuobj(pmu->ucode.pmubufobj), vm, > + NV_MEM_ACCESS_RW, > + &pmu->ucode.pmubufvma); > + if (ret) > + goto map_err; > + > + nv_debug(p...

search for: gmmu