similar to: CUDA fixed VA allocations and sparse mappings

Displaying 20 results from an estimated 2000 matches similar to: "CUDA fixed VA allocations and sparse mappings"

2015 Jul 07
2
CUDA fixed VA allocations and sparse mappings
On Tue, Jul 07, 2015 at 11:29:38AM -0400, Ilia Mirkin wrote: > On Mon, Jul 6, 2015 at 8:42 PM, Andrew Chew <achew at nvidia.com> wrote: > > Hello, > > > > I am currently looking into ways to support fixed virtual address allocations > > and sparse mappings in nouveau, as a step towards supporting CUDA. > > > > CUDA requires that the GPU virtual address
2015 Jul 08
2
CUDA fixed VA allocations and sparse mappings
On Wed, Jul 08, 2015 at 10:37:34AM +1000, Ben Skeggs wrote: > On 8 July 2015 at 10:31, Andrew Chew <achew at nvidia.com> wrote: > > On Wed, Jul 08, 2015 at 10:18:36AM +1000, Ben Skeggs wrote: > >> > There's some minimal state that needs to be mapped into GPU address space. > >> > One thing that comes to mind are pushbuffers, which are needed to submit
2018 Feb 13
2
[drm-nouveau-mmu] question about potential NULL pointer dereference
Hi all, While doing some static analysis I ran into the following piece of code at drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c:957: 957#define node(root, dir) ((root)->head.dir == &vmm->list) ? NULL : \ 958 list_entry((root)->head.dir, struct nvkm_vma, head) 959 960void 961nvkm_vmm_unmap_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma) 962{
2015 Apr 16
15
[PATCH 0/6] map big page by platform IOMMU
Hi, Generally the the imported buffers which has memory type TTM_PL_TT are mapped as small pages probably due to lack of big page allocation. But the platform device which also use memory type TTM_PL_TT, like GK20A, can *allocate* big page though the IOMMU hardware inside the SoC. This is a try to map the imported buffers as big pages in GMMU by the platform IOMMU. With some preparation work to
2015 Jul 08
3
CUDA fixed VA allocations and sparse mappings
On Tue, Jul 07, 2015 at 08:13:28PM -0400, Ilia Mirkin wrote: > On Tue, Jul 7, 2015 at 8:11 PM, C Bergström <cbergstrom at pathscale.com> wrote: > > On Wed, Jul 8, 2015 at 7:08 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > >> On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at pathscale.com> wrote: > >>> On Wed, Jul 8, 2015 at 6:58 AM, Ben
2018 Mar 10
17
[RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau
From: Jérôme Glisse <jglisse at redhat.com> (mm is cced just to allow exposure of device driver work without ccing a long list of peoples. I do not think there is anything usefull to discuss from mm point of view but i might be wrong, so just for the curious :)). git://people.freedesktop.org/~glisse/linux branch: nouveau-hmm-v00
2012 Jul 26
6
[LLVMdev] [PROPOSAL] LLVM multi-module support
Hi, a couple of weeks ago I discussed with Peter how to improve LLVM's support for heterogeneous computing. One weakness we (and others) have seen is the absence of multi-module support in LLVM. Peter came up with a nice idea how to improve here. I would like to put this idea up for discussion. ## The problem ## LLVM-IR modules can currently only contain code for a single target
2015 Apr 20
3
[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
On Sat, Apr 18, 2015 at 12:37 AM, Terje Bergstrom <tbergstrom at nvidia.com> wrote: > > On 04/17/2015 02:11 AM, Alexandre Courbot wrote: >> >> Tracking the PDE and PTE of each memory chunk can probably be avoided >> if you change your unmapping strategy. Currently you are going through >> the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
2015 Apr 17
2
[PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh at nvidia.com> wrote: > This patch implements a way to aggregate the small pages and make them be > mapped as big page(s) by utilizing the platform IOMMU if supported. And then > we can enable compression support for these big pages later. > > Signed-off-by: Vince Hsu <vinceh at nvidia.com> > --- >
2020 Jun 22
7
[RESEND PATCH 0/3] nouveau: fixes for SVM
These are based on 5.8.0-rc2 and intended for Ben Skeggs' nouveau tree. I believe the changes can be queued for 5.8-rcX after being reviewed. These were part of a larger series but I'm resending them separately as suggested by Jason Gunthorpe. https://lore.kernel.org/linux-mm/20200619215649.32297-1-rcampbell at nvidia.com/ Note that in order to exercise/test patch 2 here, you will need a
2012 Jul 26
7
[LLVMdev] [PROPOSAL] LLVM multi-module support
In our project we combine regular binary code and LLVM IR code for kernels, embedded as a special data symbol of ELF object. The LLVM IR for kernel existing at compile-time is preliminary, and may be optimized further during runtime (pointers analysis, polly, etc.). During application startup, runtime system builds an index of all kernels sources embedded into the executable. Host and kernel code
2012 Jul 26
0
[LLVMdev] [PROPOSAL] LLVM multi-module support
Hi Tobias, I didn't really get it. Is the idea that the same bitcode is going to be codegen'd for different architectures, or is each sub-module going to contain different bitcode? In the later case you may as well just use multiple modules, perhaps in conjunction with a scheme to store more than one module in the same file on disk as a convenience. Ciao, Duncan. > a couple of weeks
2019 Apr 04
1
Proof of concept for GPU forwarding for Linux guest on Linux host.
Hi, This is a proof of concept of GPU forwarding for Linux guest on Linux host. I'd like to get comments and suggestions from community before I put more time on it. To summarize what it is: 1. It's a solution to bring GPU acceleration for Linux vm guest on Linux host. It could works with different GPU although the current proof of concept only works with Intel GPU. 2. The basic idea
2015 Jul 08
2
CUDA fixed VA allocations and sparse mappings
On Wed, Jul 08, 2015 at 10:18:36AM +1000, Ben Skeggs wrote: > > There's some minimal state that needs to be mapped into GPU address space. > > One thing that comes to mind are pushbuffers, which are needed to submit > > stuff to any engine. > I guess you can probably use the start of the kernel's address space > carveout for these kind of mappings actually?
2016 Jun 07
3
NVPTX compilation problems - ptxas error
Hello everybody, i am currently testing the NVPTX back-end and playing around with the IR it generates. Unfortunately i have come to an compilation error i cannot solve on my own. Maybe someone reading this knows what is causing the trouble and has a possible solution. I am using Ubuntu 16.04, Cuda 7.5 and clang version 3.9.0 (https://github.com/llvm-mirror/clang.git
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function
2017 Aug 02
2
CUDA compilation "No available targets are compatible with this triple." problem
Yes, I followed the guide. The same error showed up: >clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 -I/usr/local/cuda/include -lcudart_static -ldl -lrt -pthread error: unable to create target: 'No available targets are compatible with this triple.' ________________________________ From: Kevin Choi <code.kchoi at gmail.com> Sent: Wednesday, August 2,
2020 Jul 30
2
Status of CUDA 11 support
Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2017 Oct 05
4
CUDA tools?
vychytraly . wrote: > On Thu, Oct 5, 2017 at 9:51 PM, <m.roth at 5-cent.us> wrote: >> >> So, kmod-nvidia installed. Trouble is, I have no tool to test it. And my >> user might need nvcc, which, of course, is only provided by the NVidia >> CUDA, which won't install, because it conflicts with kmod-nvidia. >> >> Has *anyone* dealt with this? If so,