thr3ads.net - search: "gpucc"

Displaying 17 results from an estimated 17 matches for "gpucc".

2016 Aug 01

[GPUCC] link against libdevice

Directly CC-ing some folks who may be able to help. On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I was trying to compile scalarProd.cu (from CUDA SDK) with the following > command: > > * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc > --cuda-gpu-arch=sm_50 scalarProd.cu* > > but ended up with

[GPUCC] link against libdevice

2016 Jul 29

[GPUCC] link against libdevice

Hi, I was trying to compile scalarProd.cu (from CUDA SDK) with the following command: * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc --cuda-gpu-arch=sm_50 scalarProd.cu* but ended up with the following error: *ptxas fatal : Unresolved extern function '__nv_mul24'* Seems to me that libdevice was not automatically linked. I wonder what flags I need to pass to clang to have

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Mar 09

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

...inition. One of the main goals of open-sourcing StreamExecutor is to let us add this code generation capability to Clang, when the user has chosen to use StreamExecutor as their runtime for accelerator operations. Google has been using an internally developed CUDA compiler based on Clang called **gpucc** that generates code for StreamExecutor in this way. The code below shows how the example above would be written using gpucc to generate the unsafe parts of the code. The kernel is defined in a high-level language (CUDA C++ in this example) in its own file: .. code-block:: c++ // File: add...

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 07

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

Hi, I needed to compile a cuda source file (say, a.cu) into IR (a.bc), and then merge a.bc with another bitcode file (b.bc, compiled from b.cu). So I used *llvm-link a.bc b.bc -o c.bc* However, I noticed that an internal function '* _ZL21__nvvm_reflect_anchorv() *' is defined in both a.bc & b.bc, and when merging these two files, one of the two definitions was renamed to

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Mar 09

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

...n goals of open-sourcing StreamExecutor is to let us add > this code generation capability to Clang, when the user has chosen to use > StreamExecutor as their runtime for accelerator operations. > > Google has been using an internally developed CUDA compiler based on Clang > called **gpucc** that generates code for StreamExecutor in this way. The > code below shows how the example above would be written using gpucc to > generate the unsafe parts of the code. > > The kernel is defined in a high-level language (CUDA C++ in this example) > in its own file: > > .. c...

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Mar 10

RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

...cing StreamExecutor is to let us add >> this code generation capability to Clang, when the user has chosen to use >> StreamExecutor as their runtime for accelerator operations. >> >> Google has been using an internally developed CUDA compiler based on >> Clang called **gpucc** that generates code for StreamExecutor in this way. >> The code below shows how the example above would be written using gpucc to >> generate the unsafe parts of the code. >> >> The kernel is defined in a high-level language (CUDA C++ in this example) >> in its own f...

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 08

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 09

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng

instrumenting device code with gpucc

2016 Mar 10

instrumenting device code with gpucc

..._device__ void > _Cool_MemRead_Hook(uint64_t addr)". I've compiled these hooks functions > into a shared object, and linked the axpy binary with it. > > I'm really sorry to bother you again, but I wonder whether any step I did > was apparently wrong, or there's any gpucc-specific step I need to do when > instrumenting a kernel? > > Thanks! > yuanfeng > > > > On Fri, Mar 4, 2016 at 7:56 PM, Jingyue Wu <jingyue at google.com> wrote: > >> >> >> On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng < >> yuanfeng.jack....

instrumenting device code with gpucc

2016 Mar 12

instrumenting device code with gpucc

Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you

instrumenting device code with gpucc

2016 Mar 05

instrumenting device code with gpucc

On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and I've > written a pass to insert hook functions for certain function calls and > memory accesses. For example, given a CUDA program, say, axpy...

[GPUCC] link against libdevice

2016 Aug 01

[GPUCC] link against libdevice

Hi, Yuanfeng. What version of clang are you using? CUDA is only known to work at tip of head, so you must build clang yourself from source. I suspect that's your problem, but if building from source doesn't fix it, please attach the output of compiling with -v. Regards, -Justin On Sun, Jul 31, 2016 at 9:24 PM, Chandler Carruth <chandlerc at google.com> wrote: > Directly

instrumenting device code with gpucc

2016 Mar 13

instrumenting device code with gpucc

Hey Jingyue, Thanks for being so responsive! I finally figured out a way to resolve the issue: all I have to do is to use `-only-needed` when merging the device bitcodes with llvm-link. However, since we actually need to instrument the host code as well, I encountered another issue when I tried to glue the instrumented host code and fatbin together. When I only instrumented the device code, I

instrumenting device code with gpucc

2016 Mar 15

instrumenting device code with gpucc

Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016

[GPUCC] link against libdevice

2016 Aug 01

[GPUCC] link against libdevice

OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to

[GPUCC] link against libdevice

2016 Aug 01

[GPUCC] link against libdevice

Hi Justin, Thanks for your response! The clang & llvm I'm using was built from source. Below is the output of compiling with -v. Any suggestions would be appreciated! *clang version 3.9.0 (trunk 270145) (llvm/trunk 270133)* *Target: x86_64-unknown-linux-gnu* *Thread model: posix* *InstalledDir: /usr/local/bin* *Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8*

[web] sub-domain

2015 Nov 10

[web] sub-domain

So that people have a shorter link to go to that entry page directly. It's especially useful for non-LLVM folks who want to try out LLVM's CUDA support. Many researchers fall into this category btw because LLVM used to support very little CUDA. They don't like to search llvm.org for what they want. On Tue, Nov 10, 2015 at 2:59 PM, C Bergström <cbergstrom at pathscale.com> wrote:

search for: gpucc