Displaying 14 results from an estimated 14 matches for "fatbinary".
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
...ither clang's CUDA
frontend or NVCC to ptx.
Here is the workflow in five stages:
1. generating ptx device code (a kind of nvidia assembler)
2. translate ptx to sass (machine code of ptx)
3. generate a fatbinray (a kind of wrapper for the device code)
4. generate host code object file (use fatbinary as input)
5. link to executable
(The exact commands are stored in the commands.txt in the github repo)
The interpreter replaces the 4th and 5th step. It interprets the host
code with pre-compiled device code as fatbinary. The fatbinary (Step 1
to 3) will be generated with the clang compiler and...
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
...> Here is the workflow in five stages:
>
> 1. generating ptx device code (a kind of nvidia assembler)
> 2. translate ptx to sass (machine code of ptx)
> 3. generate a fatbinray (a kind of wrapper for the device code)
> 4. generate host code object file (use fatbinary as input)
> 5. link to executable
>
> (The exact commands are stored in the commands.txt in the github repo)
>
> The interpreter replaces the 4th and 5th step. It interprets the
> host code with pre-compiled device code as fatbinary. The
> fatbinary (Step 1...
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
...d-a-device.bc after this step;
3) Generate IR files for b.cu: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c b.cu;
4) Link instrumented-a.device.bc with the device code generated for b.cu: llvm-link intrumented-a-device.bc b-cuda-nvptx64-nvidia-cuda-sm_35.bc -o ab-device.bc;
5) Use llc, ptxas & fatbinary on ab-device.bc to get ab-device.ptx, ab-device.o & ab-device.fatbin;
6) Call clang again the generate the host object file ab.o, with ab-device.o & ab-device.fatbin embedded;
7) Link against libraries and get the final binary: a.out.
The binary a.out fails with an exception I when run it...
2020 Nov 17
2
JIT compiling CUDA source code
...e clang command. When you add CUDA
to the equation, you add several other steps. If you use the clang front
end to compile, clang does the following:
1. compiles the driver source code
2. compiles the resulting PTX code using the CUDA ptxas command
3. builds a "fat binary" using the CUDA fatbinary command
4. compiles the host source code and links in the fat binary
So my question is: how do we replicate that process in memory, to generate
modules that we can add to our JIT?
I am no CUDA expert, and not much of a clang expert either, so if anyone
out there can point me in the right directio...
2016 Mar 05
2
instrumenting device code with gpucc
...link the modified axpy-sm_20.bc to the final binary, you need several
extra steps:
1. Compile axpy-sm_20.bc to PTX assembly using llc: llc axpy-sm_20.bc -o
axpy-sm_20.ptx -march=<nvptx or nvptx64>
2. Compile the PTX assembly to SASS using ptxas
3. Make the SASS a fat binary using NVIDIA's fatbinary tool
4. Link the fat binary to the host code using ld.
Clang does step 2-4 by invoking subcommands. Therefore, you can use "clang
-###" to dump all the subcommands, and then find the ones for step 2-4. For
example,
$ clang++ -### -O3 axpy.cu -I/usr/local/cuda/samples/common/inc
-L/usr/l...
2018 Sep 10
9
[RfC] A proposal of adding SPIR-V Toolchain in Clang
...can be used to generate a SPIR-V binary for OpenCL code. There was a separate thread regarding generation of SPIR-V binary and the community suggested that a translator from LLVM IR to SPIR-V can be used as an external tool, called llvm-spirv. This can be invoked similar to such tools as ptxas and fatbinary for the CUDA toolchain:
http://lists.llvm.org/pipermail/llvm-dev/2018-February/121440.html
An example of how Clang can be used to target SPIR-V:
clang -c test.cl -target spirv[32|64]-unknown-unknown -o test.spv
This will result in the following Clang actions:
(1) clang -cc1 -triple spirv[32|64]...
2020 Nov 19
1
JIT compiling CUDA source code
...uation, you add several other steps. If you use the clang
>> front end to compile, clang does the following:
>>
>> 1. compiles the driver source code
>> 2. compiles the resulting PTX code using the CUDA ptxas command
>> 3. builds a "fat binary" using the CUDA fatbinary command
>> 4. compiles the host source code and links in the fat binary
>>
>> So my question is: how do we replicate that process in memory, to
>> generate modules that we can add to our JIT?
>>
>> I am no CUDA expert, and not much of a clang expert either, so if...
2018 Sep 11
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
...rate a SPIR-V binary for OpenCL code. There was a separate
> thread regarding generation of SPIR-V binary and the community suggested
> that a translator from LLVM IR to SPIR-V can be used as an external tool,
> called llvm-spirv. This can be invoked similar to such tools as ptxas and
> fatbinary for the CUDA toolchain:
> http://lists.llvm.org/pipermail/llvm-dev/2018-February/121440.html
>
> An example of how Clang can be used to target SPIR-V:
>
> clang -c test.cl -target spirv[32|64]-unknown-unknown -o test.spv
>
> This will result in the following Clang actions:
>...
2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
...mpxft_00007040_00000000-13_cuda_id_test.cpp3.i" -o "/tmp/tmpxft_00007040_00000000-6_cuda_id_test.ptx"
#$ ptxas -arch=sm_35 -m64 --compile-only "/tmp/tmpxft_00007040_00000000-6_cuda_id_test.ptx" -o "/tmp/tmpxft_00007040_00000000-14_cuda_id_test.sm_35.cubin"
#$ fatbinary --create="/tmp/tmpxft_00007040_00000000-2_cuda_id_test.fatbin" -64 --cmdline="--compile-only " "--image=profile=sm_35,file=/tmp/tmpxft_00007040_00000000-14_cuda_id_test.sm_35.cubin" "--image=profile=compute_35,file=/tmp/tmpxft_00007040_00000000-6_cuda_id_test.ptx...
2018 Sep 12
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
...can be used to generate a SPIR-V binary for OpenCL code. There was a separate thread regarding generation of SPIR-V binary and the community suggested that a translator from LLVM IR to SPIR-V can be used as an external tool, called llvm-spirv. This can be invoked similar to such tools as ptxas and fatbinary for the CUDA toolchain:
>>> http://lists.llvm.org/pipermail/llvm-dev/2018-February/121440.html
>>>
>>> An example of how Clang can be used to target SPIR-V:
>>>
>>> clang -c test.cl <http://test.cl> -target spirv[32|64]-unknown-unknown -...
2020 Nov 19
0
JIT compiling CUDA source code
...CUDA
> to the equation, you add several other steps. If you use the clang front
> end to compile, clang does the following:
>
> 1. compiles the driver source code
> 2. compiles the resulting PTX code using the CUDA ptxas command
> 3. builds a "fat binary" using the CUDA fatbinary command
> 4. compiles the host source code and links in the fat binary
>
> So my question is: how do we replicate that process in memory, to generate
> modules that we can add to our JIT?
>
> I am no CUDA expert, and not much of a clang expert either, so if anyone
> out there c...
2018 Sep 13
2
[RfC] A proposal of adding SPIR-V Toolchain in Clang
...rate a SPIR-V binary for OpenCL code. There was a
> separate thread regarding generation of SPIR-V binary and the community
> suggested that a translator from LLVM IR to SPIR-V can be used as an
> external tool, called llvm-spirv. This can be invoked similar to such tools
> as ptxas and fatbinary for the CUDA toolchain:
> >>>> http://lists.llvm.org/pipermail/llvm-dev/2018-February/121440.html
> >>>>
> >>>> An example of how Clang can be used to target SPIR-V:
> >>>>
> >>>> clang -c test.cl <http://test.cl...
2018 Sep 12
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
...rate a SPIR-V binary for OpenCL code. There was a
> separate thread regarding generation of SPIR-V binary and the community
> suggested that a translator from LLVM IR to SPIR-V can be used as an
> external tool, called llvm-spirv. This can be invoked similar to such tools
> as ptxas and fatbinary for the CUDA toolchain:
> >> http://lists.llvm.org/pipermail/llvm-dev/2018-February/121440.html
> >>
> >> An example of how Clang can be used to target SPIR-V:
> >>
> >> clang -c test.cl <http://test.cl> -target
> spirv[32|64]-unknow...
2016 Mar 10
4
instrumenting device code with gpucc
...binary, you need several
>> extra steps:
>> 1. Compile axpy-sm_20.bc to PTX assembly using llc: llc axpy-sm_20.bc -o
>> axpy-sm_20.ptx -march=<nvptx or nvptx64>
>> 2. Compile the PTX assembly to SASS using ptxas
>> 3. Make the SASS a fat binary using NVIDIA's fatbinary tool
>> 4. Link the fat binary to the host code using ld.
>>
>> Clang does step 2-4 by invoking subcommands. Therefore, you can use
>> "clang -###" to dump all the subcommands, and then find the ones for step
>> 2-4. For example,
>>
>> $ clang++ -...