Ignacio Laguna via llvm-dev
2017-Jul-15 18:05 UTC
[llvm-dev] Linking CUDA bitcode files and generating CUDA executable
Hi everyone, Could someone share the recipe for getting bitcode files out of a CUDA program and then linking them to generate an executable? I’m following these steps but when I run the executable no CUDA kernels are executed: clang++ -emit-llvm -c program.cu --cuda-path=$(CUDA_PATH) --cuda-gpu-arch=sm_35 clang++ -c program.bc -o program.o llc program-cuda-nvptx64-nvidia-cuda-sm_35.bc -o program-cuda-nvptx64-nvidia-cuda-sm_35.ptx nvcc -arch=sm_35 --device-c program-cuda-nvptx64-nvidia-cuda-sm_35.ptx -o program-cuda-nvptx64-nvidia-cuda-sm_35.o nvcc -arch=sm_35 -dlink program.o program-cuda-nvptx64-nvidia-cuda-sm_35.o -o linkedcode.o clang++ -o program linkedcode.o program.o program-cuda-nvptx64-nvidia-cuda-sm_35.o -L$(CUDA_LIB) -lcudart_static -lcudadevrt -ldl -lrt -pthread I don’t get any error when doing this, but when I run it, no kernels execute. When I use cuda-memcheck it tells me this: “Program hit cudaErrorInvalidDeviceFunction (error 8) due to "invalid device function" on CUDA API call to cudaLaunch. ” My device is "Tesla K40m" with compute capability 3.5. I'm using clang/llvm 4.0, and CUDA 8.0. Can someone point out what I am doing wrong? Thank you very much in advance, Ignacio