search for: cudagetdevice

Displaying 1 result from an estimated 1 matches for "cudagetdevice".

2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
Hi everyone, CUDA allows to call some runtime functions also from the device code. On a multi-GPU system this allows the GPU to determine its device id on its own via cudaGetDevice(). Unfortunately i cannot get it working when compiling with clang. When compiling with nvcc relocatable device code needs to be set to true (-rdc=true) and the cudadevrt is needed when linking [0]. I did not found such switches to turn rdc for clang. Just compiling does not work as ptxas does...