Displaying 1 result from an estimated 1 matches for "cudagetdevice".
2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
Hi everyone,
CUDA allows to call some runtime functions also from the device code. On
a multi-GPU system this allows the GPU to determine its device id on its
own via cudaGetDevice().
Unfortunately i cannot get it working when compiling with clang. When
compiling with nvcc relocatable device code needs to be set to true
(-rdc=true) and the cudadevrt is needed when linking [0]. I did not
found such switches to turn rdc for clang. Just compiling does not work
as ptxas does...