Displaying 2 results from an estimated 2 matches for "sm_52".
Did you mean:
sm_50
2020 Sep 24
2
cuda __shfl_sync problem
Hi,
First of all, i'm not sure if i should be posting this here or in
cfe-dev, but here it goes.
In order to instrument CUDA kernels i first generate device IR with:
clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o
device.bc
I also have a library that contains the instrumentation stubs for which
i generate IR similarly and i link it with the device IR
programmatically with Linker::linkModules(..)
Then after some analysis i use llc to get ptx:
llc device.bc --march=nvptx64 --mcpu=sm_52 --filetype=asm...
2020 Sep 25
2
cuda __shfl_sync problem
...gt; Hi,
>>
>> First of all, i'm not sure if i should be posting this here or in
>> cfe-dev, but here it goes.
>>
>> In order to instrument CUDA kernels i first generate device IR with:
>>
>> clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52
>> -o device.bc
>>
>> I also have a library that contains the instrumentation stubs for
>> which i generate IR similarly and i link it with the device IR
>> programmatically with Linker::linkModules(..)
>>
>> Then after some analysis i use llc to get ptx...