search for: sm_50

Displaying 7 results from an estimated 7 matches for "sm_50".

Did you mean: sm_30

Clang option to provide list of target-subarchs.

2017 Feb 07

2

Clang option to provide list of target-subarchs.

...re are at least four clang frontends for offloading to accelerators: 1 Cuda clang 2 OpenMP 3 HCC and 4 OpenCL. These frontends will want to embed object code for multiple offload targets into a single application binary to provide portability across different subarchitectures (e.g. sm_35, sm_50) and across different architectures (e.g nvptx64,amdgcn). Problem: Different frontends are using different flags to provide a list of subarchitectures. For example, cuda clang repeats the flag “--cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_50” and HCC uses “--amdgpu-target=gfx701 --amdgpu-target...

[GPUCC] link against libdevice

2016 Aug 01

3

[GPUCC] link against libdevice

OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to continue compiling your program. That's a bug that we should fix. (If you want the current behavior, you should have to ask clang not to use libdevice.) I see that nvcc from cud...

[GPUCC] link against libdevice

2016 Aug 01

0

[GPUCC] link against libdevice

...: /usr/local/cuda* * "/usr/local/bin/clang-3.9" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -S -disable-free -main-file-name scalarProd.cu -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -no-integrated-as -fcuda-is-device -target-cpu sm_50 -v -dwarf-column-info -debugger-tuning=gdb -resource-dir /usr/local/bin/../lib/clang/3.9.0 -I ../ -I /usr/local/cuda-7.0/samples/common/inc -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-...

[GPUCC] link against libdevice

2016 Aug 01

2

[GPUCC] link against libdevice

...feng Peng via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> I was trying to compile scalarProd.cu (from CUDA SDK) with the following >> command: >> >> clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc >> --cuda-gpu-arch=sm_50 scalarProd.cu >> >> but ended up with the following error: >> >> ptxas fatal : Unresolved extern function '__nv_mul24' >> >> Seems to me that libdevice was not automatically linked. I wonder what >> flags I need to pass to clang to have the code...

[GPUCC] link against libdevice

2016 Jul 29

2

[GPUCC] link against libdevice

Hi, I was trying to compile scalarProd.cu (from CUDA SDK) with the following command: * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc --cuda-gpu-arch=sm_50 scalarProd.cu* but ended up with the following error: *ptxas fatal : Unresolved extern function '__nv_mul24'* Seems to me that libdevice was not automatically linked. I wonder what flags I need to pass to clang to have the code linked against libdevice? Thanks! Yuanfeng Peng -------...

[GPUCC] link against libdevice

2016 Aug 01

0

[GPUCC] link against libdevice

.... On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I was trying to compile scalarProd.cu (from CUDA SDK) with the following > command: > > * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc > --cuda-gpu-arch=sm_50 scalarProd.cu* > > but ended up with the following error: > > *ptxas fatal : Unresolved extern function '__nv_mul24'* > > Seems to me that libdevice was not automatically linked. I wonder what > flags I need to pass to clang to have the code linked against libdevice...

Clang option to provide list of target-subarchs.

2017 Feb 07

0

Clang option to provide list of target-subarchs.

...old crufty subarchs you would get with an exclusion flag. We expect that the runtime will match the most appropriate subarch. As is currently done with --cuda-gpu-arch, we expect that the triple for the arch will be implied from the context. That is, if one specifies --target-subarchs="sm_50,gfx702", the software will generate the triples "nvptx64-nvidia-cuda" and "amdgcn--cuda" from the subarchs. Collisions (different archs) for the same subarch are unlikely and indicate a poor choice of subarch names. For example, AMD should never choose sm_ prefix for i...