Displaying 7 results from an estimated 7 matches for "sm_50".
Did you mean:
sm_30
2017 Feb 07
2
Clang option to provide list of target-subarchs.
...re are at least four clang frontends for offloading to accelerators:
1 Cuda clang 2 OpenMP 3 HCC and 4 OpenCL. These frontends will
want to embed object code for multiple offload targets into a single
application binary to provide portability across different subarchitectures
(e.g. sm_35, sm_50) and across different architectures (e.g nvptx64,amdgcn).
Problem: Different frontends are using different flags to provide a
list of subarchitectures. For example, cuda clang repeats the flag
“--cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_50” and HCC uses
“--amdgpu-target=gfx701 --amdgpu-target...
2016 Aug 01
3
[GPUCC] link against libdevice
OK, I see the problem. You were right that we weren't picking up libdevice.
CUDA 7.0 only ships with the following libdevice binaries (found
/path/to/cuda/nvvm/libdevice):
libdevice.compute_20.10.bc libdevice.compute_30.10.bc
libdevice.compute_35.10.bc
If you ask for sm_50 with cuda 7.0, clang can't find a matching
libdevice binary, and it will apparently silently give up and try to
continue compiling your program. That's a bug that we should fix.
(If you want the current behavior, you should have to ask clang not to
use libdevice.)
I see that nvcc from cud...
2016 Aug 01
0
[GPUCC] link against libdevice
...: /usr/local/cuda*
* "/usr/local/bin/clang-3.9" -cc1 -triple nvptx64-nvidia-cuda -aux-triple
x86_64-unknown-linux-gnu -S -disable-free -main-file-name scalarProd.cu
-mrelocation-model static -mthread-model posix -mdisable-fp-elim
-fmath-errno -no-integrated-as -fcuda-is-device -target-cpu sm_50 -v
-dwarf-column-info -debugger-tuning=gdb -resource-dir
/usr/local/bin/../lib/clang/3.9.0 -I ../ -I
/usr/local/cuda-7.0/samples/common/inc -internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
-internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-...
2016 Aug 01
2
[GPUCC] link against libdevice
...feng Peng via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> I was trying to compile scalarProd.cu (from CUDA SDK) with the following
>> command:
>>
>> clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc
>> --cuda-gpu-arch=sm_50 scalarProd.cu
>>
>> but ended up with the following error:
>>
>> ptxas fatal : Unresolved extern function '__nv_mul24'
>>
>> Seems to me that libdevice was not automatically linked. I wonder what
>> flags I need to pass to clang to have the code...
2016 Jul 29
2
[GPUCC] link against libdevice
Hi,
I was trying to compile scalarProd.cu (from CUDA SDK) with the following
command:
* clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc
--cuda-gpu-arch=sm_50 scalarProd.cu*
but ended up with the following error:
*ptxas fatal : Unresolved extern function '__nv_mul24'*
Seems to me that libdevice was not automatically linked. I wonder what
flags I need to pass to clang to have the code linked against libdevice?
Thanks!
Yuanfeng Peng
-------...
2016 Aug 01
0
[GPUCC] link against libdevice
....
On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I was trying to compile scalarProd.cu (from CUDA SDK) with the following
> command:
>
> * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc
> --cuda-gpu-arch=sm_50 scalarProd.cu*
>
> but ended up with the following error:
>
> *ptxas fatal : Unresolved extern function '__nv_mul24'*
>
> Seems to me that libdevice was not automatically linked. I wonder what
> flags I need to pass to clang to have the code linked against libdevice...
2017 Feb 07
0
Clang option to provide list of target-subarchs.
...old crufty subarchs you would get with an exclusion flag. We expect that the runtime will match the most appropriate subarch.
As is currently done with --cuda-gpu-arch, we expect that the triple for the arch will be implied from the context. That is, if one specifies --target-subarchs="sm_50,gfx702", the software will generate the triples "nvptx64-nvidia-cuda" and "amdgcn--cuda" from the subarchs. Collisions (different archs) for the same subarch are unlikely and indicate a poor choice of subarch names. For example, AMD should never choose sm_ prefix for i...