Displaying 5 results from an estimated 5 matches for "dcl_clang_storage_class_specifi".
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
...torials very welcome. Thanks.
>>
>
> Hi,
>
> You need to include OpenCL library headers from libclc
> (http://libclc.llvm.org/) to compile most OpenCL code.
>
> Here is an example command:
>
> clang -include /path/to/libclc/headers/clc.h -I /path/to/libclc/headers -Dcl_clang_storage_class_specifiers -target amdgcn--amdhsa -mcpu=carrizo $INPUT_FILE -o $OUTPUT_FILE
Hi Tom,
to piggy-pack on this question. To load this kernel in OpenCL, is it sufficient to just pass $OUTPUT_FILE
to clCreateProgramWithBinary?
Also, assuming this is enough. Is the code quality for recent AMD GPUs quality-wise...
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
Hi LLVM,
I would like to compile OpenCL kernel for a specific AMD GPU target. Is it
possible with the current clang/LLVM?
I started by using `clang -x cl` but it looks like at least some OpenCL
specific headers are missing (e.g. uint2 is not recognized as a type).
Any links to documentation / tutorials very welcome. Thanks.
- Paweł
-------------- next part --------------
An HTML attachment was
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
....While trying to generate PTX for sm_10, it gave
*error: unknown target CPU 'sm_10'*
*fatal error: cannot open file '/tmp/shared-395893.s': No such file or
directory1 error generated.*
The compilation command used is:
clang -Xclang -I$LIBCLC/include/generic -I$LIBCLC/include/ptx
-Dcl_clang_storage_class_specifiers -O3 CudaSource.cu -S -o PtxOutput.ptx
--cuda-gpu-arch=sm_10
Is there any chance that this error being generated from CUDA runtime alone
since I am using CUDA 7.5 which does not support sm_10. If there is any
chance that the error is isolated from LLVM and is only due to CUDA, i have
some hope t...
2015 Feb 03
2
[LLVMdev] Example for usage of LLVM/Clang/libclc
Hi,
My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and
eventually generate a PTX code. I already did this but I am not sure if the
PTX code I am generating is correct (is the one that is supposed to be
generated).
For example, currently,
In OpenCL : get_global_id(0) translates to
In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates
to
In PTX:
2016 Jun 02
5
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello,
When generating the PTX output from CUDA file(.cu file), the minimum target
that is accepted by LLVM is sm_20. But I have a specific requirement to
generate PTX output for compute capability 1.0 (sm_10). Is there any
previous version of LLVM supporting this?
Thank you,
Ginu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: