thr3ads.net - llvm dev - [llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Ahmed ElTantawy via llvm-dev

2016-Mar-11 18:10 UTC

[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions

Hi,

Using the flow described here:
https://parallel-computing.pro/index.php/9-cuda/43-openmp-4-0-on-nvidia-cuda-gpus,
I can compile and run OpenMP code on GPUs when the parallel region is
self-contained (i.e., does not include calls to functions).


When the parallel region includes a call to a function (e.g., foo()), I get
this error.

nvlink error   : Undefined reference to 'foo' in
'/tmp/test.o-e8741d.cubin'


"foo" is indeed declared and defined in the same file before the main
function, but clang driver does not include it in the final PTX file
(test.s.tgt-nvptx64sm_30-nvidia-linux).



Using CUDA terminology, Is having "device functions" not supported yet
in
OpenMP ?


Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160311/70930a0f/attachment.html>

Alexey Bataev via llvm-dev

2016-Mar-14 04:16 UTC

head link

[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions

Hi,
Most probably you forget to enclose definition of 'foo()' function in
'pragma omp declare target' region. You can do it like this:

#pragma omp declare target
void foo() {
...
}
#pragma omp end declare target



Best regards,
Alexey Bataev
============Software Engineer
Intel Compiler Team

11.03.2016 21:10, Ahmed ElTantawy via llvm-dev пишет:
Hi,

Using the flow described here:
https://parallel-computing.pro/index.php/9-cuda/43-openmp-4-0-on-nvidia-cuda-gpus,
I can compile and run OpenMP code on GPUs when the parallel region is
self-contained (i.e., does not include calls to functions).


When the parallel region includes a call to a function (e.g., foo()), I get this
error.

nvlink error   : Undefined reference to 'foo' in
'/tmp/test.o-e8741d.cubin'


"foo" is indeed declared and defined in the same file before the main
function, but clang driver does not include it in the final PTX file
(test.s.tgt-nvptx64sm_30-nvidia-linux).



Using CUDA terminology, Is having "device functions" not supported yet
in OpenMP ?


Thanks.



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160314/011c22df/attachment.html>

llvm dev - Mar 2016 - Compiling OpenMP for GPUs with there are function calls in the parallel regions

[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions

[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions