Displaying 6 results from an estimated 6 matches for "eltantawi".
Did you mean:
eltantawy
2016 Jan 20
4
Executing OpenMP 4.0 code on Nvidia's GPU
Hi Arpith,
That is exactly what it is :).
My bad, I thought I copied over the libraries to where LIBRARY_PATH
pointing but apparently it was copied to a wrong destination.
Thanks a lot.
On Wed, Jan 20, 2016 at 4:51 AM, Arpith C Jacob <acjacob at us.ibm.com> wrote:
> Hi Ahmed,
>
> nvlink is unable to find the GPU OMP runtime library in its path. Does
> LIBRARY_PATH point to
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi,
I wanted to ask whether there is ongoing effort (or an already established
tool) that enables to convert CUDA kernels (that uses CUDA specific
intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware
that I can do this for OpenCL with the help of libclc but I can not find
something similar for CUDA.
Thanks
-------------- next part --------------
An HTML attachment was
2015 Apr 08
2
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
On Wed, Apr 8, 2015 at 10:12 AM, Dmitry Mikushin <dmitry at kernelgen.org>
wrote:
> A tool of this kind here: https://github.com/apc-llc/nvcc-llvm-ir
>
> 2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>:
>
>> Hi,
>>
>> I wanted to ask whether there is ongoing effort (or an already
>> established tool) that enables to convert CUDA
2015 Feb 03
2
[LLVMdev] Example for usage of LLVM/Clang/libclc
Hi,
My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and
eventually generate a PTX code. I already did this but I am not sure if the
PTX code I am generating is correct (is the one that is supposed to be
generated).
For example, currently,
In OpenCL : get_global_id(0) translates to
In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates
to
In PTX:
2015 Jun 19
2
[LLVMdev] Performance impact of different optimization passes
Hi,
I was wondering if there is a paper or a technical report that documents
the performance impact of the different optimizations passes on a some set
of benchmarks. Is something like this available ?
Best regards,
Ahmed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150619/c3c0a941/attachment.html>
2016 Mar 23
0
__sync_synchronize() crashes when compiling OpenMP to a GPU target
Hi,
I get this error when compiling a code that contains "__sync_synchronize()"
fatal error: error in backend: Cannot select: 0x85ddfb0: ch = AtomicFence
0x85fd8d8, 0x85c7890, 0x85dd9e8 [ORD=4] [ID=27]example.c:378:13
0x85c7890: i64 = Constant<7> [ID=5]example.c:378:13
0x85dd9e8: i64 = Constant<1> [ID=6]example.c:378:13
I believe it should be equivalent to