thr3ads.net - search: "eltantawy"

Displaying 6 results from an estimated 6 matches for "eltantawy".

Executing OpenMP 4.0 code on Nvidia's GPU

2016 Jan 20

Executing OpenMP 4.0 code on Nvidia's GPU

Hi Arpith, That is exactly what it is :). My bad, I thought I copied over the libraries to where LIBRARY_PATH pointing but apparently it was copied to a wrong destination. Thanks a lot. On Wed, Jan 20, 2016 at 4:51 AM, Arpith C Jacob <acjacob at us.ibm.com> wrote: > Hi Ahmed, > > nvlink is unable to find the GPU OMP runtime library in its path. Does > LIBRARY_PATH point to

[LLVMdev] CUDA front-end (CUDA to LLVM IR)

2015 Apr 08

[LLVMdev] CUDA front-end (CUDA to LLVM IR)

Hi, I wanted to ask whether there is ongoing effort (or an already established tool) that enables to convert CUDA kernels (that uses CUDA specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware that I can do this for OpenCL with the help of libclc but I can not find something similar for CUDA. Thanks -------------- next part -------------- An HTML attachment was

[LLVMdev] CUDA front-end (CUDA to LLVM IR)

2015 Apr 08

[LLVMdev] CUDA front-end (CUDA to LLVM IR)

On Wed, Apr 8, 2015 at 10:12 AM, Dmitry Mikushin <dmitry at kernelgen.org> wrote: > A tool of this kind here: https://github.com/apc-llc/nvcc-llvm-ir > > 2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>: > >> Hi, >> >> I wanted to ask whether there is ongoing effort (or an already >> established tool) that enables to convert CUDA kernels (that uses CUDA >> specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I >&g...

[LLVMdev] Example for usage of LLVM/Clang/libclc

2015 Feb 03

[LLVMdev] Example for usage of LLVM/Clang/libclc

Hi, My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and eventually generate a PTX code. I already did this but I am not sure if the PTX code I am generating is correct (is the one that is supposed to be generated). For example, currently, In OpenCL : get_global_id(0) translates to In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates to In PTX:

[LLVMdev] Performance impact of different optimization passes

2015 Jun 19

[LLVMdev] Performance impact of different optimization passes

Hi, I was wondering if there is a paper or a technical report that documents the performance impact of the different optimizations passes on a some set of benchmarks. Is something like this available ? Best regards, Ahmed -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150619/c3c0a941/attachment.html>

__sync_synchronize() crashes when compiling OpenMP to a GPU target

2016 Mar 23

__sync_synchronize() crashes when compiling OpenMP to a GPU target

Hi, I get this error when compiling a code that contains "__sync_synchronize()" fatal error: error in backend: Cannot select: 0x85ddfb0: ch = AtomicFence 0x85fd8d8, 0x85c7890, 0x85dd9e8 [ORD=4] [ID=27]example.c:378:13 0x85c7890: i64 = Constant<7> [ID=5]example.c:378:13 0x85dd9e8: i64 = Constant<1> [ID=6]example.c:378:13 I believe it should be equivalent to

search for: eltantawy