search for: ahmede

Displaying 13 results from an estimated 13 matches for "ahmede".

Did you mean: ahmed
2018 Jan 30
3
Disable spilling sub-registers in LLVM
...oreRegToStackSlot()/loadRegFromStackSlot(). They decide how > registers are spilled and reloaded. Nobody is stopping you from using > super registers spills/reloads to implement spilling/reloading smaller > registers there. > > - Matthias > >> On Jan 30, 2018, at 10:21 AM, ahmede <ahmede at ece.ubc.ca> wrote: >> >> Hi Quentin, >> >> Let me clarify if I understood this correctly. >> >> If the accesses (writes and reads) to sub-registers are expressed >> always as sub-registers of the super-register register class (e.g., &g...
2018 Jan 30
0
Disable spilling sub-registers in LLVM
To make my point clear, I believe an implementation of storeRegToStackSlot()/loadRegFromStackSlot() is not sufficient (as it received the physical register already). Does this make sense? On 2018-01-30 13:33, ahmede wrote: > Right Matthias, I am aware that an implementation for > storeRegToStackSlot()/loadRegFromStackSlot() is necessary. But these > functions receive the physical register that need to be spilled, they > might receive the sub-register. In this case, using the super-register > nai...
2018 Jan 30
3
Disable spilling sub-registers in LLVM
...class. > > Basically what I am saying is the spiller spills the value that > contains the accesses. > > E.g., > = v; will spill v > = v.sub1; will spill v too, but v is a super register in that case. > > Cheers, > -Quentin > >> On Jan 29, 2018, at 6:38 PM, ahmede via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> >> Hi Matthias, >> >> No. I want the register allocator to spill the super-register (the >> large one e.g., 64-bit) and not just the sub-register (e.g., the >> 32-bit that is a piece of of th...
2018 Jan 30
0
Disable spilling sub-registers in LLVM
...applies that you have to modify storeRegToStackSlot()/loadRegFromStackSlot(). They decide how registers are spilled and reloaded. Nobody is stopping you from using super registers spills/reloads to implement spilling/reloading smaller registers there. - Matthias > On Jan 30, 2018, at 10:21 AM, ahmede <ahmede at ece.ubc.ca> wrote: > > Hi Quentin, > > Let me clarify if I understood this correctly. > > If the accesses (writes and reads) to sub-registers are expressed always as sub-registers of the super-register register class (e.g., SuperReg.sub1;), then the spilling de...
2016 Jan 20
4
Executing OpenMP 4.0 code on Nvidia's GPU
Hi Arpith, That is exactly what it is :). My bad, I thought I copied over the libraries to where LIBRARY_PATH pointing but apparently it was copied to a wrong destination. Thanks a lot. On Wed, Jan 20, 2016 at 4:51 AM, Arpith C Jacob <acjacob at us.ibm.com> wrote: > Hi Ahmed, > > nvlink is unable to find the GPU OMP runtime library in its path. Does > LIBRARY_PATH point to
2018 Jan 30
2
Disable spilling sub-registers in LLVM
...but no stack loads/stores for that width. RegClass2 (super-registers): [sub_register, subregister] (64-bit) --> can be natively used in arithmetic operations and can be used in loads/stores. Thanks, Ahmed On 2018-01-29 20:20, Matthias Braun wrote: >> On Jan 29, 2018, at 1:20 PM, ahmede via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> >> I wonder if there is a way in LLVM to disable spilling a >> register-class while still enabling the super-registers of this >> register-class to be spilled. > What w...
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi, I wanted to ask whether there is ongoing effort (or an already established tool) that enables to convert CUDA kernels (that uses CUDA specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware that I can do this for OpenCL with the help of libclc but I can not find something similar for CUDA. Thanks -------------- next part -------------- An HTML attachment was
2018 Jan 30
0
Disable spilling sub-registers in LLVM
...es directly (via sub-regclass), then the spiller uses this class. Basically what I am saying is the spiller spills the value that contains the accesses. E.g., = v; will spill v = v.sub1; will spill v too, but v is a super register in that case. Cheers, -Quentin > On Jan 29, 2018, at 6:38 PM, ahmede via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Matthias, > > No. I want the register allocator to spill the super-register (the large one e.g., 64-bit) and not just the sub-register (e.g., the 32-bit that is a piece of of the 64-bit register) because the stack loads/sto...
2015 Apr 08
2
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
On Wed, Apr 8, 2015 at 10:12 AM, Dmitry Mikushin <dmitry at kernelgen.org> wrote: > A tool of this kind here: https://github.com/apc-llc/nvcc-llvm-ir > > 2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>: > >> Hi, >> >> I wanted to ask whether there is ongoing effort (or an already >> established tool) that enables to convert CUDA kernels (that uses CUDA >> specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I >> am aware...
2018 Jan 30
0
Disable spilling sub-registers in LLVM
> On Jan 29, 2018, at 1:20 PM, ahmede via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > > > I wonder if there is a way in LLVM to disable spilling a register-class while still enabling the super-registers of this register-class to be spilled. What would you have the register allocator do when it runs ou...
2018 Jan 29
2
Disable spilling sub-registers in LLVM
Hi, I wonder if there is a way in LLVM to disable spilling a register-class while still enabling the super-registers of this register-class to be spilled. If not, how can we implement spilling for sub-registers when stack load/stores can only operate on the super registers? Is there a way even if it is suboptimal? Thanks, Ahmed
2015 Feb 03
2
[LLVMdev] Example for usage of LLVM/Clang/libclc
Hi, My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and eventually generate a PTX code. I already did this but I am not sure if the PTX code I am generating is correct (is the one that is supposed to be generated). For example, currently, In OpenCL : get_global_id(0) translates to In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates to In PTX:
2015 Jun 19
2
[LLVMdev] Performance impact of different optimization passes
Hi, I was wondering if there is a paper or a technical report that documents the performance impact of the different optimizations passes on a some set of benchmarks. Is something like this available ? Best regards, Ahmed -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150619/c3c0a941/attachment.html>