similar to: [GSoC 2016] Implementation of the packing transformation

Displaying 20 results from an estimated 200 matches similar to: "[GSoC 2016] Implementation of the packing transformation"

2016 Feb 03
3
opt with Polly doesn't find the passes
I just checkout release_38 branches of llvm, clang and polly and built it on and x86 Ubuntu with cmake: CMAKE_BUILD_TYPE="Debug" CMAKE_INSTALL_PREFIX="$HOME/toolchain/install/llvm-3.8" LLVM_TARGETS_TO_BUILD="X86" cmake -G "Unix Makefiles" \ -DBUILD_SHARED_LIBS="ON" \ -DCMAKE_BUILD_TYPE=$CMAKE_BUILD_TYPE \
2016 Aug 03
3
Extracting the names of the variables that creates loop-carried dependencies
Hi, I would like to know if it is possible to extract the source level names of variables that create loop-carried dependencies. For example, for the following code: for (int i = 0; i < A_ROW; i++) { for (int j = 1; j < B_COL; j++) { a_matrix[i][j] = a_matrix[i][j - 1]; } } I get the following AST: #pragma omp parallel for
2012 Dec 17
1
[LLVMdev] [polly] ISL vector code generation
Hi, thanks to Tobias for doing most of the work to get the vector code generation working with ISL's code generation. Attached are two patches to port the vector code generation part of the testsuite from Cloog/CodeGen to Isl/CodeGen. Tobi, do you want to commit these two patches separately, or you want me to combine them, as the second patch is needed to make the testsuite pass? Also, let
2016 Apr 11
2
[LICM][MemorySSA] Converting LICM pass to use MemorySSA to avoid AliasSet collapse issue
Hi All, I'm looking into converting LICM to use MemorySSA instead of AliasSets to determine when it is safe to hoist/sink/promote loads and stores to get around the issue of alias set collapse (see discussion [1]). I have a prototype implementation, but have run into two issues that I could use input from the designers of MemorySSA to resolve: 1) Is MemorySSA intended to be
2017 Sep 25
2
Potential infinite loop in MemorySSAUpdater
I understand that changing the starting element to “InsertedPHIs.being() + StartingPHISize” it will be finite but given that InsrtedPHIs is finite. I have a case where one element(same element is appened to InsertedPHIs) is added to InsertedPHIs every time fixupDefs is invoked. I traced the issue why this was happening. template <class RangeType> MemoryAccess
2013 Jan 02
2
[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?
On 01/01/2013 02:45 PM, Duncan Sands wrote: > Hi Dmitry, > >> >> In our compiler we use a modified version LLVM Polly, which is very >> sensitive to >> proper code generation. Among the number of limitations, the loop region >> (enclosed by phi node on induction variable and branch) is required to >> be free >> of additional memory-dependent
2017 Sep 25
0
Potential infinite loop in MemorySSAUpdater
We should only add phis that were newly inserted, not ones that were already found. There are two cases we will hvae inserted phis: Part of the recursive call, or right in this function. The easiest way to differentiate new phis from old ones is whether they have 0 operands. I expect the attached will fix it. If not, please file a bug with reproducible IR. On Sun, Sep 24, 2017 at 11:40 PM,
2013 Jan 02
0
[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?
Hi Duncan & Tobi, Thanks a lot for your interest, and for pointing out differences in GIMPLE I missed. Attached is simplified test case. Is it good? Tobi, regarding runtime alias analysis: in KernelGen we already do it along with runtime values substitution. For example: <------------------ __kernelgen_main_loop_17: compile started ---------------------> Integer args substituted:
2017 Sep 23
0
Potential infinite loop in MemorySSAUpdater
On Sat, Sep 23, 2017 at 9:55 AM, Godala, Bhargav-reddy < Bhargav-reddy.Godala at amd.com> wrote: > > With regards > Bhargav Reddy Godala > Software Engineer 2 > Bangalore, India > E-mail: Bhargav-reddy.Godala at amd.com Ext 30678 > > > > On 23-Sep-2017, at 9:27 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > On Sat, Sep 23, 2017 at
2016 Apr 20
2
[LICM][MemorySSA] Converting LICM pass to use MemorySSA to avoid AliasSet collapse issue
Hi Daniel, Thanks for the info. I’ve started looking into converting EarlyCSE to use MemorySSA first since 1) I don’t think it needs any additional MemorySSA update API and 2) the particular case I’m looking at needs EarlyCSE to catch more load cases before LICM to be profitable. I have a prototype working, but have run into two issues: 1) readonly calls are treated as clobbers by
2017 Sep 23
2
Potential infinite loop in MemorySSAUpdater
With regards Bhargav Reddy Godala Software Engineer 2 Bangalore, India E-mail: Bhargav-reddy.Godala at amd.com<mailto:Bhargav-reddy.Godala at amd.com> Ext 30678 On 23-Sep-2017, at 9:27 PM, Daniel Berlin <dberlin at dberlin.org<mailto:dberlin at dberlin.org>> wrote: On Sat, Sep 23, 2017 at 8:38 AM, Godala, Bhargav-reddy via llvm-dev <llvm-dev at
2016 Apr 20
4
[LICM][MemorySSA] Converting LICM pass to use MemorySSA to avoid AliasSet collapse issue
1) Sounds good. This isn’t holding me up so I’ll just try to keep an eye out for these changes. 2) I’ve attached an example IR file and debug log of where the caching is going bad. It depends on my changes to EarlyCSE, but hopefully it is clear from the debug output what is going on. Let me know if there is a better way to get this repro case to you. Also, I’ll be on IRC for the
2011 Apr 08
1
[LLVMdev] [GSoC] Increase the coverage of Polly
How to feed pocc by jscop files which are made with -polly-export-jscop? 2011/4/8 ether zhhb <etherzhhb at gmail.com>: > Hi, > > 2011/4/8 Vlad Krylov <krvladislav at gmail.com>: >> Hi. >> >> I see that to detect scops firstly we search for regions in CFG ( by >> RegionInfo ) and then select regions that answer some requirements ( >> in
2016 Jun 28
2
[GSoC 2016] Implementation of the packing transformation
2016-06-27 15:52 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>: > Dear Roman and all, > > Such features would be extremely useful to implement array expansion (scalar > and array renaming, privatization with new subscript expressions of higher > dimension) and storage mapping optimization (generalizing array > contraction). It would be interesting to have these
2016 Jun 29
0
[GSoC 2016] Implementation of the packing transformation
On 06/28/2016 10:53 AM, Roman Gareev wrote: > 2016-06-27 15:52 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>: >> Dear Roman and all, >> >> Such features would be extremely useful to implement array expansion (scalar >> and array renaming, privatization with new subscript expressions of higher >> dimension) and storage mapping optimization (generalizing
2016 Apr 21
2
[LICM][MemorySSA] Converting LICM pass to use MemorySSA to avoid AliasSet collapse issue
Hi George, After digging a little deeper, it appears that readonly calls showing up as MemoryDefs is only happening on an EarlyCSE test that is using the new pass manager (test/Transforms/EarlyCSE/basic.ll test5 if you’re curious), so I suspect it is an issue with the new pass manager setup code for either MemorySSA, my changes to EarlyCSE, the test run command line or something else not
2011 Nov 02
5
[LLVMdev] How to make Polly ignore some non-affine memory accesses
Mmm I found out a very strange behavior (to me) of the SCEV analysis of the loop bound of the external loop I posted. When in ScopDetection it gets the SCEV of the external loop bound in the "isValidLoop()" function with: const SCEV *LoopCount = SE->getBackedgeTakenCount(L); It returns a SCEVCouldNotCompute, but if I change the "if" block inside the loop from: if
2012 Apr 03
0
[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm
Hi Justin, the non-translatable IR with GPU code replaced by appropriate CUDA Driver > API calls. One of CUDA driver apis (cuLaunch) need a ptx asm string as its input. So if I want to provide a one-touch solution and don't introduce any changes to tools outside polly, I must prepare the ptx string before I can generate the correct non-translatable IR part. As your suggestion, It may
2017 Sep 01
10
[RFC] Polly Status and Integration
** *Hi everyone,As you may know, stock LLVM does not provide the kind of advanced loop transformations necessary to provide good performance on many applications. LLVM's Polly project provides many of the required capabilities, including loop transformations such as fission, fusion, skewing, blocking/tiling, and interchange, all powered by state-of-the-art dependence analysis. Polly also
2011 Apr 08
0
[LLVMdev] [GSoC] Increase the coverage of Polly
Hi, 2011/4/8 Vlad Krylov <krvladislav at gmail.com>: > Hi. > > I see that to detect scops firstly we search for regions in CFG ( by > RegionInfo ) and then select regions that answer some requirements ( > in ScopDetection ). Because only affine expressions in conditions and > bounds are permissible, we trying to get scalar expressions into > affine form by