similar to: LoopVectorizer -- generating bad and unhandled shufflevector sequence

Displaying 20 results from an estimated 200 matches similar to: "LoopVectorizer -- generating bad and unhandled shufflevector sequence"

2016 Feb 09
3
[GVN] same sequence of instructions in if and else branch
Hello, I found that GVN doesn't promote identical sequence of instructions in if and else branch to their common predecessors. For example, for the following code snippet pred: … br i1 %cmp, label %if, label %else if: %incdec.ptr.1 = getelementptr inbounds i8, i8* %ptr, i64 1 %cast1 = ptrtoint i8* %incdec.ptr.1 to i64 … else: %incdec.ptr.2 = getelementptr inbounds i8, i8* %ptr,
2016 Feb 09
2
[GVN] same sequence of instructions in if and else branch
There is a phi-node "%phi = phi i64 [%cast1, %if], [%cast2, %else]" in the common successor. The actual control flow is a bit more complex, but there is a common successor block, and %cast1 and %cast2 are the two values that the phi node in the common successor takes. Does the existence of the phi node affect optimization? Thanks, Taewook From: <mats.o.petersson at
2013 Oct 21
5
[LLVMdev] First attempt at recognizing pointer reduction
Hi Nadav, Arnold, I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. Basically what I do is to teach AddReductionVar() about pointers, saying they don't really have an exit instructions, and that (maybe) the final store is a good candidate (is it?). This makes it recognize the writes and reads, but then
2016 Feb 09
2
[GVN] same sequence of instructions in if and else branch
and by "right thing" i mean it can hoist if you want and it can prove it will not extend the live range. Note that VBE (very busy expressions) is a code size optimization only. It does not save time. On Tue, Feb 9, 2016 at 12:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > This GVN does not do that, this is correct. It is a very simple GVN. All > phi nodes are
2013 Aug 22
2
[LLVMdev] scev questions
Hi, I'm trying to get the following loop to vectorize (simple reduction): unsigned int sum2(unsigned int *a, int len){ unsigned int s = 0; for (int i = 0; i < len; i += 4) s += *a++; return s; } The loop fails to vectorize because SCEV could not compute the loop exit count. It appears SCEV cannot handle the non-unit increment of the loop counter. Is this a known limitation of
2013 Sep 11
0
[LLVMdev] removing unnecessary ZEXT
Hi Andrew, Thank you for the suggestion. I've looked at CodeGenPrepare.cpp and MoveExtToFormExtLoad() is never run. I also notice that the ARM target produces the same additional register usage (copy) and zero extending (of the copy). (See the usage of r3 &r5 and also r12 & r4 in attached file arm-strcspn.s, my understanding is that 'ldrb' is zero extending.) Here is a
2007 Jan 06
2
[LLVMdev] More detailed example...
> How are you compiling this? I get the following sort of output: > > llvm-gcc incdec.cpp -o incdec I am currently using LLVM 1.8 -- I was basically holding off porting until LLVM 2.0 stabilises because I want to be able to move to 64 bit Intel and don't want to have to hit a moving target. > void %inc(int* %p) { > entry: > %tmp = volatile load int* %p
2016 May 05
2
No remapping of clone instruction in CloneBasicBlock
Hi, Found CloneBasicBlock utility only does the cloning without any remapping. Consider below example: Input block: sw.epilog: ; preds = %sw.bb20, %sw.bb15, %sw.bb10, %sw.bb6, %sw.bb2, %sw.bb, %while.body, %if.end29 %no_final.1 = phi i32 [ %no_final.055, %while.body ], [ 1, %if.end29 ], [ %no_final.055, %sw.bb20 ], [ %no_final.055, %sw.bb15 ], [
2013 Feb 14
1
[LLVMdev] LiveIntervals analysis problem
Hello everyone, please I need your help. To reproduce my problem I created simple pass for backends (TestPass.cpp in attached files). That pass I call from Mips backend in this way (MipsTargetMachine.cpp): bool MipsPassConfig::addPreRegAlloc() { addPass(createTestPass()); return false; } The problem becomes, when I am trying compile file ldtoa.ll (in attached files). Compiling
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 3:19 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote: >> On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> > Thanks! Did you compile with any non-default flags other than -mllvm >> > -vectorize? >> >> I used -O3 and -vectorize, no other non-default
2012 Jan 26
3
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote: > On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > Thanks! Did you compile with any non-default flags other than -mllvm > > -vectorize? > > I used -O3 and -vectorize, no other non-default flags. If I run clang -O3 -mllvm -vectorize -S -emit-llvm -o test.ll test.c then I get no
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 3:41 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: >> arm-none-linux-gnueabi > > Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get Minor remark: please use -target instead of -ccc-host-triple that is now deprecated. Thanks for looking at this testcase. Sebastian -- Qualcomm
2012 Jan 26
2
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: > arm-none-linux-gnueabi Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get vectorization (even though I don't get vectorization when targeting x86_64). I'll let you know what I find. -Hal -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
2013 Oct 21
0
[LLVMdev] First attempt at recognizing pointer reduction
Renato, This looks like the right direction. Did you run it on the LLVM test suite to check if it finds new loops to vectorize ? Thanks, Nadav On Oct 21, 2013, at 8:23 AM, Renato Golin <renato.golin at linaro.org> wrote: > Hi Nadav, Arnold, > > I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. >
2007 Jan 06
3
[LLVMdev] More detailed example...
Further to that, I thought an example might be useful. In the following code, n should end up with a value that varies nondeterministically with the scheduling decisions made by the underlying run time system -- my model checker, for example, should in theory be able to enumerate all possible values. Anyway, if you look at the compiler output (see below), the volatile global variable, n, has
2013 Oct 21
0
[LLVMdev] First attempt at recognizing pointer reduction
Renato, can you post a hand-created vectorized IR of how a reduction would work on your example? I don’t think that recognizing this as a reduction is going to get you far. A reduction is beneficial if the value reduced is only truly needed outside of a loop. This is not the case here (we are storing/loading from the pointer). Your example is something like WRITEPTR = phi i8* [ outsideval,
2013 Sep 11
2
[LLVMdev] removing unnecessary ZEXT
On Sep 10, 2013, at 8:59 AM, Robert Lytton <robert at xmos.com> wrote: > Hi, > > A bit more information. > I believe my problem lies with the fact that the load is left as 'anyext from i8'. > On the XCore target we know this will become an 8bit zext load - as there is no 8bit sign extended load! > If BB#1 were to force the load to a "zext from i8" would
2006 Dec 14
5
persistant: Matlab->R
Dear list members, Could anyone tell me if there is an equivalent of the Matlab declaration 'persistant' in R? Thank you very much, Bernard Gregorry. (Matlaber converted to R). --------------------------------- [[alternative HTML version deleted]]
2013 Aug 22
0
[LLVMdev] scev questions
On 22 August 2013 13:24, Redmond, Paul <paul.redmond at intel.com> wrote: > Hi, > > I'm trying to get the following loop to vectorize (simple reduction): > > unsigned int sum2(unsigned int *a, int len){ > unsigned int s = 0; > for (int i = 0; i < len; i += 4) > s += *a++; > return s; > } > > > The loop fails to vectorize because SCEV
2012 Feb 02
0
[LLVMdev] How to improve code generated for 'getelementptr' ?
Hi all, I am working on an llvm backend for a processor with a relative simple instruction set. For small loops, the code that is produced depends heavily on how the loop is specified: The less information we provide to clang, the better the loop code becomes... Any idea how I can learn llvm that we don't have load/store instructions with register index, so that it is more efficient to