thr3ads.net - similar to: "LoopVectorizer -- generating bad and unhandled shufflevector sequence"

Displaying 20 results from an estimated 200 matches similar to: "LoopVectorizer -- generating bad and unhandled shufflevector sequence"

[GVN] same sequence of instructions in if and else branch

2016 Feb 09

[GVN] same sequence of instructions in if and else branch

Hello, I found that GVN doesn't promote identical sequence of instructions in if and else branch to their common predecessors. For example, for the following code snippet pred: … br i1 %cmp, label %if, label %else if: %incdec.ptr.1 = getelementptr inbounds i8, i8* %ptr, i64 1 %cast1 = ptrtoint i8* %incdec.ptr.1 to i64 … else: %incdec.ptr.2 = getelementptr inbounds i8, i8* %ptr,

[GVN] same sequence of instructions in if and else branch

2016 Feb 09

[GVN] same sequence of instructions in if and else branch

There is a phi-node "%phi = phi i64 [%cast1, %if], [%cast2, %else]" in the common successor. The actual control flow is a bit more complex, but there is a common successor block, and %cast1 and %cast2 are the two values that the phi node in the common successor takes. Does the existence of the phi node affect optimization? Thanks, Taewook From: <mats.o.petersson at

[LLVMdev] First attempt at recognizing pointer reduction

2013 Oct 21

[LLVMdev] First attempt at recognizing pointer reduction

Hi Nadav, Arnold, I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. Basically what I do is to teach AddReductionVar() about pointers, saying they don't really have an exit instructions, and that (maybe) the final store is a good candidate (is it?). This makes it recognize the writes and reads, but then

[GVN] same sequence of instructions in if and else branch

2016 Feb 09

[GVN] same sequence of instructions in if and else branch

and by "right thing" i mean it can hoist if you want and it can prove it will not extend the live range. Note that VBE (very busy expressions) is a code size optimization only. It does not save time. On Tue, Feb 9, 2016 at 12:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > This GVN does not do that, this is correct. It is a very simple GVN. All > phi nodes are

[LLVMdev] scev questions

2013 Aug 22

[LLVMdev] scev questions

Hi, I'm trying to get the following loop to vectorize (simple reduction): unsigned int sum2(unsigned int *a, int len){ unsigned int s = 0; for (int i = 0; i < len; i += 4) s += *a++; return s; } The loop fails to vectorize because SCEV could not compute the loop exit count. It appears SCEV cannot handle the non-unit increment of the loop counter. Is this a known limitation of

[LLVMdev] removing unnecessary ZEXT

2013 Sep 11

[LLVMdev] removing unnecessary ZEXT

Hi Andrew, Thank you for the suggestion. I've looked at CodeGenPrepare.cpp and MoveExtToFormExtLoad() is never run. I also notice that the ARM target produces the same additional register usage (copy) and zero extending (of the copy). (See the usage of r3 &r5 and also r12 & r4 in attached file arm-strcspn.s, my understanding is that 'ldrb' is zero extending.) Here is a

[LLVMdev] More detailed example...

2007 Jan 06

[LLVMdev] More detailed example...

> How are you compiling this? I get the following sort of output: > > llvm-gcc incdec.cpp -o incdec I am currently using LLVM 1.8 -- I was basically holding off porting until LLVM 2.0 stabilises because I want to be able to move to 64 bit Intel and don't want to have to hit a moving target. > void %inc(int* %p) { > entry: > %tmp = volatile load int* %p

No remapping of clone instruction in CloneBasicBlock

2016 May 05

No remapping of clone instruction in CloneBasicBlock

Hi, Found CloneBasicBlock utility only does the cloning without any remapping. Consider below example: Input block: sw.epilog: ; preds = %sw.bb20, %sw.bb15, %sw.bb10, %sw.bb6, %sw.bb2, %sw.bb, %while.body, %if.end29 %no_final.1 = phi i32 [ %no_final.055, %while.body ], [ 1, %if.end29 ], [ %no_final.055, %sw.bb20 ], [ %no_final.055, %sw.bb15 ], [

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

Hello everyone, please I need your help. To reproduce my problem I created simple pass for backends (TestPass.cpp in attached files). That pass I call from Mips backend in this way (MipsTargetMachine.cpp): bool MipsPassConfig::addPreRegAlloc() { addPass(createTestPass()); return false; } The problem becomes, when I am trying compile file ldtoa.ll (in attached files). Compiling

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 26

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Thu, Jan 26, 2012 at 3:19 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote: >> On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> > Thanks! Did you compile with any non-default flags other than -mllvm >> > -vectorize? >> >> I used -O3 and -vectorize, no other non-default

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 26

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote: > On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > Thanks! Did you compile with any non-default flags other than -mllvm > > -vectorize? > > I used -O3 and -vectorize, no other non-default flags. If I run clang -O3 -mllvm -vectorize -S -emit-llvm -o test.ll test.c then I get no

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 26

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Thu, Jan 26, 2012 at 3:41 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: >> arm-none-linux-gnueabi > > Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get Minor remark: please use -target instead of -ccc-host-triple that is now deprecated. Thanks for looking at this testcase. Sebastian -- Qualcomm

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 26

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: > arm-none-linux-gnueabi Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get vectorization (even though I don't get vectorization when targeting x86_64). I'll let you know what I find. -Hal -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory

[LLVMdev] First attempt at recognizing pointer reduction

2013 Oct 21

[LLVMdev] First attempt at recognizing pointer reduction

Renato, This looks like the right direction. Did you run it on the LLVM test suite to check if it finds new loops to vectorize ? Thanks, Nadav On Oct 21, 2013, at 8:23 AM, Renato Golin <renato.golin at linaro.org> wrote: > Hi Nadav, Arnold, > > I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. >

[LLVMdev] More detailed example...

2007 Jan 06

[LLVMdev] More detailed example...

Further to that, I thought an example might be useful. In the following code, n should end up with a value that varies nondeterministically with the scheduling decisions made by the underlying run time system -- my model checker, for example, should in theory be able to enumerate all possible values. Anyway, if you look at the compiler output (see below), the volatile global variable, n, has

[LLVMdev] First attempt at recognizing pointer reduction

2013 Oct 21

[LLVMdev] First attempt at recognizing pointer reduction

Renato, can you post a hand-created vectorized IR of how a reduction would work on your example? I don’t think that recognizing this as a reduction is going to get you far. A reduction is beneficial if the value reduced is only truly needed outside of a loop. This is not the case here (we are storing/loading from the pointer). Your example is something like WRITEPTR = phi i8* [ outsideval,

[LLVMdev] removing unnecessary ZEXT

2013 Sep 11

[LLVMdev] removing unnecessary ZEXT

On Sep 10, 2013, at 8:59 AM, Robert Lytton <robert at xmos.com> wrote: > Hi, > > A bit more information. > I believe my problem lies with the fact that the load is left as 'anyext from i8'. > On the XCore target we know this will become an 8bit zext load - as there is no 8bit sign extended load! > If BB#1 were to force the load to a "zext from i8" would

persistant: Matlab->R

2006 Dec 14

persistant: Matlab->R

Dear list members, Could anyone tell me if there is an equivalent of the Matlab declaration 'persistant' in R? Thank you very much, Bernard Gregorry. (Matlaber converted to R). --------------------------------- [[alternative HTML version deleted]]

[LLVMdev] scev questions

2013 Aug 22

[LLVMdev] scev questions

On 22 August 2013 13:24, Redmond, Paul <paul.redmond at intel.com> wrote: > Hi, > > I'm trying to get the following loop to vectorize (simple reduction): > > unsigned int sum2(unsigned int *a, int len){ > unsigned int s = 0; > for (int i = 0; i < len; i += 4) > s += *a++; > return s; > } > > > The loop fails to vectorize because SCEV

[LLVMdev] How to improve code generated for 'getelementptr' ?

2012 Feb 02

[LLVMdev] How to improve code generated for 'getelementptr' ?

Hi all, I am working on an llvm backend for a processor with a relative simple instruction set. For small loops, the code that is produced depends heavily on how the loop is specified: The less information we provide to clang, the better the loop code becomes... Any idea how I can learn llvm that we don't have load/store instructions with register index, so that it is more efficient to

similar to: LoopVectorizer -- generating bad and unhandled shufflevector sequence