thr3ads.net - similar to: "Loop Unrolling Fail in Simple Vectorized loop"

Displaying 20 results from an estimated 900 matches similar to: "Loop Unrolling Fail in Simple Vectorized loop"

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say; for(unsigned i = 0; i < vectorizable_elements ; i += 2){ //main loop } for(unsigned i=0 ; i < vectorizable_elements % 2; i++){ //fix up } Why does it have to reason about the range of vectorizable_elements? Even

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4, I think. On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev < llvm-dev at lists.llvm.org> wrote: > So, I tried unrolling the following simple loop. > > int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){ > > for(unsigned i=0; i<count; i++){ > > a[i] =

Legacy Loop Pass Manager question

2018 Aug 09

Legacy Loop Pass Manager question

Hi, If we add multiple loop passes to the pass manager in PassManagerBuilder.cpp consecutively without any func/module pass in between, I used to think they would belong to the same loop pass manager. But it does not seem to be the case. For example for this code snippet PM.add(createIndVarSimplifyPass()); // Canonicalize indvars MPM.add(createLoopIdiomPass()); //

llc error

2016 Sep 03

llc error

I updated to the latest revision and now llvm does not build and quits cmake with CMake Error at cmake/modules/LLVMProcessSources.cmake:83 (message): Found unknown source file ../llvm-revec/lib/CodeGen/MachineFunctionAnalysis.cpp Please update ../llvm-revec/lib/CodeGen/CMakeLists.txt Thanks On Sat, Sep 3, 2016 at 2:09 AM, Craig Topper <craig.topper at gmail.com> wrote: >

Vectorization in LLVM x86 backend

2017 Aug 21

Vectorization in LLVM x86 backend

I isolated the LLVM IR and the X86 instructions emitted for the function and are attached herewith and it is clearly emitting vector instructions. I am having a hard time figuring out where the vector instructions are formulated. For sure SLP and Loop vectorizer is not doing anything. On Mon, Aug 21, 2017 at 11:56 AM, Craig Topper <craig.topper at gmail.com> wrote: > The X86 backend

llc error

2016 Sep 03

llc error

Hi all, The attached LLVM assembly file fails to generate x86 code when compiled using llc. compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 -mcpu=core-avx2 ex4.ll The error message is, LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, undef:i64 t16: i64 = add t2,

Changing Alignment of global variables in LLVM

2017 Oct 03

Changing Alignment of global variables in LLVM

If I know for sure I am accessing 32 byte chunks at a time, how can I go about changing the alignment of @u? Should I use DataLayout's reset method? I couldn't find a method to change alignment of one global variable. Thanks On Tue, Oct 3, 2017 at 6:34 PM, Matthias Braun <mbraun at apple.com> wrote: > The effective alignment is part of the load and store operations. Updating

Changing Alignment of global variables in LLVM

2017 Oct 03

Changing Alignment of global variables in LLVM

What is the best way to change the alignment of global variables and allocated structures in LLVM during one of its optimization passes? For example, I want to change, @u = internal unnamed_addr global [5 x [65 x [65 x [65 x double]]]] zeroinitializer, align 16 to align to 32 bytes. How can this be accomplished so that all other references in the code accessing this structure are also

Vectorization in LLVM x86 backend

2017 Aug 21

Vectorization in LLVM x86 backend

Hi all, Recently I compiled the attached .c file using Clang with "-mavx2 -mfma -m32 -O3" optimization flags. First I used -emit-llvm and inspected the LLVM IR and there are no vector instructions. Then I got the assembly output of the file in it I can clearly see vector instructions in it. Neither the SLPVectorizer or the LoopVectorizer is however doing any vectorization (also

Getting the symbolic expression for an address calculation

2016 Oct 04

Getting the symbolic expression for an address calculation

How do you generate a SCEVAddRecExpr from a SCEV? It tried dyn_casting and it seems like that the SCEV returned by getSCEV is not a SCEVAddRecExpr. Thanks On Fri, Sep 30, 2016 at 4:16 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: > On 9/30/2016 12:16 PM, Charith Mendis via llvm-dev wrote: > >> >> Hi all, >> >> What is the best way to get the symbolic

Getting the symbolic expression for an address calculation

2016 Sep 30

Getting the symbolic expression for an address calculation

Hi all, What is the best way to get the symbolic expression for an address calculation in llvm specially when memory addresses are calculated within a loop. Use case: I want to know what loop induction variables are used for a particular address calculation and in what symbolic context. Thereby, I want to identify which stores and loads will be contiguous in memory if I unroll each of the

Vector Shuffle chain lowering to X86 instructions simplification inconsistencies

2016 Oct 28

Vector Shuffle chain lowering to X86 instructions simplification inconsistencies

Hi all, Attached herewith is a fairly simple LLVM file (shuffle.ll) with lots of vector shuffles. When I use llc with -O3 -mcpu=core-avx2 the first shuffle sequence containing types of 128 wide gets reduced a single shuffle, where as the second shuffle sequence containing types of 256 wide doesn't get reduced to a single shuffle instruction in the resulting X86 code (Shuffle.s attached).

Pass ordering - GVN vs. loop optimizations

2017 Dec 21

Pass ordering - GVN vs. loop optimizations

Hi, This is Ariel from the Rust team again. I am having another pass ordering issue. Looking at the pass manager at https://github.com/llvm-mirror/llvm/blob/7034870f30320d6fbc74effff539d946018cd00a/lib/Transforms/IPO/PassManagerBuilder.cpp (the early SimplifyCfg now doesn't sink stores anymore! I can't wait until I can get to use that in rustc!) I find that the loop optimization group

[LLVMdev] [polly] pass ordering

2013 Apr 17

[LLVMdev] [polly] pass ordering

On 04/17/2013 09:04 PM, Sebastian Pop wrote: > Tobias Grosser wrote: >> As said before, we could probably add it in between those two passes: >> >> MPM.add(createReassociatePass()); // Reassociate expressions >> + addExtensionsToPM(EP_LoopOptimizerStart, MPM); >> MPM.add(createLoopRotatePass()); // Rotate Loop > > As this is in the middle of other

[LLVMdev] [Polly] Move Polly's execution later

2013 Sep 25

[LLVMdev] [Polly] Move Polly's execution later

Here is an update about moving Polly later. 1. Why does Polly generate incorrect code when we move Polly immediately after the loop rotating pass? It is mainly caused by a wrong polly merge block. When Polly detects a valid loop for Polyhedral transformations, it usually introduces a new basic block "polly.merge_new_and_old" after the original loop exit block. This new basic block

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 18

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only

[LLVMdev] [Polly] Move Polly's execution later

2013 Sep 25

[LLVMdev] [Polly] Move Polly's execution later

On 09/25/2013 04:55 AM, Star Tan wrote: > Here is an update about moving Polly later. Hi star tan, thanks for your report. > > 1. Why does Polly generate incorrect code when we move Polly immediately after the loop rotating pass? > > It is mainly caused by a wrong polly merge block. When Polly detects a valid loop for Polyhedral transformations, it usually introduces a new basic

[LLVMdev] [polly] pass ordering

2013 Apr 17

[LLVMdev] [polly] pass ordering

Tobias Grosser wrote: > As said before, we could probably add it in between those two passes: > > MPM.add(createReassociatePass()); // Reassociate expressions > + addExtensionsToPM(EP_LoopOptimizerStart, MPM); > MPM.add(createLoopRotatePass()); // Rotate Loop As this is in the middle of other LNO passes, can you please rename s/EP_LoopOptimizerStart/EP_Polly_LNO/ or

How to run InternalizePass

2015 Dec 20

How to run InternalizePass

I'm working on a whole program optimizer that uses LLVM as a library, and one of the things I want to do is eliminate dead global functions and variables even when they are not local to a module. (This understandably doesn't happen by default because the optimizer has to assume it could be compiling a library rather than a program.) I've actually written a function to do this, but

The way Puppet installs things fail

2013 Jan 31

The way Puppet installs things fail

Basically, the way puppet installs things things with /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install <package_name> fails due to authentication WARNING: The following packages cannot be authenticated But if I do it from an ssh console normally, using apt-get install <package_name> it works fine without issues. Is there a way to change how puppet uses the

similar to: Loop Unrolling Fail in Simple Vectorized loop