similar to: [LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?"

2014 Oct 14
7
[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
> On Oct 14, 2014, at 8:53 AM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > > >> On Oct 13, 2014, at 5:56 PM, Chandler Carruth <chandlerc at gmail.com> wrote: >> >> I've added a straw-man of some extra optimization passes that help specific benchmarks here or there by either preparing code better on the way into the vectorizer or cleaning
2014 Oct 14
4
[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
I’ll summarize your responses as: The new pipeline produces better results than the old, and we currently have no good mechanism for reducing the compile time overhead. I’ll summarize my criticism as: In principle, there are better ways to clean up after the vectorizer without turning it into a complicated megapass, but no one has done the engineering. I don’t think cleaning up after the
2014 Oct 14
3
[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
For what it is worth, I agree with the usefulness of having a concept of "cleanup pass". Another example of a situation where it would be nice is in the fence elimination patch I sent for review recently: the pass is rather expensive because it relies on several analysis passes, and is only useful if AtomicExpand introduced fences. Being able to say "Only run this pass if the code
2015 Jan 17
3
[LLVMdev] loop multiversioning
Does LLVM have loop multiversioning ? it seems it does not with clang++ -O3 -mllvm -debug-pass=Arguments program.c -c bash-4.1$ clang++ -O3 -mllvm -debug-pass=Arguments fast_algorithms.c -c clang-3.6: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated Pass Arguments: -datalayout -notti -basictti -x86tti -targetlibinfo -no-aa -tbaa -scoped-noalias
2015 May 11
2
[LLVMdev] about MemoryDependenceAnalysis usage
add -basicaa to your command line :) On Mon, May 11, 2015 at 7:15 AM, Willy WOLFF <willy.mh.wolff at gmail.com> wrote: > I play a bit more with MemoryDependenceAnalysis by wrapping my pass, and > call explicitely BasicAliasAnalysis. Its still using No Alias Analysis. > > How can I let MemoryDependenceAnalysis use BasicAliasAnalysis? > > Please, find attached my pass. >
2015 Mar 12
3
[LLVMdev] Question about shouldMergeGEPs in InstructionCombining
I think it would make sense for (1) and (2). I am not sure if (3) is feasible in instcombine. (I am not too familiar with LoopInfo) For the Octasic's Opus platform, I modified shouldMergeGEPs in our fork to: if (GEP.hasAllZeroIndices() && !Src.hasAllZeroIndices() && !Src.hasOneUse()) return false; return Src.hasAllConstantIndices(); // was return false;
2013 Jul 17
5
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. To centralize the discussion, until we get some documentation and better APIs in place, let me throw out an oversimplified Straw Man for a new pass pipline. It serves two purposes: (1) an overdue reorganization of
2016 May 09
4
Some questions about phase ordering in OPT and LLC
Hi, I'm a PhD student doing phase ordering as part of my PhD topic and I would like to ask some questions about LLVM. Executing the following command to see what passes does OPT execute when targeting a SPARC V8 processor: /opt/clang+llvm-3.7.1-x86_64-linux-gnu-ubuntu-15.10/bin/llvm-as < /dev/null | /opt/clang+llvm-3.7.1-x86_64-linux-gnu-ubuntu-15.10/bin/opt -O3 -march=sparc -mcpu=v8
2014 Aug 07
3
[LLVMdev] How to broaden the SLP vectorizer's search
The BB vectorizer has an option 'bb-vectorizer-search-limit'. Is there a similar option for the SLP vectorizer? Maybe an analysis pass' scope that can be widen? I have large basic blocks with instructions that should be merged into packed versions. However, the blocks are optimized independently from each other. Now, if the instructions to be merged aren't too far apart the
2013 Jun 24
0
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
On Mon, Jun 24, 2013 at 2:59 PM, Nadav Rotem <nrotem at apple.com> wrote: > I agree. The vectorizer is a *lowering* pass, and much like LSR and it > loses information. A few months ago some of us talked about this and came > up with a general draft for the ideal pass ordering. > Where? On the mailing list? > If I remember correctly the plan was that the second half of the
2016 Aug 25
4
CFLAA
(and sys::cas_flag that STATISTIC uses is a uint32 ...) On Thu, Aug 25, 2016 at 9:54 AM, Daniel Berlin <dberlin at dberlin.org> wrote: > Okay, dumb question: > Are you really getting negative numbers in the second column? > > 526,766 -136 mem2reg # PHI nodes inserted > > http://llvm.org/docs/doxygen/html/PromoteMemoryToRegister_8cpp_source.html >
2016 May 09
2
Some questions about phase ordering in OPT and LLC
On Mon, May 09, 2016 at 01:07:07PM -0700, Mehdi Amini via llvm-dev wrote: > > > On May 9, 2016, at 10:43 AM, Ricardo Nobre via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > Hi, > > > > I'm a PhD student doing phase ordering as part of my PhD topic and I would like to ask some questions about LLVM. > > > > Executing the following
2013 Jun 24
3
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
> > Just for the record, I have no real expectation that this is a good idea yet... But it's hard to collect numbers without a flag of some kind, and it's also really annoying to craft this flag given the current pass manager, so I figured I would get a skeleton in place that folks could experiment with, and we could keep or delete based on this discussion and any numbers. I agree.
2015 May 09
2
[LLVMdev] about MemoryDependenceAnalysis usage
Hi, I try to use MemoryDependenceAnalysis in a pass to analyse a simple function: void fct (int *restrict*restrict M, int *restrict*restrict L) { S1: M[1][1] = 1; S2: L[2][2] = 2; } When I iterate over MemoryDependenceAnalysis on the S2 statement, I get the load instruction for the first depth of the array, that’s ok. But I get also the load and store for the S1 statement. I assume the
2014 Oct 16
2
[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
Seems that adding -extra-vectorizer-passes doesn't help to vectorizer in my case. LoopRotation re-run does nothing. 2014-10-15 2:54 GMT+04:00 Chandler Carruth <chandlerc at google.com>: > > On Tue, Oct 14, 2014 at 3:50 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> > I have and will continue to push >> > back on trying to add it until we at least
2015 Mar 12
2
[LLVMdev] Question about shouldMergeGEPs in InstructionCombining
Hi Mark, It is not clear to me at all that preventing the merging is the right solution. There are a large number of analysis, including alias analysis, and optimizations that use GetUnderlyingObject, and related routines to search back through GEPs. They only do this up to some small finite depth (six, IIRC). So reducing the GEP depth is likely the right solution for InstCombine (which has the
2016 Aug 25
2
CFLAA
I did gathered aggregate statistics reported by “-stats” over the ~400 test files. The following table summarizes the impact. The first column is the sum where the new analysis is enabled, the second column is the delta from baseline where no CFL alias analysis is performed. I am not experienced enough to know which of these are “good” or “bad” indicators. —david 72,250 685 SLP
2014 Mar 12
2
[LLVMdev] Autovectorization questions
Hi, I'm reading "http://llvm.org/docs/Vectorizers.html" and have few question. Hope someone has answers on it. The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions that scatter/gathers memory. ( http://llvm.org/docs/Vectorizers.html#scatter-gather) int foo(int *A, int *B, int n, int k) { for (int i = 0; i < n; ++i) A[i*7] += B[i*k]; } I
2014 Mar 12
4
[LLVMdev] Autovectorization questions
In order to vectorize code like this LLVM needs to prove that “A[i*7]” does not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize this loop even if we try to force it. The following loop will be vectorized if we force it: int foo(int * A, int * B, int n, int k) { for (int i = 0; i < 1024; ++i) A[i] += B[i*k]; } So will this loop: int foo(int * restrict A, int
2014 Oct 14
2
[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
----- Original Message ----- > From: "Chandler Carruth" <chandlerc at google.com> > To: "Robin Morisset" <morisset at google.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "James Molloy" <james at jamesmolloy.co.uk>, "LLVM Developers Mailing List" > <llvmdev at cs.uiuc.edu> > Sent: Tuesday, October 14,