Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] LLVM on ARM A15"
2013 Jan 04
0
[LLVMdev] LLVM on ARM A15
Hi Renato,
The tests below are target independent and should pass.
Did you build the x86 backend ? I think that the problem is that opt uses the triple from the LL file to initialize the backends and the cost model, and if you don't have the x86 backend then the tests fail. I will fix the tests shortly.
Thanks,
Nadav
On Jan 4, 2013, at 3:45 PM, Renato Golin <renato.golin at
2017 Apr 14
2
Separate LoopVectorize LLVM pass
Hello.
I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM
trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did
anybody try something like this?
I added close to the end of the .cpp file:
/* this line seems to be required - it allows to run this pass
as an embedded pass by giving opt -my-loop-vectorize
2014 Mar 18
4
[LLVMdev] E = L->begin() in LoopVectorize
Hi,
I'm studying loop vectorizer. I don't understand the code yet. But
it looks not right to assign L->begin() to E. Is it a typo?
Thanks,
Liang
diff --git a/lib/Transforms/Vectorize/LoopVectorize.cpp
b/lib/Transforms/Vectorize/LoopVectorize.cpp
index 435c005..87b5d79 100644
--- a/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@@
2014 Mar 18
2
[LLVMdev] E = L->begin() in LoopVectorize
Looking at it now, curious why no tests failed.
On Tue, Mar 18, 2014 at 2:48 PM, Jim Grosbach <grosbach at apple.com> wrote:
> Almost certainly, yes. Nice catch!
>
>
> On Mar 18, 2014, at 2:38 PM, Liang Wang <netcasper at gmail.com> wrote:
>
> > Hi,
> >
> > I'm studying loop vectorizer. I don't understand the code yet. But
> > it
2017 Jan 22
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Thank you for information.
I’ll build clang without the hack and re-run the benchmark tomorrow.
-Evgeny
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Sunday, January 22, 2017 8:00 PM
To: Evgeny Astigeevich
Cc: llvm-dev; nd
Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
> Do you mean to
2017 Jan 23
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Confirm there is no change in IR if the hack is disabled in the sources.
David wrote that these instructions are created by SCEV.
Are other targets affected by the changes, e.g. X86?
Kind regards,
Evgeny Astigeevich
Senior Compiler Engineer
Compilation Tools
ARM
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Sunday, January 22, 2017 10:45 PM
To: Evgeny Astigeevich
Cc: llvm-dev; nd
2018 Feb 06
2
[RFC] Make LoopVectorize Aware of SLP Operations
Hello,
We would like to propose making LoopVectorize aware of SLP operations,
to improve the generated code for loops operating on struct fields or
doing complex math.
At the moment, LoopVectorize uses interleaving to vectorize loops that
operate on values loaded/stored from consecutive addresses: vector
loads/stores are generated to combine consecutive loads/stores and then
shufflevector
2016 Aug 01
2
LLVM Loop vectorizer - 2 vector.body blocks appear
Hello.
Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the
beginning of July 2016) I ran the following piece of C code:
void foo(long *A, long *B, long *C, long N) {
for (long i = 0; i < N; ++i) {
C[i] = A[i] + B[i];
}
}
The vectorized LLVM program I obtain contains 2 vector.body blocks - one named
2017 Jan 24
3
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Hi Sanjay,
Thank you for your analysis. It’s interesting why the x86 machine is not affected. Maybe the x86 backend is smarter than the AArch64 backend, or it might be micro-architectural differences.
I don’t mind to keep the changes on trunk.
What I’d like to see is who will/should be involved in solving the issue. What kind of help/support is needed? Should we (ARM Compilation Tools) start
2017 Jan 24
3
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
On Tue, Jan 24, 2017 at 1:20 PM, Sanjay Patel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> I started looking at the log files that you attached, and I'm confused.
> The code that is supposedly causing the perf regression is created by the
> loop vectorizer, right? Except the bad code is not in the "vector.body", so
> is there something peculiar about
2017 Jan 22
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Hi Sanjay,
The benchmark source file: http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup
Clang options used to produce the initial IR: clang -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a53 -fomit-frame-pointer -O3 -DNDEBUG -w -Werror=date-time -c sieve.c -S -emit-llvm -mllvm -disable-llvm-optzns --target=aarch64-arm-linux
Opt options: opt -O3
2017 Jan 24
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
> On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> All targets are likely affected in some way by the icmp+shl fold introduced with r292492. It's a basic pattern that occurs in lots of code. Did you see any perf wins on your targets with this commit?
>
> Sadly, it is also likely that many (all?) targets are negatively
2016 Aug 21
2
LoopVectorize module - some possible enhancements
Hello, Michael,
I'd like to ask if we can enhance the LoopVectorize LLVM module (I am currently using
a version from Jul 2016).
More exactly:
- do you envision to support in the near future LLVM IR gather and scatter intrinsics
(as described at http://llvm.org/docs/LangRef.html#llvm-masked-gather-intrinsics and scatter)?
I see you have defined some methods that should
2013 Apr 11
2
[LLVMdev] Decouple LoopVectorizer from O3
Hi Nadav,
I tried your suggestion by changing the condition to :
189 if (LoopVectorize && OptLevel >= 0)
190 MPM.add(createLoopVectorizePass());
and compiled. Then I used the following command:
opt -mtriple=x86_64-linux-gnu -vectorize-loops
-vectorizer-min-trip-count=6 -debug-only=loop-vectorize -O1-S -o
example1_vect.s example1.s
where example1.s is IR generated by
clang -S
2018 Jun 11
2
LoopVectorize fails to vectorize code with condition on reduction
Hello.
I'm not able to vectorize this simple C loop doing basically what could be called
predicated sum-reduction:
#define NMAX 1000
int colOccupied[NMAX];
void Func(int N) {
int numSol = 0;
for (int c = 0; c < N; c++) {
if (colOccupied[c] == 0)
numSol++;
}
return numSol;
}
The compiler
2018 Feb 06
2
6 separate instances of static getPointerOperand(). Time to consolidate?
LLVM friends,
I'm currently trying to make LoopVectorizationLegality class in Transform/Vectorize/LoopVectorize.cpp
more modular and eventually move it to Analysis directory tree. It uses several file scope helper functions
that do not really belong to LoopVectorize. Let me start from getPointerOperand(). Within LLVM, there are
five other similar functions defined.
I think it's time to
2013 Apr 04
1
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Thanks, that did it!
Are there any plans to enable the loop vectorizer by default?
From: Nadav Rotem [mailto:nrotem at apple.com]
Sent: Wednesday, April 03, 2013 13:33 PM
To: Nowicki, Tyler
Cc: LLVM Developers Mailing List
Subject: Re: Packed instructions generaetd by LoopVectorize?
Hi Tyler,
Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating
2013 Apr 11
2
[LLVMdev] Decouple LoopVectorizer from O3
Done.
Best,
Anadi.
On Thu, Apr 11, 2013 at 7:01 AM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Anadi,
>
> Yes, this is a bug in the loop vectorizer. The loop vectorizer expects only
> one loop counter (integer with step=1). There is no reason why we should
> not handle the case below, and it should be easy to fix. Interestingly
> enough if you reverse the order of
2018 Feb 06
1
6 separate instances of static getPointerOperand(). Time to consolidate?
What LoopVectorize.cpp has are the following. Each function may have to have a separate consolidation discussion.
I'm bringing up getpointerOperand() since I actually found multiple instances defined/used.
DependenceAnalysis.cpp has isLoadOrStore(). LoopAccessAnalysis.cpp has getAddressSpaceOperand().
I'm sure there are others that might be worth discussing within this thread or a follow
2013 Apr 11
0
[LLVMdev] Decouple LoopVectorizer from O3
Hi Anadi,
Yes, this is a bug in the loop vectorizer. The loop vectorizer expects only one loop counter (integer with step=1). There is no reason why we should not handle the case below, and it should be easy to fix. Interestingly enough if you reverse the order of iterations and count from SIZE to zero, the loop vectorizer would vectorize it. If you open a bugzilla report and assign it to me