similar to: LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Displaying 20 results from an estimated 4000 matches similar to: "LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target"

2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
Tried a bunch of them there (x86-64, haswell, znver2) and they all defaulted to 4-wide - haswell additionally caused some extra loop unrolling but still with 8-wide pows. Cheers, -Neil. On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> wrote: > Did you specify the target CPU the code should be optimized for? > For clang that is -march=native/znver2/... /
2020 Jul 16
4
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
So for us we use SLEEF to actually implement the libcalls (LLVM intrinsics) that LLVM by default would generate - and since SLEEF has highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to use that. So we would not see 4/8 libcalls and instead see 1 call to something that lights up the ymm registers. I guess the problem then is that the default expectation is that pow would be
2020 Jan 17
3
Help with SROA throwing away no-alias information
I'm having an issue where SROA will throw away no-alias information on some loads after inlining, because the loads are derived from a store to an alloca which can be removed after inlining. The pointers that were originally stored into the alloca do *not *have any aliasing information - the only context that allowed me to assert aliasing was that the inlined-function guaranteed it to be so.
2020 Jun 24
4
[RFC] `opt-out` attribute list for intrinsics
Hi all, A while back we started annotating intrinsics with new attributes ( https://reviews.llvm.org/D65377) After some discussion it was decided it would be good to have an `opt-out` attribute list for intrinsics. Some attributes that can be added to the list could be: nosync, nofree, nounwind, willreturn For now, there are 2 approaches: 1. Filtering opt-out attributes in tablegen source (
2015 Aug 20
2
loop unrolling introduces conditional branch
Hi, I want to use loop unrolling pass, however, I find that loop unrolling will introduces conditional branch at end of every "unrolled" part. For example, consider the following code *void foo( int n, int array_x[])* *{* * for (int i=0; i < n; i++)* * array_x[i] = i; * *}* Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify -loop-rotate -lcssa
2014 Sep 02
2
[LLVMdev] Preserving NSW/NUW bits
David/All, Just a quick question about NSW/NUW bits, if you've got a second. I noticed you've been doing a little work on this as of late. I have a bit of code that looks like the following: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %2 = add i64 %indvars.iv.next, -1 %tmp = trunc i64 %2 to i32 %cmp = icmp slt i32 %tmp, %0 br i1 %cmp, label %for.body, label
2015 Aug 22
3
loop unrolling introduces conditional branch
Hi, Mehdi, For example, I have this very simple source code: void foo( int n, int array_x[]) { for (int i=0; i < n; i++) array_x[i] = i; } After I use "clang -emit-llvm -o bc_from_clang.bc -c try.cc", I get bc_from_clang.bc. With my code (using LLVM IRbuilder API), I get bc_from_api.bc. Attachment please find thse two files. I also past the IR here.
2014 Sep 24
3
[LLVMdev] Loops Prevent Function Pointer Inlining?
I've CC'ed Chad Rosier as I think this behaviour is a side-effect of his revert of IndVarSimplify.cpp (git c6b1a7e577a0b9e9cff9f9b7ac35a2cde7c448d8, SVN 217962). The change basically makes the IndVar pass change: ; <label>:4 ; preds = %6, %0 %i.0 = phi i32 [ 0, %0 ], [ %11, %6 ] %5 = icmp eq i32 %i.0, 0 br i1 %5, label %6, label %17 To:
2017 Dec 19
4
MemorySSA question
Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using
2015 Aug 20
2
loop unrolling introduces conditional branch
Hi Xiangyang, The algorithm for loop unrolling was changed post-3.5 to do more what you'd expect. If you use 3.6 or 3.7 you'll likely get better results. Cheers, James On Thu, 20 Aug 2015 at 18:09 Philip Reames via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On 08/20/2015 07:38 AM, Xiangyang Guo via llvm-dev wrote: > > Hi, > > I want to use loop unrolling
2015 Aug 22
2
loop unrolling introduces conditional branch
Thanks for your point that out. I just add DataLayout in my code such as "mod->setDataLayout("e-m:e-i64:64-f80:128-n8:16:32:64-S128");", still no luck. I'm really confused about this. Do I need to add more passes before -loop-unroll? On Sat, Aug 22, 2015 at 11:36 AM, Mehdi Amini <mehdi.amini at apple.com> wrote: > > On Aug 22, 2015, at 7:27 AM, Xiangyang
2018 Jul 11
8
[RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete
Hi, everyone, I'd like to propose adding a nofree function attribute to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete). Clang/LLVM can currently misoptimize functions that:  1. Have a reference argument.  2. Free the memory backing the object to which the reference is bound during the function's
2015 Aug 22
2
loop unrolling introduces conditional branch
Hi, I just tried llvm-3.8 (LLVM SVN Repository). With this version, -fno-rtti can help me to compile my code and -irce can help me to do a better job for loop unrolling. However, I still have one question. If I use Clang to compile a piece of c++ code to .bc and then use 'opt -loop-rotate -loop-unroll -irce', I can get what I want. I mean, there is no conditional branch at the end of each
2012 Mar 08
2
[LLVMdev] -indvars issues?
Hi, Is the -indvars pass functional? I've done some small test to check it, but this fails to canonicalize: > int *x; > int *y; > int i; > ... > for (i = 1; i < 100; i+=2) { > x[i] = y[i] + 3; > } The IR produced after -indvars: > br label %for.cond > > for.cond: ; preds = %for.inc, %entry > %indvars.iv =
2015 Aug 21
2
loop unrolling introduces conditional branch
There's been some recent noise on the mailing list about requiring -fno-rtti; http://lists.llvm.org/pipermail/llvm-dev/2015-August/089010.html Could that be it? On Sat, Aug 22, 2015 at 12:21 AM, Xiangyang Guo via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, James and Philip, Thanks for your help. > > Based on your advice, I downloaded llvm-3.7. However, with this new
2020 Jul 19
2
Instrument intrinsic invalid
Hi, I try to use llvm-dis to disassemble the result after opt, my pass will add a intrinsic after the load instruction, like following: bool fpscan::ldAddMetadata (Instruction *Inst, StringRef c) { std::vector<Metadata *> dataTuples; // Add metadata in list dataTuples.push_back(MDString::get(Inst->getContext(), c)); MDNode* N = MDNode::get(Inst->getContext(),
2017 Dec 19
2
MemorySSA question
On Tue, Dec 19, 2017 at 9:10 AM, Siddharth Bhat via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I could be entirely wrong, but from my understanding of memorySSA, each > def defines an "abstract heap state" which has the coarsest possible > definition - any write will be modelled as a "new heap state". > This is true for def-def relationships, but
2013 Jun 26
0
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
Sent from my iPhone... On Jun 25, 2013, at 8:14 AM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> >> >> >> On Jun 24, 2013, at 4:24 PM, Hal Finkel < hfinkel at anl.gov > wrote: >> >> >> >> >> Indvars should ideally preserve NSW flags whenever possible. However, >> we don't want to
2016 Aug 25
4
Canonicalize induction variables
But even for a very simple loop: int test1 (int *x, int *y, int *z, int k) { int sum = 0; for (int i = 10; i < k; i++) { z[i] = x[i] / y[i]; } return sum; } The initial value of induction variable is not zero after compiling with -O3 -mcpu=power8 x.cpp -S -c -emit-llvm -fno-unroll-loops (see bottom of the email for IR) Also I can write somewhat more complicated loop where step
2013 Jun 25
2
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
----- Original Message ----- > > > > On Jun 24, 2013, at 4:24 PM, Hal Finkel < hfinkel at anl.gov > wrote: > > > > > Indvars should ideally preserve NSW flags whenever possible. However, > we don't want to rely on SCEV to preserve them. SCEV expressions are > implicitly reassociated and uniqued in a flow-insensitive universe > independent of the