similar to: [LLVMdev] alloca scalarization with dynamic indexing into vectors

Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] alloca scalarization with dynamic indexing into vectors"

2013 Dec 31
4
[LLVMdev] [Patch][RFC] Change R600 data layout
Hi, I've prepared patches for both LLVM and Clang to change the datalayout for R600. This may seem like a bold move, but I think it is warranted. R600/SI is a strange architecture in that it uses 64bit pointers but does not support 64 bit arithmetic except for load/store operations that roughly map onto getelementptr. The current datalayout for r600 includes n32:64, which is odd
2013 Aug 11
0
[LLVMdev] Address space extension
Hello Micah, I first apologize for the mail length, but I think that using an example would be better to clarify the case and the objections. > [Micah Villmow] In the case of OpenCL, you can't correctly use the standard C calling convention and still be OpenCL compliant, the C calling convention is too permissive. The second you use OpenCL, you are using an OpenCL specific calling
2018 Jul 29
2
Vectorizing remainder loop
Hello, I m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has
2014 Feb 19
2
[LLVMdev] better code for IV
Hi Andrew, The issue below refers to LSR, so I'll appreciate your feedback. It also refers to instruction combining and might impact backends other than X86, so if you know of others that might be interested you are more than welcome to add them. Thanks, Anat _____________________________________________ From: Shemer, Anat Sent: Tuesday, February 18, 2014 15:07 To: 'llvmdev at
2013 Feb 05
3
[LLVMdev] Vectorizing global struct pointers
Hi all, One of the reasons the Livermore Loops couldn't be vectorized is that it was using global structures to hold the arrays. Today, I'm investigating why is that so and how to fix it. My investigation brought me to LoopVectorizationLegality::canVectorizeMemory(): if (WriteObjects.count(*it)) { DEBUG(dbgs() << "LV: Found a possible read/write reorder:"
2014 Dec 10
2
[LLVMdev] Metadata/Value split has landed
> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: > > On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote: >> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the >> C++ side of it. This was a rocky day, but I suppose that's what I get >> for failing to stage the change in smaller pieces. >>
2017 Dec 19
4
MemorySSA question
Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using
2013 Aug 10
2
[LLVMdev] Address space extension
> -----Original Message----- > From: Michele Scandale [mailto:michele.scandale at gmail.com] > Sent: Saturday, August 10, 2013 6:29 AM > To: Micah Villmow > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Address space extension > > On 08/10/2013 02:47 PM, Micah Villmow wrote: > > Michele, > > The information you are trying to gather is fundamentally
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Nadav, Thanks for the quick reply !! Ok, so as of now we are lacking capability to handle flat large reductions. I did go through function vectorizeChainsInBlock() (line number 2862). In this function, we try to vectorize if we have phi nodes in the IR (several if's check for phi nodes) i.e we try to construct tree that starts at chains. Any pointers on how to join multiple trees? I
2013 Feb 05
0
[LLVMdev] Vectorizing global struct pointers
If I understand you correctly, conceptually you want two different objects to be returned for Foo.bl and Foo.al? Here is my take on this (take this with a grain of salt, Dan is the expert on this): http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds LLVM's semantic allows for arrays to be accessed out of bounds - this allows you to walk from the first
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold, Thanks for your reply. I tried test case as suggested by you. *void foo(int *a, int *sum) {*sum = a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}* so that it has a 'store' in its IR. *IR before vectorization :*target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple =
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi, I am trying to understand LLVM vectorization implementation and was looking into both loop and SLP vectorization. test case 1: *int foo(int *a) {int sum = 0,i;for(i=0; i<16; i++) sum += a[i];return sum;}* This code is vectorized by loop vectorizer where we calculate scalar loop cost as 4 and vector loop cost as 2. Since vector loop cost is less and above reduction is legal to
2013 Oct 24
0
[LLVMdev] Vectorizing alloca instructions
Hi Tom, Thanks for working on this. The SLP-vectorizer thinks that %X %Y %Z and %W alias, so it tries to perform 4 scalar store operations (which is a bad idea). We need to figure out why AA thinks that X and Y may alias. Maybe there is a problem with the code that uses AA. Thanks, Nadav On Oct 24, 2013, at 2:04 PM, Tom Stellard <tom at stellard.net> wrote: > Hi, > >
2013 Oct 28
2
[LLVMdev] loop vectorizer says Bad stride
Verifying function running passes ... LV: Checking a loop in "bar" LV: Found a loop: L0 LV: Found an induction variable. LV: We need to do 0 pointer comparisons. LV: Checking memory dependencies LV: Bad stride - Not an AddRecExpr pointer %13 = getelementptr float* %arg2, i32 %1 SCEV: ((4 * (sext i32 {(256 + %arg0),+,1}<nw><%L0> to i64)) + %arg2) LV: Src Scev: {((4 * (sext
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog, Thanks for looking at this. This has recently got itself onto my TODO list too. > I am not sure how much all this will improve the code quality for horizontal reduction > (donno how frequently such pattern of horizontal reduction from same array occurs in real world/SPECS). Actually the main loop of 470.lbm can be SLP vectorized like this. We have three parts to it: A fully
2017 Dec 19
2
MemorySSA question
On Tue, Dec 19, 2017 at 9:10 AM, Siddharth Bhat via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I could be entirely wrong, but from my understanding of memorySSA, each > def defines an "abstract heap state" which has the coarsest possible > definition - any write will be modelled as a "new heap state". > This is true for def-def relationships, but
2013 Feb 04
0
[LLVMdev] Vectorizer using Instruction, not opcodes
The loop vectorized does not estimate the cost of vectorization by looking at the IR you list below. It does not vectorize and then run the CostAnalysis pass. It estimates the cost itself before it even performs the vectorization. The way it works is that it looks at all the scalar instructions and asks: What is the cost if I execute the scalar instruction as a vector instruction. Therefore, it
2013 Oct 28
2
[LLVMdev] Loop vectorizer dosen't find loop bounds
I am trying to vectorize the function void bar(float *c, float *a, float *b) { const int width = 256; for (int i = 0 ; i < 256 ; ++i ) { c[ i ] = a[ i ] + b[ i ]; c[ width + i ] = a[ width + i ] + b[ width + i ]; } } using the following commands clang -emit-llvm -S loop.c opt loop.ll -O3 -debug-only=loop-vectorize -S -o - LV: Checking a loop in
2016 Aug 25
4
Canonicalize induction variables
But even for a very simple loop: int test1 (int *x, int *y, int *z, int k) { int sum = 0; for (int i = 10; i < k; i++) { z[i] = x[i] / y[i]; } return sum; } The initial value of induction variable is not zero after compiling with -O3 -mcpu=power8 x.cpp -S -c -emit-llvm -fno-unroll-loops (see bottom of the email for IR) Also I can write somewhat more complicated loop where step
2013 Oct 28
0
[LLVMdev] loop vectorizer says Bad stride
Frank, It looks like the loop vectorizer is unable to tell that the two stores in your code never overlap. This is probably because of the sign-extend in your code. Can you extend the indices to 64bit ? Thanks, Nadav On Oct 28, 2013, at 1:38 PM, Frank Winter <fwinter at jlab.org> wrote: > Verifying function > running passes ... > LV: Checking a loop in "bar" > LV: