thr3ads.net - search: "canvectorizememori"

Displaying 20 results from an estimated 22 matches for "canvectorizememori".

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

Hi all, One of the reasons the Livermore Loops couldn't be vectorized is that it was using global structures to hold the arrays. Today, I'm investigating why is that so and how to fix it. My investigation brought me to LoopVectorizationLegality::canVectorizeMemory(): if (WriteObjects.count(*it)) { DEBUG(dbgs() << "LV: Found a possible read/write reorder:"

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote: > LV: We don't allow storing to uniform addresses > This is triggering because it didn't recognize as a reduction variable during the canVectorizeInstrs() but did recognize that sum[q] is loop invariant in canVectorizeMemory(). I'm guessing the nested loop was unrolled because of the low trip-count, and

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

[LLVMdev] Apparent indeterminism in PreVerifier

Hi Sergei, "addRuntimeCheck" inserts code that checks that two or more arrays are disjoint. I looked at the code and it looks fine. We generate PHIs in the order that they appear in a vector. The values are inserted in 'canVectorizeMemory', which also looks fine. Please let me know if you think I missed something. Thanks, Nadav On Jan 29, 2013, at 8:48 AM, Sergei Larin

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

I am trying my luck on this global reduction kernel: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end ; ++i ) { for (int q = 0 ; q < 4 ; ++q ) sum[q] += A[i*4+q]; } return sum[0]+sum[1]+sum[2]+sum[3]; } LV: Checking a loop in "foo" LV: Found a loop: for.cond1 LV: Found an induction variable. LV: We

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

On 01/25/2013 09:56 AM, Nadav Rotem wrote: > Thanks for checking the Loop Vectorizer, I am interested in hearing your > feedback. The Loop Vectorizer does not fit here. OpenCL vectorization is > completely different because the language itself is data-parallel. You > don't need all of the legality checks that the loop vectorizer has. I'm aware of this and it was my point in

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

If I understand you correctly, conceptually you want two different objects to be returned for Foo.bl and Foo.al? Here is my take on this (take this with a grain of salt, Dan is the expert on this): http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds LLVM's semantic allows for arrays to be accessed out of bounds - this allows you to walk from the first

[LLVMdev] First attempt at recognizing pointer reduction

2013 Oct 21

[LLVMdev] First attempt at recognizing pointer reduction

Hi Nadav, Arnold, I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. Basically what I do is to teach AddReductionVar() about pointers, saying they don't really have an exit instructions, and that (maybe) the final store is a good candidate (is it?). This makes it recognize the writes and reads, but then

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 20

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 19, 2015, at 9:46 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Thanks Adam for your reply. > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Friday, March 20, 2015 3:23 AM > To: Nema, Ashutosh > Cc: Hal Finkel; Philip Reames; llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > Subject:

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

[LLVMdev] Apparent indeterminism in PreVerifier

Nadav, Thanks for the quick response. By now I am convinced that the given loop ends up vectorized with enough difference to cause bad things later on, but I have not found the exact cause yet. To continue with my work I'll have to simply turn off vectorization for now, but I will come back and investigate. Again, there is some indeterminism in order of PHIs processing somewhere. I'll

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

I changed the input C to using a 64 bit type for the loop index (this eliminates 'sext' instructions in the IR) Here the IR produced with clang -O0 define float @foo(i64 %start, i64 %end, float* %A) #0 { entry: %start.addr = alloca i64, align 8 %end.addr = alloca i64, align 8 %A.addr = alloca float*, align 8 %sum = alloca [4 x float], align 16 %i = alloca i64, align 8

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

[LLVMdev] Apparent indeterminism in PreVerifier

Is there a test case that you can share ? On Jan 29, 2013, at 9:24 AM, Sergei Larin <slarin at codeaurora.org> wrote: > Nadav, > > Thanks for the quick response. By now I am convinced that the given loop > ends up vectorized with enough difference to cause bad things later on, but > I have not found the exact cause yet. To continue with my work I'll have to >

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 19

[LLVMdev] RFC: Loop versioning for LICM

Hi Ashutosh, > On Mar 16, 2015, at 9:06 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Hi Adam, > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Wednesday, March 11, 2015 10:48 AM > To: Nema, Ashutosh > Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > Subject: Re: [LLVMdev] RFC: Loop

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 24

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 20, 2015, at 8:02 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > > Yes, this is what I was proposing above and here ;): > Thanks Adam it’s for confirming J NP :). > > > No, not hasLoopInvariantStore but hasAccessToLoopInvariantAddress. > Its only for invariant stores[not loads], Using ‘hasLoopInvariantStore’ (or a name with invariant store)

[LLVMdev] First attempt at recognizing pointer reduction

2013 Oct 21

[LLVMdev] First attempt at recognizing pointer reduction

Renato, This looks like the right direction. Did you run it on the LLVM test suite to check if it finds new loops to vectorize ? Thanks, Nadav On Oct 21, 2013, at 8:23 AM, Renato Golin <renato.golin at linaro.org> wrote: > Hi Nadav, Arnold, > > I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. >

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

----- Original Message ----- > From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Nadav Rotem" <nrotem at apple.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, January 25, 2013 5:35:16 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > On

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

----- Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > If I understand you correctly, conceptually you want two different objects to be returned for Foo.bl and Foo.al? > > Here is my take on this (take this with a grain of salt, Dan is the expert on this): > > http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds > >

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Hi Pekka, > How I see it, the data parallel input simply makes the vectorizer's job > easier (skip some of the legality checks) while reusing most of the > implementation (e.g. cost estimation, unrolling decisions, the > vector instruction formation itself, predication/if-conversion, > speculative execution+blend, etc.). > What you need is outer loop vectorization while

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Hi Pekka, > Hi, > > I started to play with the LoopVectorizer of LLVM trunk > on the work-item loops produced by pocl's OpenCL C > kernel compiler, in hopes of implementing multi-work-item > work group autovectorization in a modular manner. > Thanks for checking the Loop Vectorizer, I am interested in hearing your feedback. The Loop Vectorizer does not fit here.

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

[LLVMdev] Apparent indeterminism in PreVerifier

Nadav, As I peel this onion, it looks like you might know something about InnerLoopVectorizer::addRuntimeCheck. What does it do, and can it be causing the below described issue? Could resuming somehow (indeterministically) switch the order of PHIs in the original code? Thanks a lot. Sergei. --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 24

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Hi, I started to play with the LoopVectorizer of LLVM trunk on the work-item loops produced by pocl's OpenCL C kernel compiler, in hopes of implementing multi-work-item work group autovectorization in a modular manner. The vectorizer seems to refuse to vectorize the loop if it sees multiple writes to the same memory object within the same iteration. In case of parallel loops such as the

search for: canvectorizememori