similar to: [LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation"

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

2013 Apr 17

0

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

On 04/17/2013 04:55 AM, Anadi Mishra wrote: > Hello, > > I am trying to vectorize the following loop but the vectorizer says: > "Found a possible write-write reorder" and does not vectorize. > Why? To my knowledge, the dependence analysis in the loop vectorizer is not yet able to prove the absence of dependences here. > for (j=0; j < 8; j++) > { > jj

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

2013 Apr 17

1

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

On Wed, Apr 17, 2013 at 8:08 AM, Tobias Grosser <tobias at grosser.es> wrote: > On 04/17/2013 04:55 AM, Anadi Mishra wrote: >> >> Hello, >> >> I am trying to vectorize the following loop but the vectorizer says: >> "Found a possible write-write reorder" and does not vectorize. >> Why? > > > To my knowledge, the dependence analysis in

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

2013 Apr 17

0

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

Hi Anadi Mishra, On 04/17/2013 05:55 AM, Anadi Mishra wrote: > Another question is regarding the isannotatedparallel() check. Is > there a way to make clang (or any other frontend) to generate parallel > annotated IR? Paul Redmond was adding support for "#pragma ivdep" that would use the parallel metadata, but I haven't been able to follow its progress lately. FWIW,

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

2

[LLVMdev] Decouple LoopVectorizer from O3

Done. Best, Anadi. On Thu, Apr 11, 2013 at 7:01 AM, Nadav Rotem <nrotem at apple.com> wrote: > Hi Anadi, > > Yes, this is a bug in the loop vectorizer. The loop vectorizer expects only > one loop counter (integer with step=1). There is no reason why we should > not handle the case below, and it should be easy to fix. Interestingly > enough if you reverse the order of

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

2

[LLVMdev] Decouple LoopVectorizer from O3

Hi Nadav, I tried your suggestion by changing the condition to : 189 if (LoopVectorize && OptLevel >= 0) 190 MPM.add(createLoopVectorizePass()); and compiled. Then I used the following command: opt -mtriple=x86_64-linux-gnu -vectorize-loops -vectorizer-min-trip-count=6 -debug-only=loop-vectorize -O1-S -o example1_vect.s example1.s where example1.s is IR generated by clang -S

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

4

[LLVMdev] Decouple LoopVectorizer from O3

Hello, I am trying out the LoopVectorizer(LV) pass and would like to decouple it from O3 which is currently required to run LV. I want to do this because I want to understand the behaviour of LV by trying simple loops but the O3 mostly optimises away the loop body. Any ideas would be appreciated. Best, Anadi. -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 15

0

[LLVMdev] Decouple LoopVectorizer from O3

Just an FYI: it's often handy to mention the PR number when a thread is concluded by filing a bug. That way other people reading (now, or more importantly, later) can follow the issue through to the bug and its resolution On Apr 11, 2013 4:24 PM, "Anadi Mishra" <reachanadi at gmail.com> wrote: > Done. > > Best, > Anadi. > > > On Thu, Apr 11, 2013 at 7:01

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

0

[LLVMdev] Decouple LoopVectorizer from O3

Hi Anadi, Yes, this is a bug in the loop vectorizer. The loop vectorizer expects only one loop counter (integer with step=1). There is no reason why we should not handle the case below, and it should be easy to fix. Interestingly enough if you reverse the order of iterations and count from SIZE to zero, the loop vectorizer would vectorize it. If you open a bugzilla report and assign it to me

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

1

[LLVMdev] Decouple LoopVectorizer from O3

Thanks for the suggestion Jim. I already tried to do it by 'opt' but it also requires O3. BTW I think that if I invoke 'opt' with '-vectorize-loops' option, it will figure out the passes required for LV since every pass mentions what other passes are prerequisite. Am I correct? Best, Anadi. On Thu, Apr 11, 2013 at 2:48 AM, Jim Grosbach <grosbach at apple.com>

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

0

[LLVMdev] Decouple LoopVectorizer from O3

Hi Anadi, In the file PassManagerBuilder.cpp you can change the lines below to get rid of the O3 restriction. 189 if (LoopVectorize && OptLevel > 2) 190 MPM.add(createLoopVectorizePass()); Nadav On Apr 10, 2013, at 5:39 PM, Anadi Mishra <reachanadi at gmail.com> wrote: > Hello, > > I am trying out the LoopVectorizer(LV) pass and would like to decouple

[LLVMdev] Decouple LoopVectorizer from O3

2013 Apr 11

0

[LLVMdev] Decouple LoopVectorizer from O3

You can take unoptimized bitcode and run it through ‘opt’ to have complete flexibility in which passes get run. It may take some fiddling to find out the pass sequence and ordering that does what you want, as some passes rely on previous passes to canonicaplize code into a form it can effectively work with. -Jim On Apr 10, 2013, at 5:39 PM, Anadi Mishra <reachanadi at gmail.com> wrote:

apply() and dropped dimensions

2005 Dec 05

1

apply() and dropped dimensions

Hi I am having difficulty with apply(). I want apply() to return a matrix, but sometimes a vector is returned. Toy example follows. Function jj() takes a couple of matrices m1 and m2 as arguments and returns a matrix with r rows and c columns where r=nrow(m2) and c=nrow(m1). jj <- function(m1,m2,f,...){ apply(m1, 1, function(y) { apply(m2, 1, function(x) { f(x, y, ...)

[LLVMdev] parallel loop metadata simplification

2013 Mar 01

3

[LLVMdev] parallel loop metadata simplification

----- Original Message ----- > From: "Hal Finkel" <hfinkel at anl.gov> > To: "Paul Redmond" <paul.redmond at intel.com> > Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, March 1, 2013 11:13:06 AM > Subject: Re: [LLVMdev] parallel loop metadata simplification > > ----- Original Message ----- > >

[LLVMdev] parallel loop metadata simplification

2013 Mar 01

2

[LLVMdev] parallel loop metadata simplification

On 2013-03-01, at 11:35 AM, Hal Finkel wrote: > ----- Original Message ----- >> From: "Paul Redmond" <paul.redmond at intel.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> >> Sent: Friday, March 1, 2013 10:06:51 AM >> Subject: Re: [LLVMdev] parallel loop

[LLVMdev] parallel loop metadata simplification

2013 Mar 03

2

[LLVMdev] parallel loop metadata simplification

On 03/03/2013 06:43 PM, Tobias Grosser wrote: > Very good example, indeed. Is there a formal definition of what > #pragma ivdeps means? I see two options here: In the previous discussion we could not find a proper definition for #pragma ivdep so we concluded we can treat it as a statement of "treat the loop as parallel, I do not expect any dependency checking by the compiler",

[LLVMdev] parallel loop metadata simplification

2013 Mar 03

2

[LLVMdev] parallel loop metadata simplification

On 03/03/2013 02:34 PM, Tobias Grosser wrote: > Meaning they are due to an array or pointer access. What about loop-scope arrays? void foo(long *A, long b) { long i; #pragma ivdep for (i = 0; i < 100; i++) { long t[100]; t[0] = i + 2; A[i] = A[i+b] + t[0]; } } Clang places the alloca for t to the entry block, creating a new race condition.

do.call("[", ...) question

2004 Sep 08

3

do.call("[", ...) question

Hi again everyone I have an arbitrarily dimensional array "a" and a list "jj" of length length(dim(a)). The elements of jj are vectors of indexes. How do I use do.call() to extract a[ jj[[1]], jj[[2]], jj[[3]], ...] ? Toy example follows: a <- matrix(1:30,5,6) jj <- list(5:1,6:1) I want the following a[ jj[[1]],jj[[2]] ] How do I do this? OBAttempts:

[LLVMdev] parallel loop metadata simplification

2013 Mar 03

0

[LLVMdev] parallel loop metadata simplification

On 03/03/2013 03:34 PM, Pekka Jääskeläinen wrote: > On 03/03/2013 02:34 PM, Tobias Grosser wrote: >> Meaning they are due to an array or pointer access. > > What about loop-scope arrays? > > void foo(long *A, long b) { > long i; > > #pragma ivdep > for (i = 0; i < 100; i++) { > long t[100]; > t[0] = i + 2; >

top and allocation issues

2011 Mar 03

3

top and allocation issues

In a context where exceptions are caught, I ran the fragment: cerr << "allocating" << endl; char* arr[100]; for (int jj = 0; jj < 10; ++jj) { cerr << "jj = " << jj << endl; arr[jj] = new char[2000000000]; sleep (30); } sleep (10); for (int jj = 0; jj < 10; ++jj) delete[] arr[jj]; cerr

seq(0.05,0.95,by=0.002) and logical error

2000 Dec 10

1

seq(0.05,0.95,by=0.002) and logical error

Regardless of which version -- 1.1.1 or 1.2.0 (2000-11-27) -- with a fresh "directory" (i.e. no .RData), I am getting an extremely weird result. R : Copyright 2000, The R Development Core Team Version 1.2.0 Under development (unstable) (2000-11-27) > jj _ seq(0.05,0.95,by=0.002) > sum(jj==0.75) ## WRONG ANSWER [1] 0 > 0.05 + 350*.002 ## Double check that 0.75 is in jj [1]