thr3ads.net - similar to: "[LLVMdev] data dependency and fully loop unrolling"

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] data dependency and fully loop unrolling"

[LLVMdev] data dependency and fully loop unrolling

2012 May 24

[LLVMdev] data dependency and fully loop unrolling

Cheng, Are you looking specifically for an analysis that can 'undo' the effects of loop unrolling, or do you want dependency analysis that can run on the loop prior to unrolling? For dependency analysis on loops (prior to unrolling) Preston and Sanjoy have been working on this, see: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-May/049769.html

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

On Mon, Jan 30, 2017 at 4:59 PM Mehdi Amini <mehdi.amini at apple.com> wrote: > > > Another question is about PGO integration: is it already hooked there? > Should we have a more aggressive threshold in a hot function? (Assuming > we’re willing to spend some binary size there but not on the cold path). > > > I would even wire the *unrolling* the other way: just

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

Recollected the data from trunk head with stddev data and more threshold data points attached: Performance: stddev/mean 300 450 600 750 403 0.37% 0.11% 0.11% 0.09% 0.79% 433 0.14% 0.51% 0.25% -0.63% -0.29% 445 0.08% 0.48% 0.89% 0.12% 0.83% 447 0.16% 3.50% 2.69% 3.66% 3.59% 453 0.11% 1.49% 0.45% -0.07% 0.78% 464 0.17% 0.75% 1.80% 1.86% 1.54% Code size: 300 450 600 750 403 0.56% 2.41% 2.74% 3.75%

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop > dynamic unroller and partial unroller. This seems conservative because > unlike dynamic/partial

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote: > > > > On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> On Jan 30,

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 02

(RFC) Adjusting default loop fully unroll threshold

I had suggested having size metrics from somewhat larger applications such as Chrome, Webkit, or Firefox; clang itself; and maybe some of our internal binaries with rough size brackets? On Wed, Feb 1, 2017 at 4:33 PM Dehao Chen <dehao at google.com> wrote: > With the new data points, any comments on whether this can justify setting > fully inline threshold to 300 (or any other

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368

[LLVMdev] How can I get the destination operand of an instruction?

2012 May 09

[LLVMdev] How can I get the destination operand of an instruction?

I am able to access the source operands of an instruction using either getOperand() or op_iterator, However, I can't find any method available for destination operand. Someone suggests that instruction itself can represent the destination operand. http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037518.html The getOperand() returns an unsigned value like 0x9063498, while I can't

[LLVMdev] How can I get the destination operand of an instruction?

2012 May 09

[LLVMdev] How can I get the destination operand of an instruction?

Launcher <st.liucheng at gmail.com> writes: > I am able to access the source operands of an instruction using either > getOperand() or op_iterator, However, I can't find any method available for > destination operand. Someone suggests that instruction itself can represent > the destination operand. > http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037518.html

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 12

Loop Unrolling Fail in Simple Vectorized loop

Hi all, Attached herewith is a simple vectorized function with loops performing a simple shuffle. I want all loops (inner and outer) to be unrolled by 2 and as such used -unroll-count=2 The inner loops(with k as the induction variable and having constant trip counts) unroll fully, but the outer loop with (j) fails to unroll. The llvm code is also attached with inner loops fully unrolled. To

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Currently, loop fully unroller shares the same default

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 07

(RFC) Adjusting default loop fully unroll threshold

Ping... with the updated code size impact data, any more comments? Any more data that would be interesting to collect? Thanks, Dehao On Thu, Feb 2, 2017 at 2:07 PM, Dehao Chen <dehao at google.com> wrote: > Here is the code size impact for clang, chrome and 24 google internal > benchmarks (name omited, 14 15 16 are encoding/decoding benchmarks similar > as h264). There are 2

[LLVMdev] Strange loop unrolling problem

2009 Apr 22

[LLVMdev] Strange loop unrolling problem

I am having a strange problem with loop unrolling. Attached is a small example that demonstrates what happens. There is a for-loop with a known trip count, and some control flow inside the loop. If the condition of the control flow only depends on the loop index and loop invariant variables, the loop is not unrolled. However, if the condition involves potentially loop variant variables, the loop

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say; for(unsigned i = 0; i < vectorizable_elements ; i += 2){ //main loop } for(unsigned i=0 ; i < vectorizable_elements % 2; i++){ //fix up } Why does it have to reason about the range of vectorizable_elements? Even

[LLVMdev] Partial loop unrolling

2014 Jul 15

[LLVMdev] Partial loop unrolling

Hi, PS: It is a generic question related to partial loop unrolling, and nothing specific to LLVM. As far as partial loop unrolling is concerned, I could see following three different possibilities. Assume that unroll factor is 3. Original loop: for (i = 0; i < 10; i++) { do_foo(i); } 1. First possibility i = 0; do_foo(i++); do_foo(i++);

Loop Strength Reduction Pass Does Not Work for Some Varialbles Related to Induction Variables

2019 Apr 15

Loop Strength Reduction Pass Does Not Work for Some Varialbles Related to Induction Variables

Dear all, Hi! Recently, I try to combine the passes SeparateConstOffsetFromGEP and LoopStrengthReduction to transform the multiplication in the lowered GEP IRs into addition. However, it seems LoopStrengthReduction is unable to remove all the multiplications for the element offset calculation. My test code is shown below and thanks a lot in advance for your time and suggestion!

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 02

(RFC) Adjusting default loop fully unroll threshold

> On Feb 1, 2017, at 4:57 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > clang, chrome, and some internal large apps are good candidates for size metrics. I'd also add the standard LLVM testsuite just because it's the suite everyone in the community can use. Michael > > David > > On Wed, Feb 1, 2017 at 4:47 PM, Chandler Carruth via

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 08

(RFC) Adjusting default loop fully unroll threshold

On 02/07/2017 05:29 PM, Sanjay Patel via llvm-dev wrote: > Sorry if I missed it, but what machine/CPU are you using to collect > the perf numbers? > > I am concerned that what may be a win on a CPU that keeps a couple of > hundred instructions in-flight and has many MB of caches will not hold > for a small core. In my experience, unrolling tends to help weaker cores even more

[LLVMdev] How to get more details from storeInst ?

2013 Jan 18

[LLVMdev] How to get more details from storeInst ?

I have a loop fully unrolled and got the following store instruction. store i32 %add.3, i32* getelementptr inbounds ([20 x [20 x i32]]* @c, i32 0, i32 0, i32 0), align 4 I want to know exactly which element of the array that is going to be stored, which help me to transform the high level language to hardware. Take the instruction above as an example, I know the data is stored into c[0][0]. It

similar to: [LLVMdev] data dependency and fully loop unrolling