Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] data dependency and fully loop unrolling"
2012 May 24
0
[LLVMdev] data dependency and fully loop unrolling
Cheng,
Are you looking specifically for an analysis that can 'undo' the
effects of loop unrolling, or do you want dependency analysis that can
run on the loop prior to unrolling?
For dependency analysis on loops (prior to unrolling) Preston and
Sanjoy have been working on this, see:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-May/049769.html
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 4:59 PM Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>
> Another question is about PGO integration: is it already hooked there?
> Should we have a more aggressive threshold in a hot function? (Assuming
> we’re willing to spend some binary size there but not on the cold path).
>
>
> I would even wire the *unrolling* the other way: just
2017 Jan 31
2
(RFC) Adjusting default loop fully unroll threshold
Recollected the data from trunk head with stddev data and more threshold
data points attached:
Performance:
stddev/mean 300 450 600 750
403 0.37% 0.11% 0.11% 0.09% 0.79%
433 0.14% 0.51% 0.25% -0.63% -0.29%
445 0.08% 0.48% 0.89% 0.12% 0.83%
447 0.16% 3.50% 2.69% 3.66% 3.59%
453 0.11% 1.49% 0.45% -0.07% 0.78%
464 0.17% 0.75% 1.80% 1.86% 1.54%
Code size:
300 450 600 750
403 0.56% 2.41% 2.74% 3.75%
2017 Jan 30
2
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Currently, loop fully unroller shares the same default threshold as loop
> dynamic unroller and partial unroller. This seems conservative because
> unlike dynamic/partial
2017 Jan 31
3
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote:
>
>
>
> On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
> On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> On Jan 30,
2017 Jan 30
4
(RFC) Adjusting default loop fully unroll threshold
Currently, loop fully unroller shares the same default threshold as loop
dynamic unroller and partial unroller. This seems conservative because
unlike dynamic/partial unrolling, fully unrolling will not affect
LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to
double the threshold for loop fully unroller. This will change the codegen
of several SPECCPU benchmarks:
Code
2017 Feb 02
2
(RFC) Adjusting default loop fully unroll threshold
I had suggested having size metrics from somewhat larger applications such
as Chrome, Webkit, or Firefox; clang itself; and maybe some of our internal
binaries with rough size brackets?
On Wed, Feb 1, 2017 at 4:33 PM Dehao Chen <dehao at google.com> wrote:
> With the new data points, any comments on whether this can justify setting
> fully inline threshold to 300 (or any other
2017 Jan 30
0
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368
2012 May 09
4
[LLVMdev] How can I get the destination operand of an instruction?
I am able to access the source operands of an instruction using either
getOperand() or op_iterator, However, I can't find any method available for
destination operand. Someone suggests that instruction itself can represent
the destination operand.
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037518.html
The getOperand() returns an unsigned value like 0x9063498, while I can't
2012 May 09
0
[LLVMdev] How can I get the destination operand of an instruction?
Launcher <st.liucheng at gmail.com> writes:
> I am able to access the source operands of an instruction using either
> getOperand() or op_iterator, However, I can't find any method available for
> destination operand. Someone suggests that instruction itself can represent
> the destination operand.
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037518.html
2016 Oct 12
2
Loop Unrolling Fail in Simple Vectorized loop
Hi all,
Attached herewith is a simple vectorized function with loops performing a
simple shuffle.
I want all loops (inner and outer) to be unrolled by 2 and as such used
-unroll-count=2
The inner loops(with k as the induction variable and having constant trip
counts) unroll fully, but the outer loop with (j) fails to unroll.
The llvm code is also attached with inner loops fully unrolled.
To
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Currently, loop fully unroller shares the same default
2017 Feb 07
2
(RFC) Adjusting default loop fully unroll threshold
Ping... with the updated code size impact data, any more comments? Any more
data that would be interesting to collect?
Thanks,
Dehao
On Thu, Feb 2, 2017 at 2:07 PM, Dehao Chen <dehao at google.com> wrote:
> Here is the code size impact for clang, chrome and 24 google internal
> benchmarks (name omited, 14 15 16 are encoding/decoding benchmarks similar
> as h264). There are 2
2009 Apr 22
4
[LLVMdev] Strange loop unrolling problem
I am having a strange problem with loop unrolling. Attached is
a small example that demonstrates what happens.
There is a for-loop with a known trip count, and some control
flow inside the loop. If the condition of the control flow only
depends on the loop index and loop invariant variables, the loop
is not unrolled. However, if the condition involves potentially
loop variant variables, the loop
2016 Oct 13
2
Loop Unrolling Fail in Simple Vectorized loop
Thanks for the explanation. But I am a little confused with the following
fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert
the loop to say;
for(unsigned i = 0; i < vectorizable_elements ; i += 2){
//main loop
}
for(unsigned i=0 ; i < vectorizable_elements % 2; i++){
//fix up
}
Why does it have to reason about the range of vectorizable_elements? Even
2014 Jul 15
4
[LLVMdev] Partial loop unrolling
Hi,
PS: It is a generic question related to partial loop unrolling, and nothing
specific to LLVM.
As far as partial loop unrolling is concerned, I could see following three
different possibilities. Assume that unroll factor is 3.
Original loop:
for (i = 0; i < 10; i++)
{
do_foo(i);
}
1. First possibility
i = 0;
do_foo(i++);
do_foo(i++);
2019 Apr 15
2
Loop Strength Reduction Pass Does Not Work for Some Varialbles Related to Induction Variables
Dear all,
Hi! Recently, I try to combine the passes SeparateConstOffsetFromGEP and LoopStrengthReduction to transform the multiplication in the lowered GEP IRs into addition.
However, it seems LoopStrengthReduction is unable to remove all the multiplications for the element offset calculation.
My test code is shown below and thanks a lot in advance for your time and suggestion!
2017 Feb 02
2
(RFC) Adjusting default loop fully unroll threshold
> On Feb 1, 2017, at 4:57 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> clang, chrome, and some internal large apps are good candidates for size metrics.
I'd also add the standard LLVM testsuite just because it's the suite everyone in the community can use.
Michael
>
> David
>
> On Wed, Feb 1, 2017 at 4:47 PM, Chandler Carruth via
2017 Feb 08
2
(RFC) Adjusting default loop fully unroll threshold
On 02/07/2017 05:29 PM, Sanjay Patel via llvm-dev wrote:
> Sorry if I missed it, but what machine/CPU are you using to collect
> the perf numbers?
>
> I am concerned that what may be a win on a CPU that keeps a couple of
> hundred instructions in-flight and has many MB of caches will not hold
> for a small core.
In my experience, unrolling tends to help weaker cores even more
2013 Jan 18
2
[LLVMdev] How to get more details from storeInst ?
I have a loop fully unrolled and got the following store instruction.
store i32 %add.3, i32* getelementptr inbounds ([20 x [20 x i32]]* @c, i32 0,
i32 0, i32 0), align 4
I want to know exactly which element of the array that is going to be
stored, which help me to transform the high level language to hardware. Take
the instruction above as an example, I know the data is stored into c[0][0].
It