thr3ads.net - llvm dev - [LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Star Tan

2013-Aug-06 03:08 UTC

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

Hi all,


 It seems that Polly could still speed up 
test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any
optimization and code generation. Our evaluation show that when compiled with
"clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm
-polly-optimizer=none -mllvm -polly-code-generator=none", the execution
time of huffbench would reduced to 15 secs from the original 19 secs without
Polly.


By investigating Polly's canonicalication passes, I find the speedup mainly
comes from "createIndVarSimplifyPass()", which is controlled by the
variable SCEVCodegen:


    if (!SCEVCodegen)
       PM.add(polly::createIndVarSimplifyPass());


If we remove this canonicalication pass, then there would be no performance
improvement.


Could anyone give me some hints why Polly needs this canonicalication pass in
normal cases but refuse it in SCEVCodegen case? Is it possible to remove this
canonicalication pass at all?


Thanks,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130806/4a6040cc/attachment.html>

Tobias Grosser

2013-Aug-06 05:29 UTC

head link

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

On 08/05/2013 08:08 PM, Star Tan wrote:> Hi all,
>
>
>   It seems that Polly could still speed up 
test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any
optimization and code generation. Our evaluation show that when compiled with
"clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm
-polly-optimizer=none -mllvm -polly-code-generator=none", the execution
time of huffbench would reduced to 15 secs from the original 19 secs without
Polly.
>
>
> By investigating Polly's canonicalication passes, I find the speedup
mainly comes from "createIndVarSimplifyPass()", which is controlled by
the variable SCEVCodegen:
>
>
>      if (!SCEVCodegen)
>         PM.add(polly::createIndVarSimplifyPass());
>
>
> If we remove this canonicalication pass, then there would be no performance
improvement.
>
>
> Could anyone give me some hints why Polly needs this canonicalication pass
in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this
canonicalication pass at all?
Hi Star,

polly::createIndVarSimplifyPass() is used in Polly to create canonical 
induction variables in case we do not use the SCEV based code 
generation. For SCEV based code generation this pass is not needed any 
more and one motivation for writing the SCEV based code generation was 
in fact to remove the need for this pass. It still exists as we did not 
yet fully test the SCEV based code generation and for the classical code
generation we need canonical induction variables.

Regarding the speed up due to Polly. It seems the rewrites introduced by 
the createIndVarSimplifyPass happen to yield faster code. If you can 
easily reproduce a reduced test case that shows a missing optimization,
it would be great to get a bug report for this. On the other hand, I 
remember the induction variable canonicalization was removed due to 
introducing unpredictable performance regressions (and possible 
improvements?). Hence, I would not spend too much time tracking on this
in case there is no obvious missed optimization.

Cheers,
Tobi

Star Tan

2013-Aug-08 01:09 UTC

head link

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

At 2013-08-06 13:29:50,"Tobias Grosser" <tobias at grosser.es>
wrote:>On 08/05/2013 08:08 PM, Star Tan wrote:
>> Hi all,
>>
>>   It seems that Polly could still speed up 
test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any
optimization and code generation. Our evaluation show that when compiled with
"clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm
-polly-optimizer=none -mllvm -polly-code-generator=none", the execution
time of huffbench would reduced to 15 secs from the original 19 secs without
Polly.
>>
>> By investigating Polly's canonicalication passes, I find the
speedup mainly comes from "createIndVarSimplifyPass()", which is
controlled by the variable SCEVCodegen:
>>
>>
>>      if (!SCEVCodegen)
>>         PM.add(polly::createIndVarSimplifyPass());
>>
>> If we remove this canonicalication pass, then there would be no
performance improvement.
>>
>> Could anyone give me some hints why Polly needs this canonicalication
pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove
this canonicalication pass at all?
>
>Hi Star,
>
>polly::createIndVarSimplifyPass() is used in Polly to create canonical 
>induction variables in case we do not use the SCEV based code 
>generation. For SCEV based code generation this pass is not needed any 
>more and one motivation for writing the SCEV based code generation was 
>in fact to remove the need for this pass. It still exists as we did not 
>yet fully test the SCEV based code generation and for the classical code
>generation we need canonical induction variables.
>
>Regarding the speed up due to Polly. It seems the rewrites introduced by 
>the createIndVarSimplifyPass happen to yield faster code. If you can 
>easily reproduce a reduced test case that shows a missing optimization,
>it would be great to get a bug report for this. On the other hand, I 
>remember the induction variable canonicalization was removed due to 
>introducing unpredictable performance regressions (and possible 
>improvements?). Hence, I would not spend too much time tracking on this
>in case there is no obvious missed optimization.
I see. Thanks for your explanation.
I think we could remove  the induction variable canonicalization in the next
step.

Best,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/eab24560/attachment.html>

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Aug 2013 - [LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

Apparently Analagous Threads