Star Tan
2013-Aug-06 03:08 UTC
[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation
Hi all, It seems that Polly could still speed up test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any optimization and code generation. Our evaluation show that when compiled with "clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none", the execution time of huffbench would reduced to 15 secs from the original 19 secs without Polly. By investigating Polly's canonicalication passes, I find the speedup mainly comes from "createIndVarSimplifyPass()", which is controlled by the variable SCEVCodegen: if (!SCEVCodegen) PM.add(polly::createIndVarSimplifyPass()); If we remove this canonicalication pass, then there would be no performance improvement. Could anyone give me some hints why Polly needs this canonicalication pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this canonicalication pass at all? Thanks, Star Tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130806/4a6040cc/attachment.html>
Tobias Grosser
2013-Aug-06 05:29 UTC
[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation
On 08/05/2013 08:08 PM, Star Tan wrote:> Hi all, > > > It seems that Polly could still speed up test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any optimization and code generation. Our evaluation show that when compiled with "clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none", the execution time of huffbench would reduced to 15 secs from the original 19 secs without Polly. > > > By investigating Polly's canonicalication passes, I find the speedup mainly comes from "createIndVarSimplifyPass()", which is controlled by the variable SCEVCodegen: > > > if (!SCEVCodegen) > PM.add(polly::createIndVarSimplifyPass()); > > > If we remove this canonicalication pass, then there would be no performance improvement. > > > Could anyone give me some hints why Polly needs this canonicalication pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this canonicalication pass at all?Hi Star, polly::createIndVarSimplifyPass() is used in Polly to create canonical induction variables in case we do not use the SCEV based code generation. For SCEV based code generation this pass is not needed any more and one motivation for writing the SCEV based code generation was in fact to remove the need for this pass. It still exists as we did not yet fully test the SCEV based code generation and for the classical code generation we need canonical induction variables. Regarding the speed up due to Polly. It seems the rewrites introduced by the createIndVarSimplifyPass happen to yield faster code. If you can easily reproduce a reduced test case that shows a missing optimization, it would be great to get a bug report for this. On the other hand, I remember the induction variable canonicalization was removed due to introducing unpredictable performance regressions (and possible improvements?). Hence, I would not spend too much time tracking on this in case there is no obvious missed optimization. Cheers, Tobi
Star Tan
2013-Aug-08 01:09 UTC
[LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation
At 2013-08-06 13:29:50,"Tobias Grosser" <tobias at grosser.es> wrote:>On 08/05/2013 08:08 PM, Star Tan wrote: >> Hi all, >> >> It seems that Polly could still speed up test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any optimization and code generation. Our evaluation show that when compiled with "clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none", the execution time of huffbench would reduced to 15 secs from the original 19 secs without Polly. >> >> By investigating Polly's canonicalication passes, I find the speedup mainly comes from "createIndVarSimplifyPass()", which is controlled by the variable SCEVCodegen: >> >> >> if (!SCEVCodegen) >> PM.add(polly::createIndVarSimplifyPass()); >> >> If we remove this canonicalication pass, then there would be no performance improvement. >> >> Could anyone give me some hints why Polly needs this canonicalication pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this canonicalication pass at all? > >Hi Star, > >polly::createIndVarSimplifyPass() is used in Polly to create canonical >induction variables in case we do not use the SCEV based code >generation. For SCEV based code generation this pass is not needed any >more and one motivation for writing the SCEV based code generation was >in fact to remove the need for this pass. It still exists as we did not >yet fully test the SCEV based code generation and for the classical code >generation we need canonical induction variables. > >Regarding the speed up due to Polly. It seems the rewrites introduced by >the createIndVarSimplifyPass happen to yield faster code. If you can >easily reproduce a reduced test case that shows a missing optimization, >it would be great to get a bug report for this. On the other hand, I >remember the induction variable canonicalization was removed due to >introducing unpredictable performance regressions (and possible >improvements?). Hence, I would not spend too much time tracking on this >in case there is no obvious missed optimization.I see. Thanks for your explanation. I think we could remove the induction variable canonicalization in the next step. Best, Star Tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/eab24560/attachment.html>
Apparently Analagous Threads
- [LLVMdev] [Polly] Question about Polly's speed up on huffbench.c without optimization and code generation
- [LLVMdev] [Polly] Compile-time and Execution-time analysis for the SCEV canonicalization
- [LLVMdev] [Polly] Compile-time and Execution-time analysis for the SCEV canonicalization
- [LLVMdev] [Polly] Compile-time and Execution-time analysis for the SCEV canonicalization
- [LLVMdev] [polly] pass ordering