Kevin Qin
2014-Jul-31 08:03 UTC
[LLVMdev] Should we enable Partial unrolling and Runtime unrolling on AArch64?
Hi all, Partial unrolling and runtime unrolling are enabled by default in aarch64 gcc which is help to get performance better. But these two methods are enabled for only several backends in LLVM which are X86, PowerPC and R600. I don't know the history of these two kinds of unrolling, and why they are not widely used. I also want to know is, for aarch64 backend, is it intentionally to get them disabled? I've did some experiment around this and see the performance is indeed impacted. Overall, partial unrolling can bring small benefit on most cases of Benchmark and regression is major and small. Runtime unrolling can bring huge improvement on some certain cases but also huge regression on others. The proportion of improvement and regression varies in different Benchmark. Also, code size is increased for two both. I will show more information before this be changed. Here I just want to know more backgrounds of two unrolling methods. -- Best Regards, Kevin Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140731/07e44ae1/attachment.html>
Kevin Qin
2014-Jul-31 08:28 UTC
[LLVMdev] Should we enable Partial unrolling and Runtime unrolling on AArch64?
Correct a typo issue: the word "major" in second paragraph should be "minor". Sorry about this... 2014-07-31 16:03 GMT+08:00 Kevin Qin <kevinqindev at gmail.com>:> Hi all, > > Partial unrolling and runtime unrolling are enabled by default in aarch64 > gcc which is help to get performance better. But these two methods are > enabled for only several backends in LLVM which are X86, PowerPC and R600. > I don't know the history of these two kinds of unrolling, and why they are > not widely used. I also want to know is, for aarch64 backend, is > it intentionally to get them disabled? > > I've did some experiment around this and see the performance is indeed > impacted. Overall, partial unrolling can bring small benefit on most cases > of Benchmark and regression is major and small. Runtime unrolling can bring > huge improvement on some certain cases but also huge regression on > others. The proportion of improvement and regression varies in different > Benchmark. Also, code size is increased for two both. > > I will show more information before this be changed. Here I just want to > know more backgrounds of two unrolling methods. > > -- > Best Regards, > > Kevin Qin >-- Best Regards, Kevin Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140731/931e6f38/attachment.html>
Hal Finkel
2014-Jul-31 14:11 UTC
[LLVMdev] Should we enable Partial unrolling and Runtime unrolling on AArch64?
----- Original Message -----> From: "Kevin Qin" <kevinqindev at gmail.com> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, July 31, 2014 3:03:19 AM > Subject: [LLVMdev] Should we enable Partial unrolling and Runtime unrolling on AArch64? > > > > > > Hi all, > > > Partial unrolling and runtime unrolling are enabled by default in > aarch64 gcc which is help to get performance better. But these two > methods are enabled for only several backends in LLVM which are X86, > PowerPC and R600. I don't know the history of these two kinds of > unrolling, and why they are not widely used. I also want to know is, > for aarch64 backend, is it intentionally to get them disabled? > > I've did some experiment around this and see the performance is > indeed impacted. Overall, partial unrolling can bring small benefit > on most cases of Benchmark and regression is major and small. > Runtime unrolling can bring huge improvement on some certain cases > but also huge regression on others. The proportion of improvement > and regression varies in different Benchmark . Also, code size is > increased for two both. > > > I will show more information before this be changed. Here I just want > to know more backgrounds of two unrolling methods.These unrolling methods have been available in LLVM for several years, but the pass-manager setup and TTI hooks that enable backends to enable these in a target-specific way is relatively new. As you've noticed, per-target tuning is required. Patches are certainly welcome; if you have a modification for AArch64 that provides significant benefits and little downside, please send it to llvm-commits for review. Thanks for looking at this. -Hal> > > -- > > Best Regards, > > > Kevin Qin > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Kevin Qin
2014-Aug-01 07:38 UTC
[LLVMdev] Should we enable Partial unrolling and Runtime unrolling on AArch64?
Hi Hal, I want to make sure If there is a conclusion about these unrolling methods on AArch64 target. It seems the answer is no. So it's worth to spend more time to tune the parameter before sending out the patch. Thanks for providing some background around this. Regards, Kevin 2014-07-31 22:11 GMT+08:00 Hal Finkel <hfinkel at anl.gov>:> ----- Original Message ----- > > From: "Kevin Qin" <kevinqindev at gmail.com> > > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > > Sent: Thursday, July 31, 2014 3:03:19 AM > > Subject: [LLVMdev] Should we enable Partial unrolling and Runtime > unrolling on AArch64? > > > > > > > > > > > > Hi all, > > > > > > Partial unrolling and runtime unrolling are enabled by default in > > aarch64 gcc which is help to get performance better. But these two > > methods are enabled for only several backends in LLVM which are X86, > > PowerPC and R600. I don't know the history of these two kinds of > > unrolling, and why they are not widely used. I also want to know is, > > for aarch64 backend, is it intentionally to get them disabled? > > > > I've did some experiment around this and see the performance is > > indeed impacted. Overall, partial unrolling can bring small benefit > > on most cases of Benchmark and regression is major and small. > > Runtime unrolling can bring huge improvement on some certain cases > > but also huge regression on others. The proportion of improvement > > and regression varies in different Benchmark . Also, code size is > > increased for two both. > > > > > > I will show more information before this be changed. Here I just want > > to know more backgrounds of two unrolling methods. > > These unrolling methods have been available in LLVM for several years, but > the pass-manager setup and TTI hooks that enable backends to enable these > in a target-specific way is relatively new. As you've noticed, per-target > tuning is required. Patches are certainly welcome; if you have a > modification for AArch64 that provides significant benefits and little > downside, please send it to llvm-commits for review. > > Thanks for looking at this. > > -Hal > > > > > > > -- > > > > Best Regards, > > > > > > Kevin Qin > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-- Best Regards, Kevin Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140801/ba6114fd/attachment.html>
Reasonably Related Threads
- [LLVMdev] How to run two loop passes non-interleaved if they are registered one by one?
- [LLVMdev] [cfe-dev] AArch64 Clang CLI interface proposal
- [LLVMdev] How to run two loop passes non-interleaved if they are registered one by one?
- question about llvm partial unrolling/runtime unrolling
- [LLVMdev] why we assume malloc() always returns a non-null pointer in instruction combing?