Florian Hahn via llvm-dev
2021-Jun-24 17:29 UTC
[llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager
Hi,> On Jun 24, 2021, at 17:38, Jingu Kang via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Sorry for Ping. > > As I mentioned on previous email, if you need more information for enabling the loop distribute pass, please let me know. @Michael <mailto:llvmdev at meinersbur.de> @nikic <mailto:nikic at php.net>Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet. Also, there’s been an effort to improve the cost-modeling for LoopDistribute (https://reviews.llvm.org/D100381 <https://reviews.llvm.org/D100381>) Should we make progress in that direction first, before enabling by default? Cheers, Florian -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210624/fe46eaa1/attachment.html>
Sanne Wouda via llvm-dev
2021-Jun-25 10:23 UTC
[llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager
Hi, Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet. It would be good to have some fresh numbers on how often LoopDistribute triggers. From what I remember, there are a handful of cases in the test suite, but nothing that significantly affects performance (other than hmmer, obviously). Also, there’s been an effort to improve the cost-modeling for LoopDistribute (https://reviews.llvm.org/D100381) Should we make progress in that direction first, before enabling by default? Unfortunately, there were some problems with this effort. First, the current implementation of LoopDistribute relies heavily on LoopAccessAnalysis, which made it difficult to adapt. More importantly though, I'm not convinced that LoopDistribute will be beneficial other than in cases where it enables more vectorization. (The memcpy detection gcc might be interesting, I didn't look at that.) It reduces both ILP and MLP, which in some cases might be made up by lower register or cache pressure, but this is hard or impossible for the compiler to know. While working on this, with a more aggressive LoopDistribute across several benchmarks, I did not see any improvements that didn't turn out to be noise, and plenty of cases where it was actively degrading performance. Therefore, I'm not sure this direction is worth pursuing further, and I believe the current heuristic of "distribute when it enables new vectorization" is actually pretty reasonable, if not very general. Cheers, Sanne -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210625/4f3deb3a/attachment.html>
Jingu Kang via llvm-dev
2021-Jul-05 13:40 UTC
[llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager
Ping. Additionally, I was not able to see the pass triggered from llvm-test-suite and spec benchmark except hmmer. Thanks JinGu Kang From: Sanne Wouda <Sanne.Wouda at arm.com> Sent: 25 June 2021 11:23 To: Jingu Kang <Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org; Florian Hahn <florian_hahn at apple.com> Cc: nikic at php.net Subject: Re: [llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager Hi, Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet. It would be good to have some fresh numbers on how often LoopDistribute triggers. From what I remember, there are a handful of cases in the test suite, but nothing that significantly affects performance (other than hmmer, obviously). Also, there's been an effort to improve the cost-modeling for LoopDistribute (https://reviews.llvm.org/D100381) Should we make progress in that direction first, before enabling by default? Unfortunately, there were some problems with this effort. First, the current implementation of LoopDistribute relies heavily on LoopAccessAnalysis, which made it difficult to adapt. More importantly though, I'm not convinced that LoopDistribute will be beneficial other than in cases where it enables more vectorization. (The memcpy detection gcc might be interesting, I didn't look at that.) It reduces both ILP and MLP, which in some cases might be made up by lower register or cache pressure, but this is hard or impossible for the compiler to know. While working on this, with a more aggressive LoopDistribute across several benchmarks, I did not see any improvements that didn't turn out to be noise, and plenty of cases where it was actively degrading performance. Therefore, I'm not sure this direction is worth pursuing further, and I believe the current heuristic of "distribute when it enables new vectorization" is actually pretty reasonable, if not very general. Cheers, Sanne -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210705/257175c7/attachment.html>
Florian Hahn via llvm-dev
2021-Jul-14 12:18 UTC
[llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager
> On 25 Jun 2021, at 12:23, Sanne Wouda <Sanne.Wouda at arm.com> wrote: > > Hi, > Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet. > It would be good to have some fresh numbers on how often LoopDistribute triggers. From what I remember, there are a handful of cases in the test suite, but nothing that significantly affects performance (other than hmmer, obviously). > Also, there’s been an effort to improve the cost-modeling for LoopDistribute (https://reviews.llvm.org/D100381 <https://reviews.llvm.org/D100381>) Should we make progress in that direction first, before enabling by default? > Unfortunately, there were some problems with this effort. First, the current implementation of LoopDistribute relies heavily on LoopAccessAnalysis, which made it difficult to adapt. > > More importantly though, I'm not convinced that LoopDistribute will be beneficial other than in cases where it enables more vectorization. (The memcpy detection gcc might be interesting, I didn't look at that.) It reduces both ILP and MLP, which in some cases might be made up by lower register or cache pressure, but this is hard or impossible for the compiler to know. >I think we should be able to make an educated guess at least if we wanted to, although it won’t be straightforward. I think there can be cases where loop distribution can be beneficial on its own, especially for large loops where enough parallelism remains after distributing, but they can be highly target-specific.> While working on this, with a more aggressive LoopDistribute across several benchmarks, I did not see any improvements that didn't turn out to be noise, and plenty of cases where it was actively degrading performance. >Thanks for the update! It might be good to close the loop on the review as well? Cheers, Florian -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210714/84df2e78/attachment.html>