Dangeti Tharun kumar via llvm-dev
2016-Jun-23  15:20 UTC
[llvm-dev] Questions on LLVM vectorization diagnostics
Dear LLVM Community, I am D Tharun Kumar, masters student at Indian Institute of Technology Hyderabad, working in a team to improve current vectorizer in LLVM. As an initial study, we are studying various benchmarks to analyze and compare vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization remarks given by LLVM are vague and brief, comparatively GCC and ICC are giving detailed diagnostics. - I am interested to know why the LLVM diagnostics are brief and not intuitive (making them less helpful)? - In our analysis we never seen llvm trying to vectorize outer loops. Is this well known? Is outer loop vectorization implemented in LLVM as in GCC? (http://dl.acm.org/citation.cfm?id=1454119) If not, is someone working on it? - On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag of LLVM the inability of LLVM’s vectorizer, or are there any (enabling) optimization passes running before GCC’s vectorizer that are helping GCC perform better? - Loop peeling to enhance vectorization is present in GCC and ICC, but, the LLVM remarks don’t say anything about alignment. Does LLVM has this functionality and the vectorizer doesn’t remark about it, or it doesn’t it have the functionality at all? Finally, we appreciate suggestions and directions for improving the vectorization framework of LLVM. I would also like to know if anyone worked or is working on improving vectorization remarks. Regards, Dangeti Tharun kumar M.TECH Computer Science IIT Hyderabad -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160623/2b0d66c8/attachment.html>
Ramakrishna Upadrasta via llvm-dev
2016-Jun-23  15:29 UTC
[llvm-dev] Questions on LLVM vectorization diagnostics
Hi Tharun, Thanks for this nice question! Hope this leads to good discussion and feedback. Best Ramakrishna On Thu, Jun 23, 2016 at 8:50 PM, Dangeti Tharun kumar < cs15mtech11002 at iith.ac.in> wrote:> Dear LLVM Community, > > I am D Tharun Kumar, masters student at Indian Institute of Technology > Hyderabad, working in a team to improve current vectorizer in LLVM. As an > initial study, we are studying various benchmarks to analyze and compare > vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization > remarks given by LLVM are vague and brief, comparatively GCC and ICC are > giving detailed diagnostics. > > > - I am interested to know why the LLVM diagnostics are brief and not > intuitive (making them less helpful)? > > > - In our analysis we never seen llvm trying to vectorize outer loops. > Is this well known? Is outer loop vectorization implemented in LLVM as in > GCC? (http://dl.acm.org/citation.cfm?id=1454119) If not, is someone > working on it? > > > - On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC > and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag > of LLVM the inability of LLVM’s vectorizer, or are there any (enabling) > optimization passes running before GCC’s vectorizer that are helping GCC > perform better? > > > - Loop peeling to enhance vectorization is present in GCC and ICC, > but, the LLVM remarks don’t say anything about alignment. Does LLVM has > this functionality and the vectorizer doesn’t remark about it, or it > doesn’t it have the functionality at all? > > Finally, we appreciate suggestions and directions for improving the > vectorization framework of LLVM. > > I would also like to know if anyone worked or is working on improving > vectorization remarks. > > Regards, > > Dangeti Tharun kumar > M.TECH Computer Science > IIT Hyderabad >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160623/c8bbf966/attachment.html>
Adam Nemet via llvm-dev
2016-Jun-23  17:45 UTC
[llvm-dev] Questions on LLVM vectorization diagnostics
Hi Dangeti,> On Jun 23, 2016, at 8:20 AM, Dangeti Tharun kumar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Dear LLVM Community, > > I am D Tharun Kumar, masters student at Indian Institute of Technology Hyderabad, working in a team to improve current vectorizer in LLVM. As an initial study, we are studying various benchmarks to analyze and compare vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization remarks given by LLVM are vague and brief, comparatively GCC and ICC are giving detailed diagnostics.Yes this is an area that needs further improvement. We have some immediate plans to make these more useful. See the recent llvm-dev threads [1], [2].> I am interested to know why the LLVM diagnostics are brief and not intuitive (making them less helpful)?I think it’s just lack of work or weakness in the analyses to provide more detailed information. It would be good to file bugs for specific cases where we fall behind.> In our analysis we never seen llvm trying to vectorize outer loops. Is this well known? Is outer loop vectorization implemented in LLVM as in GCC? (http://dl.acm.org/citation.cfm?id=1454119 <http://dl.acm.org/citation.cfm?id=1454119>) If not, is someone working on it?I heard various people mention this but I am not sure whether actual work is already taking place.> On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag of LLVM the inability of LLVM’s vectorizer, or are there any (enabling) optimization passes running before GCC’s vectorizer that are helping GCC perform better?I don’t know about the GCC but I’ve seen ICC perform loop transformation more aggressively that can increase the coverage for loop vectorization. ICC performs Loop Distribution/Fusion/Interchange, etc by default at their highest optimization level. We have some of these passes (distribution, interchange) but not on by default yet. Arguably, there is also some difference between focus areas for these compilers. I think that ICC has a more HPC focus than LLVM or GCC. We have Polly which is geared toward more the HPC use cases.> Loop peeling to enhance vectorization is present in GCC and ICC, but, the LLVM remarks don’t say anything about alignment. Does LLVM has this functionality and the vectorizer doesn’t remark about it, or it doesn’t it have the functionality at all?We don’t have it.> Finally, we appreciate suggestions and directions for improving the vectorization framework of LLVM.This is a pretty active area. Probably reading up on recent llvm-dev discussion in this area would be helpful to you.> I would also like to know if anyone worked or is working on improving vectorization remarks.Yes we are. If you’re interested working on this area it would be good to coordinate. Adam> > Regards, > > Dangeti Tharun kumar > M.TECH Computer Science > IIT Hyderabad > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev[1] http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 [2] http://thread.gmane.org/gmane.comp.compilers.llvm.devel/99126 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160623/5703c9c0/attachment.html>
Gerolf Hoflehner via llvm-dev
2016-Jun-23  18:06 UTC
[llvm-dev] Questions on LLVM vectorization diagnostics
> On Jun 23, 2016, at 8:20 AM, Dangeti Tharun kumar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Dear LLVM Community, > > I am D Tharun Kumar, masters student at Indian Institute of Technology Hyderabad, working in a team to improve current vectorizer in LLVM. As an initial study, we are studying various benchmarks to analyze and compare vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization remarks given by LLVM are vague and brief, comparatively GCC and ICC are giving detailed diagnostics. > > I am interested to know why the LLVM diagnostics are brief and not intuitive (making them less helpful)? > In our analysis we never seen llvm trying to vectorize outer loops. Is this well known? Is outer loop vectorization implemented in LLVM as in GCC? (http://dl.acm.org/citation.cfm?id=1454119 <http://dl.acm.org/citation.cfm?id=1454119>) If not, is someone working on it? > On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag of LLVM the inability of LLVM’s vectorizer, or are there any (enabling) optimization passes running before GCC’s vectorizer that are helping GCC perform better?Can you share your table with performance score per benchmark and the options you used for each compiler? Could you separate your results in integer and floating point loops? You might also want to compare the compilers on the code samples http://www.aartbik.com/SSE/code.html.> Loop peeling to enhance vectorization is present in GCC and ICC, but, the LLVM remarks don’t say anything about alignment. Does LLVM has this functionality and the vectorizer doesn’t remark about it, or it doesn’t it have the functionality at all? > Finally, we appreciate suggestions and directions for improving the vectorization framework of LLVM. > > I would also like to know if anyone worked or is working on improving vectorization remarks. > > Regards, > > Dangeti Tharun kumar > M.TECH Computer Science > IIT Hyderabad > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160623/3103c106/attachment-0001.html>
Dangeti Tharun kumar via llvm-dev
2016-Jul-05  04:46 UTC
[llvm-dev] Questions on LLVM vectorization diagnostics
Dear Adam Nemet, On Thu, Jun 23, 2016 at 11:15 PM, Adam Nemet <anemet at apple.com> wrote:> Hi Dangeti, > > On Jun 23, 2016, at 8:20 AM, Dangeti Tharun kumar via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Dear LLVM Community, > > I am D Tharun Kumar, masters student at Indian Institute of Technology > Hyderabad, working in a team to improve current vectorizer in LLVM. As an > initial study, we are studying various benchmarks to analyze and compare > vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization > remarks given by LLVM are vague and brief, comparatively GCC and ICC are > giving detailed diagnostics. > > > Yes this is an area that needs further improvement. We have some > immediate plans to make these more useful. See the recent llvm-dev threads > [1], [2]. > > > - I am interested to know why the LLVM diagnostics are brief and not > intuitive (making them less helpful)? > > I think it’s just lack of work or weakness in the analyses to provide more > detailed information. It would be good to file bugs for specific cases > where we fall behind. > > > - In our analysis we never seen llvm trying to vectorize outer loops. > Is this well known? Is outer loop vectorization implemented in LLVM as in > GCC? (http://dl.acm.org/citation.cfm?id=1454119) If not, is someone > working on it? > > > I heard various people mention this but I am not sure whether actual work > is already taking place. > > > - On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC > and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag > of LLVM the inability of LLVM’s vectorizer, or are there any (enabling) > optimization passes running before GCC’s vectorizer that are helping GCC > perform better? > > > I don’t know about the GCC but I’ve seen ICC perform loop transformation > more aggressively that can increase the coverage for loop vectorization. > ICC performs Loop Distribution/Fusion/Interchange, etc by default at their > highest optimization level. We have some of these passes (distribution, > interchange) but not on by default yet. > > Arguably, there is also some difference between focus areas for these > compilers. I think that ICC has a more HPC focus than LLVM or GCC. We > have Polly which is geared toward more the HPC use cases. > > > - Loop peeling to enhance vectorization is present in GCC and ICC, > but, the LLVM remarks don’t say anything about alignment. Does LLVM has > this functionality and the vectorizer doesn’t remark about it, or it > doesn’t it have the functionality at all? > > We don’t have it. > > About alignment, we tried few examples as belowfor(int i = 0; i<N; i++) { A[i+3] = B[i+1] + C[i+2] } LLVM-vectorizer did not responded with any remark either not-vectorized or the reason for not vectorizing it. Our team is showing interest in enhancing vectorizer to support unaligned structures. http://dl.acm.org/citation.cfm?id=996853 Are there anyone already working on this?> Finally, we appreciate suggestions and directions for improving the > vectorization framework of LLVM. > > > This is a pretty active area. Probably reading up on recent llvm-dev > discussion in this area would be helpful to you. > > I would also like to know if anyone worked or is working on improving > vectorization remarks. > > > Yes we are. If you’re interested working on this area it would be good to > coordinate. > > Yes, we are very much interested for coordinating in this area.> Adam > > > Regards, > > Dangeti Tharun kumar > M.TECH Computer Science > IIT Hyderabad > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > [1] http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 > [2] http://thread.gmane.org/gmane.comp.compilers.llvm.devel/99126 >-- Thank you D Tharun kumar CS15MTECH11002 9948373970 CSE-IITH -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160705/a2f4168d/attachment.html>