Finkel, Hal J. via llvm-dev
2019-Aug-08 23:52 UTC
[llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM
On 8/8/19 2:03 PM, Hal Finkel wrote: Hi, First, as a high-level note, you posted a link to a Google doc, and at the end of the Google doc, you have a list of questions that you'd like answered. In the future, please put the questions directly in the email. For one thing, more people will read your email than will open your Google doc. Second, having the questions in the email should allow a better threading structure to the replies. * Ivdep: Is clang loop vectorize(assume_safety) equivalent to ivdep? To what extent do the semantics of ivdep need to be modified for Clang to create an equally “useful pragma”? To what extent would it be helpful to have this pragma in Clang? There is a fundamental problem with the way that ivdep is defined by Intel's current documentation, at least for C/C++. As you note in your Google doc, it essentially says that the optimizer may ignore loop-carried dependencies except for those dependencies it can definitely prove are present. These are not semantics that any other compiler can actually replicate, and is not equivalent to "vectorize(assume_safety)" (which asserts that no loop-carried dependencies are present). The good news is that, in conversations I've had with Intel, an openness to making these semantics more concrete has been expressed. I think it would be very useful to have ivdep in Clang, but only after we nail down the semantics with Intel is some useful way. * * Nontemporal:What kind of analysis can we do in LLVM to find where to use nontemporal accesses? Any help would be greatly appreciated. If you're asking about the pragma, then what analysis is necessary? In general, you're looking for accesses that won't benefit from caching (e.g., streaming data which is not accessed again). * * vecremainder/novecremainder: Should the pragma simply call the vectorizer to attempt to vectorize the remainder loop, or should the vectorizer use a different method? Something like that. There were patches posted at some point to enable tail-loop vectorization. At this point, I imagine that you'd construct a VPlan with the vectorized tail. * * mask_readwrite/nomask_readwrite: Is it a good idea to implement a pragma that will generate mask intrinsics in the IR? What other architectures (except x86) has support for masked read/writes? ARM SVE might also fall into this category. * Reference:https://llvm.org/devmtg/2015-04/slides/MaskedIntrinsics.pdf LLVM has mask intrinsics for targets with AVX, AVX2, AVX-512.>From Slides: ”Most of the targets do not support masked instructions, optimization of instructions with masks is problematic, avoid introducing new masked instructions into LLVM IR”* aligned/unaligned: Is it worthwhile to have LLVM specific pragma rather depending on OpenMP? My opinion is that, so long as we have our own vectorization pragma, it should be as fully-featured as people request it to be. -Hal * -Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> on behalf of HAPPY Mahto via llvm-dev <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org> Sent: Thursday, August 8, 2019 11:55 AM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org> Cc: BHAVYA BAGLA <cs17btech11007 at iith.ac.in><mailto:cs17btech11007 at iith.ac.in>; MAMIDALA SAI PRAHARSH <es17btech11013 at iith.ac.in><mailto:es17btech11013 at iith.ac.in>; HAPPY KUMAR <cs17btech11018 at iith.ac.in><mailto:cs17btech11018 at iith.ac.in>; YASHAS ANDALURI <es17btech11025 at iith.ac.in><mailto:es17btech11025 at iith.ac.in> Subject: [llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM Hello all, We are students from Indian Institute of Technology(IIT), Hyderabad, we would like to propose the addition of the following pragmas in LLVM that aide in (or possibly increase the scope of) vectorization in LLVM (in comparison with other compilers). 1. ivdep 2. Nontemporal 3. [no]vecremainder 4. [no]mask_readwrite 5. [un]aligned Could you please check the following Google document for the semantic description of these pragmas: https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing [https://lh4.googleusercontent.com/BFUxChQk941g1yFLPCtFJ6l0ADX-mYOx9H4rwnKhKhax-5qlknMQuqS5g1glN-44f0Ls3w=w1200-h630-p]<https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing> Vectorization Pragmas LLVM:RFC: V2<https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing> docs.google.com Vectorization Pragmas in LLVM: An RFC Yashas Andaluri, Happy Mahto, M Sai Praharsh, Bhavya Bagla IIT Hyderabad Aug 8th, 2019 [Thanks to feedback from Venugopal Raghavan, Shivarama Rao (AMD) and Michael Kruse & Hal Finkel (ANL).] Vectorization Pragmas ivdep vector(nontemporal) vector([no]vecrema... It would be great if you could please review the above document and suggest us on how to proceed further (either about the semantics, or, about the code sections in LLVM). Thank you Yashas, Happy, Sai Praharsh, and Bhavya B.Tech 3rd year, IITH. -- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190808/df2bdf89/attachment.html>
Cameron McInally via llvm-dev
2019-Aug-09 00:50 UTC
[llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM
On Thu, Aug 8, 2019 at 7:52 PM Finkel, Hal J. via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > On 8/8/19 2:03 PM, Hal Finkel wrote: > > Hi, > > > First, as a high-level note, you posted a link to a Google doc, and at the > end of the Google doc, you have a list of questions that you'd like > answered. In the future, please put the questions directly in the email. > For one thing, more people will read your email than will open your Google > doc. Second, having the questions in the email should allow a better > threading structure to the replies. > > > > - > > Ivdep: Is clang loop vectorize(assume_safety) equivalent to ivdep? To > what extent do the semantics of ivdep need to be modified for Clang to > create an equally “useful pragma”? To what extent would it be helpful > to have this pragma in Clang? > > > > There is a fundamental problem with the way that ivdep is defined by > Intel's current documentation, at least for C/C++. As you note in your > Google doc, it essentially says that the optimizer may ignore loop-carried > dependencies except for those dependencies it can definitely prove are > present. These are not semantics that any other compiler can actually > replicate, and is not equivalent to "vectorize(assume_safety)" (which > asserts that no loop-carried dependencies are present). The good news is > that, in conversations I've had with Intel, an openness to making these > semantics more concrete has been expressed. I think it would be very useful > to have ivdep in Clang, but only after we nail down the semantics with > Intel is some useful way. >To be fair, IVDEP most likely originated at Cray. [Or maybe Control Data. The history is fuzzy that far back. I do know it predates ANSI C.] There's a publicly available copy of the Cray C/C++ manual here: https://pubs.cray.com/content/S-2179/9.0/cray-classic-c-and-c++-reference-manual/vectorization-directives Scott Manley from Cray would be good resource to tap for clarification on the semantics.> - > - > > Nontemporal:What kind of analysis can we do in LLVM to find where to > use nontemporal accesses? Any help would be greatly appreciated. > > > > If you're asking about the pragma, then what analysis is necessary? In > general, you're looking for accesses that won't benefit from caching (e.g., > streaming data which is not accessed again). > > > > - > - > > vecremainder/novecremainder: Should the pragma simply call the > vectorizer to attempt to vectorize the remainder loop, or should the > vectorizer use a different method? > > > > Something like that. There were patches posted at some point to enable > tail-loop vectorization. At this point, I imagine that you'd construct a > VPlan with the vectorized tail. > > > > - > - > > mask_readwrite/nomask_readwrite: Is it a good idea to implement a > pragma that will generate mask intrinsics in the IR? What other > architectures (except x86) has support for masked read/writes? > > > ARM SVE might also fall into this category. > > > > - > > > Reference:https://llvm.org/devmtg/2015-04/slides/MaskedIntrinsics.pdf > <https://urldefense.proofpoint.com/v2/url?u=https-3A__llvm.org_devmtg_2015-2D04_slides_MaskedIntrinsics.pdf&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=o2U9j6XECBRnTOcqSIRQT-dWi8owoO8q0xKOEW6f8z0&e=> > > LLVM has mask intrinsics for targets with AVX, AVX2, AVX-512. > > From Slides: ”Most of the targets do not support masked instructions, > optimization of instructions with masks is problematic, avoid introducing > new masked instructions into LLVM IR” > > - > > aligned/unaligned: Is it worthwhile to have LLVM specific pragma > rather depending on OpenMP? > > > My opinion is that, so long as we have our own vectorization pragma, it > should be as fully-featured as people request it to be. > > > -Hal > > > > > - > > > -Hal > > > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > > ------------------------------ > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> > <llvm-dev-bounces at lists.llvm.org> on behalf of HAPPY Mahto via llvm-dev > <llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org> > *Sent:* Thursday, August 8, 2019 11:55 AM > *To:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> > <llvm-dev at lists.llvm.org> > *Cc:* BHAVYA BAGLA <cs17btech11007 at iith.ac.in> <cs17btech11007 at iith.ac.in>; > MAMIDALA SAI PRAHARSH <es17btech11013 at iith.ac.in> > <es17btech11013 at iith.ac.in>; HAPPY KUMAR <cs17btech11018 at iith.ac.in> > <cs17btech11018 at iith.ac.in>; YASHAS ANDALURI <es17btech11025 at iith.ac.in> > <es17btech11025 at iith.ac.in> > *Subject:* [llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization > Pragmas in LLVM > > Hello all, > > We are students from Indian Institute of Technology(IIT), Hyderabad, we > would like to propose the addition of the following pragmas in LLVM that > aide in (or possibly increase the scope of) vectorization in LLVM (in > comparison with other compilers). > > > 1. > > ivdep > 2. > > Nontemporal > 3. > > [no]vecremainder > 4. > > [no]mask_readwrite > 5. > > [un]aligned > > > Could you please check the following Google document for the semantic > description of these pragmas: > > > https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> > Vectorization Pragmas LLVM:RFC: V2 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> > docs.google.com > Vectorization Pragmas in LLVM: An RFC Yashas Andaluri, Happy Mahto, M Sai > Praharsh, Bhavya Bagla IIT Hyderabad Aug 8th, 2019 [Thanks to feedback from > Venugopal Raghavan, Shivarama Rao (AMD) and Michael Kruse & Hal Finkel > (ANL).] Vectorization Pragmas ivdep vector(nontemporal) > vector([no]vecrema... > > > It would be great if you could please review the above document and > suggest us on how to proceed further (either about the semantics, or, about > the code sections in LLVM). > > Thank you > > Yashas, Happy, Sai Praharsh, and Bhavya > > B.Tech 3rd year, IITH. > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=L-X4vbafbWIKsdnIqTTXsiRM2ku9-D5cLKCXc18dtUo&e>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190808/06a9cd05/attachment.html>
Scott Manley via llvm-dev
2019-Aug-09 15:57 UTC
[llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM
> There is a fundamental problem with the way that ivdep is defined byIntel's current documentation, at least for C/C++. As you note in your Google doc, it essentially says that the optimizer may ignore loop-carried dependencies except for those dependencies it can definitely prove are present. These are not semantics that any other compiler can actually replicate, and is not equivalent to "vectorize(assume_safety)" (which asserts that no loop-carried dependencies are present). The good news is that, in conversations I've had with Intel, an openness to making these semantics more concrete has been expressed. I think it would be very useful to have ivdep in Clang, but only after we nail down the semantics with Intel is some useful way. Agreed. I don't see a lot of value in having the compiler override a pragma that is supposed to override the compiler :) Cray's IVDEP really means what the documentation says: Ignore Vector DEPendencies. It doesn't remove all dependencies, just dependencies that inhibit vectorization. It also does not force vectorization. If it's not possible or not profitable to vectorize, then it won't vectorize. I will add that ivdep is well used by Cray and its users, so I'd like to see it well defined in Clang/llvm. On Thu, Aug 8, 2019 at 8:51 PM Cameron McInally via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Thu, Aug 8, 2019 at 7:52 PM Finkel, Hal J. via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> On 8/8/19 2:03 PM, Hal Finkel wrote: >> >> Hi, >> >> >> First, as a high-level note, you posted a link to a Google doc, and at >> the end of the Google doc, you have a list of questions that you'd like >> answered. In the future, please put the questions directly in the email. >> For one thing, more people will read your email than will open your Google >> doc. Second, having the questions in the email should allow a better >> threading structure to the replies. >> >> >> >> - >> >> Ivdep: Is clang loop vectorize(assume_safety) equivalent to ivdep? To >> what extent do the semantics of ivdep need to be modified for Clang to >> create an equally “useful pragma”? To what extent would it be helpful >> to have this pragma in Clang? >> >> >> >> There is a fundamental problem with the way that ivdep is defined by >> Intel's current documentation, at least for C/C++. As you note in your >> Google doc, it essentially says that the optimizer may ignore loop-carried >> dependencies except for those dependencies it can definitely prove are >> present. These are not semantics that any other compiler can actually >> replicate, and is not equivalent to "vectorize(assume_safety)" (which >> asserts that no loop-carried dependencies are present). The good news is >> that, in conversations I've had with Intel, an openness to making these >> semantics more concrete has been expressed. I think it would be very useful >> to have ivdep in Clang, but only after we nail down the semantics with >> Intel is some useful way. >> > > To be fair, IVDEP most likely originated at Cray. [Or maybe Control Data. > The history is fuzzy that far back. I do know it predates ANSI C.] > > There's a publicly available copy of the Cray C/C++ manual here: > > > https://pubs.cray.com/content/S-2179/9.0/cray-classic-c-and-c++-reference-manual/vectorization-directives > > Scott Manley from Cray would be good resource to tap for clarification on > the semantics. > > >> - >> - >> >> Nontemporal:What kind of analysis can we do in LLVM to find where to >> use nontemporal accesses? Any help would be greatly appreciated. >> >> >> >> If you're asking about the pragma, then what analysis is necessary? In >> general, you're looking for accesses that won't benefit from caching (e.g., >> streaming data which is not accessed again). >> >> >> >> - >> - >> >> vecremainder/novecremainder: Should the pragma simply call the >> vectorizer to attempt to vectorize the remainder loop, or should the >> vectorizer use a different method? >> >> >> >> Something like that. There were patches posted at some point to enable >> tail-loop vectorization. At this point, I imagine that you'd construct a >> VPlan with the vectorized tail. >> >> >> >> - >> - >> >> mask_readwrite/nomask_readwrite: Is it a good idea to implement a >> pragma that will generate mask intrinsics in the IR? What other >> architectures (except x86) has support for masked read/writes? >> >> >> ARM SVE might also fall into this category. >> >> >> >> - >> >> >> Reference:https://llvm.org/devmtg/2015-04/slides/MaskedIntrinsics.pdf >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__llvm.org_devmtg_2015-2D04_slides_MaskedIntrinsics.pdf&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=o2U9j6XECBRnTOcqSIRQT-dWi8owoO8q0xKOEW6f8z0&e=> >> >> LLVM has mask intrinsics for targets with AVX, AVX2, AVX-512. >> >> From Slides: ”Most of the targets do not support masked instructions, >> optimization of instructions with masks is problematic, avoid introducing >> new masked instructions into LLVM IR” >> >> - >> >> aligned/unaligned: Is it worthwhile to have LLVM specific pragma >> rather depending on OpenMP? >> >> >> My opinion is that, so long as we have our own vectorization pragma, it >> should be as fully-featured as people request it to be. >> >> >> -Hal >> >> >> >> >> - >> >> >> -Hal >> >> >> Hal Finkel >> Lead, Compiler Technology and Programming Languages >> Leadership Computing Facility >> Argonne National Laboratory >> >> >> ------------------------------ >> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> >> <llvm-dev-bounces at lists.llvm.org> on behalf of HAPPY Mahto via llvm-dev >> <llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org> >> *Sent:* Thursday, August 8, 2019 11:55 AM >> *To:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> >> <llvm-dev at lists.llvm.org> >> *Cc:* BHAVYA BAGLA <cs17btech11007 at iith.ac.in> >> <cs17btech11007 at iith.ac.in>; MAMIDALA SAI PRAHARSH >> <es17btech11013 at iith.ac.in> <es17btech11013 at iith.ac.in>; HAPPY KUMAR >> <cs17btech11018 at iith.ac.in> <cs17btech11018 at iith.ac.in>; YASHAS ANDALURI >> <es17btech11025 at iith.ac.in> <es17btech11025 at iith.ac.in> >> *Subject:* [llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization >> Pragmas in LLVM >> >> Hello all, >> >> We are students from Indian Institute of Technology(IIT), Hyderabad, we >> would like to propose the addition of the following pragmas in LLVM that >> aide in (or possibly increase the scope of) vectorization in LLVM (in >> comparison with other compilers). >> >> >> 1. >> >> ivdep >> 2. >> >> Nontemporal >> 3. >> >> [no]vecremainder >> 4. >> >> [no]mask_readwrite >> 5. >> >> [un]aligned >> >> >> Could you please check the following Google document for the semantic >> description of these pragmas: >> >> >> https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> >> >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> >> Vectorization Pragmas LLVM:RFC: V2 >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> >> docs.google.com >> Vectorization Pragmas in LLVM: An RFC Yashas Andaluri, Happy Mahto, M Sai >> Praharsh, Bhavya Bagla IIT Hyderabad Aug 8th, 2019 [Thanks to feedback from >> Venugopal Raghavan, Shivarama Rao (AMD) and Michael Kruse & Hal Finkel >> (ANL).] Vectorization Pragmas ivdep vector(nontemporal) >> vector([no]vecrema... >> >> >> It would be great if you could please review the above document and >> suggest us on how to proceed further (either about the semantics, or, about >> the code sections in LLVM). >> >> Thank you >> >> Yashas, Happy, Sai Praharsh, and Bhavya >> >> B.Tech 3rd year, IITH. >> >> -- >> Hal Finkel >> Lead, Compiler Technology and Programming Languages >> Leadership Computing Facility >> Argonne National Laboratory >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=L-X4vbafbWIKsdnIqTTXsiRM2ku9-D5cLKCXc18dtUo&e>> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190809/e9c797f4/attachment.html>
Sjoerd Meijer via llvm-dev
2019-Aug-13 16:58 UTC
[llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM
vecremainder/novecremainder: Should the pragma simply call the vectorizer to attempt to vectorize the remainder loop, or should the vectorizer use a different method?>> Something like that. There were patches posted at some point to enable tail-loop vectorization. At this point, I imagine that you'd construct a VPlan with the vectorized tail.Yep, committed in https://reviews.llvm.org/rL366989 and https://reviews.llvm.org/D65197. The pragma name is different, but I think it tries to achieve the same thing. ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> Sent: 09 August 2019 01:50 To: Finkel, Hal J. <hfinkel at anl.gov> Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; MAMIDALA SAI PRAHARSH <es17btech11013 at iith.ac.in>; YASHAS ANDALURI <es17btech11025 at iith.ac.in>; HAPPY Mahto <cs17btech11018 at iith.ac.in>; BHAVYA BAGLA <cs17btech11007 at iith.ac.in> Subject: Re: [llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM On Thu, Aug 8, 2019 at 7:52 PM Finkel, Hal J. via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: On 8/8/19 2:03 PM, Hal Finkel wrote: Hi, First, as a high-level note, you posted a link to a Google doc, and at the end of the Google doc, you have a list of questions that you'd like answered. In the future, please put the questions directly in the email. For one thing, more people will read your email than will open your Google doc. Second, having the questions in the email should allow a better threading structure to the replies. * Ivdep: Is clang loop vectorize(assume_safety) equivalent to ivdep? To what extent do the semantics of ivdep need to be modified for Clang to create an equally “useful pragma”? To what extent would it be helpful to have this pragma in Clang? There is a fundamental problem with the way that ivdep is defined by Intel's current documentation, at least for C/C++. As you note in your Google doc, it essentially says that the optimizer may ignore loop-carried dependencies except for those dependencies it can definitely prove are present. These are not semantics that any other compiler can actually replicate, and is not equivalent to "vectorize(assume_safety)" (which asserts that no loop-carried dependencies are present). The good news is that, in conversations I've had with Intel, an openness to making these semantics more concrete has been expressed. I think it would be very useful to have ivdep in Clang, but only after we nail down the semantics with Intel is some useful way. To be fair, IVDEP most likely originated at Cray. [Or maybe Control Data. The history is fuzzy that far back. I do know it predates ANSI C.] There's a publicly available copy of the Cray C/C++ manual here: https://pubs.cray.com/content/S-2179/9.0/cray-classic-c-and-c++-reference-manual/vectorization-directives Scott Manley from Cray would be good resource to tap for clarification on the semantics. * * Nontemporal:What kind of analysis can we do in LLVM to find where to use nontemporal accesses? Any help would be greatly appreciated. If you're asking about the pragma, then what analysis is necessary? In general, you're looking for accesses that won't benefit from caching (e.g., streaming data which is not accessed again). * * vecremainder/novecremainder: Should the pragma simply call the vectorizer to attempt to vectorize the remainder loop, or should the vectorizer use a different method? Something like that. There were patches posted at some point to enable tail-loop vectorization. At this point, I imagine that you'd construct a VPlan with the vectorized tail. * * mask_readwrite/nomask_readwrite: Is it a good idea to implement a pragma that will generate mask intrinsics in the IR? What other architectures (except x86) has support for masked read/writes? ARM SVE might also fall into this category. * Reference:https://llvm.org/devmtg/2015-04/slides/MaskedIntrinsics.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__llvm.org_devmtg_2015-2D04_slides_MaskedIntrinsics.pdf&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=o2U9j6XECBRnTOcqSIRQT-dWi8owoO8q0xKOEW6f8z0&e=> LLVM has mask intrinsics for targets with AVX, AVX2, AVX-512.>From Slides: ”Most of the targets do not support masked instructions, optimization of instructions with masks is problematic, avoid introducing new masked instructions into LLVM IR”* aligned/unaligned: Is it worthwhile to have LLVM specific pragma rather depending on OpenMP? My opinion is that, so long as we have our own vectorization pragma, it should be as fully-featured as people request it to be. -Hal * -Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> on behalf of HAPPY Mahto via llvm-dev <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org> Sent: Thursday, August 8, 2019 11:55 AM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org> Cc: BHAVYA BAGLA <cs17btech11007 at iith.ac.in><mailto:cs17btech11007 at iith.ac.in>; MAMIDALA SAI PRAHARSH <es17btech11013 at iith.ac.in><mailto:es17btech11013 at iith.ac.in>; HAPPY KUMAR <cs17btech11018 at iith.ac.in><mailto:cs17btech11018 at iith.ac.in>; YASHAS ANDALURI <es17btech11025 at iith.ac.in><mailto:es17btech11025 at iith.ac.in> Subject: [llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM Hello all, We are students from Indian Institute of Technology(IIT), Hyderabad, we would like to propose the addition of the following pragmas in LLVM that aide in (or possibly increase the scope of) vectorization in LLVM (in comparison with other compilers). 1. ivdep 2. Nontemporal 3. [no]vecremainder 4. [no]mask_readwrite 5. [un]aligned Could you please check the following Google document for the semantic description of these pragmas: https://docs.google.com/document/d/1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-R1A/edit?usp=sharing<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> [https://lh4.googleusercontent.com/BFUxChQk941g1yFLPCtFJ6l0ADX-mYOx9H4rwnKhKhax-5qlknMQuqS5g1glN-44f0Ls3w=w1200-h630-p]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> Vectorization Pragmas LLVM:RFC: V2<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1YjGnyzWFKJvqbpCsZicCUczzU8HlLHkmG9MssUw-2DR1A_edit-3Fusp-3Dsharing&d=DwMF-g&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=JAlwNOiT5i7zvP9qxjAe_Rt8ZZv_ukBvSbfEZzH_CZI&e=> docs.google.com<http://docs.google.com> Vectorization Pragmas in LLVM: An RFC Yashas Andaluri, Happy Mahto, M Sai Praharsh, Bhavya Bagla IIT Hyderabad Aug 8th, 2019 [Thanks to feedback from Venugopal Raghavan, Shivarama Rao (AMD) and Michael Kruse & Hal Finkel (ANL).] Vectorization Pragmas ivdep vector(nontemporal) vector([no]vecrema... It would be great if you could please review the above document and suggest us on how to proceed further (either about the semantics, or, about the code sections in LLVM). Thank you Yashas, Happy, Sai Praharsh, and Bhavya B.Tech 3rd year, IITH. -- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=ttZjwoTRuUQgVSd_8PZOPypfqqn-GiNqAl9WLpPxiAk&s=L-X4vbafbWIKsdnIqTTXsiRM2ku9-D5cLKCXc18dtUo&e-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190813/2dc3eb67/attachment-0001.html>