Elovikov, Andrei via llvm-dev
2019-Jun-10 18:09 UTC
[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.
Hi Francesco,> I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI.I just thought that another option would be to force FE to always emit "logically"-widened alwaysinline wrapper for the vector function that does the arguments processing according to ABI inside (we need that info in the FE anyway). That way the vectorizer pass won't need to care about the tricky processing and we (possibly) will get a somewhat easier to understand IR after the vectorizer. Is that something that might work? Thanks, Andrei -----Original Message----- From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> Sent: Monday, June 10, 2019 09:09 To: Elovikov, Andrei <andrei.elovikov at intel.com> Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com> Subject: Re: [RFC] Expose user provided vector function for auto-vectorization. Hi Andrei,> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at intel.com> wrote: > > Hi All, > > [I'm only subscribed to digest, so the reply doesn't look great, sorry > about that] > >> The second component is a tool that other parts of LLVM (for example, the loop vectorizer) can use to query the availability of the vector function, the SVFS I have described in the original post of the RFC, which is based on interpreting the `vector-variant` attribute. >> The final component is the one that seems to have generated most of the controversies discussed in the thread, and for which I decided to move away from `declare variant`. > > Where will the mapping between parameters positions be stored? Using the example from https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant: > > float MyAdd(float* a, int b) { return *a + b; } > __declspec(vector_variant(implements(MyAdd(float *a, int b)), > linear(a), vectorlength(8), > nomask, processor(core_2nd_gen_avx))) > __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) > > We need somehow communicate which lanes of widened "b" would map for the b1 parameter and which would go to the b2. If we only care about single ABI (like the one mandated by the OMP) than such things could be put to TTI, but what about other ABIs? Should we encode this explicitly in the annotation too? >I think that the mapping between a scalar parameter and the correspondent vector parameter(s - there can be more than one) should be handled by the Vector Function ABI when a vector function ABI is defined. I am working out on a new proposal, I’ll keep you posted. I think that the requirements of 1. being a user feature 2. Based on a standard (OpenMP), implies the fact that a contract between the scalar functions and the vector functions must be stipulated in some document, such document being a vector function ABI for the target. I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI. Kind regards, Francesco> Best Regards, > AndreiIMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Francesco Petrogalli via llvm-dev
2019-Jun-10 18:37 UTC
[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.
> On Jun 10, 2019, at 1:09 PM, Elovikov, Andrei <andrei.elovikov at intel.com> wrote: > > Hi Francesco, >Hello!>> I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI. > > I just thought that another option would be to force FE to always emit "logically"-widened alwaysinline wrapper for the vector function that does the arguments processing according to ABI inside (we need that info in the FE anyway). That way the vectorizer pass won't need to care about the tricky processing and we (possibly) will get a somewhat easier to understand IR after the vectorizer. > > Is that something that might work? >I don’t know, I am not sure I understand your request. What is a `"logically"-widened alwaysinline wrapper for the vector function`? Can you provide an example? Also, what is the `tricky processing` you are referring to that the vectorizer should care about? Kind regards, Francesco> Thanks, > Andrei > > -----Original Message----- > From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> > Sent: Monday, June 10, 2019 09:09 > To: Elovikov, Andrei <andrei.elovikov at intel.com> > Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com> > Subject: Re: [RFC] Expose user provided vector function for auto-vectorization. > > Hi Andrei, > >> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at intel.com> wrote: >> >> Hi All, >> >> [I'm only subscribed to digest, so the reply doesn't look great, sorry >> about that] >> >>> The second component is a tool that other parts of LLVM (for example, the loop vectorizer) can use to query the availability of the vector function, the SVFS I have described in the original post of the RFC, which is based on interpreting the `vector-variant` attribute. >>> The final component is the one that seems to have generated most of the controversies discussed in the thread, and for which I decided to move away from `declare variant`. >> >> Where will the mapping between parameters positions be stored? Using the example from https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant: >> >> float MyAdd(float* a, int b) { return *a + b; } >> __declspec(vector_variant(implements(MyAdd(float *a, int b)), >> linear(a), vectorlength(8), >> nomask, processor(core_2nd_gen_avx))) >> __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) >> >> We need somehow communicate which lanes of widened "b" would map for the b1 parameter and which would go to the b2. If we only care about single ABI (like the one mandated by the OMP) than such things could be put to TTI, but what about other ABIs? Should we encode this explicitly in the annotation too? >> > > I think that the mapping between a scalar parameter and the correspondent vector parameter(s - there can be more than one) should be handled by the Vector Function ABI when a vector function ABI is defined. > > I am working out on a new proposal, I’ll keep you posted. > > I think that the requirements of 1. being a user feature 2. Based on a standard (OpenMP), implies the fact that a contract between the scalar functions and the vector functions must be stipulated in some document, such document being a vector function ABI for the target. > > I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI. > > Kind regards, > > Francesco > >> Best Regards, >> Andrei > > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Elovikov, Andrei via llvm-dev
2019-Jun-10 18:51 UTC
[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.
> What is a `"logically"-widened alwaysinline wrapper for the vector function`? Can you provide an example? Also, what is the `tricky processing` you are referring to that the vectorizer should care about?For the case mentioned earlier: float MyAdd(float* a, int b) { return *a + b; } __declspec(vector_variant(implements(MyAdd(float *a, int b)), linear(a), vectorlength(8), nomask, processor(core_2nd_gen_avx))) __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) If FE emitted ;; Alwaysinline define <8 x float> @MyAddVec.abi_wrapper(float* %v_a, <8 x i32> %v_b) { ;; Not sure about the exact values in the mask parameter. %v_b1 = shufflevector <8 x i32> %v_b, <8 x i32> undef, <4 x i32><i32 0, i32 1, i32 2, i32 3> %v_b2 = shufflevector <8 x i32> %v_b, <8 x i32> undef, <4 x i32><i32 4, i32 5, i32 6, i32 7> %ret = call <8 x float> @MyAddVec(%v_a, %v_b1, %v_b2) } Then the vectorizer won't need to deal with the splitting of vector version of %b into two arguments, and the vector-attribute would only describe that kind of processing that is specific to the vectorizer and not the lowering ABI part. Note, that I don't insist on this approach, it's just an alternative to the "hardcoded" usage of OMP's Vector Function ABI. Thanks, Andrei -----Original Message----- From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> Sent: Monday, June 10, 2019 11:38 To: Elovikov, Andrei <andrei.elovikov at intel.com> Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com> Subject: Re: [RFC] Expose user provided vector function for auto-vectorization.> On Jun 10, 2019, at 1:09 PM, Elovikov, Andrei <andrei.elovikov at intel.com> wrote: > > Hi Francesco, >Hello!>> I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI. > > I just thought that another option would be to force FE to always emit "logically"-widened alwaysinline wrapper for the vector function that does the arguments processing according to ABI inside (we need that info in the FE anyway). That way the vectorizer pass won't need to care about the tricky processing and we (possibly) will get a somewhat easier to understand IR after the vectorizer. > > Is that something that might work? >I don’t know, I am not sure I understand your request. What is a `"logically"-widened alwaysinline wrapper for the vector function`? Can you provide an example? Also, what is the `tricky processing` you are referring to that the vectorizer should care about? Kind regards, Francesco> Thanks, > Andrei > > -----Original Message----- > From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> > Sent: Monday, June 10, 2019 09:09 > To: Elovikov, Andrei <andrei.elovikov at intel.com> > Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com> > Subject: Re: [RFC] Expose user provided vector function for auto-vectorization. > > Hi Andrei, > >> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at intel.com> wrote: >> >> Hi All, >> >> [I'm only subscribed to digest, so the reply doesn't look great, >> sorry about that] >> >>> The second component is a tool that other parts of LLVM (for example, the loop vectorizer) can use to query the availability of the vector function, the SVFS I have described in the original post of the RFC, which is based on interpreting the `vector-variant` attribute. >>> The final component is the one that seems to have generated most of the controversies discussed in the thread, and for which I decided to move away from `declare variant`. >> >> Where will the mapping between parameters positions be stored? Using the example from https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant: >> >> float MyAdd(float* a, int b) { return *a + b; } >> __declspec(vector_variant(implements(MyAdd(float *a, int b)), >> linear(a), vectorlength(8), >> nomask, processor(core_2nd_gen_avx))) >> __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) >> >> We need somehow communicate which lanes of widened "b" would map for the b1 parameter and which would go to the b2. If we only care about single ABI (like the one mandated by the OMP) than such things could be put to TTI, but what about other ABIs? Should we encode this explicitly in the annotation too? >> > > I think that the mapping between a scalar parameter and the correspondent vector parameter(s - there can be more than one) should be handled by the Vector Function ABI when a vector function ABI is defined. > > I am working out on a new proposal, I’ll keep you posted. > > I think that the requirements of 1. being a user feature 2. Based on a standard (OpenMP), implies the fact that a contract between the scalar functions and the vector functions must be stipulated in some document, such document being a vector function ABI for the target. > > I am crafting the attribute so that it makes it explicit that we are using OpenMP and we are expecting a Vector Function ABI. > > Kind regards, > > Francesco > >> Best Regards, >> Andrei > > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.