thr3ads.net - llvm dev - [llvm-dev] [RFC] Expose user provided vector function for auto-vectorization. [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Elovikov, Andrei via llvm-dev

2019-Jun-10 18:09 UTC

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

Hi Francesco,
> I am crafting the attribute so that it makes it explicit that we are using
OpenMP and we are expecting a Vector Function ABI.
I just thought that another option would be to force FE to always emit
"logically"-widened alwaysinline wrapper for the vector function that
does the arguments processing according to ABI inside (we need that info in the
FE anyway). That way the vectorizer pass won't need to care about the tricky
processing and we (possibly) will get a somewhat easier to understand IR after
the vectorizer.

Is that something that might work?

Thanks,
Andrei

-----Original Message-----
From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> 
Sent: Monday, June 10, 2019 09:09
To: Elovikov, Andrei <andrei.elovikov at intel.com>
Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com>
Subject: Re: [RFC] Expose user provided vector function for auto-vectorization.

Hi Andrei,
> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at
intel.com> wrote:
>
> Hi All,
>
> [I'm only subscribed to digest, so the reply doesn't look great,
sorry
> about that]
>
>> The second component is a tool that other parts of LLVM (for example,
the loop vectorizer) can use to query the availability of the vector function,
the SVFS I have described in the original post of the RFC, which is based on
interpreting the `vector-variant` attribute.
>> The final component is the one that seems to have generated most of the
controversies discussed in the thread, and for which I decided to move away from
`declare variant`.
>
> Where will the mapping between parameters positions be stored? Using the
example from
https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant:
>
> float MyAdd(float* a, int b) { return *a + b; } 
> __declspec(vector_variant(implements(MyAdd(float *a, int b)),
>                          linear(a), vectorlength(8),
>                          nomask, processor(core_2nd_gen_avx)))
> __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2)
>
> We need somehow communicate which lanes of widened "b" would map
for the b1 parameter and which would go to the b2. If we only care about single
ABI (like the one mandated by the OMP) than such things could be put to TTI, but
what about other ABIs? Should we encode this explicitly in the annotation too?
>
I think that the mapping between a scalar parameter and the correspondent vector
parameter(s - there can be more than one) should be handled by the Vector
Function ABI when a vector function ABI is defined.

I am working out on a new proposal, I’ll keep you posted.

I think that the requirements of 1. being a user feature 2. Based on a standard
(OpenMP), implies the fact that a contract between the scalar functions and the
vector functions must be stipulated in some document, such document being a
vector function ABI for the target.

I am crafting the attribute so that it makes it explicit that we are using
OpenMP and we are expecting a Vector Function ABI.

Kind regards,

Francesco
> Best Regards,
> Andrei
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.

Francesco Petrogalli via llvm-dev

2019-Jun-10 18:37 UTC

head link

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

> On Jun 10, 2019, at 1:09 PM, Elovikov, Andrei <andrei.elovikov at
intel.com> wrote:
>
> Hi Francesco,
>
Hello!
>> I am crafting the attribute so that it makes it explicit that we are
using OpenMP and we are expecting a Vector Function ABI.
>
> I just thought that another option would be to force FE to always emit
"logically"-widened alwaysinline wrapper for the vector function that
does the arguments processing according to ABI inside (we need that info in the
FE anyway). That way the vectorizer pass won't need to care about the tricky
processing and we (possibly) will get a somewhat easier to understand IR after
the vectorizer.
>
> Is that something that might work?
>
I don’t know, I am not sure I understand your request.

What is a `"logically"-widened alwaysinline wrapper for the vector
function`? Can you provide an example? Also, what is the `tricky processing` you
are referring to that the vectorizer should care about?

Kind regards,

Francesco
> Thanks,
> Andrei
>
> -----Original Message-----
> From: Francesco Petrogalli <Francesco.Petrogalli at arm.com>
> Sent: Monday, June 10, 2019 09:09
> To: Elovikov, Andrei <andrei.elovikov at intel.com>
> Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at
intel.com>
> Subject: Re: [RFC] Expose user provided vector function for
auto-vectorization.
>
> Hi Andrei,
>
>> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at
intel.com> wrote:
>>
>> Hi All,
>>
>> [I'm only subscribed to digest, so the reply doesn't look
great, sorry
>> about that]
>>
>>> The second component is a tool that other parts of LLVM (for
example, the loop vectorizer) can use to query the availability of the vector
function, the SVFS I have described in the original post of the RFC, which is
based on interpreting the `vector-variant` attribute.
>>> The final component is the one that seems to have generated most of
the controversies discussed in the thread, and for which I decided to move away
from `declare variant`.
>>
>> Where will the mapping between parameters positions be stored? Using
the example from
https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant:
>>
>> float MyAdd(float* a, int b) { return *a + b; }
>> __declspec(vector_variant(implements(MyAdd(float *a, int b)),
>>                         linear(a), vectorlength(8),
>>                         nomask, processor(core_2nd_gen_avx)))
>> __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2)
>>
>> We need somehow communicate which lanes of widened "b" would
map for the b1 parameter and which would go to the b2. If we only care about
single ABI (like the one mandated by the OMP) than such things could be put to
TTI, but what about other ABIs? Should we encode this explicitly in the
annotation too?
>>
>
> I think that the mapping between a scalar parameter and the correspondent
vector parameter(s - there can be more than one) should be handled by the Vector
Function ABI when a vector function ABI is defined.
>
> I am working out on a new proposal, I’ll keep you posted.
>
> I think that the requirements of 1. being a user feature 2. Based on a
standard (OpenMP), implies the fact that a contract between the scalar functions
and the vector functions must be stipulated in some document, such document
being a vector function ABI for the target.
>
> I am crafting the attribute so that it makes it explicit that we are using
OpenMP and we are expecting a Vector Function ABI.
>
> Kind regards,
>
> Francesco
>
>> Best Regards,
>> Andrei
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.

Elovikov, Andrei via llvm-dev

2019-Jun-10 18:51 UTC

head link

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

> What is a `"logically"-widened alwaysinline wrapper for the
vector function`? Can you provide an example? Also, what is the `tricky
processing` you are referring to that the vectorizer should care about?
For the case mentioned earlier:

 float MyAdd(float* a, int b) { return *a + b; } 
 __declspec(vector_variant(implements(MyAdd(float *a, int b)),
                         linear(a), vectorlength(8),
                         nomask, processor(core_2nd_gen_avx)))
 __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2)

If FE emitted

;; Alwaysinline
define <8 x float> @MyAddVec.abi_wrapper(float* %v_a, <8 x i32>
%v_b) {
  ;; Not sure about the exact values in the mask parameter.
  %v_b1 = shufflevector <8 x i32> %v_b, <8 x i32> undef, <4 x
i32><i32 0, i32 1, i32 2, i32 3>
  %v_b2 = shufflevector <8 x i32> %v_b, <8 x i32> undef, <4 x
i32><i32 4, i32 5, i32 6, i32 7>
  %ret = call <8 x float> @MyAddVec(%v_a, %v_b1, %v_b2)
}

Then the vectorizer won't need to deal with the splitting of vector version
of %b into two arguments, and the vector-attribute would only describe that kind
of processing that is specific to the vectorizer and not the lowering ABI part.

Note, that I don't insist on this approach, it's just an alternative to
the "hardcoded" usage of OMP's Vector Function ABI.

Thanks,
Andrei

-----Original Message-----
From: Francesco Petrogalli <Francesco.Petrogalli at arm.com> 
Sent: Monday, June 10, 2019 11:38
To: Elovikov, Andrei <andrei.elovikov at intel.com>
Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at intel.com>
Subject: Re: [RFC] Expose user provided vector function for auto-vectorization.


> On Jun 10, 2019, at 1:09 PM, Elovikov, Andrei <andrei.elovikov at
intel.com> wrote:
>
> Hi Francesco,
>
Hello!
>> I am crafting the attribute so that it makes it explicit that we are
using OpenMP and we are expecting a Vector Function ABI.
>
> I just thought that another option would be to force FE to always emit
"logically"-widened alwaysinline wrapper for the vector function that
does the arguments processing according to ABI inside (we need that info in the
FE anyway). That way the vectorizer pass won't need to care about the tricky
processing and we (possibly) will get a somewhat easier to understand IR after
the vectorizer.
>
> Is that something that might work?
>
I don’t know, I am not sure I understand your request.

What is a `"logically"-widened alwaysinline wrapper for the vector
function`? Can you provide an example? Also, what is the `tricky processing` you
are referring to that the vectorizer should care about?

Kind regards,

Francesco
> Thanks,
> Andrei
>
> -----Original Message-----
> From: Francesco Petrogalli <Francesco.Petrogalli at arm.com>
> Sent: Monday, June 10, 2019 09:09
> To: Elovikov, Andrei <andrei.elovikov at intel.com>
> Cc: llvm-dev at lists.llvm.org; Saito, Hideki <hideki.saito at
intel.com>
> Subject: Re: [RFC] Expose user provided vector function for
auto-vectorization.
>
> Hi Andrei,
>
>> On Jun 7, 2019, at 5:46 PM, Elovikov, Andrei <andrei.elovikov at
intel.com> wrote:
>>
>> Hi All,
>>
>> [I'm only subscribed to digest, so the reply doesn't look
great,
>> sorry about that]
>>
>>> The second component is a tool that other parts of LLVM (for
example, the loop vectorizer) can use to query the availability of the vector
function, the SVFS I have described in the original post of the RFC, which is
based on interpreting the `vector-variant` attribute.
>>> The final component is the one that seems to have generated most of
the controversies discussed in the thread, and for which I decided to move away
from `declare variant`.
>>
>> Where will the mapping between parameters positions be stored? Using
the example from
https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-vector-variant:
>>
>> float MyAdd(float* a, int b) { return *a + b; } 
>> __declspec(vector_variant(implements(MyAdd(float *a, int b)),
>>                         linear(a), vectorlength(8),
>>                         nomask, processor(core_2nd_gen_avx)))
>> __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2)
>>
>> We need somehow communicate which lanes of widened "b" would
map for the b1 parameter and which would go to the b2. If we only care about
single ABI (like the one mandated by the OMP) than such things could be put to
TTI, but what about other ABIs? Should we encode this explicitly in the
annotation too?
>>
>
> I think that the mapping between a scalar parameter and the correspondent
vector parameter(s - there can be more than one) should be handled by the Vector
Function ABI when a vector function ABI is defined.
>
> I am working out on a new proposal, I’ll keep you posted.
>
> I think that the requirements of 1. being a user feature 2. Based on a
standard (OpenMP), implies the fact that a contract between the scalar functions
and the vector functions must be stipulated in some document, such document
being a vector function ABI for the target.
>
> I am crafting the attribute so that it makes it explicit that we are using
OpenMP and we are expecting a Vector Function ABI.
>
> Kind regards,
>
> Francesco
>
>> Best Regards,
>> Andrei
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.

llvm dev - Jun 2019 - [RFC] Expose user provided vector function for auto-vectorization.

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.