thr3ads.net - llvm dev - [llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Neil Henning via llvm-dev

2020-Jul-17 11:51 UTC

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Oh interesting - I hadn't even considered registering vector descriptors
for the LLVM intrinsics, but right enough when I just registered that pow
has a vector variant (itself of a bigger size) I got the correct 8-wide
variants like I was expecting - nice!

Thanks for the help!

Cheers,
-Neil.

On Fri, Jul 17, 2020 at 12:09 PM Florian Hahn <florian_hahn at apple.com>
wrote:
>
>
> On 16 Jul 2020, at 19:54, Neil Henning via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> So for us we use SLEEF to actually implement the libcalls (LLVM
> intrinsics) that LLVM by default would generate - and since SLEEF has
> highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to
> use that.
>
>
> Right, the way vector versions of library functions are accessed by the
> vectoriser has changed since the last release. I think the initial patch
> was https://reviews.llvm.org/D70107.
>
> Vector functions now must be annotated with a vector-function-abi-variant
> function attribute. There’s the -inject-tli-mappings pass, that is supposed
> to add the attributes for vector functions from TLI. It seems like this is
> currently not happening for your custom TLI mappings for some reason.
>
> For example, the Accelerate library has a vector version of log10. Running
> `opt -vector-library=Accelerate -inject-tli-mappings` on the IR below will
> add the following attribute to the llvm.log10 call-site, indicating that
> there’s a <4 x float> version of log10 called vlog10f.
>
> {
"vector-function-abi-variant"="_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)"
}
>
>
> To double-check, if running -inject-tli-mappings on your example does not
> add the vector-function-abi-variant attribute for `pow`, the vectorisers
> won’t know about them. If the vector-function-abi-variant attribute is
> actually created, but the vector version is not used nonetheless, it would
> be great if you could share the IR with the attributes, as they depend on
> the downstream TLI.
>
> I am also CC’ing Francesco, who might be able to help you pinning down
> where exactly things go wrong with the mapping.
>
> Cheers,
> Florian
>
> ——
>
> define float @call_llvm.log10.f32(float %in) {
>   %call = tail call float @llvm.log10.f32(float %in)
>   ret float %call
> }
>
> declare float @llvm.log10.f32(float)
>

-- 
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200717/e903ada7/attachment.html>

Florian Hahn via llvm-dev

2020-Jul-17 14:19 UTC

head link

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

> On Jul 17, 2020, at 12:51, Neil Henning <neil.henning at unity3d.com>
wrote:
> 
> Oh interesting - I hadn't even considered registering vector
descriptors for the LLVM intrinsics, but right enough when I just registered
that pow has a vector variant (itself of a bigger size) I got the correct 8-wide
variants like I was expecting - nice!
> 
> Thanks for the help!
No worries. It might be worth calling this new behavior out in the release notes
for 11.0

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200717/57ae8fd7/attachment.html>

Francesco Petrogalli via llvm-dev

2020-Jul-17 14:23 UTC

head link

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Hi all,

Thank you Florian for pointing Neil in the right direction!

I am glad this has been working for you Neil. Feel free to contact me directly
if you need anything related to function vectorization again.
> It might be worth calling this new behavior out in the release notes for
11.0
How can we make sure this happens? Is there a process where we could write into
release notes?

Kind regards,

Francesco



From: Florian Hahn <florian_hahn at apple.com>
Date: Friday, July 17, 2020 at 9:19 AM
To: Neil Henning <neil.henning at unity3d.com>
Cc: Sanjay Patel <spatel at rotateright.com>, llvm-dev <llvm-dev at
lists.llvm.org>, Francesco Petrogalli <Francesco.Petrogalli at arm.com>
Subject: Re: [llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide
loop vectorization for AVX-enabled target




On Jul 17, 2020, at 12:51, Neil Henning <mailto:neil.henning at
unity3d.com> wrote:

Oh interesting - I hadn't even considered registering vector descriptors for
the LLVM intrinsics, but right enough when I just registered that pow has a
vector variant (itself of a bigger size) I got the correct 8-wide variants like
I was expecting - nice!

Thanks for the help!

No worries. It might be worth calling this new behavior out in the release notes
for 11.0

llvm dev - Jul 2020 - LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target