thr3ads.net - llvm dev - [llvm-dev] Vectorizing minimum without function attributes [May 2019]

If this information is useful, please help other people find it:
Share via:

Nicolau Werneck via llvm-dev

2019-May-04 11:36 UTC

[llvm-dev] Vectorizing minimum without function attributes

Greetings,

The LLVM loop vectorizer does a great job handling reductions with the
`min(a, b)` function over an array of integers or floats. This finds the
smallest value of a list exploiting SIMD instructions, and works just as
well as a summation.

Specifically with floats, though, using the `fcmp` instruction, the
vectorization seems to require the function attribute
"no-nans-fp-math" to
be set. Just setting instruction flags is not enough. This forces us to
give up on fine-grained control of fast-math in the code in order to
benefit from this vectorization.

How to overcome this? LLVM has intrinsic functions such as `minnum` and
`minimum` (`minnan`) that accurately represent the operation. This could
permit fine-grained control of fast-math flags, although the vectorizer
seems to ignore these intrinsics.

Beyond this specific case, it would be nice to be sure when is it ever
necessary to set these function attributes, e.g.
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750

What would be a way to control the vectorization for `min` without having
to rely on that function attribute? And furthermore, could LLVM
optimizations conceivably depend only on instruction flags, and not ever on
function attributes? What would be necessary to achieve this?

Thanks,

-- 
Nicolau Werneck <nwerneck at gmail.com>
http://n <http://nwerneck.sdf.org>ic.hpavc.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190504/8befc991/attachment.html>

Finkel, Hal J. via llvm-dev

2019-May-04 13:41 UTC

head link

[llvm-dev] Vectorizing minimum without function attributes

On 5/4/19 6:36 AM, llvm-dev wrote:
Greetings,

The LLVM loop vectorizer does a great job handling reductions with the `min(a,
b)` function over an array of integers or floats. This finds the smallest value
of a list exploiting SIMD instructions, and works just as well as a summation.

Specifically with floats, though, using the `fcmp` instruction, the
vectorization seems to require the function attribute
"no-nans-fp-math" to be set. Just setting instruction flags is not
enough.


fcmp takes fast-math flags now, but that wasn't always true (my recollection
is that was a capability added after the arithmetic operations). In any case, I
wonder if this is just a hold-over from before fcmp took fast-math flags, or if
this is an && condition that should be an || condition.


This forces us to give up on fine-grained control of fast-math in the code in
order to benefit from this vectorization.

How to overcome this? LLVM has intrinsic functions such as `minnum` and
`minimum` (`minnan`) that accurately represent the operation. This could permit
fine-grained control of fast-math flags, although the vectorizer seems to ignore
these intrinsics.

Beyond this specific case, it would be nice to be sure when is it ever necessary
to set these function attributes, e.g.
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750

What would be a way to control the vectorization for `min` without having to
rely on that function attribute? And furthermore, could LLVM optimizations
conceivably depend only on instruction flags, and not ever on function
attributes? What would be necessary to achieve this?


The goal has been to eliminate the dependence on the function attributes once
all of the necessary local flags are in place. Obviously I could be missing
something, but this just seems like a bug.

 -Hal

Thanks,

--
Nicolau Werneck <nwerneck at gmail.com<mailto:nwerneck at
gmail.com>>
http://n<http://nwerneck.sdf.org>ic.hpavc.net<http://ic.hpavc.net>



_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190504/d2e28845/attachment.html>

Nicolau Werneck via llvm-dev

2019-May-04 21:51 UTC

head link

[llvm-dev] Vectorizing minimum without function attributes

Thanks for the reply. I should say I'm actually working on 6.0, but I
don't
think this part of the code changed much since. These are traces I made
with GDB optimizing a loop with floats and then integers, showing where
they diverge:
https://gist.github.com/nlw0/58ed9fda8e8944a9cb5e5a20f6038fcf

This is the point I believe we need to set the function attribute for the
vectorization to work with floats:
https://github.com/llvm/llvm-project/blob/fd254e429ea103be8bab6271855c04919d33f9fb/llvm/lib/Analysis/IVDescriptors.cpp#L590

Could this be a bug? It seems to me it just wasn't changed yet to depend
only on instruction flags.

I would gladly work on refactoring this if there's an opportunity, but
I'm
a complete newbie in this project. It would be great to hear from someone
more knowledgeable what can be done about this issue, especially if turns
out to be a very small patch!


On Sat, May 4, 2019 at 3:41 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:
> On 5/4/19 6:36 AM, llvm-dev wrote:
>
> Greetings,
>
> The LLVM loop vectorizer does a great job handling reductions with the
> `min(a, b)` function over an array of integers or floats. This finds the
> smallest value of a list exploiting SIMD instructions, and works just as
> well as a summation.
>
> Specifically with floats, though, using the `fcmp` instruction, the
> vectorization seems to require the function attribute
"no-nans-fp-math" to
> be set. Just setting instruction flags is not enough.
>
>
> fcmp takes fast-math flags now, but that wasn't always true (my
> recollection is that was a capability added after the arithmetic
> operations). In any case, I wonder if this is just a hold-over from before
> fcmp took fast-math flags, or if this is an && condition that
should be an
> || condition.
>
>
> This forces us to give up on fine-grained control of fast-math in the code
> in order to benefit from this vectorization.
>
> How to overcome this? LLVM has intrinsic functions such as `minnum` and
> `minimum` (`minnan`) that accurately represent the operation. This could
> permit fine-grained control of fast-math flags, although the vectorizer
> seems to ignore these intrinsics.
>
> Beyond this specific case, it would be nice to be sure when is it ever
> necessary to set these function attributes, e.g.
>
>
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750
>
> What would be a way to control the vectorization for `min` without having
> to rely on that function attribute? And furthermore, could LLVM
> optimizations conceivably depend only on instruction flags, and not ever on
> function attributes? What would be necessary to achieve this?
>
>
> The goal has been to eliminate the dependence on the function attributes
> once all of the necessary local flags are in place. Obviously I could be
> missing something, but this just seems like a bug.
>
>  -Hal
>
>
> Thanks,
>
> --
> Nicolau Werneck <nwerneck at gmail.com>
> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-- 
Nicolau Werneck <nwerneck at gmail.com>
http://n <http://nwerneck.sdf.org>ic.hpavc.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190504/1d91f64c/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - May 2019 - Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

Possibly Parallel Threads