thr3ads.net - llvm dev - [llvm-dev] Vectorizing minimum without function attributes [May 2019]

If this information is useful, please help other people find it:
Share via:

Nicolau Werneck via llvm-dev

2019-May-04 21:51 UTC

[llvm-dev] Vectorizing minimum without function attributes

Thanks for the reply. I should say I'm actually working on 6.0, but I
don't
think this part of the code changed much since. These are traces I made
with GDB optimizing a loop with floats and then integers, showing where
they diverge:
https://gist.github.com/nlw0/58ed9fda8e8944a9cb5e5a20f6038fcf

This is the point I believe we need to set the function attribute for the
vectorization to work with floats:
https://github.com/llvm/llvm-project/blob/fd254e429ea103be8bab6271855c04919d33f9fb/llvm/lib/Analysis/IVDescriptors.cpp#L590

Could this be a bug? It seems to me it just wasn't changed yet to depend
only on instruction flags.

I would gladly work on refactoring this if there's an opportunity, but
I'm
a complete newbie in this project. It would be great to hear from someone
more knowledgeable what can be done about this issue, especially if turns
out to be a very small patch!


On Sat, May 4, 2019 at 3:41 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:
> On 5/4/19 6:36 AM, llvm-dev wrote:
>
> Greetings,
>
> The LLVM loop vectorizer does a great job handling reductions with the
> `min(a, b)` function over an array of integers or floats. This finds the
> smallest value of a list exploiting SIMD instructions, and works just as
> well as a summation.
>
> Specifically with floats, though, using the `fcmp` instruction, the
> vectorization seems to require the function attribute
"no-nans-fp-math" to
> be set. Just setting instruction flags is not enough.
>
>
> fcmp takes fast-math flags now, but that wasn't always true (my
> recollection is that was a capability added after the arithmetic
> operations). In any case, I wonder if this is just a hold-over from before
> fcmp took fast-math flags, or if this is an && condition that
should be an
> || condition.
>
>
> This forces us to give up on fine-grained control of fast-math in the code
> in order to benefit from this vectorization.
>
> How to overcome this? LLVM has intrinsic functions such as `minnum` and
> `minimum` (`minnan`) that accurately represent the operation. This could
> permit fine-grained control of fast-math flags, although the vectorizer
> seems to ignore these intrinsics.
>
> Beyond this specific case, it would be nice to be sure when is it ever
> necessary to set these function attributes, e.g.
>
>
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750
>
> What would be a way to control the vectorization for `min` without having
> to rely on that function attribute? And furthermore, could LLVM
> optimizations conceivably depend only on instruction flags, and not ever on
> function attributes? What would be necessary to achieve this?
>
>
> The goal has been to eliminate the dependence on the function attributes
> once all of the necessary local flags are in place. Obviously I could be
> missing something, but this just seems like a bug.
>
>  -Hal
>
>
> Thanks,
>
> --
> Nicolau Werneck <nwerneck at gmail.com>
> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-- 
Nicolau Werneck <nwerneck at gmail.com>
http://n <http://nwerneck.sdf.org>ic.hpavc.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190504/1d91f64c/attachment.html>

Sanjay Patel via llvm-dev

2019-May-06 13:31 UTC

head link

[llvm-dev] Vectorizing minimum without function attributes

We have several problems with FP min/max optimization, so it would help to
see an IR example. But I agree that we should not require a function
attribute to enable the optimization.

This might be the closest match based on the description:
https://bugs.llvm.org/show_bug.cgi?id=35538

Other candidates:
https://bugs.llvm.org/show_bug.cgi?id=36982
https://bugs.llvm.org/show_bug.cgi?id=34149
https://bugs.llvm.org/show_bug.cgi?id=26956
https://bugs.llvm.org/show_bug.cgi?id=35284


On Sat, May 4, 2019 at 3:51 PM Nicolau Werneck via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Thanks for the reply. I should say I'm actually working on 6.0, but I
> don't think this part of the code changed much since. These are traces
I
> made with GDB optimizing a loop with floats and then integers, showing
> where they diverge:
> https://gist.github.com/nlw0/58ed9fda8e8944a9cb5e5a20f6038fcf
>
> This is the point I believe we need to set the function attribute for the
> vectorization to work with floats:
>
>
https://github.com/llvm/llvm-project/blob/fd254e429ea103be8bab6271855c04919d33f9fb/llvm/lib/Analysis/IVDescriptors.cpp#L590
>
> Could this be a bug? It seems to me it just wasn't changed yet to
depend
> only on instruction flags.
>
> I would gladly work on refactoring this if there's an opportunity, but
I'm
> a complete newbie in this project. It would be great to hear from someone
> more knowledgeable what can be done about this issue, especially if turns
> out to be a very small patch!
>
>
> On Sat, May 4, 2019 at 3:41 PM Finkel, Hal J. <hfinkel at anl.gov>
wrote:
>
>> On 5/4/19 6:36 AM, llvm-dev wrote:
>>
>> Greetings,
>>
>> The LLVM loop vectorizer does a great job handling reductions with the
>> `min(a, b)` function over an array of integers or floats. This finds
the
>> smallest value of a list exploiting SIMD instructions, and works just
as
>> well as a summation.
>>
>> Specifically with floats, though, using the `fcmp` instruction, the
>> vectorization seems to require the function attribute
"no-nans-fp-math" to
>> be set. Just setting instruction flags is not enough.
>>
>>
>> fcmp takes fast-math flags now, but that wasn't always true (my
>> recollection is that was a capability added after the arithmetic
>> operations). In any case, I wonder if this is just a hold-over from
before
>> fcmp took fast-math flags, or if this is an && condition that
should be an
>> || condition.
>>
>>
>> This forces us to give up on fine-grained control of fast-math in the
>> code in order to benefit from this vectorization.
>>
>> How to overcome this? LLVM has intrinsic functions such as `minnum` and
>> `minimum` (`minnan`) that accurately represent the operation. This
could
>> permit fine-grained control of fast-math flags, although the vectorizer
>> seems to ignore these intrinsics.
>>
>> Beyond this specific case, it would be nice to be sure when is it ever
>> necessary to set these function attributes, e.g.
>>
>>
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750
>>
>> What would be a way to control the vectorization for `min` without
having
>> to rely on that function attribute? And furthermore, could LLVM
>> optimizations conceivably depend only on instruction flags, and not
ever on
>> function attributes? What would be necessary to achieve this?
>>
>>
>> The goal has been to eliminate the dependence on the function
attributes
>> once all of the necessary local flags are in place. Obviously I could
be
>> missing something, but this just seems like a bug.
>>
>>  -Hal
>>
>>
>> Thanks,
>>
>> --
>> Nicolau Werneck <nwerneck at gmail.com>
>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
> --
> Nicolau Werneck <nwerneck at gmail.com>
> http://n <http://nwerneck.sdf.org>ic.hpavc.net
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190506/28ce5ce5/attachment.html>

Nicolau Werneck via llvm-dev

2019-May-06 19:18 UTC

head link

[llvm-dev] Vectorizing minimum without function attributes

Thanks for checking this out. I think your second comment on 35538 must be
precisely what I'm talking about.

I came up with this IR that might be used for testing:

```
define float @minloop(float* nocapture readonly)  #0 {
top:
  %1 = load float, float* %0
  br label %loop

loop:
  %2 = phi i64 [ %8, %loop ], [ 1, %top ]
  %3 = phi float [ %7, %loop ], [ %1, %top ]
  %4 = getelementptr float, float* %0, i64 %2
  %5 = load float, float* %4, align 4
  %6 = fcmp fast olt float %3, %5
  %7 = select i1 %6, float %3, float %5
  %8 = add i64 %2, 1
  %9 = icmp eq i64 %8, 65537
  br i1 %9, label %out, label %loop

out:
  ret float %7
}

attributes #0 = { "no-nans-fp-math"="true"}
```

I can get vectorized code if I use this:
opt -S minloop.ll -O2 -force-vector-width=8

Setting the attributes is necessary, and commenting it prevents the
vectorization. Neither the fcmp fast flag or the command-line argument
-fp-contract=fast have any effect.

The code that seems relevant to this issue is 6 years old. In fact, its
birthday was just yesterday!
https://github.com/llvm/llvm-project/commit/d96e427eacdd4d48da3a1f93a64008ef4dadcc8a#diff-b3c08c1a517adb5ef635cdf9cf224904R3087

If support for fast flags in fcmp was really introduced only after that,
then this was probably just never updated, and the fix might be as simple
as replacing `HasFunNoNaNAttr` by `I->hasNoNaNs()`. Does that sound like a
possible PR?



On Mon, May 6, 2019 at 3:31 PM Sanjay Patel <spatel at rotateright.com>
wrote:
> We have several problems with FP min/max optimization, so it would help to
> see an IR example. But I agree that we should not require a function
> attribute to enable the optimization.
>
> This might be the closest match based on the description:
> https://bugs.llvm.org/show_bug.cgi?id=35538
>
> Other candidates:
> https://bugs.llvm.org/show_bug.cgi?id=36982
> https://bugs.llvm.org/show_bug.cgi?id=34149
> https://bugs.llvm.org/show_bug.cgi?id=26956
> https://bugs.llvm.org/show_bug.cgi?id=35284
>
>
> On Sat, May 4, 2019 at 3:51 PM Nicolau Werneck via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Thanks for the reply. I should say I'm actually working on 6.0, but
I
>> don't think this part of the code changed much since. These are
traces I
>> made with GDB optimizing a loop with floats and then integers, showing
>> where they diverge:
>> https://gist.github.com/nlw0/58ed9fda8e8944a9cb5e5a20f6038fcf
>>
>> This is the point I believe we need to set the function attribute for
the
>> vectorization to work with floats:
>>
>>
https://github.com/llvm/llvm-project/blob/fd254e429ea103be8bab6271855c04919d33f9fb/llvm/lib/Analysis/IVDescriptors.cpp#L590
>>
>> Could this be a bug? It seems to me it just wasn't changed yet to
depend
>> only on instruction flags.
>>
>> I would gladly work on refactoring this if there's an opportunity,
but
>> I'm a complete newbie in this project. It would be great to hear
from
>> someone more knowledgeable what can be done about this issue,
especially if
>> turns out to be a very small patch!
>>
>>
>> On Sat, May 4, 2019 at 3:41 PM Finkel, Hal J. <hfinkel at
anl.gov> wrote:
>>
>>> On 5/4/19 6:36 AM, llvm-dev wrote:
>>>
>>> Greetings,
>>>
>>> The LLVM loop vectorizer does a great job handling reductions with
the
>>> `min(a, b)` function over an array of integers or floats. This
finds the
>>> smallest value of a list exploiting SIMD instructions, and works
just as
>>> well as a summation.
>>>
>>> Specifically with floats, though, using the `fcmp` instruction, the
>>> vectorization seems to require the function attribute
"no-nans-fp-math" to
>>> be set. Just setting instruction flags is not enough.
>>>
>>>
>>> fcmp takes fast-math flags now, but that wasn't always true (my
>>> recollection is that was a capability added after the arithmetic
>>> operations). In any case, I wonder if this is just a hold-over from
before
>>> fcmp took fast-math flags, or if this is an && condition
that should be an
>>> || condition.
>>>
>>>
>>> This forces us to give up on fine-grained control of fast-math in
the
>>> code in order to benefit from this vectorization.
>>>
>>> How to overcome this? LLVM has intrinsic functions such as `minnum`
and
>>> `minimum` (`minnan`) that accurately represent the operation. This
could
>>> permit fine-grained control of fast-math flags, although the
vectorizer
>>> seems to ignore these intrinsics.
>>>
>>> Beyond this specific case, it would be nice to be sure when is it
ever
>>> necessary to set these function attributes, e.g.
>>>
>>>
https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750
>>>
>>> What would be a way to control the vectorization for `min` without
>>> having to rely on that function attribute? And furthermore, could
LLVM
>>> optimizations conceivably depend only on instruction flags, and not
ever on
>>> function attributes? What would be necessary to achieve this?
>>>
>>>
>>> The goal has been to eliminate the dependence on the function
attributes
>>> once all of the necessary local flags are in place. Obviously I
could be
>>> missing something, but this just seems like a bug.
>>>
>>>  -Hal
>>>
>>>
>>> Thanks,
>>>
>>> --
>>> Nicolau Werneck <nwerneck at gmail.com>
>>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>>>
>>> _______________________________________________
>>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>
>> --
>> Nicolau Werneck <nwerneck at gmail.com>
>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-- 
Nicolau Werneck <nwerneck at gmail.com>
http://n <http://nwerneck.sdf.org>ic.hpavc.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190506/6f1bc49c/attachment.html>

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - May 2019 - Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

[llvm-dev] Vectorizing minimum without function attributes

Reasonably Related Threads