thr3ads.net - llvm dev - [llvm-dev] Working on FP SCEV Analysis [May 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-May-16 21:42 UTC

[llvm-dev] Working on FP SCEV Analysis

[+CC Andy]

Hi Elena,

I don't have any fundamental issues with teaching SCEV about floating
point types, but given this will be a major change, I think a high
level roadmap should be discussed on llvm-dev before we start
reviewing and committing changes.

Here are some issues that I think are worth discussing:

  - Core motivation: why do we even care about optimizing floating
    point induction variables?  What situations are they common in?  Do
    programmers _expect_ compilers to optimize them well?  (I haven't
    worked on our vectorizers so pardon the possibly stupid question)
    in the example you gave, why do you need SCEV to analyze the
    increment to vectorize the loop (i.e how does it help)?  What are
    some other concrete cases you'll want to optimize?

  - I presume you'll want SCEV expressions for `sitofp` and `uitofp`.
    (The most important question:) With these in the game, what is the
    canonical representation of SCEV expressions that can be expressed
    as, say, both `sitofp(A + B)` and `sitofp(A) + sitofp(B)`?

    Will we have a way to mark expressions (like we have `nsw` and
    `nuw` for `sext` and `zext`) which we can distribute `sitofp` and
    `uitofp` over?

    Same questions for `fptosi` and `fptoui`.

  - How will you partition the logic between floating and integer
    expressions in SCEV-land?  Will you have, say, `SCEVAddExpr` do
    different things based on type, or will you split it into
    `SCEVIAddExpr` and `SCEVFAddExpr`? [0]

    * There are likely to be similarities too -- e.g. the "inductive"
      or "control flow" aspect of `SCEVAddRecExpr` is likely to be
      common between floating point add recurrences[1], and integer add
      recurrences; and part of figuring out the partitioning is also
      figuring out how to re-use these bits of logic.
	

[0]: I'll prefer the latter since e.g. integer addition is
associative, but floating point addition isn't; and it is better to
force programmers to handle the two operations differently.

[1]: For instance, things like this:
https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L7564
are likely to stay common between floating point and integer add recs.

-- Sanjoy

Andrew Trick via llvm-dev

2016-May-16 22:03 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

> On May 16, 2016, at 2:42 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> [+CC Andy]
> 
> Hi Elena,
> 
> I don't have any fundamental issues with teaching SCEV about floating
> point types, but given this will be a major change, I think a high
> level roadmap should be discussed on llvm-dev before we start
> reviewing and committing changes.
> 
> Here are some issues that I think are worth discussing:
> 
> - Core motivation: why do we even care about optimizing floating
>  point induction variables?  What situations are they common in?  Do
>  programmers _expect_ compilers to optimize them well?  (I haven't
>  worked on our vectorizers so pardon the possibly stupid question)
>  in the example you gave, why do you need SCEV to analyze the
>  increment to vectorize the loop (i.e how does it help)?  What are
>  some other concrete cases you'll want to optimize?
> 
> - I presume you'll want SCEV expressions for `sitofp` and `uitofp`.
>  (The most important question:) With these in the game, what is the
>  canonical representation of SCEV expressions that can be expressed
>  as, say, both `sitofp(A + B)` and `sitofp(A) + sitofp(B)`?
> 
>  Will we have a way to mark expressions (like we have `nsw` and
>  `nuw` for `sext` and `zext`) which we can distribute `sitofp` and
>  `uitofp` over?
> 
>  Same questions for `fptosi` and `fptoui`.
> 
> - How will you partition the logic between floating and integer
>  expressions in SCEV-land?  Will you have, say, `SCEVAddExpr` do
>  different things based on type, or will you split it into
>  `SCEVIAddExpr` and `SCEVFAddExpr`? [0]
> 
>  * There are likely to be similarities too -- e.g. the
"inductive"
>    or "control flow" aspect of `SCEVAddRecExpr` is likely to be
>    common between floating point add recurrences[1], and integer add
>    recurrences; and part of figuring out the partitioning is also
>    figuring out how to re-use these bits of logic.
> 	
> 
> [0]: I'll prefer the latter since e.g. integer addition is
> associative, but floating point addition isn't; and it is better to
> force programmers to handle the two operations differently.
> 
> [1]: For instance, things like this:
>
https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L7564
> are likely to stay common between floating point and integer add recs.
> 
> — Sanjoy
I don’t think it makes sense to consider a floating point induction variable an
“add recurrence” unless it the inductive value can be promoted to an integer.
(The computational properties of chained recurrences are defined on integers).

IndVarSimplify handleFloatingPointIV is supposed to promote your FP induction
variables to integers so that SCEV can analyze them. After all the loop opts
runs, the LSR is supposed to cleanup any casts during optimizeShadowIV.

-Andy

Andrew Trick via llvm-dev

2016-May-16 22:43 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

> On May 16, 2016, at 3:03 PM, Andrew Trick <atrick at apple.com>
wrote:
> 
> 
>> On May 16, 2016, at 2:42 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
>> 
>> [+CC Andy]
>> 
>> Hi Elena,
>> 
>> I don't have any fundamental issues with teaching SCEV about
floating
>> point types, but given this will be a major change, I think a high
>> level roadmap should be discussed on llvm-dev before we start
>> reviewing and committing changes.
>> 
>> Here are some issues that I think are worth discussing:
>> 
>> - Core motivation: why do we even care about optimizing floating
>> point induction variables?  What situations are they common in?  Do
>> programmers _expect_ compilers to optimize them well?  (I haven't
>> worked on our vectorizers so pardon the possibly stupid question)
>> in the example you gave, why do you need SCEV to analyze the
>> increment to vectorize the loop (i.e how does it help)?  What are
>> some other concrete cases you'll want to optimize?
>> 
>> - I presume you'll want SCEV expressions for `sitofp` and `uitofp`.
>> (The most important question:) With these in the game, what is the
>> canonical representation of SCEV expressions that can be expressed
>> as, say, both `sitofp(A + B)` and `sitofp(A) + sitofp(B)`?
>> 
>> Will we have a way to mark expressions (like we have `nsw` and
>> `nuw` for `sext` and `zext`) which we can distribute `sitofp` and
>> `uitofp` over?
>> 
>> Same questions for `fptosi` and `fptoui`.
>> 
>> - How will you partition the logic between floating and integer
>> expressions in SCEV-land?  Will you have, say, `SCEVAddExpr` do
>> different things based on type, or will you split it into
>> `SCEVIAddExpr` and `SCEVFAddExpr`? [0]
>> 
>> * There are likely to be similarities too -- e.g. the
"inductive"
>>   or "control flow" aspect of `SCEVAddRecExpr` is likely to
be
>>   common between floating point add recurrences[1], and integer add
>>   recurrences; and part of figuring out the partitioning is also
>>   figuring out how to re-use these bits of logic.
>> 	
>> 
>> [0]: I'll prefer the latter since e.g. integer addition is
>> associative, but floating point addition isn't; and it is better to
>> force programmers to handle the two operations differently.
>> 
>> [1]: For instance, things like this:
>>
https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L7564
>> are likely to stay common between floating point and integer add recs.
>> 
>> — Sanjoy
> 
> I don’t think it makes sense to consider a floating point induction
variable an “add recurrence” unless it the inductive value can be promoted to an
integer. (The computational properties of chained recurrences are defined on
integers).
> 
> IndVarSimplify handleFloatingPointIV is supposed to promote your FP
induction variables to integers so that SCEV can analyze them. After all the
loop opts runs, the LSR is supposed to cleanup any casts during
optimizeShadowIV.
That said, there are probably many reasons (e.g. non-integer initial value) that
you can’t promote a FP IV to an Integer, but you may still want to vectorize it
in fast-math mode. I’ll leave it to MichaelZ and Adam to comment on how best to
prepare such loops for vectorization.

-Andy

Demikhovsky, Elena via llvm-dev

2016-May-17 10:14 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

Hi Sanjoy,

Please see my answers bellow:

  - Core motivation: why do we even care about optimizing floating
    point induction variables?  What situations are they common in?  Do
    programmers _expect_ compilers to optimize them well?  (I haven't
    worked on our vectorizers so pardon the possibly stupid question)
    in the example you gave, why do you need SCEV to analyze the
    increment to vectorize the loop (i.e how does it help)?  What are
    some other concrete cases you'll want to optimize?

[Demikhovsky, Elena] I gave an example of loop that can be vectorized in the
fast-math mode. ICC compiler vectorizes loops with *primary* and *secondary*
IVs:
This is the example for *primary* induction:

(1) for (float i = 0.5; i < 0.75; i+=0.05) {}   → i is a "primary"
IV

And for *secondary*:

(2) for (int i = 0, float x = start; i < N; i++, x += delta) {}     → x is a
"secondary" IV

Now I'm working only on (2)

  - I presume you'll want SCEV expressions for `sitofp` and `uitofp`.

[Demikhovsky, Elena] I'm adding these expressions, of course. They are
similar to "truncate" and "zext", in terms of
implementation.

    (The most important question:) With these in the game, what is the
    canonical representation of SCEV expressions that can be expressed
    as, say, both `sitofp(A + B)` and `sitofp(A) + sitofp(B)`?
[Demikhovsky, Elena] Meanwhile I have  (start + delta * sitofp(i)).
I don't know how far we can go with FP simplification and under what flags.
The first implementation does not assume that sitofp(A + B) is equal to
sitofp(A) + sitofp(B)


    Will we have a way to mark expressions (like we have `nsw` and
    `nuw` for `sext` and `zext`) which we can distribute `sitofp` and
    `uitofp` over?
[Demikhovsky, Elena] I assume that sitofp and uitofp should be 2 different
operations.

    Same questions for `fptosi` and `fptoui`.
[Demikhovsky, Elena] the same answer as above, because I don’t want to combine
these operations

  - How will you partition the logic between floating and integer
    expressions in SCEV-land?  Will you have, say, `SCEVAddExpr` do
    different things based on type, or will you split it into
    `SCEVIAddExpr` and `SCEVFAddExpr`? [0]

[Demikhovsky, Elena] Yes, I’m introducing SCEVFAddExpr and SCEVFMulExpr - (start
+ delta * sitofp(i))

    * There are likely to be similarities too -- e.g. the "inductive"
      or "control flow" aspect of `SCEVAddRecExpr` is likely to be
      common between floating point add recurrences[1], and integer add
      recurrences; and part of figuring out the partitioning is also
      figuring out how to re-use these bits of logic.
[Demikhovsky, Elena] I’m adding SCEVFAddRecExpr to describe the recurrence of FP
IV


[0]: I'll prefer the latter since e.g. integer addition is associative, but
floating point addition isn't; and it is better to force programmers to
handle the two operations differently.

[1]: For instance, things like this:
https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L7564
are likely to stay common between floating point and integer add recs.

-- Sanjoy

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160517/130bf88f/attachment.html>

Saito, Hideki via llvm-dev

2016-May-18 00:17 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

>What situations are they common in?
ICC Vectorizer made a paradigm shift a while ago.
If there aren’t a clear reason why something can’t be vectorized, we should try
our best to vectorize.
The rest is a performance modeling (and priority to implement) question, not a
capability question.
We believe this is a good paradigm to follow in a vectorizer development. It was
a big departure from
“vectorize when all things look nice to vectorizer”.

We shouldn’t give up vectorizing simply because programmer wrote a FP induction
code.(*)
Then, the next question is what’s the best solution for that problem, and
extending SCEV
appears to be one of the obvious directions to explore.

Thanks,
Hideki Saito
Intel Compilers and Languages
----------------------
(*) Quick (and dirty) overview of vectorization legality
Vectorization is a cross-iteration optimization. We need to have a solution for
cross-iteration dependences.
Forward dependencies are considered “safe for vectorization” since vector
execution order naturally satisfy them.
Backward dependencies are unsafe, unless vectorizer knows how to “break” them.
Induction is cyclic dependence
by nature and as such considered unsafe for vectorization, unless vectorizer
knows how to break them.
[Given a CFG that executes from top to bottom, forward dependence is the
downward data dependence edge.]

_____________________________________________
From: Demikhovsky, Elena
Sent: Tuesday, May 17, 2016 3:15 AM
To: Sanjoy Das <sanjoy at playingwithpointers.com>; Chandler Carruth
<chandlerc at google.com>
Cc: llvm-dev <llvm-dev at lists.llvm.org>; Hal Finkel (hfinkel at anl.gov)
<hfinkel at anl.gov>; Adam Nemet (anemet at apple.com) <anemet at
apple.com>; Andrew Trick <atrick at apple.com>; mzolotukhin at
apple.com; Zaks, Ayal <ayal.zaks at intel.com>; Saito, Hideki
<hideki.saito at intel.com>
Subject: RE: [llvm-dev] Working on FP SCEV Analysis


Hi Sanjoy,

Please see my answers bellow:

  - Core motivation: why do we even care about optimizing floating
    point induction variables?  What situations are they common in?  Do
    programmers _expect_ compilers to optimize them well?  (I haven't
    worked on our vectorizers so pardon the possibly stupid question)
    in the example you gave, why do you need SCEV to analyze the
    increment to vectorize the loop (i.e how does it help)?  What are
    some other concrete cases you'll want to optimize?

[Demikhovsky, Elena] I gave an example of loop that can be vectorized in the
fast-math mode. ICC compiler vectorizes loops with *primary* and *secondary*
IVs:
This is the example for *primary* induction:

(1) for (float i = 0.5; i < 0.75; i+=0.05) {}   → i is a "primary"
IV

And for *secondary*:

(2) for (int i = 0, float x = start; i < N; i++, x += delta) {}     → x is a
"secondary" IV

Now I'm working only on (2)

  - I presume you'll want SCEV expressions for `sitofp` and `uitofp`.

[Demikhovsky, Elena] I'm adding these expressions, of course. They are
similar to "truncate" and "zext", in terms of
implementation.

    (The most important question:) With these in the game, what is the
    canonical representation of SCEV expressions that can be expressed
    as, say, both `sitofp(A + B)` and `sitofp(A) + sitofp(B)`?
[Demikhovsky, Elena] Meanwhile I have  (start + delta * sitofp(i)).
I don't know how far we can go with FP simplification and under what flags.
The first implementation does not assume that sitofp(A + B) is equal to
sitofp(A) + sitofp(B)


    Will we have a way to mark expressions (like we have `nsw` and
    `nuw` for `sext` and `zext`) which we can distribute `sitofp` and
    `uitofp` over?
[Demikhovsky, Elena] I assume that sitofp and uitofp should be 2 different
operations.

    Same questions for `fptosi` and `fptoui`.
[Demikhovsky, Elena] the same answer as above, because I don’t want to combine
these operations

  - How will you partition the logic between floating and integer
    expressions in SCEV-land?  Will you have, say, `SCEVAddExpr` do
    different things based on type, or will you split it into
    `SCEVIAddExpr` and `SCEVFAddExpr`? [0]

[Demikhovsky, Elena] Yes, I’m introducing SCEVFAddExpr and SCEVFMulExpr - (start
+ delta * sitofp(i))

    * There are likely to be similarities too -- e.g. the "inductive"
      or "control flow" aspect of `SCEVAddRecExpr` is likely to be
      common between floating point add recurrences[1], and integer add
      recurrences; and part of figuring out the partitioning is also
      figuring out how to re-use these bits of logic.
[Demikhovsky, Elena] I’m adding SCEVFAddRecExpr to describe the recurrence of FP
IV


[0]: I'll prefer the latter since e.g. integer addition is associative, but
floating point addition isn't; and it is better to force programmers to
handle the two operations differently.

[1]: For instance, things like this:
https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L7564
are likely to stay common between floating point and integer add recs.

-- Sanjoy

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160518/f406e553/attachment.html>

Owen Anderson via llvm-dev

2016-May-18 03:49 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

> On May 16, 2016, at 2:42 PM, Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> - Core motivation: why do we even care about optimizing floating
>   point induction variables?  What situations are they common in?  Do
>   programmers _expect_ compilers to optimize them well?  (I haven't
>   worked on our vectorizers so pardon the possibly stupid question)
>   in the example you gave, why do you need SCEV to analyze the
>   increment to vectorize the loop (i.e how does it help)?  What are
>   some other concrete cases you'll want to optimize?
Graphics shading languages like GLSL or HLSL have floating point types as the
“default” types.  Integers weren’t even added until later revisions of GLSL.  In
that world, it’s not especially strange to imagine a loop counter written in
floating point.

—Owen
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160517/9948f60e/attachment.html>

Chandler Carruth via llvm-dev

2016-May-18 04:27 UTC

head link

[llvm-dev] Working on FP SCEV Analysis

On Tue, May 17, 2016 at 8:49 PM Owen Anderson <resistor at mac.com> wrote:
>
> On May 16, 2016, at 2:42 PM, Sanjoy Das via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> - Core motivation: why do we even care about optimizing floating
>   point induction variables?  What situations are they common in?  Do
>   programmers _expect_ compilers to optimize them well?  (I haven't
>   worked on our vectorizers so pardon the possibly stupid question)
>   in the example you gave, why do you need SCEV to analyze the
>   increment to vectorize the loop (i.e how does it help)?  What are
>   some other concrete cases you'll want to optimize?
>
>
> Graphics shading languages like GLSL or HLSL have floating point types as
> the “default” types.  Integers weren’t even added until later revisions of
> GLSL.  In that world, it’s not especially strange to imagine a loop counter
> written in floating point.
>
But most of those are convertible to integer IVs which as Andy pointed out
up the thread is something IndVarSimplify tries to do.

A potential way to show a motivating use case is what Andy already
outlined: cases where non-integer values are fundamentally used as part of
the IV.

Even then, I'd personally want to see further evidence of why the correct
solution is to model the floating point IV in SCEV rather than find a more
powerful way of converting the IV to an integer that models the non-integer
values taken on by the IV. As an example, if the use case is the following
code with appropriate flags to relax IEEE semantics so this looks like
normal algabra etc:

  for (float f = 0.01f; f < 1.0f; f += 0.01f)
    ...

I'd rather see us cleverly turn it into:

  float f = 0.01f;
  for (int i = 1; i < 100; i += 1, f += 0.01f)
    ...

(or whatever the transform would be, I've not spent lots of time thinking
about the exact way to map this onto a synthetic "adding that value N times
crosses threshold, so let's replace IV with a counter to N")

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160518/7814064d/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - May 2016 - Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

[llvm-dev] Working on FP SCEV Analysis

Apparently Analagous Threads