thr3ads.net - llvm dev - [llvm-dev] How to best deal with undesirable Induction Variable Simplification? [Aug 2019]

If this information is useful, please help other people find it:
Share via:

Danila Malyutin via llvm-dev

2019-Aug-13 15:25 UTC

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

I've noticed that there was an attempt to mitigate ExitValues problem in
https://reviews.llvm.org/D12494 that went nowhere. Were there particular issues
with that approach?

--
Danila

From: Philip Reames [mailto:listmail at philipreames.com]
Sent: Saturday, August 10, 2019 02:05
To: Danila Malyutin <Danila.Malyutin at synopsys.com>; Finkel, Hal J.
<hfinkel at anl.gov>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?

On 8/9/19 8:27 AM, Danila Malyutin via llvm-dev wrote:
Hi Hal,

I see. So LSR could theoretically counteract undesirable Ind Var transformations
but it's not implemented at the moment?

I think I've managed to come up with a small reproducer that can also
exhibit similar problem on x86, here it is:
https://godbolt.org/z/_wxzut<https://urldefense.proofpoint.com/v2/url?u=https-3A__godbolt.org_z_-5Fwxzut&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=-qhi7IRwOrqjcv_cxlhP6lbVWspNKWeDT4amCIHR1sU&e=>

As you can see, when rewriteLoopExitValues is not disabled Clang generates worse
code due to additional spills, because Ind Vars rewrites all exit values of
'a' to recompute it's value instead of reusing the value from the
loop body. This requires extra registers for the new "a after the
loop" value (since it's not simply reused) and also to store the new
"offset", which leads to the extra spills since they all live across
big loop body. When exit values are not rewritten 'a' stays in it's
`r15d` register with no extra costs.

This hits on a point I've thought some about, but haven't tried to
implement.

I think there might be room for a late pass which undoes the exit value
rewriting.  As an analogy, we have MachineLICM which sometimes undoes the
transforms performed by LICM, but we still want the IR form to hoist
aggressively for ease of optimization and analysis.

Maybe this should be part of LSR, or maybe separate.  Haven't thought about
that part extensively.

It's worth noting that the SCEVs for the exit value of the value inside the
loop and the rewritten exit value should be identical.  So recognizing the case
for potential rewriting is quite straight-forward.  The profitability reasoning
might be more involved, but the legality part should essentially be handled by
SCEV, and should be able to reuse exactly the same code as RLEV.

--
Danila

From: Finkel, Hal J. [mailto:hfinkel at anl.gov]
Sent: Thursday, August 8, 2019 21:24
To: Danila Malyutin <Danila.Malyutin at
synopsys.com><mailto:Danila.Malyutin at synopsys.com>
Subject: Re: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?

Hi, Danila,

Regarding the first case, this is certainly a problem that has come up before.
As I recall, and I believe this is still true, LoopStrengthReduce, where we
reason about induction variables while accounting for register pressure,
won't currently add new PHIs. People have talked about extending LSR to
consider adding new PHIs in the past.

Regarding the second case, could you post a more-detailed description? I
don't quite understand the issue.

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> on behalf of Danila Malyutin via llvm-dev <llvm-dev
at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Sent: Thursday, August 8, 2019 12:36 PM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?

Hello,
Recently I've come across two instances where Induction Variable
Simplification lead to noticable performance regressions.

In one case, the removal of extra IV lead to the inability to reschedule
instructions in a tight loop to reduce stalls. In that case, there were enough
registers to spare, so using extra register for extra induction variable was
preferable since it reduced dependencies in the loop.
In the second case, there was a big nested loop made even bigger after
unswitching. However, the inner loop body was rather simple, of the form:

loop {

  p+=n;

...

  p+=n;

...

}
use p.

Due to unswitching there were several such loops each with the different number
of p+=n ops, so when the IndVars pass rewrote all exit values, it added a lot of
slightly different offsets to the main loop header that couldn't fit in the
available registers which lead to unnecessary spills/reloads.

I am wondering what is the usual strategy for dealing with such
"pessimizations"? Is it possible to somehow modify the IndVarSimplify
pass to take those issues into account (for example, tell it that adding offset
computation + gep is potentially more expensive than simply reusing last var
from the loop) or should it be recovered in some later pass? If so, is there an
easy way to revert IV elimination? Have anyone dealt with similar issues before?

--

Danila

_______________________________________________

LLVM Developers mailing list

llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=yx8qR1CqElqmkWtFEZai6IE4tZr66rXpt7QYSVvsv6Q&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190813/dc10c18a/attachment.html>

Philip Reames via llvm-dev

2019-Aug-13 16:01 UTC

head link

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

Wasn't aware of this patch.  No, I don't see an obvious reason why it 
wasn't followed up on.

Philip

On 8/13/19 8:25 AM, Danila Malyutin wrote:>
> I’ve noticed that there was an attempt to mitigate ExitValues problem 
> in https://reviews.llvm.org/D12494 that went nowhere. Were there 
> particular issues with that approach?
>
> --
>
> Danila
>
> *From:*Philip Reames [mailto:listmail at philipreames.com]
> *Sent:* Saturday, August 10, 2019 02:05
> *To:* Danila Malyutin <Danila.Malyutin at synopsys.com>; Finkel, Hal
J.
> <hfinkel at anl.gov>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] How to best deal with undesirable Induction 
> Variable Simplification?
>
> On 8/9/19 8:27 AM, Danila Malyutin via llvm-dev wrote:
>
>     Hi Hal,
>
>     I see. So LSR could theoretically counteract undesirable Ind Var
>     transformations but it’s not implemented at the moment?
>
>     I think I’ve managed to come up with a small reproducer that can
>     also exhibit similar problem on x86, here it is:
>     https://godbolt.org/z/_wxzut
>    
<https://urldefense.proofpoint.com/v2/url?u=https-3A__godbolt.org_z_-5Fwxzut&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=-qhi7IRwOrqjcv_cxlhP6lbVWspNKWeDT4amCIHR1sU&e=>
>
>     As you can see, when rewriteLoopExitValues is not disabled Clang
>     generates worse code due to additional spills, because Ind Vars
>     rewrites all exit values of ‘a’ to recompute it’s value instead of
>     reusing the value from the loop body. This requires extra
>     registers for the new “a after the loop” value (since it’s not
>     simply reused) and also to store the new “offset”, which leads to
>     the extra spills since they all live across big loop body. When
>     exit values are not rewritten ‘a’ stays in it’s `r15d` register
>     with no extra costs.
>
> This hits on a point I've thought some about, but haven't tried to 
> implement.
>
> I think there might be room for a late pass which undoes the exit 
> value rewriting.  As an analogy, we have MachineLICM which sometimes 
> undoes the transforms performed by LICM, but we still want the IR form 
> to hoist aggressively for ease of optimization and analysis.
>
> Maybe this should be part of LSR, or maybe separate. Haven't thought 
> about that part extensively.
>
> It's worth noting that the SCEVs for the exit value of the value 
> inside the loop and the rewritten exit value should be identical.  So 
> recognizing the case for potential rewriting is quite 
> straight-forward.  The profitability reasoning might be more involved, 
> but the legality part should essentially be handled by SCEV, and 
> should be able to reuse exactly the same code as RLEV.
>
>     --
>
>     Danila
>
>     *From:* Finkel, Hal J. [mailto:hfinkel at anl.gov]
>     *Sent:* Thursday, August 8, 2019 21:24
>     *To:* Danila Malyutin <Danila.Malyutin at synopsys.com>
>     <mailto:Danila.Malyutin at synopsys.com>
>     *Subject:* Re: [llvm-dev] How to best deal with undesirable
>     Induction Variable Simplification?
>
>     Hi, Danila,
>
>     Regarding the first case, this is certainly a problem that has
>     come up before. As I recall, and I believe this is still
>     true, LoopStrengthReduce, where we reason about induction
>     variables while accounting for register pressure, won't currently
>     add new PHIs. People have talked about extending LSR to consider
>     adding new PHIs in the past.
>
>     Regarding the second case, could you post a more-detailed
>     description? I don't quite understand the issue.
>
>      -Hal
>
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>    
------------------------------------------------------------------------
>
>     *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org
>     <mailto:llvm-dev-bounces at lists.llvm.org>> on behalf of
Danila
>     Malyutin via llvm-dev <llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>>
>     *Sent:* Thursday, August 8, 2019 12:36 PM
>     *To:* llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>>
>     *Subject:* [llvm-dev] How to best deal with undesirable Induction
>     Variable Simplification?
>
>     Hello,
>     Recently I’ve come across two instances where Induction Variable
>     Simplification lead to noticable performance regressions.
>
>     In one case, the removal of extra IV lead to the inability to
>     reschedule instructions in a tight loop to reduce stalls. In that
>     case, there were enough registers to spare, so using extra
>     register for extra induction variable was preferable since it
>     reduced dependencies in the loop.
>     In the second case, there was a big nested loop made even bigger
>     after unswitching. However, the inner loop body was rather simple,
>     of the form:
>
>     loop {
>
>     p+=n;
>
>     …
>
>     p+=n;
>
>     …
>
>     }
>     use p.
>
>     Due to unswitching there were several such loops each with the
>     different number of p+=n ops, so when the IndVars pass rewrote all
>     exit values, it added a lot of slightly different offsets to the
>     main loop header that couldn’t fit in the available registers
>     which lead to unnecessary spills/reloads.
>
>     I am wondering what is the usual strategy for dealing with such
>     “pessimizations”? Is it possible to somehow modify the
>     IndVarSimplify pass to take those issues into account (for
>     example, tell it that adding offset computation + gep is
>     potentially more expensive than simply reusing last var from the
>     loop) or should it be recovered in some later pass? If so, is
>     there an easy way to revert IV elimination? Have anyone dealt with
>     similar issues before?
>
>     --
>
>     Danila
>
>
>
>     _______________________________________________
>
>     LLVM Developers mailing list
>
>     llvm-dev at lists.llvm.org  <mailto:llvm-dev at lists.llvm.org>
>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=yx8qR1CqElqmkWtFEZai6IE4tZr66rXpt7QYSVvsv6Q&e=>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190813/10d97da8/attachment.html>

Danila Malyutin via llvm-dev

2019-Aug-16 13:44 UTC

head link

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

Thanks. I've rebased this patch on top of the recent LLVM (it was
straightforward) and applied it in my fork.
It seems to have solved one of the problems I was having. Would LLVM be
interested if I submit the updated version for the review?

--
Danila

From: Philip Reames [mailto:listmail at philipreames.com]
Sent: Tuesday, August 13, 2019 19:01
To: Danila Malyutin <Danila.Malyutin at synopsys.com>; Finkel, Hal J.
<hfinkel at anl.gov>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?


Wasn't aware of this patch.  No, I don't see an obvious reason why it
wasn't followed up on.

Philip
On 8/13/19 8:25 AM, Danila Malyutin wrote:
I've noticed that there was an attempt to mitigate ExitValues problem in
https://reviews.llvm.org/D12494<https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D12494&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=dG9EWNnpejxa8ub7_ajgnN50pG20wvSyA7WI9jWEv2Q&s=XRJmqJsGpSvBiRcYTKFYc_m94KMv3dQ6FFLsfF7GR1Y&e=>
that went nowhere. Were there particular issues with that approach?

--
Danila

From: Philip Reames [mailto:listmail at philipreames.com]
Sent: Saturday, August 10, 2019 02:05
To: Danila Malyutin <Danila.Malyutin at
synopsys.com><mailto:Danila.Malyutin at synopsys.com>; Finkel, Hal J.
<hfinkel at anl.gov><mailto:hfinkel at anl.gov>
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?



On 8/9/19 8:27 AM, Danila Malyutin via llvm-dev wrote:
Hi Hal,

I see. So LSR could theoretically counteract undesirable Ind Var transformations
but it's not implemented at the moment?

I think I've managed to come up with a small reproducer that can also
exhibit similar problem on x86, here it is:
https://godbolt.org/z/_wxzut<https://urldefense.proofpoint.com/v2/url?u=https-3A__godbolt.org_z_-5Fwxzut&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=-qhi7IRwOrqjcv_cxlhP6lbVWspNKWeDT4amCIHR1sU&e=>

As you can see, when rewriteLoopExitValues is not disabled Clang generates worse
code due to additional spills, because Ind Vars rewrites all exit values of
'a' to recompute it's value instead of reusing the value from the
loop body. This requires extra registers for the new "a after the
loop" value (since it's not simply reused) and also to store the new
"offset", which leads to the extra spills since they all live across
big loop body. When exit values are not rewritten 'a' stays in it's
`r15d` register with no extra costs.

This hits on a point I've thought some about, but haven't tried to
implement.

I think there might be room for a late pass which undoes the exit value
rewriting.  As an analogy, we have MachineLICM which sometimes undoes the
transforms performed by LICM, but we still want the IR form to hoist
aggressively for ease of optimization and analysis.

Maybe this should be part of LSR, or maybe separate.  Haven't thought about
that part extensively.

It's worth noting that the SCEVs for the exit value of the value inside the
loop and the rewritten exit value should be identical.  So recognizing the case
for potential rewriting is quite straight-forward.  The profitability reasoning
might be more involved, but the legality part should essentially be handled by
SCEV, and should be able to reuse exactly the same code as RLEV.

--
Danila

From: Finkel, Hal J. [mailto:hfinkel at anl.gov]
Sent: Thursday, August 8, 2019 21:24
To: Danila Malyutin <Danila.Malyutin at
synopsys.com><mailto:Danila.Malyutin at synopsys.com>
Subject: Re: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?


Hi, Danila,



Regarding the first case, this is certainly a problem that has come up before.
As I recall, and I believe this is still true, LoopStrengthReduce, where we
reason about induction variables while accounting for register pressure,
won't currently add new PHIs. People have talked about extending LSR to
consider adding new PHIs in the past.



Regarding the second case, could you post a more-detailed description? I
don't quite understand the issue.



 -Hal


Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> on behalf of Danila Malyutin via llvm-dev <llvm-dev
at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Sent: Thursday, August 8, 2019 12:36 PM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: [llvm-dev] How to best deal with undesirable Induction Variable
Simplification?


Hello,
Recently I've come across two instances where Induction Variable
Simplification lead to noticable performance regressions.

In one case, the removal of extra IV lead to the inability to reschedule
instructions in a tight loop to reduce stalls. In that case, there were enough
registers to spare, so using extra register for extra induction variable was
preferable since it reduced dependencies in the loop.
In the second case, there was a big nested loop made even bigger after
unswitching. However, the inner loop body was rather simple, of the form:

loop {

  p+=n;

...

  p+=n;

...

}
use p.



Due to unswitching there were several such loops each with the different number
of p+=n ops, so when the IndVars pass rewrote all exit values, it added a lot of
slightly different offsets to the main loop header that couldn't fit in the
available registers which lead to unnecessary spills/reloads.

I am wondering what is the usual strategy for dealing with such
"pessimizations"? Is it possible to somehow modify the IndVarSimplify
pass to take those issues into account (for example, tell it that adding offset
computation + gep is potentially more expensive than simply reusing last var
from the loop) or should it be recovered in some later pass? If so, is there an
easy way to revert IV elimination? Have anyone dealt with similar issues before?



--

Danila






_______________________________________________

LLVM Developers mailing list

llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwMD-g&c=DPL6_X_6JkXFx7AXWqB0tg&r=VEV8gWVf26SDOqiMtTxnBloZmItAauQlSqznsCc0KxY&m=xzJTtah1fNUz56fRe1yh10OCSBFg7IbzUhFcn8BPyJk&s=yx8qR1CqElqmkWtFEZai6IE4tZr66rXpt7QYSVvsv6Q&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190816/2ea7f9b6/attachment.html>

llvm dev - Aug 2019 - How to best deal with undesirable Induction Variable Simplification?

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

[llvm-dev] How to best deal with undesirable Induction Variable Simplification?