thr3ads.net - llvm dev - [llvm-dev] canonical form loops [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Zaks, Ayal (Mobileye) via llvm-dev

2020-Apr-01 10:53 UTC

[llvm-dev] canonical form loops

Interesting, thanks for digging this up!
> As a consequence, any loop structure that is recognized
> by SCEV will (/should) not profit from rewriting.
As discussed in https://reviews.llvm.org/D68577#1742745 and PR40816 showed,
there is still merit and profit in further simplifying loop induction variables,
or at-least the primary one; somewhat independent of continuing to rely on SCEV
for analyzing loops.

> enable-iv-rewrite=false was made the default in r139579 after finding that
it
> slows down many benchmarks.
This was 8.5 years ago. Time to revisit and try to re-enable some of these
iv-rewrites, with a better understanding why current downstream passes pessimize
canonical iv's, if they still do?

> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Michael
> Kruse via llvm-dev
> Sent: Saturday, March 28, 2020 12:33
> To: Sjoerd Meijer <Sjoerd.Meijer at arm.com>
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] canonical form loops
> 
> The topic came up before, e.g. https://reviews.llvm.org/D60565#1484984
> 
> > Some canonicalization passes are designed for this. In particular,
> IndVarSimplify used to make canonical loops (i.e. start at zero, increment
by
> one). r133502 introduced -disable-iv-rewrite to rely more on
ScalarEvolution
> instead of "opcode/pattern matching" (cite from the commit
message). -
> enable-iv-rewrite=false was made the default in r139579 after finding that
it
> slows down many benchmarks. It was completely removed in r153260.
> 
> The general approach in LLVM is to rely on SCEV for analyzing loops instead
> of custom handling. As a consequence, any loop structure that is recognized
> by SCEV will (/should) not profit from rewriting.
> 
> Michael
> 
> 
> 
> 
> Am Do., 26. März 2020 um 15:56 Uhr schrieb Sjoerd Meijer via llvm-dev
> <llvm-dev at lists.llvm.org>:
> >
> > Hello,
> >
> > Quick question to see if I haven't missed anything: I would like
convert
> counting down loops, i.e. loops with a constant -1 step value, to counting
up
> loops, because the vectoriser is able to better deal with these loops (see
e.g.
> D76838 that I was discussing today with Ayal). It looks like
LoopSimplifyCFG
> and IndVarSimplify don't do this. So was just curious if I haven't
missed
> anything here or in another pass I haven't yet considered. I was
perhaps also
> expecting this to be the canonical form of loops, but couldn't find any
> evidence of that in [1] or in source-code.
> > The obvious follow-up question is if there would be any objections to
> adding this to e.g. LoopSimplifyCFG, and adding LoopSimplifyCFG to the
> optimisation pipeline just before the vectoriser.
> >
> > Cheers,
> > Sjoerd.
> >
> > [1] https://llvm.org/docs/LoopTerminology.html
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Michael Kruse via llvm-dev

2020-Apr-02 10:50 UTC

head link

[llvm-dev] canonical form loops

SCEV is the de-facto right approach for induction variable analysis.
Any pass not using it will be sensitive to irrelevant variations in
patterns. Indeed, this decision has been made years ago, like most
design decisions.

The alternative is to canonicalize induction variables as
IndVarSimplify used to do. If you would like to change it, feel free
to send an RFC to the llvm-dev list. Justifying this change is
proposer's obligation, including measurements of compile time and
test-suite performance changes.

When I looked at it last time, the reason was that the canonicalized
induction variables introduces yet another induction variable. Typical
example is a loop iterating over a buffer:

for (char *p = start; p < end; ++p)
  foo(p);

Canonicalization yields something like:

for (size_t i = 0; i < (end-start); ++i)
  foo(start[i]);

That is, an new register for i to determine the number of iterations
that otherwise could also be done using the pointer.

@reames was mentioning that LoopStrengthReduce is supposed to undo
this again, but seems to not always be successful. I could imagine one
reason is that p would overflow earlier than i.

Michael

Am Mi., 1. Apr. 2020 um 05:53 Uhr schrieb Zaks, Ayal (Mobileye)
<ayal.zaks at intel.com>:>
> Interesting, thanks for digging this up!
>
> > As a consequence, any loop structure that is recognized
> > by SCEV will (/should) not profit from rewriting.
>
> As discussed in https://reviews.llvm.org/D68577#1742745 and PR40816 showed,
there is still merit and profit in further simplifying loop induction variables,
or at-least the primary one; somewhat independent of continuing to rely on SCEV
for analyzing loops.
>
>
> > enable-iv-rewrite=false was made the default in r139579 after finding
that it
> > slows down many benchmarks.
>
> This was 8.5 years ago. Time to revisit and try to re-enable some of these
iv-rewrites, with a better understanding why current downstream passes pessimize
canonical iv's, if they still do?
>
>
> > -----Original Message-----
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Michael
> > Kruse via llvm-dev
> > Sent: Saturday, March 28, 2020 12:33
> > To: Sjoerd Meijer <Sjoerd.Meijer at arm.com>
> > Cc: llvm-dev at lists.llvm.org
> > Subject: Re: [llvm-dev] canonical form loops
> >
> > The topic came up before, e.g. https://reviews.llvm.org/D60565#1484984
> >
> > > Some canonicalization passes are designed for this. In
particular,
> > IndVarSimplify used to make canonical loops (i.e. start at zero,
increment by
> > one). r133502 introduced -disable-iv-rewrite to rely more on
ScalarEvolution
> > instead of "opcode/pattern matching" (cite from the commit
message). -
> > enable-iv-rewrite=false was made the default in r139579 after finding
that it
> > slows down many benchmarks. It was completely removed in r153260.
> >
> > The general approach in LLVM is to rely on SCEV for analyzing loops
instead
> > of custom handling. As a consequence, any loop structure that is
recognized
> > by SCEV will (/should) not profit from rewriting.
> >
> > Michael
> >
> >
> >
> >
> > Am Do., 26. März 2020 um 15:56 Uhr schrieb Sjoerd Meijer via llvm-dev
> > <llvm-dev at lists.llvm.org>:
> > >
> > > Hello,
> > >
> > > Quick question to see if I haven't missed anything: I would
like convert
> > counting down loops, i.e. loops with a constant -1 step value, to
counting up
> > loops, because the vectoriser is able to better deal with these loops
(see e.g.
> > D76838 that I was discussing today with Ayal). It looks like
LoopSimplifyCFG
> > and IndVarSimplify don't do this. So was just curious if I
haven't missed
> > anything here or in another pass I haven't yet considered. I was
perhaps also
> > expecting this to be the canonical form of loops, but couldn't
find any
> > evidence of that in [1] or in source-code.
> > > The obvious follow-up question is if there would be any
objections to
> > adding this to e.g. LoopSimplifyCFG, and adding LoopSimplifyCFG to the
> > optimisation pipeline just before the vectoriser.
> > >
> > > Cheers,
> > > Sjoerd.
> > >
> > > [1] https://llvm.org/docs/LoopTerminology.html
> > >
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.

Sjoerd Meijer via llvm-dev

2020-Apr-02 15:32 UTC

head link

[llvm-dev] canonical form loops

Thanks Ayal and Michael for sharing these further thoughts.
Slightly moving the discussion from D76838 to here just to keep it in one place.

What I want to achieve in D76838 is to rewrite this:

for (int i=N; i>0; i--)
   foo();

into this:

for (int i=0; i<N; i++)
   foo();

because this enables more tricks in the vectoriser. This trick can also be
taught to work on counting down loops of course, but we prefer not to do this
and rather have the more canonical counting up form and now the question is
where this rewrite should live.

Please note that it is appreciated that SCEV is used for all the heavy lifting,
and if you look at the implementation in D76838 you'll see that the
implementation is very minimal and self-contained. I am reluctant to go for
indvar simplify because of the impact it may have as also raised here on the dev
list, and as this rewrite is to further enable the vectoriser, I would think a
helper in LoopInfo used in the vectoriser is a win-win.

In D76838 I have put the rewrite to a counting up loop as late as possible, but
at that point vectorisation may still fail. Then the result might be that the
loop has been rewritten, but no vectorisation has happened. That might be a
surprise, but at the same time I don't see the problem with that. And of
course when vectorisation happens, you will never notice this rewrite.

Please let me know what you think.

Cheers,
Sjoerd.


________________________________
From: Michael Kruse <llvmdev at meinersbur.de>
Sent: 02 April 2020 11:50
To: Zaks, Ayal (Mobileye) <ayal.zaks at intel.com>
Cc: Michael Kruse <llvmdev at meinersbur.de>; Sjoerd Meijer
<Sjoerd.Meijer at arm.com>; llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] canonical form loops

SCEV is the de-facto right approach for induction variable analysis.
Any pass not using it will be sensitive to irrelevant variations in
patterns. Indeed, this decision has been made years ago, like most
design decisions.

The alternative is to canonicalize induction variables as
IndVarSimplify used to do. If you would like to change it, feel free
to send an RFC to the llvm-dev list. Justifying this change is
proposer's obligation, including measurements of compile time and
test-suite performance changes.

When I looked at it last time, the reason was that the canonicalized
induction variables introduces yet another induction variable. Typical
example is a loop iterating over a buffer:

for (char *p = start; p < end; ++p)
  foo(p);

Canonicalization yields something like:

for (size_t i = 0; i < (end-start); ++i)
  foo(start[i]);

That is, an new register for i to determine the number of iterations
that otherwise could also be done using the pointer.

@reames was mentioning that LoopStrengthReduce is supposed to undo
this again, but seems to not always be successful. I could imagine one
reason is that p would overflow earlier than i.

Michael

Am Mi., 1. Apr. 2020 um 05:53 Uhr schrieb Zaks, Ayal (Mobileye)
<ayal.zaks at intel.com>:>
> Interesting, thanks for digging this up!
>
> > As a consequence, any loop structure that is recognized
> > by SCEV will (/should) not profit from rewriting.
>
> As discussed in https://reviews.llvm.org/D68577#1742745 and PR40816 showed,
there is still merit and profit in further simplifying loop induction variables,
or at-least the primary one; somewhat independent of continuing to rely on SCEV
for analyzing loops.
>
>
> > enable-iv-rewrite=false was made the default in r139579 after finding
that it
> > slows down many benchmarks.
>
> This was 8.5 years ago. Time to revisit and try to re-enable some of these
iv-rewrites, with a better understanding why current downstream passes pessimize
canonical iv's, if they still do?
>
>
> > -----Original Message-----
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Michael
> > Kruse via llvm-dev
> > Sent: Saturday, March 28, 2020 12:33
> > To: Sjoerd Meijer <Sjoerd.Meijer at arm.com>
> > Cc: llvm-dev at lists.llvm.org
> > Subject: Re: [llvm-dev] canonical form loops
> >
> > The topic came up before, e.g. https://reviews.llvm.org/D60565#1484984
> >
> > > Some canonicalization passes are designed for this. In
particular,
> > IndVarSimplify used to make canonical loops (i.e. start at zero,
increment by
> > one). r133502 introduced -disable-iv-rewrite to rely more on
ScalarEvolution
> > instead of "opcode/pattern matching" (cite from the commit
message). -
> > enable-iv-rewrite=false was made the default in r139579 after finding
that it
> > slows down many benchmarks. It was completely removed in r153260.
> >
> > The general approach in LLVM is to rely on SCEV for analyzing loops
instead
> > of custom handling. As a consequence, any loop structure that is
recognized
> > by SCEV will (/should) not profit from rewriting.
> >
> > Michael
> >
> >
> >
> >
> > Am Do., 26. März 2020 um 15:56 Uhr schrieb Sjoerd Meijer via llvm-dev
> > <llvm-dev at lists.llvm.org>:
> > >
> > > Hello,
> > >
> > > Quick question to see if I haven't missed anything: I would
like convert
> > counting down loops, i.e. loops with a constant -1 step value, to
counting up
> > loops, because the vectoriser is able to better deal with these loops
(see e.g.
> > D76838 that I was discussing today with Ayal). It looks like
LoopSimplifyCFG
> > and IndVarSimplify don't do this. So was just curious if I
haven't missed
> > anything here or in another pass I haven't yet considered. I was
perhaps also
> > expecting this to be the canonical form of loops, but couldn't
find any
> > evidence of that in [1] or in source-code.
> > > The obvious follow-up question is if there would be any
objections to
> > adding this to e.g. LoopSimplifyCFG, and adding LoopSimplifyCFG to the
> > optimisation pipeline just before the vectoriser.
> > >
> > > Cheers,
> > > Sjoerd.
> > >
> > > [1] https://llvm.org/docs/LoopTerminology.html
> > >
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200402/4ef53185/attachment.html>

llvm dev - Apr 2020 - canonical form loops

[llvm-dev] canonical form loops

[llvm-dev] canonical form loops

[llvm-dev] canonical form loops