thr3ads.net - llvm dev - [LLVMdev] Naryreassociate vs reassociate [May 2015]

If this information is useful, please help other people find it:
Share via:

Jingyue Wu

2015-May-05 20:09 UTC

[LLVMdev] Naryreassociate vs reassociate

On Tue, May 5, 2015 at 10:31 AM, Daniel Berlin <dberlin at dberlin.org>
wrote:
> On Tue, May 5, 2015 at 10:20 AM, Jingyue Wu <jingyue at google.com>
wrote:
> > Hi Daniel,
> >
> > I presume you mean, instead of assigning function arguments distinct
> ranks
> >
(http://llvm.org/docs/doxygen/html/Reassociate_8cpp_source.html#l00282),
> we
> > should group function arguments in favor of existing pairings.
>
> Existing = pairings reassociate already chose before
> *not*
> existing = pairings that already exist in the source IR
>
> Given that, we should probably group everything in favor of existing
> pairings when possible.
>
Makes sense.

>
>
> > You are not
> > suggesting discarding the entire ranking system, right?
>
> The only three cases that should matter hugely are constants,
> arguments, and non-movable instructions.
>
> The rest should hopefully already end up with consistent decisions.
> If not, we have larger problems.
>
>
> > I'll look into how that works on my benchmarks. AFAIK, we
encountered
> some
> > cases that seem beyond the fix you suggested. These cases involve
> constants,
> > and I attached one reduced example in
> > https://llvm.org/bugs/show_bug.cgi?id=22357.
> >
>
> > void foo(int a, int b, int c, int *x, int *y) {
> >   *x = (a + b);
> >   *y = (a + 2) + b;
> > }
> >
> > Reassociate assigns constants a lower rank than variables, which
prevents
> > Reassociate from transforming the above example to
> >
> > *x = a + b;
> > *y = (a + b) + 2;
> >
>
> This is a variant of the problem above, except you are getting ranks
> 0, 1, 2 vs 1, 2
>
The key difference is that constants are designed to be lowest ranked, and
then the current reassociation algorithm always groups the constant 2 with
other variables. Looks like your new solution will favor existing pairings
regardless of the ranks. Then, it should be able to solve the a+b+2 case
nicely.

>
> It thus chooses 0,1 as one tree, and 2 as the other for the A+b+2 case.
>
> If we made it make consistent decisions, you would get the right results.
>
> There is an optimal order to make these decisions in, too.  It
> probably should be making decisions in order of smallest computation
> tree to largest computation tree, because the small trees have the
> least freedom.  Otherwise, you have the same problem if you write this
> as:
>  *y = (a + 2) + b;
>  *x = a + b;
> instead :)
>
> (IE if you don't process smallest to largest, you will choose a+2+b,
> and not be able to make a decision consistent with that when you get
> to the a+b tree)
>
> The above change should not be difficult. I'll take a crack at it
> today and send you a patch.
>
>
>Great! Thank you! Things will be clearer when I see your patch.

>
> > Jingyue
> >
> > On Mon, May 4, 2015 at 4:45 PM Daniel Berlin <dberlin at
dberlin.org>
> wrote:
> >>
> >> Whoops, forgot llvmdev
> >>
> >>
> >> On Mon, May 4, 2015 at 4:12 PM, Daniel Berlin <dberlin at
dberlin.org>
> wrote:
> >> > So i started by looking at naryreassociate, whose pass
> >> > description/reason listed for doing it is actually describes
bug in
> >> > reassociate, and discovered that, in fact, reassociate seems
broken,
> >> > and should be doing the right thing on most of your
testcases.
> >> >
> >> > Let's take nary-add.ll, left_reassociate
> >> >
> >> > ; RUN: opt < %s -nary-reassociate -S | FileCheck %s
> >> >
> >> > target datalayout =
"e-i64:64-v16:16-v32:32-n16:32:64"
> >> >
> >> > declare void @foo(i32)
> >> >
> >> > ; foo(a + c);
> >> > ; foo((a + (b + c));
> >> > ;   =>
> >> > ; t = a + c;
> >> > ; foo(t);
> >> > ; foo(t + b);
> >> > define void @left_reassociate(i32 %a, i32 %b, i32 %c) {
> >> >   %1 = add i32 %a, %c
> >> >   call void @foo(i32 %1)
> >> >   %2 = add i32 %b, %c
> >> >   %3 = add i32 %a, %2
> >> >   call void @foo(i32 %3)
> >> >   ret void
> >> > }
> >> >
> >> >
> >> > normal reassociate transforms this, IMHO, wrongly, into:
> >> >
> >> > define void @left_reassociate(i32 %a, i32 %b, i32 %c) {
> >> >   %1 = add i32 %c, %a
> >> >   call void @foo(i32 %1)
> >> >   %2 = add i32 %b, %a
> >> >   %3 = add i32 %2, %c
> >> >   call void @foo(i32 %3)
> >> >   ret void
> >> > }
> >> >
> >> >
> >> > This is because for the first expression:
> >> >
> >> > RAIn: add i32 [ %a, #3] [ %c, #5]
> >> > RAOut: add i32 [ %c, #5] [ %a, #3]
> >> >
> >> >
> >> > and
> >> > for the second:
> >> > RAIn: add i32 [ %a, #3] [ %b, #4] [ %c, #5]
> >> > RAOut: add i32 [ %c, #5] [ %b, #4] [ %a, #3]
> >> >
> >> >
> >> > This makes it transform the first into add c, a
> >> > and the second into
> >> > %1 = add b, a
> >> > add c, %1
> >> >
> >> >
> >> > This is caused by these arguments having different ranks
(because they
> >> > are function arguments), and it not respecting the same order
it has
> >> > already chosen once.
> >> >
> >> >
> >> > This is, IMHO, pretty clearly a bug.
> >> >
> >> > It screws up right_reassociate, no_reassociate, and
conditional tests
> >> > for the same reason.
> >> >
> >> > If you fix this, i expect you will find a lot less use cases
for
> >> > nary-reassociate.
> >> >
> >> >
> >> > The simple way to fix this is to mark which operands it's
paired
> >> > together in the past, and always try to pair the same ones
together if
> >> > it can, regardless of rank.
> >>
> >> Or simply adjust the ranks of paired operands to be the same and
> >> different from all other ranks
> >>
> >> (BBrank is << 16, so you should have room do to this)
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150505/d10900a7/attachment.html>

Jingyue Wu

2015-May-08 17:35 UTC

head link

[LLVMdev] Naryreassociate vs reassociate

Hi Daniel,

I tried your patch. It indeed solves some (but not all) of the issues
NaryReassociate aims to address. One thing NaryReassociate will work on but
Reassociate will probably miss is to transform

p = &input[a]
q = &input[a + b]

to

p = &input[a]
q = p + b;

I don't see an easy way to merge this into Reassociate unless we peep into
GEPs. However, that doesn't say I don't like your approach.

That said, I would definitely like to see your patch in, and thanks for
working on it. Some general comments on the implementation:

1. Instead of hoisting the rank of last pairing, can we assign them a
minimum value (e.g. -1)? This allows us to maintain the original sorting
order (highest first) and at the same time group last paired values.

2. LastPairingMap doesn't reflect dominance. If you have
a+c
and
a+b+c
but a+c doesn't dominate a+b+c, it might not be beneficial to put a and c
next to each other. I suspect a simple way of making LastPairingMap per-bb
will work for most cases.

3. FYI, Ken Kennedy et al. published a paper (
http://dl.acm.org/citation.cfm?id=1454120) on a global reassociation
algorithm, which shares the same spirit as your approach. Instead of
considering last pairings, their algorithm collects more history and tends
to group the pairs that appear most frequently. It's much more expensive
than your approach, so I don't recommend to use their approach yet (unless
there's enough need). Just let you know some work explored this area.

Jingyue

On Tue, May 5, 2015 at 2:00 PM Daniel Berlin <dberlin at dberlin.org>
wrote:
> On Tue, May 5, 2015 at 1:09 PM, Jingyue Wu <jingyue at google.com>
wrote:
> >
> >
> > On Tue, May 5, 2015 at 10:31 AM, Daniel Berlin <dberlin at
dberlin.org>
> wrote:
> >>
> >> On Tue, May 5, 2015 at 10:20 AM, Jingyue Wu <jingyue at
google.com> wrote:
> >> > Hi Daniel,
> >> >
> >> > I presume you mean, instead of assigning function arguments
distinct
> >> > ranks
> >> > (
> http://llvm.org/docs/doxygen/html/Reassociate_8cpp_source.html#l00282),
> >> > we
> >> > should group function arguments in favor of existing
pairings.
> >>
> >> Existing = pairings reassociate already chose before
> >> *not*
> >> existing = pairings that already exist in the source IR
> >>
> >> Given that, we should probably group everything in favor of
existing
> >> pairings when possible.
> >
> >
> > Makes sense.
> >
> >>
> >>
> >>
> >> > You are not
> >> > suggesting discarding the entire ranking system, right?
> >>
> >> The only three cases that should matter hugely are constants,
> >> arguments, and non-movable instructions.
> >>
> >> The rest should hopefully already end up with consistent
decisions.
> >> If not, we have larger problems.
> >>
> >>
> >> > I'll look into how that works on my benchmarks. AFAIK, we
encountered
> >> > some
> >> > cases that seem beyond the fix you suggested. These cases
involve
> >> > constants,
> >> > and I attached one reduced example in
> >> > https://llvm.org/bugs/show_bug.cgi?id=22357.
> >> >
> >>
> >> > void foo(int a, int b, int c, int *x, int *y) {
> >> >   *x = (a + b);
> >> >   *y = (a + 2) + b;
> >> > }
> >> >
> >> > Reassociate assigns constants a lower rank than variables,
which
> >> > prevents
> >> > Reassociate from transforming the above example to
> >> >
> >> > *x = a + b;
> >> > *y = (a + b) + 2;
> >> >
> >>
> >> This is a variant of the problem above, except you are getting
ranks
> >> 0, 1, 2 vs 1, 2
> >
> >
> > The key difference is that constants are designed to be lowest ranked,
> and
> > then the current reassociation algorithm always groups the constant 2
> with
> > other variables. Looks like your new solution will favor existing
> pairings
> > regardless of the ranks. Then, it should be able to solve the a+b+2
case
> > nicely.
>
>
> So, right now, it rewrites it in optimally bad order to have this
> happen.  Because it puts highest ranked values first, it is guaranteed
> that it will end up putting them not next to each other.
>
> This is because in the ops == 2 case, it places the first two into the
> same expression
> In the ops > 2 case, it'll place the first two into different
expressions.
>
> So if you have
> a+c
> and
> a+b+c
>
> You are essentially guaranteed it will never put a and c next to each
> other in both cases, unless a and c are the lowest ranked elements.
>
> For the moment, i reversed the sort order, and we reassign ranks so
> that things that it has processed before should end up next to each
> other.
>
> Because the sort is stable, you are only guaranteed they will end up
> in the same expression, not necessarily ordered exactly the same way
> they were last time.
>
> This is fixable, but probably not worth it.
>
> The patch is not perfect (and it is only lightly tested), but i'd love
> to see how it goes.
>
> Some issues
> 1.  Because i reversed the sort order, it may do worse in other cases.
> I can fix the sort order reversal by changing how we assign ranks slightly.
> Basically, when we look for ops to pair with, we reassign the ranks to
> be greater than the maximum rank for that bb (so it doesn't conflict
> with anything).
>
> 2. It will only pair it the same way as the immediately last time it
> paired it.  I doubt there is anything much better unless you spend the
> time you do in naryreassociate.
>
>
> Honestly, after staring at this long enough, i'm pretty convinced we
> should take an approach similar to naryreassociate's underpinnings in
> reassociate.
>
> In particular, i think we should deliberately unbalance the tree if we
> can guarantee it will expose a redundancy (IE it will be an expression
> we know is computed by one of our dominators).
>
> This logic can be incorporated pretty easily into reassociate
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150508/5d1a9ee3/attachment.html>

llvm dev - May 2015 - [LLVMdev] Naryreassociate vs reassociate

[LLVMdev] Naryreassociate vs reassociate

[LLVMdev] Naryreassociate vs reassociate