thr3ads.net - llvm dev - [llvm-dev] [GlobalISel] Prioritizing long patterns in combiner over short ones [Jan 2021]

If this information is useful, please help other people find it:
Share via:

Daniel Sanders via llvm-dev

2021-Jan-20 18:50 UTC

[llvm-dev] [GlobalISel] Prioritizing long patterns in combiner over short ones

To be honest, hearing it was top-down surprised me too at first but I think I
was remembering the algorithm I wanted to experiment with which required
bottom-up.

I asked Aditya about this and apparently it's to address the pathological
case where we repeatedly do an entire pass only to simplify a single node of the
MIR and slowly propagate that through the MIR. Another reason was because
InstCombine is like this too.

I think we can probably address a fair portion of the pathological case by
controlling how we populate the work list. Instead of adding all MI's to it
each time, we could filter out blocks that can't possibly match on this
round. I'm thinking that a block only needs revisiting on the next pass if
it contains a use of a register that was def'd by an instruction we changed
on the current pass. Therefore the observer could track the vreg defs affected
by a pass and build a list of MBB's to revisit. I think we'd have to
revisit all the successors of those blocks too if the pass is running top-down
(in case we make changes that allow further changes in the same pass) but if we
changed it to bottom-up we wouldn't need that. This wouldn't help with
large blocks but maybe we can find a similar scheme there.
> On Jan 19, 2021, at 22:51, Amara Emerson <amara at apple.com> wrote:
> 
> +Daniel.
> 
> Daniel, do you remember if there was a strong reason for choosing to go
top-down on the combiner?
> 
> Amara
> 
>> On Dec 17, 2020, at 2:14 AM, Jay Foad via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> 
>> I am really surprised that "the combiner runs top-down".
Surely it's
>> better to combine smaller sub-expressions before combining larger
>> expressions? Or is there actually a good reason to keep the top-down
>> order??
>> 
>> DAGCombiner also runs mostly top-down (but it can also add nodes to
>> the worklist in a bottom-up direction for certain cases). The top-down
>> order causes problems when you have deep expressions trees that could
>> be simplified away to nothing. Outer combines see these subexpression
>> unsimplified. They can use recursive helper functions like
>> computeKnownBits to try to analyse the unsimplified expression, but
>> the recursive helper functions typically have a max recursion depth of
>> 6. which is easy to hit.
>> 
>> DAGCombiner is so embedded now that it is hard to change the basic
>> top-down order, but I would hope we could change the GlobalISel
>> combiner.
>> 
>> Jay.
>> 
>> On Tue, 15 Dec 2020 at 15:09, Dominik Montada via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> 
>>> Hi,
>>> 
>>> I'm currently writing target specific combiners with
GlobalISel. I have
>>> a case where a sub-node of a larger pattern also matches another,
>>> smaller combiner pattern. Because the combiner runs top-down, the
>>> smaller pattern is matched before the larger pattern has a chance
to be
>>> matched. Do I have to teach my larger pattern to handle this case
or is
>>> there a better way to do this?
>>> 
>>> More importantly, are there any plans to improve this behavior?
>>> 
>>> Cheers,
>>> 
>>> Dominik
>>> 
>>> --
>>>
----------------------------------------------------------------------
>>> Dominik Montada                   Email: dominik.montada at
hightec-rt.com
>>> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
>>> Europaallee 19                    Fax:   +49-681-92613-26
>>> D-66113 Saarbrücken               WWW: http://www.hightec-rt.com
>>> 
>>> Managing Director: Vera Strothmann
>>> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
>>> 
>>> This e-mail may contain confidential and/or privileged information.
If
>>> you are not the intended recipient please notify the sender
immediately
>>> and destroy this e-mail. Any unauthorised copying, disclosure or
>>> distribution of the material in this e-mail is strictly forbidden.
>>> ---
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Jay Foad via llvm-dev

2021-Jan-21 17:02 UTC

head link

[llvm-dev] [GlobalISel] Prioritizing long patterns in combiner over short ones

I think the terminology is confusing. By "bottom up" I mean visiting
the leaves of an expression tree first like in BURS
(https://en.wikipedia.org/wiki/BURS). This is "bottom up" if you draw
your trees with the root at the top. For the expression (fadd (fmul a,
b), c) it would visit the mul before the add. I think simplifying
things in this order is sane because when you visit a node, you can
assume that the operands of that node have already been simplified.

In my defence I think I got this definition from earlier discussions
about the order in which DAGCombiner runs:
https://reviews.llvm.org/D33587#1372912

Unfortunately if you think of textual IR in a basic block, then
"bottom up" order starts at the top of the block and proceeds towards
the bottom :-(
  %mul = fmul float %a, %b
  %add = fadd float %mul, %c

A quick experiment shows that InstCombine is bottom-up:
$ cat z.ll
define float @main(float %a, float %b, float %c, float %d, float %e) {
  %add = fadd float %a, %b
  %sub = fsub float %add, %c
  %mul = fmul float %sub, %d
  %div = fdiv float %mul, %e
  ret float %div
}
$ opt -instcombine -debug-only=instcombine z.ll
IC: Visiting:   %add = fadd float %a, %b
IC: Visiting:   %sub = fsub float %add, %c
IC: Visiting:   %mul = fmul float %sub, %d
IC: Visiting:   %div = fdiv float %mul, %e

DAGCombiner is top-down, which I think is wrong but it's hard to fix:
$ llc -march=aarch64 -debug-only=dagcombine z.ll
Combining: t14: f32 = fdiv t13, t10
Combining: t13: f32 = fmul t12, t8
Combining: t12: f32 = fsub t11, t6
Combining: t11: f32 = fadd t2, t4

I'm happy to see that GISel Combiner is bottom-up after all:
$ llc -march=aarch64 -global-isel -debug-only=gi-combiner z.ll
Try combining %5:_(s32) = G_FADD %0:_, %1:_
Try combining %6:_(s32) = G_FSUB %5:_, %2:_
Try combining %7:_(s32) = G_FMUL %6:_, %3:_
Try combining %8:_(s32) = G_FDIV %7:_, %4:_


Sorry if I have derailed your original questions Dominik. I think the
answers are "yes you have to teach your larger pattern to handle this
case" and "no there are no plans to improve this behaviour as far as I
know, and I'm not even sure how it would be improved".

Jay.





Jay.

On Wed, 20 Jan 2021 at 18:50, Daniel Sanders <daniel_l_sanders at
apple.com> wrote:>
> To be honest, hearing it was top-down surprised me too at first but I think
I was remembering the algorithm I wanted to experiment with which required
bottom-up.
>
> I asked Aditya about this and apparently it's to address the
pathological case where we repeatedly do an entire pass only to simplify a
single node of the MIR and slowly propagate that through the MIR. Another reason
was because InstCombine is like this too.
>
> I think we can probably address a fair portion of the pathological case by
controlling how we populate the work list. Instead of adding all MI's to it
each time, we could filter out blocks that can't possibly match on this
round. I'm thinking that a block only needs revisiting on the next pass if
it contains a use of a register that was def'd by an instruction we changed
on the current pass. Therefore the observer could track the vreg defs affected
by a pass and build a list of MBB's to revisit. I think we'd have to
revisit all the successors of those blocks too if the pass is running top-down
(in case we make changes that allow further changes in the same pass) but if we
changed it to bottom-up we wouldn't need that. This wouldn't help with
large blocks but maybe we can find a similar scheme there.
>
> > On Jan 19, 2021, at 22:51, Amara Emerson <amara at apple.com>
wrote:
> >
> > +Daniel.
> >
> > Daniel, do you remember if there was a strong reason for choosing to
go top-down on the combiner?
> >
> > Amara
> >
> >> On Dec 17, 2020, at 2:14 AM, Jay Foad via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> >>
> >> I am really surprised that "the combiner runs top-down".
Surely it's
> >> better to combine smaller sub-expressions before combining larger
> >> expressions? Or is there actually a good reason to keep the
top-down
> >> order??
> >>
> >> DAGCombiner also runs mostly top-down (but it can also add nodes
to
> >> the worklist in a bottom-up direction for certain cases). The
top-down
> >> order causes problems when you have deep expressions trees that
could
> >> be simplified away to nothing. Outer combines see these
subexpression
> >> unsimplified. They can use recursive helper functions like
> >> computeKnownBits to try to analyse the unsimplified expression,
but
> >> the recursive helper functions typically have a max recursion
depth of
> >> 6. which is easy to hit.
> >>
> >> DAGCombiner is so embedded now that it is hard to change the basic
> >> top-down order, but I would hope we could change the GlobalISel
> >> combiner.
> >>
> >> Jay.
> >>
> >> On Tue, 15 Dec 2020 at 15:09, Dominik Montada via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I'm currently writing target specific combiners with
GlobalISel. I have
> >>> a case where a sub-node of a larger pattern also matches
another,
> >>> smaller combiner pattern. Because the combiner runs top-down,
the
> >>> smaller pattern is matched before the larger pattern has a
chance to be
> >>> matched. Do I have to teach my larger pattern to handle this
case or is
> >>> there a better way to do this?
> >>>
> >>> More importantly, are there any plans to improve this
behavior?
> >>>
> >>> Cheers,
> >>>
> >>> Dominik
> >>>
> >>> --
> >>>
----------------------------------------------------------------------
> >>> Dominik Montada                   Email: dominik.montada at
hightec-rt.com
> >>> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
> >>> Europaallee 19                    Fax:   +49-681-92613-26
> >>> D-66113 Saarbrücken               WWW:
http://www.hightec-rt.com
> >>>
> >>> Managing Director: Vera Strothmann
> >>> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
> >>>
> >>> This e-mail may contain confidential and/or privileged
information. If
> >>> you are not the intended recipient please notify the sender
immediately
> >>> and destroy this e-mail. Any unauthorised copying, disclosure
or
> >>> distribution of the material in this e-mail is strictly
forbidden.
> >>> ---
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>

llvm dev - Jan 2021 - [GlobalISel] Prioritizing long patterns in combiner over short ones

[llvm-dev] [GlobalISel] Prioritizing long patterns in combiner over short ones

[llvm-dev] [GlobalISel] Prioritizing long patterns in combiner over short ones