thr3ads.net - llvm dev - [llvm-dev] failing to optimize boolean ops on cmps [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2017-Jul-14 15:45 UTC

[llvm-dev] failing to optimize boolean ops on cmps

On 07/14/2017 10:27 AM, Daniel Berlin wrote:>
>
> On Thu, Jul 13, 2017 at 6:18 PM, Hal Finkel <hfinkel at anl.gov 
> <mailto:hfinkel at anl.gov>> wrote:
>
>
>     On 07/13/2017 06:40 PM, Daniel Berlin via llvm-dev wrote:
>>     Ah yes, it can only return one instruction and you need two.
>>
>>     NewGVN could still handle it if instsimplify was changed to
>>     return the right binary operators because it has infrastructure
>>     that can handle insertion.
>>     (it would need a little work).
>>
>>
>>     (At this point, InstCombine seems to have become worse to me than
>>     GCC's combining infrastructure, so i'd really like to see
if we
>>     can stop putting things there. Otherwise, we just go farther down
>>     the path of 'let's  just make everything one big graph
rewrite)
>
>     Perhaps something of a digression:
>
>     My general impression is that if InstCombine were actually doing a
>     principled graph rewrite, then we wouldn't have these problems.
>
>
> Define "these problems"?
>
> :)
> I mean, you are right it would be better in general.
> But the notion of adding more and more graph rewrites to a single part 
> of the compiler, to get second/third/etc order effects, is endless.  
> Pretty much everything can be expressed as a set of graph rewrites.  
> There is no principled stopping point.  IE You have to decide what you 
> want each pass to do, and if you have one pass whose job is "perform 
> all possible graph rewrites", that pass is in fact, your compiler :)
>
> That's not such a bad thing, mind you, there are entire 
> libraries/languages centered around this (see, e.g., stratego, 
> https://en.wikipedia.org/wiki/Stratego/XT and 
>
http://www.metaborg.org/en/latest/source/langdev/meta/lang/stratego/strategoxt/index.html),
> but it should be a deliberate thing.
>
>     e problem we have, however, is that we have phase-ordering
>     problems (even within the fixed-point iteration scheme): Patterns
>     being matched depend on a canonical form, instructions nearly
>     matching those patterns might be created, and then suboptimally
>     consumed (i.e. further transformed), before being subject to the
>     canonicalizing transformation(s) that would make the better
>     pattern actually match.
>
>     Except for TableGen'ing the whole thing, and making an actual
>     graph automaton to match the patterns, it's not clear to me how to
>     efficiently fix this
>
>
> That is what people used to do, make parsing grammars, etc.
> Stratego is also  quite efficient at term rewriting, from what i remember.
AFAIK, that's true.
>
>     . In the mean time, what we've been doing in practice, is that we
>     have InstCombine patterns depend on canonical forms, until we hit
>     some phase-ordering problem, and then we extend the pattern in
>     question to match some non-canonical forms. We might do this here
>     as well (if that's the problem here), but at some point I suspect
>     we'll end up redoing everything.
>
>
> The question i would have is "how do you know when you are
done?". As
> I said, pretty much everything can be expressed as a graph transform, 
> and the more it gets added here, the more people want to run 
> instcombine 50 times because cse/gvn/you name it "doesn't do what
they
> want".
>
I think you're done when you get everything that you have that fits into 
a model of local transformations and that should be run to a fixed 
point. Certainly an interesting question, but I suspect that the 
practical answer is "what's in InstCombine." To do more in this
sense
probably requires some more-general scheme (e.g. equality saturation).

  -Hal

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170714/39cede7d/attachment-0001.html>

Daniel Berlin via llvm-dev

2017-Jul-14 16:54 UTC

head link

[llvm-dev] failing to optimize boolean ops on cmps

>
>
> I think you're done when you get everything that you have that fits
into a
> model of local transformations and that should be run to a fixed point.
>
Sure, but to point out what you certainly already know, that's almost every
optimization you can think of in a compiler.
For example, I can do PRE this way with no issue (IE generate optimal
results).
Same with reassociation, most CSE, etc.

A very large set of global stuff can be done by "local transform repeated
until fixpoint is reached". It's just not optimal time-wise.

Certainly an interesting question, but I suspect that the practical
answer> is "what's in InstCombine."
>
At least to me, that's clearly the wrong answer, because instcombine
already does a lot of non-local stuff that could be done better elsewhere.

> To do more in this sense probably requires some more-general scheme (e.g.
> equality saturation).
>
>  -Hal
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170714/ff5762e8/attachment.html>

Sanjay Patel via llvm-dev

2017-Jul-14 17:44 UTC

head link

[llvm-dev] failing to optimize boolean ops on cmps

On Fri, Jul 14, 2017 at 10:54 AM, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>
>> I think you're done when you get everything that you have that fits
into
>> a model of local transformations and that should be run to a fixed
point.
>>
>
> Sure, but to point out what you certainly already know, that's almost
> every optimization you can think of in a compiler.
> For example, I can do PRE this way with no issue (IE generate optimal
> results).
> Same with reassociation, most CSE, etc.
>
> A very large set of global stuff can be done by "local transform
repeated
> until fixpoint is reached". It's just not optimal time-wise.
>
> Certainly an interesting question, but I suspect that the practical answer
>> is "what's in InstCombine."
>>
>
> At least to me, that's clearly the wrong answer, because instcombine
> already does a lot of non-local stuff that could be done better elsewhere.
>
Let me bring the discussion back to this particular example. This *is* a
local optimization; it's the most basic 2-variable boolean logic
optimization that's not an instsimplify (where we return a constant or an
existing value).

I see 3 possibilities (let me know if there are others):
1. Hard: Declare instcombine bankrupt and closed for business because it
has already gone over the cliff. We should not add the new matcher type. We
should go further and remove the existing combine [ ((~A & B) | A) -> (A
|
B) ] too because that's the responsibility of some other pass. This might
be the best theoretical solution, but practically impossible until someone
reinvents instcombine as multiple different passes that have the same
optimization power that we have today?

2. Medium: Take the suggestion from PR27431 (
https://bugs.llvm.org/show_bug.cgi?id=27431 ) to have CSE or GVN replace an
inverted compare predicate with an xor of an existing compare. I think this
would mean abandoning https://reviews.llvm.org/D35182 and maybe removing
that transform altogether to avoid conflicting with the expected canonical
form that includes an xor. How this plays out with the rest of
instcombine's existing folds and other IR transforms is not clear to me yet.

3. Easy: Add the new matcher and use it. It's then one 2-line patch after
another to ensure that instcombine won't miss these pattern variants no
matter what form they come in with. We acknowledge the added mass to
instcombine, but live with it for the perf gains...and wait for the now
better-performing machines to rewrite the whole thing for us. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170714/96d0572a/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jul 2017 - failing to optimize boolean ops on cmps

[llvm-dev] failing to optimize boolean ops on cmps

[llvm-dev] failing to optimize boolean ops on cmps

[llvm-dev] failing to optimize boolean ops on cmps

Apparently Analagous Threads