thr3ads.net - llvm dev - [llvm-dev] RFC: should CVP always narrow the width of lshr? [May 2018]

If this information is useful, please help other people find it:
Share via:

Roman Lebedev via llvm-dev

2018-May-23 18:53 UTC

[llvm-dev] RFC: should CVP always narrow the width of lshr?

Hi.

In https://reviews.llvm.org/D46760,
Bixia Zheng notes that given the pattern like [1]
  %za = zext i32 %a to i64
  %udiv = udiv i64 %za, <C>
  %urem = urem i64 %za, <C>
  %uadd = add i64 %udiv, %urem
  ret i64 %uadd

the CVP will reduce the width of udiv/urem,
without considerations for the new instructions,
or whether the new width is legal for the target.
That was implemented in https://reviews.llvm.org/D44102

As Sanjay Patel notes in
https://reviews.llvm.org/D47113#1106601,> We overlooked these questions in the specific case of div/rem
> (D44102) because we assumed that narrower div/rem are always
> better for analysis and codegen <cut>
Now, there is a caveat.
If <C> is constant, and is a power-of-two, instcombine
will transform udiv to lshr, and urem to and.
And CVP pass does not narrow the width of those opcodes.
And instcombine won't narrow them either, because the
zext has multiple uses / that would create more instructions,
which is illegal for instcombine.

So [1] may or may not be handled properly...

There are multiple solutions, and more than one
solution can be implemented, i suppose.
1. https://reviews.llvm.org/D46760
    Somehow relax instcombine, to check for this very specific pattern.
    I don't think we want that.
2. Extend aggressive-instcombine to narrow a sequence of arbitrary
    binops (and maybe even more than binops). This is probably
    the most powerful, but i'm not quite sure how it would look..
3. Teach CVP to always narrow the width of lshr.
    https://reviews.llvm.org/D47113
    As Sanjay Patel notes in https://reviews.llvm.org/D47113#1106601,
    > <cut> but I doubt that we can extend that reasoning to all binops
    > on all targets. This will fight with transforms in instcombine
    > (see canEvaluateSExtd / canEvaluateZExtd).
    But, this is already sufficient to handle the [1].

Thus, as suggested, i'm sending this RFC.
What do people think about "CVP should always narrow lshr" approach?
Or is that likely to be too bad, and some more concrete approach,
like the aggressive-instcombine extension is highly preferable?

Roman.

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - May 2018 - RFC: should CVP always narrow the width of lshr?

[llvm-dev] RFC: should CVP always narrow the width of lshr?

Maybe Matching Threads

Wisdom of the Ancients