thr3ads.net - llvm dev - [llvm-dev] Question about canonicalizing cmp+select [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Sanjay Patel via llvm-dev

2018-Jul-03 22:44 UTC

[llvm-dev] Question about canonicalizing cmp+select

I linked the wrong patch review. Here's the patch that was actually
committed:
https://reviews.llvm.org/D48508
https://reviews.llvm.org/rL335433

On Tue, Jul 3, 2018 at 4:39 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> [adding back llvm-dev and cc'ing Craig]
>
> I think you are asking if we are missing a fold (or your target is missing
> enabling another hook) to transform the sext+add into shift+or? I think the
> answer is 'yes'. We probably should add that fold. This seems like
a
> similar case as the recent: https://reviews.llvm.org/D48466
>
> Note that on x86, the sext+add becomes zext+sub:
>         t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch
>       t24: i16 = zero_extend t20
>     t17: i16 = sub Constant:i16<5>, t24
>
> Would that transform help your target?
>
> On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at google.com>
wrote:
>
>> Hi, Roman and Sanjay,
>>
>>   Thank you for your reply!  We currently do run DAGCombiner, but
didn't
>> implement this specific transformation.  I just tried turning on
>> convertSelectOfCosntantsToMath() in our ISelLowering, but that
doesn't
>> quite work because it generated a sign_extend op from i1 to i16, which
our
>> backend currently doesn't support.
>>
>>   Does the DAGCombiner already has this transformation implemented?
>>
>> Thanks,
>> --Yuan
>>
>> On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at
rotateright.com>
>> wrote:
>>
>>> Do you run DAGCombiner? And are you overriding
>>> TLI.convertSelectOfConstantsToMath(VT) for your target?
>>>
>>> For the stated example (true val and false val constants in the
select
>>> differ by 1), that should already be converted.
>>>
>>>
>>> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri at
gmail.com>
>>> wrote:
>>>
>>>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev
>>>> <llvm-dev at lists.llvm.org> wrote:
>>>> > Hi, Sanjay/all,
>>>> >
>>>> >   I noticed in rL331486 that some compare-select
optimizations are
>>>> disabled
>>>> > in favor of providing canonicalized cmp+select to the
backend.
>>>> >
>>>> >   I am currently working on a private backend target, and
the target
>>>> has a
>>>> > small code size limit.  With this change, some of the apps
went over
>>>> the
>>>> > codesize limit.  As an example,
>>>> >
>>>> > C code:
>>>> >   b = (a > -1) ? 4 : 5;
>>>> >
>>>> > ll code:
>>>> > Before rL331486:
>>>> >   %0 = lshr i16 %a.0.a.0., 15
>>>> >   %1 = or i16 %0, 4
>>>> >
>>>> > After rL331486:
>>>> >   %cmp = icmp sgt i16 %a.0.a.0., -1
>>>> >   %cond = select i1 %cmp, i16 4, i16 5
>>>> >
>>>> >   With the various encoding restrictions of my particular
target, the
>>>> > cmp/select generated slightly larger code size.  However,
because the
>>>> apps
>>>> > were very close to the code size limit, this slight change
pushed
>>>> them over
>>>> > the limit.
>>>> >
>>>> >   If I still prefer to have this optimization performed,
then is my
>>>> best
>>>> > strategy moving forward to implement this optimization as
a peephole
>>>> opt in
>>>> > my backend?
>>>> I personally think it should probably be a DAGCombine
transform,
>>>> driven by a Target Transform Info hook (e.g. like hasAndNot()).
>>>>
>>>> > Thanks,
>>>> > --Yuan
>>>> Roman.
>>>>
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>>
>>>
>>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/3f9bbe36/attachment.html>

Craig Topper via llvm-dev

2018-Jul-03 22:46 UTC

head link

[llvm-dev] Question about canonicalizing cmp+select

While its true select convertSelectOfCosntantsToMath() creates add+sext i1,
there's a transform in DAGCombiner::visitADDLike that does "add (sext
i1),
X -> sub X, (zext i1)". Why didn't that fire for your target?

~Craig


On Tue, Jul 3, 2018 at 3:44 PM Sanjay Patel <spatel at rotateright.com>
wrote:
> I linked the wrong patch review. Here's the patch that was actually
> committed:
> https://reviews.llvm.org/D48508
> https://reviews.llvm.org/rL335433
>
> On Tue, Jul 3, 2018 at 4:39 PM, Sanjay Patel <spatel at
rotateright.com>
> wrote:
>
>> [adding back llvm-dev and cc'ing Craig]
>>
>> I think you are asking if we are missing a fold (or your target is
>> missing enabling another hook) to transform the sext+add into shift+or?
I
>> think the answer is 'yes'. We probably should add that fold.
This seems
>> like a similar case as the recent: https://reviews.llvm.org/D48466
>>
>> Note that on x86, the sext+add becomes zext+sub:
>>         t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch
>>       t24: i16 = zero_extend t20
>>     t17: i16 = sub Constant:i16<5>, t24
>>
>> Would that transform help your target?
>>
>> On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at google.com>
wrote:
>>
>>> Hi, Roman and Sanjay,
>>>
>>>   Thank you for your reply!  We currently do run DAGCombiner, but
didn't
>>> implement this specific transformation.  I just tried turning on
>>> convertSelectOfCosntantsToMath() in our ISelLowering, but that
doesn't
>>> quite work because it generated a sign_extend op from i1 to i16,
which our
>>> backend currently doesn't support.
>>>
>>>   Does the DAGCombiner already has this transformation implemented?
>>>
>>> Thanks,
>>> --Yuan
>>>
>>> On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at
rotateright.com>
>>> wrote:
>>>
>>>> Do you run DAGCombiner? And are you overriding
>>>> TLI.convertSelectOfConstantsToMath(VT) for your target?
>>>>
>>>> For the stated example (true val and false val constants in the
select
>>>> differ by 1), that should already be converted.
>>>>
>>>>
>>>> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri
at gmail.com>
>>>> wrote:
>>>>
>>>>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>> > Hi, Sanjay/all,
>>>>> >
>>>>> >   I noticed in rL331486 that some compare-select
optimizations are
>>>>> disabled
>>>>> > in favor of providing canonicalized cmp+select to the
backend.
>>>>> >
>>>>> >   I am currently working on a private backend target,
and the target
>>>>> has a
>>>>> > small code size limit.  With this change, some of the
apps went over
>>>>> the
>>>>> > codesize limit.  As an example,
>>>>> >
>>>>> > C code:
>>>>> >   b = (a > -1) ? 4 : 5;
>>>>> >
>>>>> > ll code:
>>>>> > Before rL331486:
>>>>> >   %0 = lshr i16 %a.0.a.0., 15
>>>>> >   %1 = or i16 %0, 4
>>>>> >
>>>>> > After rL331486:
>>>>> >   %cmp = icmp sgt i16 %a.0.a.0., -1
>>>>> >   %cond = select i1 %cmp, i16 4, i16 5
>>>>> >
>>>>> >   With the various encoding restrictions of my
particular target, the
>>>>> > cmp/select generated slightly larger code size. 
However, because
>>>>> the apps
>>>>> > were very close to the code size limit, this slight
change pushed
>>>>> them over
>>>>> > the limit.
>>>>> >
>>>>> >   If I still prefer to have this optimization
performed, then is my
>>>>> best
>>>>> > strategy moving forward to implement this optimization
as a peephole
>>>>> opt in
>>>>> > my backend?
>>>>> I personally think it should probably be a DAGCombine
transform,
>>>>> driven by a Target Transform Info hook (e.g. like
hasAndNot()).
>>>>>
>>>>> > Thanks,
>>>>> > --Yuan
>>>>> Roman.
>>>>>
>>>>> > _______________________________________________
>>>>> > LLVM Developers mailing list
>>>>> > llvm-dev at lists.llvm.org
>>>>> >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >
>>>>>
>>>>
>>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/3df7fdef/attachment.html>

Yuan Lin via llvm-dev

2018-Jul-04 01:25 UTC

head link

[llvm-dev] Question about canonicalizing cmp+select

Hi,

  "*While its true select convertSelectOfCosntantsToMath() creates add+sext
i1, there's a transform in DAGCombiner::visitADDLike that does "add
(sext
i1), X -> sub X, (zext i1)". Why didn't that fire for your
target?*"

  That's a good question! We never explicitly marked sext-on-i1 as a custom
op.  And I think the default behavior is "legal" (Am I right on that?)
So
the following transform never gets executed because it thinks sext-on-i1 as
legal op.

  // add (sext i1), X -> sub X, (zext i1)
  if (N0.getOpcode() == ISD::SIGN_EXTEND &&
      N0.getOperand(0).getValueType() == MVT::i1 &&
      !TLI.isOperationLegal(ISD::SIGN_EXTEND, MVT::i1)) {
    SDValue ZExt = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0.getOperand(0));
    return DAG.getNode(ISD::SUB, DL, VT, N1, ZExt);
  }

  I will try to change the isel behavior sext-on-i1, and see if that fixes
the issue.


On Tue, Jul 3, 2018 at 3:47 PM Craig Topper <craig.topper at gmail.com>
wrote:
> While its true select convertSelectOfCosntantsToMath() creates add+sext
> i1, there's a transform in DAGCombiner::visitADDLike that does
"add (sext
> i1), X -> sub X, (zext i1)". Why didn't that fire for your
target?
>
> ~Craig
>
>
> On Tue, Jul 3, 2018 at 3:44 PM Sanjay Patel <spatel at
rotateright.com>
> wrote:
>
>> I linked the wrong patch review. Here's the patch that was actually
>> committed:
>> https://reviews.llvm.org/D48508
>> https://reviews.llvm.org/rL335433
>>
>> On Tue, Jul 3, 2018 at 4:39 PM, Sanjay Patel <spatel at
rotateright.com>
>> wrote:
>>
>>> [adding back llvm-dev and cc'ing Craig]
>>>
>>> I think you are asking if we are missing a fold (or your target is
>>> missing enabling another hook) to transform the sext+add into
shift+or? I
>>> think the answer is 'yes'. We probably should add that
fold. This seems
>>> like a similar case as the recent: https://reviews.llvm.org/D48466
>>>
>>> Note that on x86, the sext+add becomes zext+sub:
>>>         t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch
>>>       t24: i16 = zero_extend t20
>>>     t17: i16 = sub Constant:i16<5>, t24
>>>
>>> Would that transform help your target?
>>>
>>> On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at
google.com> wrote:
>>>
>>>> Hi, Roman and Sanjay,
>>>>
>>>>   Thank you for your reply!  We currently do run DAGCombiner,
but
>>>> didn't implement this specific transformation.  I just
tried turning on
>>>> convertSelectOfCosntantsToMath() in our ISelLowering, but that
doesn't
>>>> quite work because it generated a sign_extend op from i1 to
i16, which our
>>>> backend currently doesn't support.
>>>>
>>>>   Does the DAGCombiner already has this transformation
implemented?
>>>>
>>>> Thanks,
>>>> --Yuan
>>>>
>>>> On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at
rotateright.com>
>>>> wrote:
>>>>
>>>>> Do you run DAGCombiner? And are you overriding
>>>>> TLI.convertSelectOfConstantsToMath(VT) for your target?
>>>>>
>>>>> For the stated example (true val and false val constants in
the select
>>>>> differ by 1), that should already be converted.
>>>>>
>>>>>
>>>>> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev
<lebedev.ri at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev
>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> > Hi, Sanjay/all,
>>>>>> >
>>>>>> >   I noticed in rL331486 that some compare-select
optimizations are
>>>>>> disabled
>>>>>> > in favor of providing canonicalized cmp+select to
the backend.
>>>>>> >
>>>>>> >   I am currently working on a private backend
target, and the
>>>>>> target has a
>>>>>> > small code size limit.  With this change, some of
the apps went
>>>>>> over the
>>>>>> > codesize limit.  As an example,
>>>>>> >
>>>>>> > C code:
>>>>>> >   b = (a > -1) ? 4 : 5;
>>>>>> >
>>>>>> > ll code:
>>>>>> > Before rL331486:
>>>>>> >   %0 = lshr i16 %a.0.a.0., 15
>>>>>> >   %1 = or i16 %0, 4
>>>>>> >
>>>>>> > After rL331486:
>>>>>> >   %cmp = icmp sgt i16 %a.0.a.0., -1
>>>>>> >   %cond = select i1 %cmp, i16 4, i16 5
>>>>>> >
>>>>>> >   With the various encoding restrictions of my
particular target,
>>>>>> the
>>>>>> > cmp/select generated slightly larger code size. 
However, because
>>>>>> the apps
>>>>>> > were very close to the code size limit, this
slight change pushed
>>>>>> them over
>>>>>> > the limit.
>>>>>> >
>>>>>> >   If I still prefer to have this optimization
performed, then is my
>>>>>> best
>>>>>> > strategy moving forward to implement this
optimization as a
>>>>>> peephole opt in
>>>>>> > my backend?
>>>>>> I personally think it should probably be a DAGCombine
transform,
>>>>>> driven by a Target Transform Info hook (e.g. like
hasAndNot()).
>>>>>>
>>>>>> > Thanks,
>>>>>> > --Yuan
>>>>>> Roman.
>>>>>>
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > llvm-dev at lists.llvm.org
>>>>>> >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/66e2d5dd/attachment.html>

llvm dev - Jul 2018 - Question about canonicalizing cmp+select

[llvm-dev] Question about canonicalizing cmp+select

[llvm-dev] Question about canonicalizing cmp+select

[llvm-dev] Question about canonicalizing cmp+select