thr3ads.net - llvm dev - [llvm-dev] LLVM X86 backend combineIncDecVector's transform [Aug 2019]

If this information is useful, please help other people find it:
Share via:

Sanjay Patel via llvm-dev

2019-Aug-26 20:01 UTC

[llvm-dev] LLVM X86 backend combineIncDecVector's transform

No objections from me to make it run later. I didn't see the potential
conflicts when I added that code. Delayed combine, custom lowering, or
DAGToDAGISel all seem like viable options to me.

On Mon, Aug 26, 2019 at 2:04 PM Roman Lebedev <lebedev.ri at gmail.com>
wrote:
> I have previously posted these two patches:
>
> [X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine to
> X86DAGToDAGISel
> https://reviews.llvm.org/D62327
>
> [DAGCombine][X86][AArch64][AMDGPU][MIPS][PPC] (sub x, c) -> (add x,
> -c) vector edition.
> https://reviews.llvm.org/D62341
>
> While they got stuck since i wasn't really interested in them
> (i'm mostly interested in scalars, not vectors...),
> i don't think there was any fatal roadblocks there,
> so i guess i should rebase them and see what's the status now.
>
> Roman.
>
> On Mon, Aug 26, 2019 at 8:56 PM Topper, Craig <craig.topper at
intel.com>
> wrote:
> >
> > I think DAGToDAG is too late because the build_vector has already been
> turned into a constant pool load by then so it’s a little difficult to get
> back. Maybe we can delay it to !DCI.isBeforeLegalizeOps()? That would at
> least let the first DAG combine and the post type legalization DAG combine
> see the add, 1.
> >
> >
> >
> > +Sanjay as well
> >
> >
> >
> > From: Amaury Séchet <deadalnix at gmail.com>
> > Sent: Monday, August 26, 2019 10:48 AM
> > To: Topper, Craig <craig.topper at intel.com>; llvm-dev at
redking.me.uk;
> efriedma at quicinc.com; lebedev.ri at gmail.com; llvm-dev <
> llvm-dev at lists.llvm.org>
> > Subject: LLVM X86 backend combineIncDecVector's transform
> >
> >
> >
> > Hi all,
> >
> > As you knwo already, I'm trying to change DAGCombiner so that it
process
> the nodes in topological order. Doing so is not difficult per se, but this
> creates various improvements and regression to the existing test suite.
I'd
> like to work through as many of the regressions as possible ahead of time.
> >
> > One source of such regressions is combineIncDecVector in the X86
> backend. It changes (add X, 1) into (sub X, -1) in order to be able to use
> the pcmpeq instruction.
> >
> > This is all well and good, but numerous paterns are matching an add
> rather than a sub, and in fact, DAGCombiner does the inverse transform by
> itself as it consider the add form to be canonical. An example of such
> pattern is the X86ISD::AVG node, but there are more.
> >
> > It seems to me like this transformation is useful, but doesn't
happen at
> the right place in the pipeline. Doing so later on, for instance at the DAG
> to DAG level would likely give DAGCombiner  more opportunities to do its
> job, and also ensure that all instances of the pattern are detected.
> >
> > It would be great if someone familiar with the X86 backend could look
> into this.
> >
> > Thanks in advance,
> >
> > Amaury Séchet
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/06d80092/attachment.html>

Craig Topper via llvm-dev

2019-Aug-26 22:16 UTC

head link

[llvm-dev] LLVM X86 backend combineIncDecVector's transform

I pushed it to a later DAG combine in r369981. Let me know if you still
have problems from it.

~Craig


On Mon, Aug 26, 2019 at 1:02 PM Sanjay Patel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> No objections from me to make it run later. I didn't see the potential
> conflicts when I added that code. Delayed combine, custom lowering, or
> DAGToDAGISel all seem like viable options to me.
>
> On Mon, Aug 26, 2019 at 2:04 PM Roman Lebedev <lebedev.ri at
gmail.com>
> wrote:
>
>> I have previously posted these two patches:
>>
>> [X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine to
>> X86DAGToDAGISel
>> https://reviews.llvm.org/D62327
>>
>> [DAGCombine][X86][AArch64][AMDGPU][MIPS][PPC] (sub x, c) -> (add x,
>> -c) vector edition.
>> https://reviews.llvm.org/D62341
>>
>> While they got stuck since i wasn't really interested in them
>> (i'm mostly interested in scalars, not vectors...),
>> i don't think there was any fatal roadblocks there,
>> so i guess i should rebase them and see what's the status now.
>>
>> Roman.
>>
>> On Mon, Aug 26, 2019 at 8:56 PM Topper, Craig <craig.topper at
intel.com>
>> wrote:
>> >
>> > I think DAGToDAG is too late because the build_vector has already
been
>> turned into a constant pool load by then so it’s a little difficult to
get
>> back. Maybe we can delay it to !DCI.isBeforeLegalizeOps()? That would
at
>> least let the first DAG combine and the post type legalization DAG
combine
>> see the add, 1.
>> >
>> >
>> >
>> > +Sanjay as well
>> >
>> >
>> >
>> > From: Amaury Séchet <deadalnix at gmail.com>
>> > Sent: Monday, August 26, 2019 10:48 AM
>> > To: Topper, Craig <craig.topper at intel.com>; llvm-dev at
redking.me.uk;
>> efriedma at quicinc.com; lebedev.ri at gmail.com; llvm-dev <
>> llvm-dev at lists.llvm.org>
>> > Subject: LLVM X86 backend combineIncDecVector's transform
>> >
>> >
>> >
>> > Hi all,
>> >
>> > As you knwo already, I'm trying to change DAGCombiner so that
it
>> process the nodes in topological order. Doing so is not difficult per
se,
>> but this creates various improvements and regression to the existing
test
>> suite. I'd like to work through as many of the regressions as
possible
>> ahead of time.
>> >
>> > One source of such regressions is combineIncDecVector in the X86
>> backend. It changes (add X, 1) into (sub X, -1) in order to be able to
use
>> the pcmpeq instruction.
>> >
>> > This is all well and good, but numerous paterns are matching an
add
>> rather than a sub, and in fact, DAGCombiner does the inverse transform
by
>> itself as it consider the add form to be canonical. An example of such
>> pattern is the X86ISD::AVG node, but there are more.
>> >
>> > It seems to me like this transformation is useful, but doesn't
happen
>> at the right place in the pipeline. Doing so later on, for instance at
the
>> DAG to DAG level would likely give DAGCombiner  more opportunities to
do
>> its job, and also ensure that all instances of the pattern are
detected.
>> >
>> > It would be great if someone familiar with the X86 backend could
look
>> into this.
>> >
>> > Thanks in advance,
>> >
>> > Amaury Séchet
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/0d8070b1/attachment.html>

Amaury Séchet via llvm-dev

2019-Aug-27 11:33 UTC

head link

[llvm-dev] LLVM X86 backend combineIncDecVector's transform

That is a great first step. It makes things better indeed, but still causes
a few regression when nodes processing happens in a different order.

Roman patches seems to be the way to go.

Le mar. 27 août 2019 à 00:18, Craig Topper via llvm-dev <
llvm-dev at lists.llvm.org> a écrit :
> I pushed it to a later DAG combine in r369981. Let me know if you still
> have problems from it.
>
> ~Craig
>
>
> On Mon, Aug 26, 2019 at 1:02 PM Sanjay Patel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> No objections from me to make it run later. I didn't see the
potential
>> conflicts when I added that code. Delayed combine, custom lowering, or
>> DAGToDAGISel all seem like viable options to me.
>>
>> On Mon, Aug 26, 2019 at 2:04 PM Roman Lebedev <lebedev.ri at
gmail.com>
>> wrote:
>>
>>> I have previously posted these two patches:
>>>
>>> [X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine
to
>>> X86DAGToDAGISel
>>> https://reviews.llvm.org/D62327
>>>
>>> [DAGCombine][X86][AArch64][AMDGPU][MIPS][PPC] (sub x, c) -> (add
x,
>>> -c) vector edition.
>>> https://reviews.llvm.org/D62341
>>>
>>> While they got stuck since i wasn't really interested in them
>>> (i'm mostly interested in scalars, not vectors...),
>>> i don't think there was any fatal roadblocks there,
>>> so i guess i should rebase them and see what's the status now.
>>>
>>> Roman.
>>>
>>> On Mon, Aug 26, 2019 at 8:56 PM Topper, Craig <craig.topper at
intel.com>
>>> wrote:
>>> >
>>> > I think DAGToDAG is too late because the build_vector has
already been
>>> turned into a constant pool load by then so it’s a little difficult
to get
>>> back. Maybe we can delay it to !DCI.isBeforeLegalizeOps()? That
would at
>>> least let the first DAG combine and the post type legalization DAG
combine
>>> see the add, 1.
>>> >
>>> >
>>> >
>>> > +Sanjay as well
>>> >
>>> >
>>> >
>>> > From: Amaury Séchet <deadalnix at gmail.com>
>>> > Sent: Monday, August 26, 2019 10:48 AM
>>> > To: Topper, Craig <craig.topper at intel.com>; llvm-dev
at redking.me.uk;
>>> efriedma at quicinc.com; lebedev.ri at gmail.com; llvm-dev <
>>> llvm-dev at lists.llvm.org>
>>> > Subject: LLVM X86 backend combineIncDecVector's transform
>>> >
>>> >
>>> >
>>> > Hi all,
>>> >
>>> > As you knwo already, I'm trying to change DAGCombiner so
that it
>>> process the nodes in topological order. Doing so is not difficult
per se,
>>> but this creates various improvements and regression to the
existing test
>>> suite. I'd like to work through as many of the regressions as
possible
>>> ahead of time.
>>> >
>>> > One source of such regressions is combineIncDecVector in the
X86
>>> backend. It changes (add X, 1) into (sub X, -1) in order to be able
to use
>>> the pcmpeq instruction.
>>> >
>>> > This is all well and good, but numerous paterns are matching
an add
>>> rather than a sub, and in fact, DAGCombiner does the inverse
transform by
>>> itself as it consider the add form to be canonical. An example of
such
>>> pattern is the X86ISD::AVG node, but there are more.
>>> >
>>> > It seems to me like this transformation is useful, but
doesn't happen
>>> at the right place in the pipeline. Doing so later on, for instance
at the
>>> DAG to DAG level would likely give DAGCombiner  more opportunities
to do
>>> its job, and also ensure that all instances of the pattern are
detected.
>>> >
>>> > It would be great if someone familiar with the X86 backend
could look
>>> into this.
>>> >
>>> > Thanks in advance,
>>> >
>>> > Amaury Séchet
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190827/bf1ba4c3/attachment.html>

llvm dev - Aug 2019 - LLVM X86 backend combineIncDecVector's transform

[llvm-dev] LLVM X86 backend combineIncDecVector's transform

[llvm-dev] LLVM X86 backend combineIncDecVector's transform

[llvm-dev] LLVM X86 backend combineIncDecVector's transform