Amaury Séchet via llvm-dev
2019-Aug-26 17:48 UTC
[llvm-dev] LLVM X86 backend combineIncDecVector's transform
Hi all, As you knwo already, I'm trying to change DAGCombiner so that it process the nodes in topological order. Doing so is not difficult per se, but this creates various improvements and regression to the existing test suite. I'd like to work through as many of the regressions as possible ahead of time. One source of such regressions is combineIncDecVector in the X86 backend. It changes (add X, 1) into (sub X, -1) in order to be able to use the pcmpeq instruction. This is all well and good, but numerous paterns are matching an add rather than a sub, and in fact, DAGCombiner does the inverse transform by itself as it consider the add form to be canonical. An example of such pattern is the X86ISD::AVG node, but there are more. It seems to me like this transformation is useful, but doesn't happen at the right place in the pipeline. Doing so later on, for instance at the DAG to DAG level would likely give DAGCombiner more opportunities to do its job, and also ensure that all instances of the pattern are detected. It would be great if someone familiar with the X86 backend could look into this. Thanks in advance, Amaury Séchet -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/785eed4d/attachment.html>
Topper, Craig via llvm-dev
2019-Aug-26 17:56 UTC
[llvm-dev] LLVM X86 backend combineIncDecVector's transform
I think DAGToDAG is too late because the build_vector has already been turned into a constant pool load by then so it’s a little difficult to get back. Maybe we can delay it to !DCI.isBeforeLegalizeOps()? That would at least let the first DAG combine and the post type legalization DAG combine see the add, 1. +Sanjay as well From: Amaury Séchet <deadalnix at gmail.com> Sent: Monday, August 26, 2019 10:48 AM To: Topper, Craig <craig.topper at intel.com>; llvm-dev at redking.me.uk; efriedma at quicinc.com; lebedev.ri at gmail.com; llvm-dev <llvm-dev at lists.llvm.org> Subject: LLVM X86 backend combineIncDecVector's transform Hi all, As you knwo already, I'm trying to change DAGCombiner so that it process the nodes in topological order. Doing so is not difficult per se, but this creates various improvements and regression to the existing test suite. I'd like to work through as many of the regressions as possible ahead of time. One source of such regressions is combineIncDecVector in the X86 backend. It changes (add X, 1) into (sub X, -1) in order to be able to use the pcmpeq instruction. This is all well and good, but numerous paterns are matching an add rather than a sub, and in fact, DAGCombiner does the inverse transform by itself as it consider the add form to be canonical. An example of such pattern is the X86ISD::AVG node, but there are more. It seems to me like this transformation is useful, but doesn't happen at the right place in the pipeline. Doing so later on, for instance at the DAG to DAG level would likely give DAGCombiner more opportunities to do its job, and also ensure that all instances of the pattern are detected. It would be great if someone familiar with the X86 backend could look into this. Thanks in advance, Amaury Séchet -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/f38eb7c8/attachment.html>
Roman Lebedev via llvm-dev
2019-Aug-26 18:03 UTC
[llvm-dev] LLVM X86 backend combineIncDecVector's transform
I have previously posted these two patches: [X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine to X86DAGToDAGISel https://reviews.llvm.org/D62327 [DAGCombine][X86][AArch64][AMDGPU][MIPS][PPC] (sub x, c) -> (add x, -c) vector edition. https://reviews.llvm.org/D62341 While they got stuck since i wasn't really interested in them (i'm mostly interested in scalars, not vectors...), i don't think there was any fatal roadblocks there, so i guess i should rebase them and see what's the status now. Roman. On Mon, Aug 26, 2019 at 8:56 PM Topper, Craig <craig.topper at intel.com> wrote:> > I think DAGToDAG is too late because the build_vector has already been turned into a constant pool load by then so it’s a little difficult to get back. Maybe we can delay it to !DCI.isBeforeLegalizeOps()? That would at least let the first DAG combine and the post type legalization DAG combine see the add, 1. > > > > +Sanjay as well > > > > From: Amaury Séchet <deadalnix at gmail.com> > Sent: Monday, August 26, 2019 10:48 AM > To: Topper, Craig <craig.topper at intel.com>; llvm-dev at redking.me.uk; efriedma at quicinc.com; lebedev.ri at gmail.com; llvm-dev <llvm-dev at lists.llvm.org> > Subject: LLVM X86 backend combineIncDecVector's transform > > > > Hi all, > > As you knwo already, I'm trying to change DAGCombiner so that it process the nodes in topological order. Doing so is not difficult per se, but this creates various improvements and regression to the existing test suite. I'd like to work through as many of the regressions as possible ahead of time. > > One source of such regressions is combineIncDecVector in the X86 backend. It changes (add X, 1) into (sub X, -1) in order to be able to use the pcmpeq instruction. > > This is all well and good, but numerous paterns are matching an add rather than a sub, and in fact, DAGCombiner does the inverse transform by itself as it consider the add form to be canonical. An example of such pattern is the X86ISD::AVG node, but there are more. > > It seems to me like this transformation is useful, but doesn't happen at the right place in the pipeline. Doing so later on, for instance at the DAG to DAG level would likely give DAGCombiner more opportunities to do its job, and also ensure that all instances of the pattern are detected. > > It would be great if someone familiar with the X86 backend could look into this. > > Thanks in advance, > > Amaury Séchet