Josh Sharp via llvm-dev
2019-Jan-26 00:15 UTC
[llvm-dev] Different SelectionDAGs for same CPU
Hi Tim,>That C++ function is probably what looks for an FrameIndex node and >has been taught that it can be folded into the load.How do you teach a function that a node can be folded into an instruction? ________________________________ From: Tim Northover <t.p.northover at gmail.com> Sent: Monday, January 21, 2019 11:52 PM To: Josh Sharp Cc: via llvm-dev Subject: Re: [llvm-dev] Different SelectionDAGs for same CPU Hi Josh, On Tue, 22 Jan 2019 at 04:54, Josh Sharp via llvm-dev <llvm-dev at lists.llvm.org> wrote:> In the first case, node t1 is a separate node whereas in the second case, t1 is inside t4. What difference in implementation could explain this difference in behavior?The second compiler looks like someone has added extra code to fold a stack address calculation into the load operation that accesses the variable.> Where in the code should I look into?It could be implemented in a couple of places. Most likely is that XYZInstrInfo.td (or some related TableGen file) defines a ComplexPattern that is used by the LDWI instruction definition. That ComplexPattern tells pattern matching to call a specific function in XYZISelDAGToDAG.cpp when deciding what to use for the LDWI operands. That C++ function is probably what looks for an FrameIndex node and has been taught that it can be folded into the load. If you just grep the target's code for FrameIndex or frameindex you should find it pretty quickly though, even if they used some other method. There don't tend to be many uses of that particular node. Cheers. Tim. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190126/d08566a0/attachment-0001.html>
Tim Northover via llvm-dev
2019-Jan-26 08:15 UTC
[llvm-dev] Different SelectionDAGs for same CPU
On Sat, 26 Jan 2019 at 00:15, Josh Sharp <mm92126 at hotmail.com> wrote:> >That C++ function is probably what looks for an FrameIndex node and > >has been taught that it can be folded into the load. > > How do you teach a function that a node can be folded into an instruction?Well, if you look at the SelectAddrModeIndexed function in AArch64ISelDAGToDAG.cpp for example, at the top it checks whether the address we're selecting is an ISD::FrameIndex; if so, it converts it into an equivalent TargetFrameIndex (so that LLVM knows it's already been selected) and makes that the base of the address operand, and adds a dummy TargetConstant 0 as the offset operand; then it returns true to indicate it was able to match part of the DAG for that instruction. Other key things to look at in that particular example is the am_indexed8 definition, which is where TableGen is taught about that C++ function (well, actually SelectAddrMode8, but that just immediately calls SelectAddrMode with an extra "8" argument), and the definition of LDRB which uses that am_indexed8 in a pattern. The definitions are quite a maze of multiclass expansions, so I sometimes find it easier to run llvm-tblgen without a backend (from my build directory "bin/llvm-tblgen ../llvm/lib/Target/AArch64/AArch64.td -I ../llvm/include -I ../llvm/lib/Target/AArch64"). That expands everything so that you can (say) look at all the parts that make up LDRBui (the key instruction) in one place -- all of its operands and patterns and bits etc. Cheers. Tim.
Josh Sharp via llvm-dev
2019-Feb-08 04:36 UTC
[llvm-dev] Different SelectionDAGs for same CPU
Tim, I was able to fold the stack address calculation into the load operation as you said. Is the approach the same if I want to fold any target instruction into any another target instruction? Specifically, I'm trying to get from this t0: ch = EntryToken t8: i32 = MOVRI TargetConstant:i32<0> t1: i32,i1,i1,i1,i1 = ADDR TargetFrameIndex:i32<0>, t8 t3: ch,glue = CopyToReg t0, Register:i32 $r4, t1 t4: ch = JLR Register:i32 $r4, t3, t3:1 to this t0: ch = EntryToken t1: i32,i1,i1,i1,i1 = ADDR TargetFrameIndex:i32<0>, MOVRI:i32,i1,i1 t3: ch,glue = CopyToReg t0, Register:i32 $r4, t1 t4: ch = JLR Register:i32 $r4, t3, t3:1 Thanks. ________________________________ From: Tim Northover <t.p.northover at gmail.com> Sent: Saturday, January 26, 2019 12:15 AM To: Josh Sharp Cc: via llvm-dev Subject: Re: [llvm-dev] Different SelectionDAGs for same CPU On Sat, 26 Jan 2019 at 00:15, Josh Sharp <mm92126 at hotmail.com> wrote:> >That C++ function is probably what looks for an FrameIndex node and > >has been taught that it can be folded into the load. > > How do you teach a function that a node can be folded into an instruction?Well, if you look at the SelectAddrModeIndexed function in AArch64ISelDAGToDAG.cpp for example, at the top it checks whether the address we're selecting is an ISD::FrameIndex; if so, it converts it into an equivalent TargetFrameIndex (so that LLVM knows it's already been selected) and makes that the base of the address operand, and adds a dummy TargetConstant 0 as the offset operand; then it returns true to indicate it was able to match part of the DAG for that instruction. Other key things to look at in that particular example is the am_indexed8 definition, which is where TableGen is taught about that C++ function (well, actually SelectAddrMode8, but that just immediately calls SelectAddrMode with an extra "8" argument), and the definition of LDRB which uses that am_indexed8 in a pattern. The definitions are quite a maze of multiclass expansions, so I sometimes find it easier to run llvm-tblgen without a backend (from my build directory "bin/llvm-tblgen ../llvm/lib/Target/AArch64/AArch64.td -I ../llvm/include -I ../llvm/lib/Target/AArch64"). That expands everything so that you can (say) look at all the parts that make up LDRBui (the key instruction) in one place -- all of its operands and patterns and bits etc. Cheers. Tim. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190208/11613a33/attachment.html>