Stephen Rogers via llvm-dev
2017-Feb-14 11:27 UTC
[llvm-dev] Ensuring chain dependencies with expansion to libcalls
Hi all, Our target does not have native support for 64-bit integers, so we rely on library calls for certain operations (like sdiv). We recently ran into a problem where these operations that are expanded to library calls aren't maintaining the proper ordering in relation to other chains in the DAG. The following snippet of a DAG demonstrates the problem. t0: ch = EntryToken t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 t16: i64 = sdiv t2, t4 Before legalization, there is a single sdiv node. After legalization, this has been expanded to a call sequence: t0: ch = EntryToken t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 t46: ch,glue = callseq_start t0, TargetConstant:i32<0> t47: ch,glue = CopyToReg t46, Register:i64 %reg0, t2 t48: ch,glue = CopyToReg t47, Register:i64 %reg1, t4, t47:1 t50: ch,glue = SHAVEISD::CALL t48, TargetExternalSymbol:i32'__divdi3', Register:i64 %reg0, Register:i64 %reg1, RegisterMask:Untyped, t48:1 t51: ch,glue = callseq_end t50, TargetConstant:i32<0>, TargetExternalSymbol:i32'__divdi3', t50:1 t52: i64,ch,glue = CopyFromReg t51, Register:i64 %reg0, t51:1 t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 Since the sdiv node is not part of a chain, the EntryToken is used as the starting point of the chain for the library call. This means that the ordering between the sequence of CopyFromReg nodes and the call sequence can be changed. Specifically, we are seeing the two copies from %reg2 and %reg3 being moved to after the call sequence. This is a problem since the library call clobbers both of these registers (since they are caller-saved registers). If I manually call __divdi3 instead of using / in the source code that generates this DAG, then we get the following snippet: t0: ch = EntryToken t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 t9: ch = TokenFactor t2:1, t4:1, t6:1, t8:1 t19: ch,glue = callseq_start t9, TargetConstant:i32<0> t20: ch,glue = CopyToReg t19, Register:i64 %reg0, t2 t21: ch,glue = CopyToReg t20, Register:i64 %reg1, t4, t20:1 t23: ch,glue = SHAVEISD::CALL t21, TargetGlobalAddress:i32<i64 (i64, i64)* @__divdi3> 0, Register:i64 %reg0, Register:i64 %reg1, RegisterMask:Untyped, t21:1 t24: ch,glue = callseq_end t23, TargetConstant:i32<0>, TargetGlobalAddress:i32<i64 (i64, i64)* @__divdi3> 0, t23:1 t25: i64,ch,glue = CopyFromReg t24, Register:i64 %reg0, t24:1 t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 t31: ch = TokenFactor t11, t13, t15, t25:1 LLVM correctly uses a TokenFactor to pull the CopyFromReg nodes together and uses it as the input chain to the call sequence. Is there a way that I can tell LLVM that i64 sdiv operations have side-effects for our target and require an input chain? Thanks, Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170214/4fe8cd3d/attachment.html>
Friedman, Eli via llvm-dev
2017-Feb-14 18:21 UTC
[llvm-dev] Ensuring chain dependencies with expansion to libcalls
On 2/14/2017 3:27 AM, Stephen Rogers via llvm-dev wrote:> Hi all, > > Our target does not have native support for 64-bit integers, so we > rely on library calls for certainoperations (like sdiv). We recently > ran into a problem where these operations that are expanded to library > calls aren't maintaining the proper ordering in relation to other > chains in the DAG. > > The following snippet of a DAG demonstrates the problem. > > t0: ch = EntryToken > t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 > t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 > t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 > t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 > t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 > t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 > t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 > t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 > t16: i64 = sdiv t2, t4 > > Before legalization, there is a single sdivnode. After legalization, > this has been expanded to a call sequence: > > t0: ch = EntryToken > t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 > t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 > t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 > t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 > t46: ch,glue = callseq_start t0, TargetConstant:i32<0> > t47: ch,glue = CopyToReg t46, Register:i64 %reg0, t2 > t48: ch,glue = CopyToReg t47, Register:i64 %reg1, t4, t47:1 > t50: ch,glue = SHAVEISD::CALL t48, > TargetExternalSymbol:i32'__divdi3', Register:i64 %reg0, Register:i64 > %reg1, RegisterMask:Untyped, t48:1 > t51: ch,glue = callseq_end t50, TargetConstant:i32<0>, > TargetExternalSymbol:i32'__divdi3', t50:1 > t52: i64,ch,glue = CopyFromReg t51, Register:i64 %reg0, t51:1 > t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 > t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 > t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 > t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 > > Since the sdivnode is not part of a chain, the EntryTokenis used as > the starting point of the chain for the library call. This means that > the ordering between the sequence of CopyFromRegnodes and the call > sequence can be changed. Specifically, we are seeing the two copies > from %reg2and %reg3being moved to after the call sequence. This is a > problem since the library call clobbers both of these registers (since > they are caller-saved registers)."CopyFromReg t0, Register:i64 %reg0" is your problem: as you've discovered, you can't model function arguments like this. If you look at the debug output for an in-tree target (e.g. 32-bit ARM), you'll find that the CopyFromReg nodes for register arguments are copies from virtual registers. See MachineFunction::addLiveIn. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170214/ee997c1f/attachment.html>
Stephen Rogers via llvm-dev
2017-Feb-15 11:46 UTC
[llvm-dev] Ensuring chain dependencies with expansion to libcalls
Thanks Eli, This was exactly the problem. Using virtual registers for our function arguments has fixed it. All the best, Stephen On 14 February 2017 at 18:21, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 2/14/2017 3:27 AM, Stephen Rogers via llvm-dev wrote: > > Hi all, > > Our target does not have native support for 64-bit integers, so we rely on > library calls for certain operations (like sdiv). We recently ran into a > problem where these operations that are expanded to library calls aren't > maintaining the proper ordering in relation to other chains in the DAG. > > The following snippet of a DAG demonstrates the problem. > > t0: ch = EntryToken > t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 > t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 > t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 > t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 > t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 > t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 > t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 > t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 > t16: i64 = sdiv t2, t4 > > Before legalization, there is a single sdiv node. After legalization, > this has been expanded to a call sequence: > > t0: ch = EntryToken > t2: i64,ch,glue = CopyFromReg t0, Register:i64 %reg0 > t4: i64,ch,glue = CopyFromReg t2:1, Register:i64 %reg1, t2:1 > t6: i64,ch,glue = CopyFromReg t4:1, Register:i64 %reg2, t4:1 > t8: i64,ch,glue = CopyFromReg t6:1, Register:i64 %reg3, t6:1 > t46: ch,glue = callseq_start t0, TargetConstant:i32<0> > t47: ch,glue = CopyToReg t46, Register:i64 %reg0, t2 > t48: ch,glue = CopyToReg t47, Register:i64 %reg1, t4, t47:1 > t50: ch,glue = SHAVEISD::CALL t48, TargetExternalSymbol:i32'__divdi3', > Register:i64 %reg0, Register:i64 %reg1, RegisterMask:Untyped, t48:1 > t51: ch,glue = callseq_end t50, TargetConstant:i32<0>, > TargetExternalSymbol:i32'__divdi3', t50:1 > t52: i64,ch,glue = CopyFromReg t51, Register:i64 %reg0, t51:1 > t11: ch = CopyToReg t0, Register:i64 %vreg0, t2 > t13: ch = CopyToReg t0, Register:i64 %vreg1, t4 > t15: ch = CopyToReg t0, Register:i64 %vreg2, t8 > t26: ch = TokenFactor t11, t13, t15, t2:1, t4:1, t6:1, t8:1 > > Since the sdiv node is not part of a chain, the EntryToken is used as the > starting point of the chain for the library call. This means that the > ordering between the sequence of CopyFromReg nodes and the call sequence > can be changed. Specifically, we are seeing the two copies from %reg2 and > %reg3 being moved to after the call sequence. This is a problem since the > library call clobbers both of these registers (since they are caller-saved > registers). > > > "CopyFromReg t0, Register:i64 %reg0" is your problem: as you've > discovered, you can't model function arguments like this. If you look at > the debug output for an in-tree target (e.g. 32-bit ARM), you'll find that > the CopyFromReg nodes for register arguments are copies from virtual > registers. See MachineFunction::addLiveIn. > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170215/ef0ae092/attachment-0001.html>
Apparently Analagous Threads
- Instruction selection problems due to SelectionDAGBuilder
- How to constraint instructions reordering from patterns?
- How to constraint instructions reordering from patterns?
- How to constraint instructions reordering from patterns?
- How to constraint instructions reordering from patterns?