林政宗 via llvm-dev
2021-May-31 02:48 UTC
[llvm-dev] question about flag register through instruction selection and instruction scheduling
Hi, I am trying to support addition of int64 on my target which has only 32 bit registers. The target has only normal add instruction which adds two 32 bit register, writes result to a register, and updates carry flag in flag regitser. It does not have the addwithcarry instruction which adds two register and carry flag, writes result to a register, and updates carry flag. I have refered to the implementation of 64bit addition on cortex-a7 which is arm architecture and have some questions. the c source, .ll file, and debug dump is attached. -----------------------------------------debug dump------------------------------------------------------------- 1522Optimized legalized selection DAG: %bb.0 'foo:entry' 1523SelectionDAG has 16 nodes: 1524 t0: ch = EntryToken 1525 t17: ch,glue = CopyToReg t0, Register:i32 $r0, t29 1526 t8: i32,ch = CopyFromReg t0, Register:i32 %3 1527 t4: i32,ch = CopyFromReg t0, Register:i32 %1 1528 t25: i32,i32 = ARMISD::ADDE t8, t4, t29:1 1529 t19: ch,glue = CopyToReg t17, Register:i32 $r1, t25, t17:1 1530 t6: i32,ch = CopyFromReg t0, Register:i32 %2 1531 t2: i32,ch = CopyFromReg t0, Register:i32 %0 1532 t29: i32,i32 = ARMISD::ADDC t6, t2 1533 t20: ch = ARMISD::RET_FLAG t19, Register:i32 $r0, Register:i32 $r1, t19:1 --------------------------------------------------------------------------------------------------------------------- Question 1: In the lowering process, arm cortex-a7 simulate the carry flag by considering it as data dependence, such as t25: i32,i32 = ARMISD::ADDE t8, t4, t29:1. Is it safer by simulate the carry flag by using a MVT::Glue? If the DAG graph is large and we see it as data dependence, is it possible that an other instruction is inserted between the two nodes/instructions, which have carry flag dependence, after emitting schedule in CodeGenAndEmitDAG()(SelectionDAGISel.cpp)? If the inserted instruction happens to update the carry flag, the second ARMISD::ADDE will have incorrect result. Or is it guaranteed the case will not happen? ------------------------------------debug dump----------------------------------------------------------------- 1548ISEL: Starting selection on root node: t25: i32,i32 = ARMISD::ADDE t8, t4, t29:1 1549ISEL: Starting pattern match 1550 Initial Opcode index to 94735 1551 Match failed at index 94748 1552 Continuing at 94777 1553 Match failed at index 94778 1554 Continuing at 94806 1555 Continuing at 94807 1556 Match failed at index 94808 1557 Continuing at 94838 1558 Continuing at 94839 1559 Match failed at index 94842 1560 Continuing at 95134 1561Creating new node: t35: ch,glue = CopyToReg t0, Register:i32 $cpsr, t29:1 1562 Morphed node: t25: i32,i32 = ADCrr t8, t4, TargetConstant:i32<14>, Register:i32 $noreg, Register:i32 $noreg, t35:1 1563ISEL: Match complete! ----------------------------------td file(ARMInstrInfo.td)------------------------------------------------------------------------------- 3717let isAdd = 1 in 3718defm ADC : AI1_adde_sube_irs<0b0101, "adc", ARMadde, 1>; 1740/// AI1_adde_sube_irs - Define instructions and patterns for adde and sube. 1741let TwoOperandAliasConstraint = "$Rn = $Rd" in 1742multiclass AI1_adde_sube_irs<bits<4> opcod, string opc, SDNode opnode, 1743 bit Commutable = 0> { 1744 let hasPostISelHook = 1, Defs = [CPSR], Uses = [CPSR] in { 1745 def ri : AsI1<opcod, (outs GPR:$Rd), (ins GPR:$Rn, mod_imm:$imm), 1746 DPFrm, IIC_iALUi, opc, "\t$Rd, $Rn, $imm", 1747 [(set GPR:$Rd, CPSR, (opnode GPR:$Rn, mod_imm:$imm, CPSR))]>, 1748 Requires<[IsARM]>, 1749 Sched<[WriteALU, ReadALU]> { 1750 bits<4> Rd; 1751 bits<4> Rn; 1752 bits<12> imm; 1753 let Inst{25} = 1; 1754 let Inst{15-12} = Rd; 1755 let Inst{19-16} = Rn; 1756 let Inst{11-0} = imm; 1757 } 1758 def rr : AsI1<opcod, (outs GPR:$Rd), (ins GPR:$Rn, GPR:$Rm), 1759 DPFrm, IIC_iALUr, opc, "\t$Rd, $Rn, $Rm", 1760 [(set GPR:$Rd, CPSR, (opnode GPR:$Rn, GPR:$Rm, CPSR))]>, 1761 Requires<[IsARM]>, 1762 Sched<[WriteALU, ReadALU, ReadALU]> { 1763 bits<4> Rd; 1764 bits<4> Rn; 1765 bits<4> Rm; 1766 let Inst{11-4} = 0b00000000; 1767 let Inst{25} = 0; 1768 let isCommutable = Commutable; 1769 let Inst{3-0} = Rm; 1770 let Inst{15-12} = Rd; 1771 let Inst{19-16} = Rn; 1772 } ------------------------------------------------------------------------------------------------------------------------------------------- Question 2: When selecting ARMISD::ADDE, a new node(t35) is created. Why does tablegen create a new node? Is this behavior related with the code "(opnode GPR:$Rn, GPR:$Rm, CPSR))" at line 1760? Is it because CPSR is a parameter of opnode? ---------------------------------------------debug dump------------------------------------------------------------------------------ 1812Total amount of phi nodes to update: 0 1813*** MachineFunction at end of ISel *** 1814# Machine code for function foo: IsSSA, TracksLiveness 1815Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2, $r3 in %3 1816 1817bb.0.entry: 1818 liveins: $r0, $r1, $r2, $r3 1819 %3:gpr = COPY $r3 1820 %2:gpr = COPY $r2 1821 %1:gpr = COPY $r1 1822 %0:gpr = COPY $r0 1823 %4:gpr = ADDrr %2:gpr, %0:gpr, 14, $noreg, def $cpsr 1824 %5:gpr = ADCrr %3:gpr, %1:gpr, 14, $noreg, $noreg, implicit $cpsr 1825 $r0 = COPY %4:gpr 1826 $r1 = COPY %5:gpr 1827 BX_RET 14, $noreg, implicit $r0, implicit $r1 1828 1829# End machine code for function foo. ----------------------------------------------------------------------------------------------------------------------------------------------- Question 3: if the basic block bb.0.entry has a lot of instructions and Machine Instruction Scheduler is enabled, the instruciton sequence will be reordered by latency. Is it possible that an other instruction is placed between ADDrr and ADCrr? If the placed instruction happens to update carry flag, We will get incorrect result. Thanks, Jerry -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210531/8cc53384/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.arm.debug Type: application/octet-stream Size: 89545 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210531/8cc53384/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.arm.ll Type: application/octet-stream Size: 1380 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210531/8cc53384/attachment-0003.obj> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.c URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210531/8cc53384/attachment-0001.c>