Boris Boesler
2014-Oct-29 17:35 UTC
[LLVMdev] Virtual register def doesn't dominate all uses
Hi Quentin, yes, this happens quite late. With the Option --debug-pass=Structure it's in or after "Assembly Printer". I do have a very simple DAGToDAGISel::Select() method: SDNode *MyTargetDAGToDAGISel::Select(SDNode *N) { SDLoc dl(N); // default implementation if (N -> isMachineOpcode()) { N -> setNodeId(-1); return NULL; // Already selected. } SDNode *res = SelectCode(N); return res; } Is that too simple? There are no further passes that eliminate anything. Anyway, I have another test program, that could point to my bug: int formal_args_3(int p1, int p2, int p3) { int v1 = p1; int v2 = p2; int v3 = p3; int res = v1 + v2; return(res); } I can compile this test and I get correct code. With option -view-sched-dags I can verify that all arguments are stored in the stack-frame, the local variables are initialized (several store operations in stack-frame) and the return value is evaluated. But if I use the statement "int res = v1 + v2 + v3;" something strange happens: all arguments are stored in the stack-frame and the local variables are initialized. Now, the variables v1 and v2 are loaded, but they are not used (no ADD instructions) and a MOVE instruction register to register is generated that uses itself as an operand. This register should be stored and should be used as function result. Well, the LOAD instruction uses another register class for the destination register than the ADD instruction uses for its operands, but both classes share some registers. That should not be a problem. Any hints where I can search for my bug? Thanks, Boris Am 24.10.2014 um 19:27 schrieb Quentin Colombet <qcolombet at apple.com>:> Hi Boris, > > I don’t see any phis in your machine code whereas the IR had some. This means you are already pretty late in the pipeline of the backend (i.e., after SSA form has been deconstructed). > Do you have any custom pass between instruction selection and the PHIElimination pass? > > If so, I would look into them. > > Cheers, > -Quentin > >> On Oct 24, 2014, at 7:53 AM, Boris Boesler <baembel at gmx.de> wrote: >> >> Hi! >> >> During my backend development I get the error message for some tests: >> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >> >> (C source-code, byte-code disassembly and printed machine code at the end of the email) >> >> The first USE of vreg4 in BB#1 has no previous DEF in BB#0 or #1. But why? I can't see how the LLVM byte-code is transformed to the lower machine code. >> >> One possible reason could be that I haven't implemented all operations, eg I didn't implement MUL at this stage. Their "state" is LEGAL and not CUSTOM or EXPAND. But it fails with implemented operations as well. >> >> What did I do wrong? Missing implementation for some operations? What did I miss to implement? >> >> Thanks in advance, >> Boris >> >> ----8<---- >> >> C source-code: >> int simple_loop(int end_loop_index) >> { >> int sum = 0; >> for(int i = 0; i < end_loop_index; i++) { >> sum += i; >> } >> return(sum); >> } >> >> >> LLVm byte-code disassembly: >> ; Function Attrs: nounwind readnone >> define i32 @simple_loop(i32 %end_loop_index) #1 { >> entry: >> %cmp4 = icmp sgt i32 %end_loop_index, 0 >> br i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end >> >> for.cond.for.end_crit_edge: ; preds = %entry >> %0 = add i32 %end_loop_index, -2 >> %1 = add i32 %end_loop_index, -1 >> %2 = zext i32 %0 to i33 >> %3 = zext i32 %1 to i33 >> %4 = mul i33 %3, %2 >> %5 = lshr i33 %4, 1 >> %6 = trunc i33 %5 to i32 >> %7 = add i32 %6, %end_loop_index >> %8 = add i32 %7, -1 >> br label %for.end >> >> for.end: ; preds = %for.cond.for.end_crit_edge, %entry >> %sum.0.lcssa = phi i32 [ %8, %for.cond.for.end_crit_edge ], [ 0, %entry ] >> ret i32 %sum.0.lcssa >> } >> >> >> The emitted blocks are: >> Function Live Ins: %R0 in %vreg2 >> >> BB#0: derived from LLVM BB %entry >> Live Ins: %R0 >> %vreg2<def> = COPY %R0; IntRegs:%vreg2 >> %vreg3<def> = MV 0; SRegs:%vreg3 >> CMP %vreg2, 1, %FLAG<imp-def>; IntRegs:%vreg2 >> %vreg6<def> = COPY %vreg3; SRegs:%vreg6,%vreg3 >> BR_cc <BB#2>, 20, %FLAG<imp-use,kill> >> BR <BB#1> >> Successors according to CFG: BB#1(20) BB#2(12) >> >> BB#1: derived from LLVM BB %for.cond.for.end_crit_edge >> Predecessors according to CFG: BB#0 >> %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >> %vreg5<def> = ADD %vreg4<kill>, -1; IntRegs:%vreg5,%vreg4 >> %vreg0<def> = COPY %vreg5<kill>; SRegs:%vreg0 IntRegs:%vreg5 >> %vreg6<def> = COPY %vreg0; SRegs:%vreg6,%vreg0 >> Successors according to CFG: BB#2 >> >> BB#2: derived from LLVM BB %for.end >> Predecessors according to CFG: BB#0 BB#1 >> %vreg1<def> = COPY %vreg6<kill>; SRegs:%vreg1,%vreg6 >> %R0<def> = COPY %vreg1; SRegs:%vreg1 >> RETURN %R0<imp-use> >> >> # End machine code for function simple_loop. >> >> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >> - function: simple_loop >> - basic block: BB#1 for.cond.for.end_crit_edge (0x7fd7cb025250) >> - instruction: %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >> LLVM ERROR: Found 1 machine code errors. >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Quentin Colombet
2014-Oct-30 18:23 UTC
[LLVMdev] Virtual register def doesn't dominate all uses
Hi Boris, On Oct 29, 2014, at 10:35 AM, Boris Boesler <baembel at gmx.de> wrote:> Hi Quentin, > > yes, this happens quite late. With the Option --debug-pass=Structure it's in or after "Assembly Printer". > I do have a very simple DAGToDAGISel::Select() method: > > SDNode *MyTargetDAGToDAGISel::Select(SDNode *N) > { > SDLoc dl(N); > // default implementation > if (N -> isMachineOpcode()) { > N -> setNodeId(-1); > return NULL; // Already selected. > } > SDNode *res = SelectCode(N); > return res; > } > > Is that too simple? There are no further passes that eliminate anything. > > Anyway, I have another test program, that could point to my bug: > > int formal_args_3(int p1, int p2, int p3) > { > int v1 = p1; > int v2 = p2; > int v3 = p3; > int res = v1 + v2; > return(res); > } > > I can compile this test and I get correct code. With option -view-sched-dags I can verify that all arguments are stored in the stack-frame, the local variables are initialized (several store operations in stack-frame) and the return value is evaluated. > > But if I use the statement "int res = v1 + v2 + v3;" something strange happens: all arguments are stored in the stack-frame and the local variables are initialized. Now, the variables v1 and v2 are loaded, but they are not used (no ADD instructions) and a MOVE instruction register to register is generated that uses itself as an operand. This register should be stored and should be used as function result. > > Well, the LOAD instruction uses another register class for the destination register than the ADD instruction uses for its operands, but both classes share some registers. That should not be a problem.Like you said, that shouldn’t be a problem.> > Any hints where I can search for my bug?Try using -print-machineinstrs and check where the Machine IR diverge from what you were expected. Then, you can use -debug-only <the offending pass> to have more details. Cheers, -Quentin> > Thanks, > Boris > > > Am 24.10.2014 um 19:27 schrieb Quentin Colombet <qcolombet at apple.com>: > >> Hi Boris, >> >> I don’t see any phis in your machine code whereas the IR had some. This means you are already pretty late in the pipeline of the backend (i.e., after SSA form has been deconstructed). >> Do you have any custom pass between instruction selection and the PHIElimination pass? >> >> If so, I would look into them. >> >> Cheers, >> -Quentin >> >>> On Oct 24, 2014, at 7:53 AM, Boris Boesler <baembel at gmx.de> wrote: >>> >>> Hi! >>> >>> During my backend development I get the error message for some tests: >>> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >>> >>> (C source-code, byte-code disassembly and printed machine code at the end of the email) >>> >>> The first USE of vreg4 in BB#1 has no previous DEF in BB#0 or #1. But why? I can't see how the LLVM byte-code is transformed to the lower machine code. >>> >>> One possible reason could be that I haven't implemented all operations, eg I didn't implement MUL at this stage. Their "state" is LEGAL and not CUSTOM or EXPAND. But it fails with implemented operations as well. >>> >>> What did I do wrong? Missing implementation for some operations? What did I miss to implement? >>> >>> Thanks in advance, >>> Boris >>> >>> ----8<---- >>> >>> C source-code: >>> int simple_loop(int end_loop_index) >>> { >>> int sum = 0; >>> for(int i = 0; i < end_loop_index; i++) { >>> sum += i; >>> } >>> return(sum); >>> } >>> >>> >>> LLVm byte-code disassembly: >>> ; Function Attrs: nounwind readnone >>> define i32 @simple_loop(i32 %end_loop_index) #1 { >>> entry: >>> %cmp4 = icmp sgt i32 %end_loop_index, 0 >>> br i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end >>> >>> for.cond.for.end_crit_edge: ; preds = %entry >>> %0 = add i32 %end_loop_index, -2 >>> %1 = add i32 %end_loop_index, -1 >>> %2 = zext i32 %0 to i33 >>> %3 = zext i32 %1 to i33 >>> %4 = mul i33 %3, %2 >>> %5 = lshr i33 %4, 1 >>> %6 = trunc i33 %5 to i32 >>> %7 = add i32 %6, %end_loop_index >>> %8 = add i32 %7, -1 >>> br label %for.end >>> >>> for.end: ; preds = %for.cond.for.end_crit_edge, %entry >>> %sum.0.lcssa = phi i32 [ %8, %for.cond.for.end_crit_edge ], [ 0, %entry ] >>> ret i32 %sum.0.lcssa >>> } >>> >>> >>> The emitted blocks are: >>> Function Live Ins: %R0 in %vreg2 >>> >>> BB#0: derived from LLVM BB %entry >>> Live Ins: %R0 >>> %vreg2<def> = COPY %R0; IntRegs:%vreg2 >>> %vreg3<def> = MV 0; SRegs:%vreg3 >>> CMP %vreg2, 1, %FLAG<imp-def>; IntRegs:%vreg2 >>> %vreg6<def> = COPY %vreg3; SRegs:%vreg6,%vreg3 >>> BR_cc <BB#2>, 20, %FLAG<imp-use,kill> >>> BR <BB#1> >>> Successors according to CFG: BB#1(20) BB#2(12) >>> >>> BB#1: derived from LLVM BB %for.cond.for.end_crit_edge >>> Predecessors according to CFG: BB#0 >>> %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >>> %vreg5<def> = ADD %vreg4<kill>, -1; IntRegs:%vreg5,%vreg4 >>> %vreg0<def> = COPY %vreg5<kill>; SRegs:%vreg0 IntRegs:%vreg5 >>> %vreg6<def> = COPY %vreg0; SRegs:%vreg6,%vreg0 >>> Successors according to CFG: BB#2 >>> >>> BB#2: derived from LLVM BB %for.end >>> Predecessors according to CFG: BB#0 BB#1 >>> %vreg1<def> = COPY %vreg6<kill>; SRegs:%vreg1,%vreg6 >>> %R0<def> = COPY %vreg1; SRegs:%vreg1 >>> RETURN %R0<imp-use> >>> >>> # End machine code for function simple_loop. >>> >>> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >>> - function: simple_loop >>> - basic block: BB#1 for.cond.for.end_crit_edge (0x7fd7cb025250) >>> - instruction: %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >>> LLVM ERROR: Found 1 machine code errors. >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >
Boris Boesler
2014-Oct-31 18:00 UTC
[LLVMdev] Virtual register def doesn't dominate all uses
Hi Quentin, I added some debug output (N->dump()) in ::Select(SDNode*N) and compared it to the dot/Graphviz output (-view-legalize-types-dags; the last one with correct code). I found out, that some SDNodes are not passed to the ::Select(SDNode*N), approximately 11 nodes are missing. The first add-node (v1+v2) is missing. Is it normal that not all nodes are passes to ::Select()? Thanks, Boris Am 30.10.2014 um 19:23 schrieb Quentin Colombet <qcolombet at apple.com>:> Hi Boris, > > On Oct 29, 2014, at 10:35 AM, Boris Boesler <baembel at gmx.de> wrote: > >> Hi Quentin, >> >> yes, this happens quite late. With the Option --debug-pass=Structure it's in or after "Assembly Printer". >> I do have a very simple DAGToDAGISel::Select() method: >> >> SDNode *MyTargetDAGToDAGISel::Select(SDNode *N) >> { >> SDLoc dl(N); >> // default implementation >> if (N -> isMachineOpcode()) { >> N -> setNodeId(-1); >> return NULL; // Already selected. >> } >> SDNode *res = SelectCode(N); >> return res; >> } >> >> Is that too simple? There are no further passes that eliminate anything. >> >> Anyway, I have another test program, that could point to my bug: >> >> int formal_args_3(int p1, int p2, int p3) >> { >> int v1 = p1; >> int v2 = p2; >> int v3 = p3; >> int res = v1 + v2; >> return(res); >> } >> >> I can compile this test and I get correct code. With option -view-sched-dags I can verify that all arguments are stored in the stack-frame, the local variables are initialized (several store operations in stack-frame) and the return value is evaluated. >> >> But if I use the statement "int res = v1 + v2 + v3;" something strange happens: all arguments are stored in the stack-frame and the local variables are initialized. Now, the variables v1 and v2 are loaded, but they are not used (no ADD instructions) and a MOVE instruction register to register is generated that uses itself as an operand. This register should be stored and should be used as function result. >> >> Well, the LOAD instruction uses another register class for the destination register than the ADD instruction uses for its operands, but both classes share some registers. That should not be a problem. > > Like you said, that shouldn’t be a problem. > >> >> Any hints where I can search for my bug? > > Try using -print-machineinstrs and check where the Machine IR diverge from what you were expected. > Then, you can use -debug-only <the offending pass> to have more details. > > Cheers, > -Quentin > >> >> Thanks, >> Boris >> >> >> Am 24.10.2014 um 19:27 schrieb Quentin Colombet <qcolombet at apple.com>: >> >>> Hi Boris, >>> >>> I don’t see any phis in your machine code whereas the IR had some. This means you are already pretty late in the pipeline of the backend (i.e., after SSA form has been deconstructed). >>> Do you have any custom pass between instruction selection and the PHIElimination pass? >>> >>> If so, I would look into them. >>> >>> Cheers, >>> -Quentin >>> >>>> On Oct 24, 2014, at 7:53 AM, Boris Boesler <baembel at gmx.de> wrote: >>>> >>>> Hi! >>>> >>>> During my backend development I get the error message for some tests: >>>> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >>>> >>>> (C source-code, byte-code disassembly and printed machine code at the end of the email) >>>> >>>> The first USE of vreg4 in BB#1 has no previous DEF in BB#0 or #1. But why? I can't see how the LLVM byte-code is transformed to the lower machine code. >>>> >>>> One possible reason could be that I haven't implemented all operations, eg I didn't implement MUL at this stage. Their "state" is LEGAL and not CUSTOM or EXPAND. But it fails with implemented operations as well. >>>> >>>> What did I do wrong? Missing implementation for some operations? What did I miss to implement? >>>> >>>> Thanks in advance, >>>> Boris >>>> >>>> ----8<---- >>>> >>>> C source-code: >>>> int simple_loop(int end_loop_index) >>>> { >>>> int sum = 0; >>>> for(int i = 0; i < end_loop_index; i++) { >>>> sum += i; >>>> } >>>> return(sum); >>>> } >>>> >>>> >>>> LLVm byte-code disassembly: >>>> ; Function Attrs: nounwind readnone >>>> define i32 @simple_loop(i32 %end_loop_index) #1 { >>>> entry: >>>> %cmp4 = icmp sgt i32 %end_loop_index, 0 >>>> br i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end >>>> >>>> for.cond.for.end_crit_edge: ; preds = %entry >>>> %0 = add i32 %end_loop_index, -2 >>>> %1 = add i32 %end_loop_index, -1 >>>> %2 = zext i32 %0 to i33 >>>> %3 = zext i32 %1 to i33 >>>> %4 = mul i33 %3, %2 >>>> %5 = lshr i33 %4, 1 >>>> %6 = trunc i33 %5 to i32 >>>> %7 = add i32 %6, %end_loop_index >>>> %8 = add i32 %7, -1 >>>> br label %for.end >>>> >>>> for.end: ; preds = %for.cond.for.end_crit_edge, %entry >>>> %sum.0.lcssa = phi i32 [ %8, %for.cond.for.end_crit_edge ], [ 0, %entry ] >>>> ret i32 %sum.0.lcssa >>>> } >>>> >>>> >>>> The emitted blocks are: >>>> Function Live Ins: %R0 in %vreg2 >>>> >>>> BB#0: derived from LLVM BB %entry >>>> Live Ins: %R0 >>>> %vreg2<def> = COPY %R0; IntRegs:%vreg2 >>>> %vreg3<def> = MV 0; SRegs:%vreg3 >>>> CMP %vreg2, 1, %FLAG<imp-def>; IntRegs:%vreg2 >>>> %vreg6<def> = COPY %vreg3; SRegs:%vreg6,%vreg3 >>>> BR_cc <BB#2>, 20, %FLAG<imp-use,kill> >>>> BR <BB#1> >>>> Successors according to CFG: BB#1(20) BB#2(12) >>>> >>>> BB#1: derived from LLVM BB %for.cond.for.end_crit_edge >>>> Predecessors according to CFG: BB#0 >>>> %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >>>> %vreg5<def> = ADD %vreg4<kill>, -1; IntRegs:%vreg5,%vreg4 >>>> %vreg0<def> = COPY %vreg5<kill>; SRegs:%vreg0 IntRegs:%vreg5 >>>> %vreg6<def> = COPY %vreg0; SRegs:%vreg6,%vreg0 >>>> Successors according to CFG: BB#2 >>>> >>>> BB#2: derived from LLVM BB %for.end >>>> Predecessors according to CFG: BB#0 BB#1 >>>> %vreg1<def> = COPY %vreg6<kill>; SRegs:%vreg1,%vreg6 >>>> %R0<def> = COPY %vreg1; SRegs:%vreg1 >>>> RETURN %R0<imp-use> >>>> >>>> # End machine code for function simple_loop. >>>> >>>> *** Bad machine code: Virtual register def doesn't dominate all uses. *** >>>> - function: simple_loop >>>> - basic block: BB#1 for.cond.for.end_crit_edge (0x7fd7cb025250) >>>> - instruction: %vreg4<def> = MV %vreg4; IntRegs:%vreg4 >>>> LLVM ERROR: Found 1 machine code errors. >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> >