On Jun 13, 2012, at 1:15 PM, Sergei Larin <slarin at codeaurora.org> wrote:> Andy, > > You are probably right here – look at this – before phi elimination this code looks much more sane: > > # *** IR Dump After Live Variable Analysis ***: > # Machine code for function push: SSA > Function Live Outs: %R0 > > BB#0: derived from LLVM BB %entry > %vreg5<def> = IMPLICIT_DEF; IntRegs:%vreg5 > %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg0<def> = PHI %vreg4, <BB#0>, %vreg3, <BB#1>; IntRegs:%vreg0,%vreg4,%vreg3 > %vreg1<def> = PHI %vreg5, <BB#0>, %vreg2, <BB#1>; IntRegs:%vreg1,%vreg5,%vreg2 > %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 > %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 > %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 > JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 > STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 > %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 > %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 > JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> > > Right after the dead vreg is introduced: > > # *** IR Dump After Eliminate PHI nodes for register allocation ***: > # Machine code for function push: Post SSA > Function Live Outs: %R0 > > BB#0: derived from LLVM BB %entry > %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 > %vreg9<def> = COPY %vreg4<kill>; IntRegs:%vreg9,%vreg4 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg0<def> = COPY %vreg9<kill>; IntRegs:%vreg0,%vreg9 > %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 <<<<<<<<<<<<<<<<<<<<<<<<<<< Not defined on first iteration…. > %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 > %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 > %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 > %vreg9<def> = COPY %vreg3<kill>; IntRegs:%vreg9,%vreg3 > %vreg10<def> = COPY %vreg2<kill>; IntRegs:%vreg10,%vreg2 > JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 > STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 > %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 > %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 > JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> > > # End machine code for function push. > > So the problem is elsewhere after all…Sergei, If you don't want an undefined variable here, that's something you should look into. Otherwise, the machine code before phi elimination looks ok. Phi elimination is not doing everything I would like it to do in this case. I'll have to try constructing a test case by hand. Until I figure out the right fix, you may want to work around by disabling the SSA check in the scheduler. Thanks -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120613/01a87da6/attachment.html>
Ok, after a long detour I am back to where I have started. I think there is a problem at dep DAG construction. Let me try to convince you. Here is the C code we are dealing with: push () { struct xx_stack *stack, *top; for (stack = xx_stack; stack; stack = stack->next) top = stack; yy_instr = top->first; } If the loop never iterates, "top" will have garbage in it. If it iterates even once, it will presumably have valid pointer. Bad, but perfectly valid code. In SSA it looked like this: BB#0: derived from LLVM BB %entry %vreg5<def> = IMPLICIT_DEF; IntRegs:%vreg5 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<Dummy def. %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 Successors according to CFG: BB#1 BB#1: derived from LLVM BB %for.cond Predecessors according to CFG: BB#0 BB#1 %vreg0<def> = PHI %vreg4, <BB#0>, %vreg3, <BB#1>; IntRegs:%vreg0,%vreg4,%vreg3 %vreg1<def> = PHI %vreg5, <BB#0>, %vreg2, <BB#1>; IntRegs:%vreg1,%vreg5,%vreg2 <<<<<<<<<<< Use of that dummy value. %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 By the time it has gotten to the scheduler, it became this: BB#0: derived from LLVM BB %entry %vreg9<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg9 Successors according to CFG: BB#1 BB#1: derived from LLVM BB %for.cond Predecessors according to CFG: BB#0 BB#1 %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 <<<<<<<<<<<<< First use uninitialized vreg10 %vreg10<def> = LDriw %vreg9<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg10,%vreg9 %vreg9<def> = ADD_ri %vreg10, 8; IntRegs:%vreg9,%vreg10 %vreg6<def> = CMPEQri %vreg10, 0; PredRegs:%vreg6 IntRegs:%vreg10 JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 JMP <BB#2> Successors according to CFG: BB#2 BB#1 BB#2: derived from LLVM BB %for.end Predecessors according to CFG: BB#1 %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,undef> If loop never iterates, vreg10 and vreg1 will hold garbage (as they should!). If it iterates even once, vreg1 use in BB2 will be fine. This exactly matches original C code behavior. If it is a legitimate code sequence, we need to handle it in DAG dep construction, and introduce anti dep between %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 %vreg10<def> = LDriw %vreg9<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg10,%vreg9 .because reordering them is wrong. The rest of this thread discusses my logic about how to do it. By the way, it is exactly same code on trunk. I am testing your suggested workaround, and will update with results. Thanks for your patience. Sergei -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. From: Andrew Trick [mailto:atrick at apple.com] Sent: Wednesday, June 13, 2012 3:29 PM To: Sergei Larin Cc: 'Jakob Olesen'; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Assert in live update from MI scheduler. On Jun 13, 2012, at 1:15 PM, Sergei Larin <slarin at codeaurora.org> wrote: Andy, You are probably right here - look at this - before phi elimination this code looks much more sane: # *** IR Dump After Live Variable Analysis ***: # Machine code for function push: SSA Function Live Outs: %R0 BB#0: derived from LLVM BB %entry %vreg5<def> = IMPLICIT_DEF; IntRegs:%vreg5 %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 Successors according to CFG: BB#1 BB#1: derived from LLVM BB %for.cond Predecessors according to CFG: BB#0 BB#1 %vreg0<def> = PHI %vreg4, <BB#0>, %vreg3, <BB#1>; IntRegs:%vreg0,%vreg4,%vreg3 %vreg1<def> = PHI %vreg5, <BB#0>, %vreg2, <BB#1>; IntRegs:%vreg1,%vreg5,%vreg2 %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 JMP <BB#2> Successors according to CFG: BB#2 BB#1 BB#2: derived from LLVM BB %for.end Predecessors according to CFG: BB#1 %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> Right after the dead vreg is introduced: # *** IR Dump After Eliminate PHI nodes for register allocation ***: # Machine code for function push: Post SSA Function Live Outs: %R0 BB#0: derived from LLVM BB %entry %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 %vreg9<def> = COPY %vreg4<kill>; IntRegs:%vreg9,%vreg4 Successors according to CFG: BB#1 BB#1: derived from LLVM BB %for.cond Predecessors according to CFG: BB#0 BB#1 %vreg0<def> = COPY %vreg9<kill>; IntRegs:%vreg0,%vreg9 %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 <<<<<<<<<<<<<<<<<<<<<<<<<<< Not defined on first iteration.. %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 %vreg9<def> = COPY %vreg3<kill>; IntRegs:%vreg9,%vreg3 %vreg10<def> = COPY %vreg2<kill>; IntRegs:%vreg10,%vreg2 JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 JMP <BB#2> Successors according to CFG: BB#2 BB#1 BB#2: derived from LLVM BB %for.end Predecessors according to CFG: BB#1 %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> # End machine code for function push. So the problem is elsewhere after all. Sergei, If you don't want an undefined variable here, that's something you should look into. Otherwise, the machine code before phi elimination looks ok. Phi elimination is not doing everything I would like it to do in this case. I'll have to try constructing a test case by hand. Until I figure out the right fix, you may want to work around by disabling the SSA check in the scheduler. Thanks -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120613/b7906618/attachment.html>
Sergei, Absolutely right, the copy/ldriw should not be reordered. I was attempting to explain that I consider it a phi-elimination bug, not a DAG builder bug. Liveness will also have problems with this code in the long run. To avoid confusion, I filed PR13112: Phi elimination generates uninitialized vreg uses, and disabled the SSA check until its fixes in r158461. However, your C code is also fishy. I'm not sure what xx_stack is. In the machine code, if you only execute the loop body once, never taking the backedge, then you load from a garbage pointer after exiting the loop. I think you're saying that's intended behavior, which is fine. Just thought I would point it out. -Andy On Jun 13, 2012, at 3:25 PM, Sergei Larin <slarin at codeaurora.org> wrote:> Ok, after a long detour I am back to where I have started. I think there is a problem at dep DAG construction. Let me try to convince you… > > Here is the C code we are dealing with: > > push () > { > struct xx_stack *stack, *top; > for (stack = xx_stack; stack; stack = stack->next) > top = stack; > yy_instr = top->first; > } > > If the loop never iterates, “top” will have garbage in it. If it iterates even once, it will presumably have valid pointer. Bad, but perfectly valid code. > > In SSA it looked like this: > BB#0: derived from LLVM BB %entry > %vreg5<def> = IMPLICIT_DEF; IntRegs:%vreg5 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<Dummy def. > %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg0<def> = PHI %vreg4, <BB#0>, %vreg3, <BB#1>; IntRegs:%vreg0,%vreg4,%vreg3 > %vreg1<def> = PHI %vreg5, <BB#0>, %vreg2, <BB#1>; IntRegs:%vreg1,%vreg5,%vreg2 <<<<<<<<<<< Use of that dummy value. > %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 > > > By the time it has gotten to the scheduler, it became this: > > BB#0: derived from LLVM BB %entry > %vreg9<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg9 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 <<<<<<<<<<<<< First use uninitialized vreg10 > %vreg10<def> = LDriw %vreg9<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg10,%vreg9 > %vreg9<def> = ADD_ri %vreg10, 8; IntRegs:%vreg9,%vreg10 > %vreg6<def> = CMPEQri %vreg10, 0; PredRegs:%vreg6 IntRegs:%vreg10 > JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 > STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 > JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,undef> > > If loop never iterates, vreg10 and vreg1 will hold garbage (as they should!). If it iterates even once, vreg1 use in BB2 will be fine. > This exactly matches original C code behavior. > > If it is a legitimate code sequence, we need to handle it in DAG dep construction, and introduce anti dep between > %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 > %vreg10<def> = LDriw %vreg9<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg10,%vreg9 > > …because reordering them is wrong. The rest of this thread discusses my logic about how to do it. > By the way, it is exactly same code on trunk… > > I am testing your suggested workaround, and will update with results. > > Thanks for your patience. > > Sergei > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > From: Andrew Trick [mailto:atrick at apple.com] > Sent: Wednesday, June 13, 2012 3:29 PM > To: Sergei Larin > Cc: 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Assert in live update from MI scheduler. > > > On Jun 13, 2012, at 1:15 PM, Sergei Larin <slarin at codeaurora.org> wrote: > > > Andy, > > You are probably right here – look at this – before phi elimination this code looks much more sane: > > # *** IR Dump After Live Variable Analysis ***: > # Machine code for function push: SSA > Function Live Outs: %R0 > > BB#0: derived from LLVM BB %entry > %vreg5<def> = IMPLICIT_DEF; IntRegs:%vreg5 > %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg0<def> = PHI %vreg4, <BB#0>, %vreg3, <BB#1>; IntRegs:%vreg0,%vreg4,%vreg3 > %vreg1<def> = PHI %vreg5, <BB#0>, %vreg2, <BB#1>; IntRegs:%vreg1,%vreg5,%vreg2 > %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 > %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 > %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 > JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 > STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 > %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 > %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 > JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> > > Right after the dead vreg is introduced: > > # *** IR Dump After Eliminate PHI nodes for register allocation ***: > # Machine code for function push: Post SSA > Function Live Outs: %R0 > > BB#0: derived from LLVM BB %entry > %vreg4<def> = TFRI_V4 <ga:@xx_stack>; IntRegs:%vreg4 > %vreg9<def> = COPY %vreg4<kill>; IntRegs:%vreg9,%vreg4 > Successors according to CFG: BB#1 > > BB#1: derived from LLVM BB %for.cond > Predecessors according to CFG: BB#0 BB#1 > %vreg0<def> = COPY %vreg9<kill>; IntRegs:%vreg0,%vreg9 > %vreg1<def> = COPY %vreg10<kill>; IntRegs:%vreg1,%vreg10 <<<<<<<<<<<<<<<<<<<<<<<<<<< Not defined on first iteration…. > %vreg2<def> = LDriw %vreg0<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg2,%vreg0 > %vreg3<def> = ADD_ri %vreg2, 8; IntRegs:%vreg3,%vreg2 > %vreg6<def> = CMPEQri %vreg2, 0; PredRegs:%vreg6 IntRegs:%vreg2 > %vreg9<def> = COPY %vreg3<kill>; IntRegs:%vreg9,%vreg3 > %vreg10<def> = COPY %vreg2<kill>; IntRegs:%vreg10,%vreg2 > JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 > STriw_GP <ga:@yy_instr>, 0, %vreg7<kill>; mem:ST4[@yy_instr](tbaa=!"any pointer") IntRegs:%vreg7 > %vreg8<def> = IMPLICIT_DEF; IntRegs:%vreg8 > %R0<def> = COPY %vreg8<kill>; IntRegs:%vreg8 > JMPR %PC<imp-def>, %R31<imp-use>, %R0<imp-use,kill> > > # End machine code for function push. > > So the problem is elsewhere after all… > > Sergei, > > If you don't want an undefined variable here, that's something you should look into. Otherwise, the machine code before phi elimination looks ok. Phi elimination is not doing everything I would like it to do in this case. I'll have to try constructing a test case by hand. Until I figure out the right fix, you may want to work around by disabling the SSA check in the scheduler. > > Thanks > -Andy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120614/276050f5/attachment.html>