Curtis Faith
2010-May-28 13:35 UTC
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
I have some LLVM IR after the optimization passes defined in createStandardModulePasses with the optimization level set to 3. It contains what appears to me to be an easily optimizable branch statement. In particular, note in the code below that at the end of the "loop" BasicBlock that there is a conditional branch where in the false case, it branches to the label "%loop.endif_crit_edge", this label contains only the single statement "br label %endif", Therefore, the conditional branch could simply branch directly to the "%endif" label in one step instead of taking the two steps it currently does. I have bolded the relevant code below which starts 13 lines into the routine. It appears that the BasicBlock "%loop.endif_crit_edge" was created by one of the optimization passes which also created the phi node at the beginning of the "%endif" block. In this particular case, the phi node could just as easily be using "%loop" as the predecessor. So it appears that the "%loop.endif_crit_edge" block and associated branch statement is completely unnecessary. Am I missing something here? Is there an optimization pass that I should add which should handle this case? Finally, I want to look at the machine code to see if perhaps this sort of thing is handled by the optimization passes during machine code emission but I can't figure out how to easily print out the machine code. Is there an easy way to add a createMachineFunctionPrinterPass pass to the ExecutionEngine's JIT so that it will print out the machine code after all the optimizations have been done and the code has been generated for MachineCodeEmission? Thank you in advance, Curtis define void @main() nounwind { bb.nph: br label %loop loop: ; preds = %bb.nph, %endif15 %lsr.iv = phi i32 [ 1, %bb.nph ], [ %lsr.iv.next, %endif15 ] ; <i32> [#uses=4] %fp = sitofp i32 %lsr.iv to double ; <double> [#uses=1] %r_fmul = fmul double %fp, 3.151900e+00 ; <double> [#uses=1] %r_fadd = fadd double %r_fmul, 2.800000e+00 ; <double> [#uses=1] %r_fmul5 = fmul double %r_fadd, 5.320000e+00 ; <double> [#uses=2] %r_srem = srem i32 %lsr.iv, 17 ; <i32> [#uses=1] %r_icmpeq = icmp eq i32 %r_srem, 1 ; <i1> [#uses=1] br i1 %r_icmpeq, label %then, label %loop.endif_crit_edge loop.endif_crit_edge: ; preds = %loop br label %endif then: ; preds = %loop %r_fmul8 = fmul double %r_fmul5, 1.000000e+02 ; <double> [#uses=1] br label %endif endif: ; preds = %loop.endif_crit_edge, %then %anotherFloat.1 = phi double [ %r_fmul8, %then ], [ %r_fmul5, %loop.endif_crit_edge ] ; <double> [#uses=1] %r_srem10 = srem i32 %lsr.iv, 16 ; <i32> [#uses=2] %r_icmpsgt = icmp sgt i32 %r_srem10, 10 ; <i1> [#uses=1] br i1 %r_icmpsgt, label %then11, label %endif.endif15_crit_edge endif.endif15_crit_edge: ; preds = %endif br label %endif15 then11: ; preds = %endif %phitmp = sitofp i32 %r_srem10 to double ; <double> [#uses=1] br label %endif15 endif15: ; preds = %endif.endif15_crit_edge, %then11 %storemerge26 = phi double [ %phitmp, %then11 ], [ 1.000000e+01, %endif.endif15_crit_edge ] ; <double> [#uses=1] %lsr.iv.next = add i32 %lsr.iv, 1 ; <i32> [#uses=2] %exitcond = icmp eq i32 %lsr.iv.next, 10001 ; <i1> [#uses=1] br i1 %exitcond, label %afterloop, label %loop afterloop: ; preds = %endif15 %r_fadd19 = fadd double %anotherFloat.1, %storemerge26 ; <double> [#uses=1] %phitmp32 = fcmp ogt double %r_fadd19, 3.535200e+03 ; <i1> [#uses=1] %storemerge27 = zext i1 %phitmp32 to i32 ; <i32> [#uses=1] store i32 %storemerge27, i32* @testSuccessful ret void } -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100528/7c009e86/attachment.html>
Kenneth Uildriks
2010-May-28 13:57 UTC
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
On Fri, May 28, 2010 at 8:35 AM, Curtis Faith <curtis at curtisfaith.com> wrote:> Finally, I want to look at the machine code to see if perhaps this sort of > thing is handled by the optimization passes during machine code emission but > I can't figure out how to easily print out the machine code. Is there an > easy way to add a createMachineFunctionPrinterPass pass to the > ExecutionEngine's JIT so that it will print out the machine code after all > the optimizations have been done and the code has been generated for > MachineCodeEmission? > Thank you in advance, > CurtisYou could always have your code save the module you're JITting off of and then run that module through llc.
Kenneth Uildriks
2010-May-28 13:57 UTC
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
On Fri, May 28, 2010 at 8:35 AM, Curtis Faith <curtis at curtisfaith.com> wrote:> Finally, I want to look at the machine code to see if perhaps this sort of > thing is handled by the optimization passes during machine code emission but > I can't figure out how to easily print out the machine code. Is there an > easy way to add a createMachineFunctionPrinterPass pass to the > ExecutionEngine's JIT so that it will print out the machine code after all > the optimizations have been done and the code has been generated for > MachineCodeEmission? > Thank you in advance, > CurtisYou could always have your code save the module you're JITting off of and then run that module through llc.
Dale Johannesen
2010-May-28 17:17 UTC
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
The thread here should help. http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-May/031624.html On May 28, 2010, at 6:35 AMPDT, Curtis Faith wrote:> I have some LLVM IR after the optimization passes defined in createStandardModulePasses with the optimization level set to 3. It contains what appears to me to be an easily optimizable branch statement. > > In particular, note in the code below that at the end of the "loop" BasicBlock that there is a conditional branch where in the false case, it branches to the label "%loop.endif_crit_edge", this label contains only the single statement "br label %endif", Therefore, the conditional branch could simply branch directly to the "%endif" label in one step instead of taking the two steps it currently does. I have bolded the relevant code below which starts 13 lines into the routine. > > It appears that the BasicBlock "%loop.endif_crit_edge" was created by one of the optimization passes which also created the phi node at the beginning of the "%endif" block. In this particular case, the phi node could just as easily be using "%loop" as the predecessor. So it appears that the "%loop.endif_crit_edge" block and associated branch statement is completely unnecessary. > > Am I missing something here? > > Is there an optimization pass that I should add which should handle this case? > > Finally, I want to look at the machine code to see if perhaps this sort of thing is handled by the optimization passes during machine code emission but I can't figure out how to easily print out the machine code. Is there an easy way to add a createMachineFunctionPrinterPass pass to the ExecutionEngine's JIT so that it will print out the machine code after all the optimizations have been done and the code has been generated for MachineCodeEmission? > > Thank you in advance, > > Curtis > > > > define void @main() nounwind { > bb.nph: > br label %loop > > loop: ; preds = %bb.nph, %endif15 > %lsr.iv = phi i32 [ 1, %bb.nph ], [ %lsr.iv.next, %endif15 ] ; <i32> [#uses=4] > %fp = sitofp i32 %lsr.iv to double ; <double> [#uses=1] > %r_fmul = fmul double %fp, 3.151900e+00 ; <double> [#uses=1] > %r_fadd = fadd double %r_fmul, 2.800000e+00 ; <double> [#uses=1] > %r_fmul5 = fmul double %r_fadd, 5.320000e+00 ; <double> [#uses=2] > %r_srem = srem i32 %lsr.iv, 17 ; <i32> [#uses=1] > %r_icmpeq = icmp eq i32 %r_srem, 1 ; <i1> [#uses=1] > br i1 %r_icmpeq, label %then, label %loop.endif_crit_edge > > loop.endif_crit_edge: ; preds = %loop > br label %endif > > then: ; preds = %loop > %r_fmul8 = fmul double %r_fmul5, 1.000000e+02 ; <double> [#uses=1] > br label %endif > > endif: ; preds = %loop.endif_crit_edge, %then > %anotherFloat.1 = phi double [ %r_fmul8, %then ], [ %r_fmul5, %loop.endif_crit_edge ] ; <double> [#uses=1] > %r_srem10 = srem i32 %lsr.iv, 16 ; <i32> [#uses=2] > %r_icmpsgt = icmp sgt i32 %r_srem10, 10 ; <i1> [#uses=1] > br i1 %r_icmpsgt, label %then11, label %endif.endif15_crit_edge > > endif.endif15_crit_edge: ; preds = %endif > br label %endif15 > > then11: ; preds = %endif > %phitmp = sitofp i32 %r_srem10 to double ; <double> [#uses=1] > br label %endif15 > > endif15: ; preds = %endif.endif15_crit_edge, %then11 > %storemerge26 = phi double [ %phitmp, %then11 ], [ 1.000000e+01, %endif.endif15_crit_edge ] ; <double> [#uses=1] > %lsr.iv.next = add i32 %lsr.iv, 1 ; <i32> [#uses=2] > %exitcond = icmp eq i32 %lsr.iv.next, 10001 ; <i1> [#uses=1] > br i1 %exitcond, label %afterloop, label %loop > > afterloop: ; preds = %endif15 > %r_fadd19 = fadd double %anotherFloat.1, %storemerge26 ; <double> [#uses=1] > %phitmp32 = fcmp ogt double %r_fadd19, 3.535200e+03 ; <i1> [#uses=1] > %storemerge27 = zext i1 %phitmp32 to i32 ; <i32> [#uses=1] > store i32 %storemerge27, i32* @testSuccessful > ret void > } > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100528/9ab556d1/attachment.html>
Curtis Faith
2010-May-29 13:57 UTC
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
Thanks Dale, It appears, based on a read of that thread and my neophyte's perusal of the machine code, that the branches are eliminated as part of the target-dependent optimization during the machine-code-generation process. Now if I can only figure out a way to easily print just one pass of the Machine code at then end of machine code emission instead of the many that get printed when you set PrintMachineCode=1 :) - Curtis On May 28, 2010, at 1:17 PM, Dale Johannesen wrote:> The thread here should help. > > http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-May/031624.html > > On May 28, 2010, at 6:35 AMPDT, Curtis Faith wrote: > >> I have some LLVM IR after the optimization passes defined in createStandardModulePasses with the optimization level set to 3. It contains what appears to me to be an easily optimizable branch statement. >> >> In particular, note in the code below that at the end of the "loop" BasicBlock that there is a conditional branch where in the false case, it branches to the label "%loop.endif_crit_edge", this label contains only the single statement "br label %endif", Therefore, the conditional branch could simply branch directly to the "%endif" label in one step instead of taking the two steps it currently does. I have bolded the relevant code below which starts 13 lines into the routine. >> >> It appears that the BasicBlock "%loop.endif_crit_edge" was created by one of the optimization passes which also created the phi node at the beginning of the "%endif" block. In this particular case, the phi node could just as easily be using "%loop" as the predecessor. So it appears that the "%loop.endif_crit_edge" block and associated branch statement is completely unnecessary. >> >> Am I missing something here? >> >> Is there an optimization pass that I should add which should handle this case? >> >> Finally, I want to look at the machine code to see if perhaps this sort of thing is handled by the optimization passes during machine code emission but I can't figure out how to easily print out the machine code. Is there an easy way to add a createMachineFunctionPrinterPass pass to the ExecutionEngine's JIT so that it will print out the machine code after all the optimizations have been done and the code has been generated for MachineCodeEmission? >> >> Thank you in advance, >> >> Curtis >> >> >> >> define void @main() nounwind { >> bb.nph: >> br label %loop >> >> loop: ; preds = %bb.nph, %endif15 >> %lsr.iv = phi i32 [ 1, %bb.nph ], [ %lsr.iv.next, %endif15 ] ; <i32> [#uses=4] >> %fp = sitofp i32 %lsr.iv to double ; <double> [#uses=1] >> %r_fmul = fmul double %fp, 3.151900e+00 ; <double> [#uses=1] >> %r_fadd = fadd double %r_fmul, 2.800000e+00 ; <double> [#uses=1] >> %r_fmul5 = fmul double %r_fadd, 5.320000e+00 ; <double> [#uses=2] >> %r_srem = srem i32 %lsr.iv, 17 ; <i32> [#uses=1] >> %r_icmpeq = icmp eq i32 %r_srem, 1 ; <i1> [#uses=1] >> br i1 %r_icmpeq, label %then, label %loop.endif_crit_edge >> >> loop.endif_crit_edge: ; preds = %loop >> br label %endif >> >> then: ; preds = %loop >> %r_fmul8 = fmul double %r_fmul5, 1.000000e+02 ; <double> [#uses=1] >> br label %endif >> >> endif: ; preds = %loop.endif_crit_edge, %then >> %anotherFloat.1 = phi double [ %r_fmul8, %then ], [ %r_fmul5, %loop.endif_crit_edge ] ; <double> [#uses=1] >> %r_srem10 = srem i32 %lsr.iv, 16 ; <i32> [#uses=2] >> %r_icmpsgt = icmp sgt i32 %r_srem10, 10 ; <i1> [#uses=1] >> br i1 %r_icmpsgt, label %then11, label %endif.endif15_crit_edge >> >> endif.endif15_crit_edge: ; preds = %endif >> br label %endif15 >> >> then11: ; preds = %endif >> %phitmp = sitofp i32 %r_srem10 to double ; <double> [#uses=1] >> br label %endif15 >> >> endif15: ; preds = %endif.endif15_crit_edge, %then11 >> %storemerge26 = phi double [ %phitmp, %then11 ], [ 1.000000e+01, %endif.endif15_crit_edge ] ; <double> [#uses=1] >> %lsr.iv.next = add i32 %lsr.iv, 1 ; <i32> [#uses=2] >> %exitcond = icmp eq i32 %lsr.iv.next, 10001 ; <i1> [#uses=1] >> br i1 %exitcond, label %afterloop, label %loop >> >> afterloop: ; preds = %endif15 >> %r_fadd19 = fadd double %anotherFloat.1, %storemerge26 ; <double> [#uses=1] >> %phitmp32 = fcmp ogt double %r_fadd19, 3.535200e+03 ; <i1> [#uses=1] >> %storemerge27 = zext i1 %phitmp32 to i32 ; <i32> [#uses=1] >> store i32 %storemerge27, i32* @testSuccessful >> ret void >> } >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100529/644cc4de/attachment.html>