Hi, I have two questions about optimizations performed by llvm. Consider these simple functions: int x(int b) { return b?4:6; } int y() { return x(0); } int x2() { return 5; } int y2() { return x2(); } the optimized bitcode (with clang + opt -std-compiler-opts) is: define i32 @y(...) nounwind { entry: ret i32 6 } define i32 @y2(...) nounwind { entry: %call = call i32 (...)* @x2( ) ; <i32> [#uses=0] ret i32 5 } So why does LLVM optimizes the more difficult case, but leaves behind the function call in the easiest case? :) Second question: int f(); int g() { if (f() == 5) return 3; return 4; } int h() { if (g() == 6) return 2; return 1; } gives the following optimized bc: define i32 @g(...) nounwind { entry: %call = call i32 (...)* @f( ) ; <i32> [#uses=1] %cmp = icmp eq i32 %call, 5 ; <i1> [#uses=1] %retval = select i1 %cmp, i32 3, i32 4 ; <i32> [#uses=1] ret i32 %retval } define i32 @h(...) nounwind { entry: %call = call i32 (...)* @g( ) ; <i32> [#uses=1] %cmp = icmp eq i32 %call, 6 ; <i1> [#uses=1] %retval = select i1 %cmp, i32 2, i32 1 ; <i32> [#uses=1] ret i32 %retval } In function h(), llvm doesn't realize that g() never returns 6, and thus it could reduce that function to { g(); return 1; }. Doesn't llvm produce some sort of summary for functions with e.g. their possible return values? If not, is there any pass where I could implement this? (I have some code that has many opportunities for this kind of optimization) Also, shouldn't the function g() get inlined in the h() function? It is inlined only if I change the g() function to be static. So isn't llvm being too conservative when inlining global functions? Thanks, Nuno
Nuno Lopes a écrit :> Hi, > > I have two questions about optimizations performed by llvm. > > Consider these simple functions: > int x(int b) { return b?4:6; } > int y() { return x(0); } > > int x2() { return 5; } > int y2() { return x2(); } > > the optimized bitcode (with clang + opt -std-compiler-opts) is: > define i32 @y(...) nounwind { > entry: > ret i32 6 > } > > define i32 @y2(...) nounwind { > entry: > %call = call i32 (...)* @x2( ) ; <i32> [#uses=0] > ret i32 5 > } > > So why does LLVM optimizes the more difficult case, but leaves behind the > function call in the easiest case? :) >I don't know why vararg would inhibit optimisation but maybe it is the problem. You should try with: int x2(void) { return 5; } int y2(void) { return x2(); } it may solve the problem (or use C++ as langage) Cédric
The LLVM browser test optimizes both examples as expected... What optimization options are different? Although that optimization could be gcc-llvm level and not actual llvm level... On Thu, Sep 4, 2008 at 10:19 AM, Cédric Venet <cedric.venet at laposte.net> wrote:> Nuno Lopes a écrit : >> Hi, >> >> I have two questions about optimizations performed by llvm. >> >> Consider these simple functions: >> int x(int b) { return b?4:6; } >> int y() { return x(0); } >> >> int x2() { return 5; } >> int y2() { return x2(); } >> >> the optimized bitcode (with clang + opt -std-compiler-opts) is: >> define i32 @y(...) nounwind { >> entry: >> ret i32 6 >> } >> >> define i32 @y2(...) nounwind { >> entry: >> %call = call i32 (...)* @x2( ) ; <i32> [#uses=0] >> ret i32 5 >> } >> >> So why does LLVM optimizes the more difficult case, but leaves behind the >> function call in the easiest case? :) >> > > I don't know why vararg would inhibit optimisation but maybe it is the > problem. > You should try with: > > int x2(void) { return 5; } > int y2(void) { return x2(); } > > it may solve the problem (or use C++ as langage) > > Cédric > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Thu, Sep 4, 2008 at 8:39 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote:> Hi, > > I have two questions about optimizations performed by llvm. > > Consider these simple functions: > int x(int b) { return b?4:6; } > int y() { return x(0); } > > int x2() { return 5; } > int y2() { return x2(); } > > the optimized bitcode (with clang + opt -std-compiler-opts) is: > define i32 @y(...) nounwind { > entry: > ret i32 6 > } > > define i32 @y2(...) nounwind { > entry: > %call = call i32 (...)* @x2( ) ; <i32> [#uses=0] > ret i32 5 > } > > So why does LLVM optimizes the more difficult case, but leaves behind the > function call in the easiest case? :)Pretty simple: LLVM doesn't know how to eliminate the call. There are a couple of ways it could be eliminated: DCE or inlining. The current DCE infrastructure isn't clever enough to do interprocedural analysis, so that doesn't eliminate it. And the inliner immediately gives up on varargs functions, so that doesn't eliminate it. And That said, clang really should be turning int x2() { return x(0); } into "define i32 @x2()" rather than "define i32 @x2(...)"; the function isn't varargs, and marking it as such could lead to wrong code for exotic calling conventions.> > Second question: > int f(); > > int g() { > if (f() == 5) return 3; > return 4; > } > > int h() { > if (g() == 6) return 2; > return 1; > } > > gives the following optimized bc: > > define i32 @g(...) nounwind { > entry: > %call = call i32 (...)* @f( ) ; <i32> [#uses=1] > %cmp = icmp eq i32 %call, 5 ; <i1> [#uses=1] > %retval = select i1 %cmp, i32 3, i32 4 ; <i32> [#uses=1] > ret i32 %retval > } > > define i32 @h(...) nounwind { > entry: > %call = call i32 (...)* @g( ) ; <i32> [#uses=1] > %cmp = icmp eq i32 %call, 6 ; <i1> [#uses=1] > %retval = select i1 %cmp, i32 2, i32 1 ; <i32> [#uses=1] > ret i32 %retval > } > > In function h(), llvm doesn't realize that g() never returns 6, and thus it > could reduce that function to { g(); return 1; }. Doesn't llvm produce some > sort of summary for functions with e.g. their possible return values? If > not, is there any pass where I could implement this? (I have some code that > has many opportunities for this kind of optimization)No, there isn't any such infrastructure at the moment. The LLVM pass that might do that sort of thing is predsimplify, but it isn't an interprocedural pass. It's not necessarily difficult to implement, depending on how precise you want it to be, but nobody's implemented it; as far as I know, it doesn't provide significant benefits for normal C/C++ code. If there's some specific pattern in your code, you could probably throw together a specialized pass pretty quickly.> Also, shouldn't the function g() get inlined in the h() function? It is > inlined only if I change the g() function to be static. So isn't llvm being > too conservative when inlining global functions?The inliner doesn't like varargs; see above. -Eli
Hi Eli,> That said, clang really should be turning int x2() { return x(0); } > into "define i32 @x2()" rather than "define i32 @x2(...)"; the > function isn't varargs, and marking it as such could lead to wrong > code for exotic calling conventions.I always understood that this is correct per C language specification. For functions that are internal (static), the DeadArgElim pass removes the vararg spec if it's unused, but that's not applicable in this case. Any particular reason the inliner doesn't touch varargs? I think that when you call a varargs function without any varargs, it should be able to inline it just fine? Or at least when the function in question also doesn't call the va_* functions.> > Also, shouldn't the function g() get inlined in the h() function? It is > > inlined only if I change the g() function to be static. So isn't llvm being > > too conservative when inlining global functions? > > The inliner doesn't like varargs; see above.And the fact that it is inlined when it's static, is probably due to the DeadArgElim I mentioned above. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080905/f0ce343d/attachment.sig>
Hi, As a follow up of this thread I've made a patch that implements a simple approach to propagate the function return values as described previously. It can transform e.g. the following program: define i32 @f(...) nounwind { (...) %cond = select i1 %tobool, i32 2, i32 3 ; <i32> [#uses=1] ret i32 %cond } define i32 @g(...) nounwind { entry: %call = call i32 (...)* @f() ; <i32> [#uses=1] switch i32 %call, label %sw.default [ i32 5, label %sw.bb i32 2, label %sw.bb1 i32 3, label %sw.bb2 ] sw.bb: ; preds = %entry ret i32 -1 sw.bb1: ; preds = %entry ret i32 3 sw.bb2: ; preds = %entry ret i32 55 sw.default: ; preds = %entry ret i32 0 } into: define i32 @g(...) nounwind { entry: %call = call i32 (...)* @f() ; <i32> [#uses=1] %cond = icmp eq i32 %call, 2 ; <i1> [#uses=1] %retval = select i1 %cond, i32 3, i32 55 ; <i32> [#uses=1] ret i32 %retval } This kind of transformation isn't currently done by LLVM (note that here this pass is only removing case statements. other transformations are not of my responsibility :). The patch is available at http://web.ist.utl.pt/nuno.lopes/llvm_function_ret_infer.txt I would love to ear some feedback, so that this pass can eventually be merged to the LLVM tree. In particular I'm not sure if there's a better way to track the possible return values of each function (e.g. is there any code that I can reuse?) Thanks, Nuno