OK, I wanted to understand function inlining in LLVM but had avoided going to the effort of finding out if the inlining was really happening. The advice I got to "use the assembly source, Luke" suggested I go ahead and investigate inlining for a bit of practice, since (so I figured) even a monkey with really weak x86-fu could tell whether a function call was happening or not. If this monkey can tell, it isn't happening. :-) I'll try to provide all useful information. For my null test, I attempted to specify no inlining in a little program that computes a Very Important Number :-) : --- define fastcc i32 @foo(i32 %arg) noinline { %result = mul i32 %arg, 7 ret i32 %result } define i32 @main(i32 %argc, i8 **%argv) { %retVal = call fastcc i32 @foo(i32 6) noinline ret i32 %retVal } --- and after my Makefile executed the following commands: gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline llvm-as testInline.ll llc -O0 -f testInline.bc cc testInline.s -o testInline rm testInline.bc gemini:~/Projects/Nil/nil(0)$ we can compute that Very Important Number gemini:~/Projects/Nil/nil(0)$ ./testInline ; echo $? 42 gemini:~/Projects/Nil/nil(0)$ and the generated assembly (with much red tape snipped for now): --- .file "testInline.bc" .text .align 16 .globl foo .type foo, at function foo: # @foo .Leh_func_begin1: .LBB1_0: imull $7, %edi, %eax ret .size foo, .-foo .Leh_func_end1: .align 16 .globl main .type main, at function main: # @main .Leh_func_begin2: .LBB2_0: subq $8, %rsp .Llabel1: movl $6, %edi call foo addq $8, %rsp ret .size main, .-main .Leh_func_end2: --- Even this monkey (thinks he) can see the constant 6 being passed to foo in %edi. So far so good. Now I tried to get it to inline, without much luck. Putting together everything I tried into one test, I changed 'noinline' to 'alwaysinline' (and changing the linkage, as I gather that would be appropriate for multiple files) --- ; testInline.ll -- test code for inlining. define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ret i32 %result } define i32 @main(i32 %argc, i8 **%argv) { %retVal = call fastcc i32 @foo(i32 6) alwaysinline ret i32 %retVal } --- and bumped up the optimization level to O3: rm -f nil c_defs c_defs.llh *.bc *.s *.o testInline # *.ll gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline llvm-as testInline.ll llc -O3 -f testInline.bc cc testInline.s -o testInline rm testInline.bc gemini:~/Projects/Nil/nil(0)$ which generates --- .file "testInline.bc" .section .gnu.linkonce.t.foo,"ax", at progbits .align 16 .weak foo .type foo, at function foo: # @foo .Leh_func_begin1: .LBB1_0: imull $7, %edi, %eax ret .size foo, .-foo .Leh_func_end1: .text .align 16 .globl main .type main, at function main: # @main .Leh_func_begin2: .LBB2_0: subq $8, %rsp .Llabel1: movl $6, %edi call foo addq $8, %rsp ret .size main, .-main .Leh_func_end2: --- Which only differs in putting foo in a linkonce section instead of in .text and in specifying (what I think is) .weak linkage instead of .globl, so apparently the multiplication was not inlined. There are no other differences in the snipped red-tape, I checked with diff. What did monkey do wrong? Dustin
On Jan 8, 2010, at 1:52 PM, Dustin Laurence wrote:> gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline > llvm-as testInline.ll > llc -O3 -f testInline.bc'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat machine-code and object-file optimizations, but it does not apply high-level optimizations like CSE or inlining. 'opt' is the tool which does IR-to-IR optimization. John.
On 01/08/2010 02:10 PM, John McCall wrote:> 'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat > machine-code and object-file optimizations, but it does not apply > high-level optimizations like CSE or inlining. 'opt' is the tool > which does IR-to-IR optimization.A vital clue, but I'm still not getting it: --- gemini:~/Projects/Nil/nil(0)$ make testInline.optdis.ll llvm-as testInline.ll opt -always-inline testInline.bc -o testInline.optbc llvm-dis -f testInline.optbc -o testInline.optdis.ll rm testInline.bc testInline.optbc gemini:~/Projects/Nil/nil(0)$ cat testInline.optdis.ll ; ModuleID = 'testInline.optbc' define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ; <i32> [#uses=1] ret i32 %result } define i32 @main(i32 %argc, i8** %argv) { %retVal = call fastcc i32 @foo(i32 6) alwaysinline ; <i32> [#uses=1] ret i32 %retVal } gemini:~/Projects/Nil/nil(0)$ --- Perhaps the -always-inline pass has a prerequisite pass? I also tried it with "-O3 -always-inline", which got halfway there: --- ; ModuleID = 'testInline.optbc' define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ; <i32> [#uses=1] ret i32 %result } define i32 @main(i32 %argc, i8** nocapture %argv) { %retVal = tail call fastcc i32 @foo(i32 6) alwaysinline ; <i32> [#uses=1] ret i32 %retVal } --- I'm pleased to get the tailcall optimization, but in this case was looking for the 'no call at all' optimization. :-) Dustin
Hi Dustin,> define linkonce fastcc i32 @foo(i32 %arg) alwaysinlinelinkonce implies that the function body may change at link time. Thus it would be wrong to inline it, since the code being inlined would not be the final code. Use linkonce_odr to tell the compiler that the function body can be replaced only by an equivalent function body. Ciao, Duncan.
Hi Duncan- Forgive my confusion, but I can't help notice that LangRef states: Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded. Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions? Alastair On 9 Jan 2010, at 08:15, Duncan Sands wrote:> Hi Dustin, > >> define linkonce fastcc i32 @foo(i32 %arg) alwaysinline > > linkonce implies that the function body may change at link time. Thus it would > be wrong to inline it, since the code being inlined would not be the final code. > Use linkonce_odr to tell the compiler that the function body can be replaced > only by an equivalent function body. > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev