thr3ads.net - llvm dev - [LLVMdev] Optimization feasibility [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2007-Dec-25 17:07 UTC

[LLVMdev] Optimization feasibility

On 25 Dec 2007, at 03:29, Gordon Henriksen wrote:
> Hi Jo,
>
> On 2007-12-24, at 14:43, Joachim Durchholz wrote:
>
>> I'm in a very preliminary phase of a language project which
requires
>> some specific optimizations to be reasonably efficient.
>>
>> LLVM already looks very good; I'd just like to know whether I can
>> push these optimizations through LLVM to the JIT phase (which, as
>> far as I understand the docs, is a pretty powerful part of LLVM).
>
> Cool.
>
>> The optimizations that I need to get to work are:
>>
>> * Tail call elimination.
> It also supports emitting tail calls on
> x86, but its support is somewhat weak. This is partially mandated by
> calling conventions, but those implementing functional languages might
> be disappointed. Check the llvmdev archives for details.
>Hi Joachim,
I am  the person to blame for tail call support and its deficiencies  
on x86.

The current constraints for tail calls on x86 are:

Max 2 registers are used for argument passing (inreg). Tail call  
optimization is performed
provided:
//                * option tailcallopt is enabled
//                * caller/callee are fastcc
//                * elf/pic is disabled (this should be the case on  
mac os x?) OR
//                * elf/pic enabled + callee is in the same module as  
caller + callee has
//                  visibility protected or hidden


an (pointless) example would be:


<<---tailcall.ll --->>
@.str = internal constant [12 x i8] c"result: %d\0A\00"		; <[12 x
i8]
*> [#uses=1]

define fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
entry:
         ret i32 %a3
}

define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
entry:
         %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 % 
in2, i32 %in1,
  i32 %in2 )             ; <i32> [#uses=1]
         ret i32 %tmp11
}



define i32 @main(i32 %argc, i8** %argv) {
entry:
	%argc_addr = alloca i32		; <i32*> [#uses=1]
	%argv_addr = alloca i8**		; <i8***> [#uses=1]
	%retval = alloca i32, align 4		; <i32*> [#uses=2]
	%tmp = alloca i32, align 4		; <i32*> [#uses=2]
	%res = alloca i32, align 4		; <i32*> [#uses=2]
	"alloca point" = bitcast i32 0 to i32		; <i32> [#uses=0]
	store i32 %argc, i32* %argc_addr
	store i8** %argv, i8*** %argv_addr
	%tmp1 = call fastcc i32 @tailcaller( i32 1, i32 2)		; <i32> [#uses=1]
	store i32 %tmp1, i32* %res
	%tmp2 = getelementptr [12 x i8]* @.str, i32 0, i32 0		; <i8*> [#uses=1]
	%tmp3 = load i32* %res		; <i32> [#uses=1]
	%tmp4 = call i32 (i8*, ...)* @printf( i8* %tmp2, i32 %tmp3 )		;  
<i32> [#uses=0]
	store i32 0, i32* %tmp
	%tmp5 = load i32* %tmp		; <i32> [#uses=1]
	store i32 %tmp5, i32* %retval
	br label %return

return:		; preds = %entry
	%retval6 = load i32* %retval		; <i32> [#uses=1]
	ret i32 %retval6
}

declare i32 @printf(i8*, ...)
<<---tailcall.ll --->>

x86Shell:>  llvm-as < tailcall.ll | llc  -tailcallopt | gcc -x  
assembler -
x86Shell:> ./a.out

if you have got any questions regarding tail call stuff  i would be  
happy to help

regards arnold

Joachim Durchholz

2007-Dec-25 20:47 UTC

head link

[LLVMdev] Optimization feasibility

Hi Arnold,

Arnold Schwaighofer schrieb:> On 25 Dec 2007, at 03:29, Gordon Henriksen wrote:
> 
>> It also supports emitting tail calls on
>> x86, but its support is somewhat weak. This is partially mandated by
>> calling conventions, but those implementing functional languages might
>> be disappointed. Check the llvmdev archives for details.
>>
> Hi Joachim,
> I am  the person to blame for tail call support and its deficiencies  
> on x86.
> 
> The current constraints for tail calls on x86 are:
> 
> Max 2 registers are used for argument passing (inreg).
On a register-starved architecture like the x86, that's not too serious 
of a problem.

 > Tail call optimization is performed provided:> * option tailcallopt is enabled
OK.
> * caller/callee are fastcc
Not sure what that means - I have to dig into the docs yet.
> * elf/pic is disabled (this should be the case on mac os x?) OR
Dunno, I don't have (or want) a Mac. (They are fine machines and all, 
but too expensive for my budget.)
> * elf/pic enabled + callee is in the same module as caller + callee has
>   visibility protected or hidden
Well, I'm not sure whether I will need or want to work with LLVM 
modules, so I don't know whether that will become a problem or not.

The audience I have in mind would mostly use Intel 32-bit, however, it 
would probably be a good idea to cover the 64-bit variants as well, 
since 32-bit will be dead or dying when (if) my project gets traction. 
In other words, I'll probably want to stay away from 
architecture-specific optimizations, unless they are really, really 
important, cover a really, really large part of the user base, and 
really, really don't place architectural constraints on the rest of the 
system.
> if you have got any questions regarding tail call stuff  i would be  
> happy to help
Not at the moment, no. My priorities are more on the frontend side right 
now; after that, I'll see how much interest the thing will generate, and 
*then* I'll start to look into optimization possibilities.
Besides, most of the project is just vaporware in my head right now. I 
wouldn't even know which questions to ask yet.

The one answer I really needed to know was whether the LLVM community is 
interested in this kind of problem. The response has convinced me that 
this is the case, so I'll use LLVM as a foundation.

Regards,
Jo

Evan Cheng

2008-Jan-02 19:23 UTC

head link

[LLVMdev] Optimization feasibility

On Dec 25, 2007, at 9:07 AM, Arnold Schwaighofer wrote:
> On 25 Dec 2007, at 03:29, Gordon Henriksen wrote:
>
>> Hi Jo,
>>
>> On 2007-12-24, at 14:43, Joachim Durchholz wrote:
>>
>>> I'm in a very preliminary phase of a language project which
requires
>>> some specific optimizations to be reasonably efficient.
>>>
>>> LLVM already looks very good; I'd just like to know whether I
can
>>> push these optimizations through LLVM to the JIT phase (which, as
>>> far as I understand the docs, is a pretty powerful part of LLVM).
>>
>> Cool.
>>
>>> The optimizations that I need to get to work are:
>>>
>>> * Tail call elimination.
>
>> It also supports emitting tail calls on
>> x86, but its support is somewhat weak. This is partially mandated by
>> calling conventions, but those implementing functional languages  
>> might
>> be disappointed. Check the llvmdev archives for details.
>>
> Hi Joachim,
> I am  the person to blame for tail call support and its deficiencies
> on x86.
Hi Arnold,

Speaking of tail call support... Do you have any plan to enhance it in  
the near future?

Thanks,

Evan
>
>
> The current constraints for tail calls on x86 are:
>
> Max 2 registers are used for argument passing (inreg). Tail call
> optimization is performed
> provided:
> //                * option tailcallopt is enabled
> //                * caller/callee are fastcc
> //                * elf/pic is disabled (this should be the case on
> mac os x?) OR
> //                * elf/pic enabled + callee is in the same module as
> caller + callee has
> //                  visibility protected or hidden
>
>
> an (pointless) example would be:
>
>
> <<---tailcall.ll --->>
> @.str = internal constant [12 x i8] c"result: %d\0A\00"		;
<[12 x i8]
> *> [#uses=1]
>
> define fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
> entry:
>         ret i32 %a3
> }
>
> define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
> entry:
>         %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %
> in2, i32 %in1,
>  i32 %in2 )             ; <i32> [#uses=1]
>         ret i32 %tmp11
> }
>
>
>
> define i32 @main(i32 %argc, i8** %argv) {
> entry:
> 	%argc_addr = alloca i32		; <i32*> [#uses=1]
> 	%argv_addr = alloca i8**		; <i8***> [#uses=1]
> 	%retval = alloca i32, align 4		; <i32*> [#uses=2]
> 	%tmp = alloca i32, align 4		; <i32*> [#uses=2]
> 	%res = alloca i32, align 4		; <i32*> [#uses=2]
> 	"alloca point" = bitcast i32 0 to i32		; <i32> [#uses=0]
> 	store i32 %argc, i32* %argc_addr
> 	store i8** %argv, i8*** %argv_addr
> 	%tmp1 = call fastcc i32 @tailcaller( i32 1, i32 2)		; <i32>
[#uses=1]
> 	store i32 %tmp1, i32* %res
> 	%tmp2 = getelementptr [12 x i8]* @.str, i32 0, i32 0		; <i8*>  
> [#uses=1]
> 	%tmp3 = load i32* %res		; <i32> [#uses=1]
> 	%tmp4 = call i32 (i8*, ...)* @printf( i8* %tmp2, i32 %tmp3 )		;
> <i32> [#uses=0]
> 	store i32 0, i32* %tmp
> 	%tmp5 = load i32* %tmp		; <i32> [#uses=1]
> 	store i32 %tmp5, i32* %retval
> 	br label %return
>
> return:		; preds = %entry
> 	%retval6 = load i32* %retval		; <i32> [#uses=1]
> 	ret i32 %retval6
> }
>
> declare i32 @printf(i8*, ...)
> <<---tailcall.ll --->>
>
> x86Shell:>  llvm-as < tailcall.ll | llc  -tailcallopt | gcc -x
> assembler -
> x86Shell:> ./a.out
>
> if you have got any questions regarding tail call stuff  i would be
> happy to help
>
> regards arnold
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Arnold Schwaighofer

2008-Jan-03 15:45 UTC

head link

[LLVMdev] Optimization feasibility

On 2 Jan 2008, at 20:23, Evan Cheng wrote:> Hi Arnold,
>
> Speaking of tail call support... Do you have any plan to enhance it in
> the near future?
I was thinking about improving the way arguments are lowered (see the  
readme.txt in the Target/X86 directory). But at the moment i am quite  
busy finishing my studies. So probably not in the near future. (e.g  
before the upcoming llvm 2.2 release) But if there is immediate need  
i could be motivated to improve sooner.

regards arnold

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jan 2008 - [LLVMdev] Optimization feasibility

[LLVMdev] Optimization feasibility

[LLVMdev] Optimization feasibility

[LLVMdev] Optimization feasibility

[LLVMdev] Optimization feasibility

Reasonably Related Threads