similar to: [LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates"

2013 Feb 15
0
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
Hey Eli, On Thu, Feb 14, 2013 at 5:45 PM, Eli Bendersky <eliben at google.com> wrote: > Hello, > > While investigating one of the existing tests > (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some > interesting code. The IR is very straightforward: > > define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 > %a4) { > entry: >
2013 Feb 15
2
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
>> While investigating one of the existing tests >> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some >> interesting code. The IR is very straightforward: >> >> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 >> %a4) { >> entry: >> ret i32 %a3 >> } >> >> define fastcc i32 @tailcaller(i32
2013 Feb 15
0
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
When you enable -tailcallopt you get support for tail calls between functions with arbitrary stack space requirements. That means the calling convention has to change slightly. E.g the callee is responsible for removing it's arguments of the stack. The caller cannot transitively know the tail callee's tailcallee's requirement. Also care must be taken to make sure the stack stays
2013 Feb 15
1
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
Hi Arnold, Thanks for the insights. My comments below: On Thu, Feb 14, 2013 at 5:30 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > When you enable -tailcallopt you get support for tail calls between functions with arbitrary stack space requirements. That means the calling convention has to change slightly. E.g the callee is responsible for removing it's arguments of
2007 Dec 25
3
[LLVMdev] Optimization feasibility
On 25 Dec 2007, at 03:29, Gordon Henriksen wrote: > Hi Jo, > > On 2007-12-24, at 14:43, Joachim Durchholz wrote: > >> I'm in a very preliminary phase of a language project which requires >> some specific optimizations to be reasonably efficient. >> >> LLVM already looks very good; I'd just like to know whether I can >> push these optimizations
2007 Dec 25
0
[LLVMdev] Optimization feasibility
Hi Jo, On 2007-12-24, at 14:43, Joachim Durchholz wrote: > I'm in a very preliminary phase of a language project which requires > some specific optimizations to be reasonably efficient. > > LLVM already looks very good; I'd just like to know whether I can > push these optimizations through LLVM to the JIT phase (which, as > far as I understand the docs, is a
2007 Dec 24
3
[LLVMdev] Optimization feasibility
Hi all, I'm in a very preliminary phase of a language project which requires some specific optimizations to be reasonably efficient. LLVM already looks very good; I'd just like to know whether I can push these optimizations through LLVM to the JIT phase (which, as far as I understand the docs, is a pretty powerful part of LLVM). The optimizations that I need to get to work are: *
2008 Jan 02
0
[LLVMdev] Optimization feasibility
On Dec 25, 2007, at 9:07 AM, Arnold Schwaighofer wrote: > On 25 Dec 2007, at 03:29, Gordon Henriksen wrote: > >> Hi Jo, >> >> On 2007-12-24, at 14:43, Joachim Durchholz wrote: >> >>> I'm in a very preliminary phase of a language project which requires >>> some specific optimizations to be reasonably efficient. >>> >>> LLVM
2007 Aug 09
1
[LLVMdev] Tail call optimization thoughts
Implementing tail call opt could look like the following: 0.)a fast calling convention (maybe use the current CallingConv::Fast, or create a CallingConv::TailCall) 1.) lowering of formal arguments like for example x86_LowerCCCArguments in stdcall mode we need to make sure that later mentioned CALL_CLOBBERED_REG is not used (remove it from available registers in callingconvention for
2007 Aug 11
1
[LLVMdev] Tail call optimization deeds
Okay so i implemented an(other :) initial version for X86-32 backend, this time based on TOT: It is not very generic at the moment. Only functions with callingconv::fastcc and the tail call attribute will be optimized. Maybe the next step should be to integrate the code into the other calling convention lowering. Here is what i have at the moment: If callingconv::fastcc is used the
2007 Aug 13
0
[LLVMdev] Tail call optimization deeds
Hi Arnold and Anton, Sorry I have been ignoring your emails on this topic. It's an important task and I really need sometime to think about it (and talk to Chris about it!) But this has been an especially hectic week. I am also going to vacation soon so I am not sure when I would get around to it. If Chris has time, I am sure he has lots to say on this topic. :-) Otherwise, please
2017 Jun 24
1
musttail & alwaysinline interaction
Consider this program: @globalSideEffect = global i32 0 define void @tobeinlined() #0 { entry: store i32 3, i32* @globalSideEffect, align 4 musttail call fastcc void @tailcallee(i32 3) ret void } define fastcc void @tailcallee(i32 %i) { entry: call void @tobeinlined() ret void } attributes #0 = { alwaysinline } Clearly, if this is processed with opt -alwaysinline, it will lead
2007 Aug 09
4
[LLVMdev] Tail call optimization thoughts
Hello, Arnold. Only quick comments, I'll try to make a full review a little bit later. > 0.)a fast calling convention (maybe use the current > CallingConv::Fast, or create a CallingConv::TailCall) > 1.) lowering of formal arguments > like for example x86_LowerCCCArguments in stdcall mode > we need to make sure that later mentioned CALL_CLOBBERED_REG is >
2007 Sep 24
2
[LLVMdev] RFC: Tail call optimization X86
On 24 Sep 2007, at 09:18, Evan Cheng wrote: > +; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info- > output-file - | grep asm-printer | grep 9 > +; change preceeding line form ... | grep 8 to ..| grep 9 since > +; with new fastcc has std call semantics causing a stack adjustment > +; after the function call > > Not sure if I understand this. Can you illustrate
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc, Thanks a lot for reviewing this huge assembly function! silk_warped_autocorrelation_FIX_c()'s kernel part is for( n = 0; n < length; n++ ) { tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS ); /* Loop over allpass sections */ for( i = 0; i < order; i++ ) { /* Output of allpass section */ tmp2_QS = silk_SMLAWB(
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be configured to 12, 14, 16, 20 and 24 according to complexity parameter. It's hard to get a universal function to handle all these orders efficiently. Any suggestions? Thanks, Linfeng On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc, Thanks for your suggestions. Will get back to you once we have some updates. Linfeng On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 07:18 PM, Linfeng Zhang wrote: > > This is a great idea. But the order (psEncC->shapingLPCOrder) can be > > configured to 12, 14, 16, 20 and 24 according to
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the last patch). We have done the same internal testing as usual. Also, attached 2 failed temporary versions which try to reduce code size (just for code review reference purpose). The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Hello everyone, I think I have found an gvn / alias analysis related bug, but before opening an issue on the tracker I wanted to see if I am missing something. I have the following testcase: define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) { > entry: > ; Just some temporary storage > %tmp.0 = alloca i32 > %tmp.1 = alloca i32 > %tmp.i =
2007 Sep 24
0
[LLVMdev] RFC: Tail call optimization X86
Hi Arnold, This is a very good first step! Thanks! Comments below. Evan Index: test/CodeGen/X86/constant-pool-remat-0.ll =================================================================== --- test/CodeGen/X86/constant-pool-remat-0.ll (revision 42247) +++ test/CodeGen/X86/constant-pool-remat-0.ll (working copy) @@ -1,8 +1,10 @@ ; RUN: llvm-as < %s | llc -march=x86-64 | grep LCPI | count 3 ;