thr3ads.net - similar to: "Understanding tail call"

Displaying 20 results from an estimated 3000 matches similar to: "Understanding tail call"

Optimizing assembly generated for tail call

2020 Oct 06

Optimizing assembly generated for tail call

Hello, I recently found that LLVM generates sub-optimal assembly for a tail call optimization case. Below is an example (https://godbolt.org/z/ao15xE): > void g1(); > void g2(); > void f(bool v) { > if (v) { > g1(); > } else { > g2(); > } > } > The assembly generated is as follow: > f(bool): # @f(bool) > testb %dil, %dil > je .LBB0_2 >

Another tail call optimization question

2020 Oct 03

Another tail call optimization question

Hello, Could anyone kindly explain to me why the 'g()' in the following function cannot have tail call optimization? > void f(int* x); > void g(); > void h(int v) { > f(&v); > g(); > } > A while ago I was taught that tail call optimization cannot apply if local variables needs to be kept alive, but 'g()' doesn't seem to require anything to be

Custom calling convention & ARM target

2019 Jul 17

Custom calling convention & ARM target

Hi Tim, Thank you for your reply. Actually, I already played with various target triples including what sys::getProcessTriple() returns when I tried to compile it on a Raspberry Pi 3 device. Yes, changing the triple to armv7-unknown-linux-gnueabi changes the emitted return instruction to 'bx lr'. But this is not the issue. Let me describe it based on an example I prepared to demonstrate

Performance of JIT execution

2020 Sep 04

Performance of JIT execution

Hello, I recently noticed a performance issue of JIT execution vs native code of the following simple logic which computes the Fibonacci sequence: uint64_t fib(int n) { if (n <= 2) { return 1; } else { return fib(n-1) + fib(n-2); } } When compiled natively using clang++ with -O3, it took 0.17s to compute fib(40). However, when executing using LLJIT, fed with the IR output of "clang++

[LLVMdev] Tail call optimization thoughts

2007 Aug 09

[LLVMdev] Tail call optimization thoughts

Implementing tail call opt could look like the following: 0.)a fast calling convention (maybe use the current CallingConv::Fast, or create a CallingConv::TailCall) 1.) lowering of formal arguments like for example x86_LowerCCCArguments in stdcall mode we need to make sure that later mentioned CALL_CLOBBERED_REG is not used (remove it from available registers in callingconvention for

JIT interaction with linkonce_odr global variables

2020 Aug 07

JIT interaction with linkonce_odr global variables

Hello, I recently hit an issue when JIT'ing my generated IR using llvm::orc::LLJIT. My IR contains the following definition of a global variable: > $_ZZ23TestStaticVarInFunctionbE1x = comdat any > @_ZZ23TestStaticVarInFunctionbE1x = linkonce_odr dso_local global i32 123, > comdat, align 4 > And in my host process, there exists the same symbol. I would expect LLJIT to resolve the

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

2007 Aug 08

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

Hello, Arnold. > with the sentence i tried to express the question whether there is a > way to persuade the code generator to use another register to load (or > move) the function pointer to (right before the callee saved register > restore) but thinking a little further that's nonsense. Why don't define some special op for callee address and custom lower it? I really

target-features attribute prevents inlining?

2020 Jun 13

target-features attribute prevents inlining?

Thank you so much David! After thinking a bit more I agree with you that attempting to add 'target-features' to my functions seem to be the safest approach of all. I noticed that if I mark the clang++ function as 'AlwaysInline', the inlining is performed normally. Is this a potential bug, given what you said that LLVM may accidentally move code using advanced cpu features outside

LLJIT global constants string becomes invalid in generated code

2020 Nov 05

LLJIT global constants string becomes invalid in generated code

Hi, Recently I hit an issue that LLJIT crashes when CodeGenOpt::Less or higher is given. After investigation, it turned out that the issue is some global constant string in the IR, like > @.str.117 = private unnamed_addr constant [9 x i8] c"lineitem\00", align 1 > becomes an invalid pointer in the generated code. > $1 = 0xf7fab054 <error: Cannot access memory at address

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

2007 Aug 08

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

Hello, Arnold. > Is there a way to indicate that the register the tail call > instruction uses as destination needs to be valid after the callee > saved registers have been restored? (some X86InstrInfo.td foo magic > maybe ?) It's wrong way to do the things. Because in this case you either violate the ABI for callee, or you're restricted to do tail call lowering only for

target-features attribute prevents inlining?

2020 Jun 13

target-features attribute prevents inlining?

Hi David, Thanks for your quick response! I now understand the reason that inlining cannot be done on functions with different target-attributes. Thanks for your explanation! However, I think I didn't fully understand your solution; it would be nice if you would like to elaborate a bit more. Here's a bit more info on my current workflow: (1) The clang++ compiler builds C++ source file

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

2007 Aug 08

[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling

Hi Anton and Dale first thanks for your answers. On 8 Aug 2007, at 16:43, Anton Korobeynikov wrote: > Hello, Arnold. > >> Is there a way to indicate that the register the tail call >> instruction uses as destination needs to be valid after the callee >> saved registers have been restored? (some X86InstrInfo.td foo magic >> maybe ?) > It's wrong way to do the

R.DLL mapping by P/Invoke

2006 Nov 27

R.DLL mapping by P/Invoke

After a long processing, I was able to create a version of a small C# class that was able to emulate the rproxy by P/Invoke. This is mostly to find a workaround a performance problem of the StatConnector. It's almost work but ... I have strange memory exception when I call the print function. The variable seems to not survive from one call to the other. As there is no debug symbol for

target-features attribute prevents inlining?

2020 Jun 13

target-features attribute prevents inlining?

Hello, I'm new to LLVM and I recently hit a weird problem about inlining behavior. I managed to get a minimal repro and the symptom of the issue, but I couldn't understand the root cause or how I should properly handle this issue. Below is an IR code consisting of two functions '_Z2fnP10TestStructi' and 'testfn', with the latter calling the former. One would expect the

[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets

2015 Jan 11

[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets

Hello, find enclosed a first patch for adding tail call optimizations for thumb1 targets. I assume that this list is the right place for publishing patches for review? Since this is my first proposal for LLVM, I'd very much appreciate your feedback. What the patch is meant to do: For Tail calls identified during DAG generation, the target address will be loaded into a register by use

[LLVMdev] Calling Conventions Cont'd

2008 Apr 12

[LLVMdev] Calling Conventions Cont'd

What is the correct procedure for translating a function signature from a high-order language to LLVM? It looks like I replace each struct/array parameter with a 'byval' pointer parameter, and I replace a result struct/array with an 'sret' pointer parameter. The reason I ask is that each calling convention has subtle variations for each architecture and platform. For

Exceptions and performance

2020 Aug 14

Exceptions and performance

On Thu, Aug 13, 2020 at 6:11 PM Haoran Xu <haoranxu510 at gmail.com> wrote: > > Thanks for the insights David! > > For your first 3 points, is it correct to understand it as following: the external function prototypes are missing reliable information on whether the function throws and what exceptions it may throw (due to C++'s design failures and that it is impractical to

Tail-Loop Folding/Predication

2019 Jul 15

Tail-Loop Folding/Predication

I am looking for feedback to add support for a new loop pragma to Clang/LLVM. With "#pragma tail_predicate" the idea would be to indicate that a loop epilogue/tail can, or should be, folded into the main loop. I see two use cases for this pragma. First, this could be interesting for the vectorizer. It currently supports tail folding by masking all loop instructions/blocks, but does this

[LLVMdev] CDECL Calling Convention

2011 Mar 20

[LLVMdev] CDECL Calling Convention

Hello all, I am a beginner of LLVM and I want to add a new backend into LLVM. The calling convention of the target I ported is CDECL. I am wondering to know whether there is already CDECL calling convention implemented in LLVM?? Which CallingConv.td file should I copy and modify for my target?? thanks a lot Mitnick -------------- next part -------------- An HTML attachment was scrubbed... URL:

Exceptions and performance

2020 Aug 13

Exceptions and performance

There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not. But I agree with Dave--exceptions generally inhibit optimization. One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive

similar to: Understanding tail call