thr3ads.net - similar to: "[LLVMdev] Need a clue to improve the optimization of some C code"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Need a clue to improve the optimization of some C code"

[LLVMdev] Need a clue to improve the optimization of some C code

2015 Mar 03

[LLVMdev] Need a clue to improve the optimization of some C code

Am 03.03.2015 um 19:49 schrieb Philip Reames <listmail at philipreames.com>: Hi Philip first thanks for your response, > You'll need to prove a bit more information to get any useful response. Questions: > 1) What's you're use case? Are you using clang to compile C code? Are you manually generating LLVM IR? yes the "inline function C code" will be compiled

Expected constant simplification not happening

2016 Feb 11

Expected constant simplification not happening

Hi the appended IR code does not optimize to my liking :) this is the interesting part in x86_64, that got produced via clang -Os: --- movq -16(%r12), %rax movl -4(%rax), %ecx andl $2298949, %ecx ## imm = 0x231445 cmpq $2298949, (%rax,%rcx) ## imm = 0x231445 leaq 8(%rax,%rcx), %rax cmovneq %r15, %rax movl $2298949, %esi ## imm = 0x231445 movq %r12, %rdi movq %r14,

Expected constant simplification not happening

2016 Dec 07

Expected constant simplification not happening

Hello Has there been any progress on this topic ? The 3.9 optimizer output is still the same as I just looked. https://llvm.org/bugs/show_bug.cgi?id=24448 Ciao Nat! Sanjay Patel schrieb: > [cc'ing Zia] > > We have this transform with -Os for some cases after: > http://reviews.llvm.org/rL244601 > http://reviews.llvm.org/D11363 > > but something in this example is

A code layout related side-effect introduced by rL318299

2017 Dec 19

A code layout related side-effect introduced by rL318299

Hi, Recently 10% performance regression on an important benchmark showed up after we integrated https://reviews.llvm.org/rL318299. The analysis showed that rL318299 triggered loop rotation on an multi exits loop, and the loop rotation introduced code layout issue. The performance regression is a side-effect of rL318299. I got two testcases a.ll and b.ll attached to illustrate the problem. a.ll

A code layout related side-effect introduced by rL318299

2017 Dec 19

A code layout related side-effect introduced by rL318299

On Mon, Dec 18, 2017 at 5:46 PM Xinliang David Li <davidxl at google.com> wrote: > The introduction of cleanup.cond block in b.ll without loop-rotation > already makes the layout worse than a.ll. > > > Without introducing cleanup.cond block, the layout out is > > entry->while.cond -> while.body->ret > > All the arrows are hot fall through edges which is

[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops

2014 Jul 23

[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops

the clang 3.5 loop optimizer seems to jump in unintentional for simple loops the very simple example ---- const int SIZE = 3; int the_func(int* p_array) { int dummy = 0; #if defined(ITER) for(int* p = &p_array[0]; p < &p_array[SIZE]; ++p) dummy += *p; #else for(int i = 0; i < SIZE; ++i) dummy += p_array[i]; #endif return dummy; } int main(int argc, char** argv) {

Nowaday Scalar Evolution's Problem.

2017 Nov 20

Nowaday Scalar Evolution's Problem.

The Problem? Nowaday, SCEV called "Scalar Evolution" does only evolate instructions that has predictable operand, Constant-Based operand. such as that can evolute as a constant. otherwise we couldn't evolate it as SCEV node, evolated as SCEVUnknown. important thing that we remember is, we do not use SCEV only for Loop Deletion, which that doesn't really needed on nature loops

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 06

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

2011 Feb 18

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

On Feb 17, 2011, at 10:35 PM, Вадим Марковцев wrote: > Hello everyone, > > I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls". > Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched. Adding separate "s" instructions is

LLVM Loop vectorizer - 2 vector.body blocks appear

2016 Aug 01

LLVM Loop vectorizer - 2 vector.body blocks appear

Hello. Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the beginning of July 2016) I ran the following piece of C code: void foo(long *A, long *B, long *C, long N) { for (long i = 0; i < N; ++i) { C[i] = A[i] + B[i]; } } The vectorized LLVM program I obtain contains 2 vector.body blocks - one named

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

2011 Feb 18

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

Hello everyone, I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls". Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched. Besides, I propose the codegen optimization based on them, which removes the redundant comparison in patterns like orr

[LLVMdev] LICM promoting memory to scalar

2014 Sep 02

[LLVMdev] LICM promoting memory to scalar

All, If we can speculatively execute a load instruction, why isn’t it safe to hoist it out by promoting it to a scalar in LICM pass? There is a comment in LICM pass that if a load/store is conditional then it is not safe because it would break the LLVM concurrency model (See commit 73bfa4a). It has an IR test for checking this in test/Transforms/LICM/scalar-promote-memmodel.ll However, I have

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 27

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

"Sanjay Patel" <spatel at rotateright.com> wrote: > IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like > this: > unsigned int foo(unsigned int crc) { > if (crc & 0x80000000) > crc <<= 1, crc ^= 0xEDB88320; > else > crc <<= 1; > return crc; > } To document this for x86 too: rewrite the function

[LLVMdev] LICM promoting memory to scalar

2014 Sep 02

[LLVMdev] LICM promoting memory to scalar

I think gcc is right. It inserted a branch for n == 0 (the cbz at the top), so that's not a problem. In all other regards, this is safe: if you examine the sequence of loads and stores, it eliminated all but the first load and all but the last store. How's that unsafe? If I had to guess, the bug here is that LLVM doesn't want to hoist the load over the condition (which it is right

RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)

2018 Mar 23

RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)

Hello all, I've been working for the last month or so on a comprehensive mitigation approach to variant #1 of Spectre. There are a bunch of reasons why this is desirable: - Critical software that is unlikely to be easily hand-mitigated (or where the performance tradeoff isn't worth it) will have a compelling option. - It gives us a baseline on performance for hand-mitigation. - Combined

[LLVMdev] LICM promoting memory to scalar

2014 Sep 03

[LLVMdev] LICM promoting memory to scalar

Thanks for the background on the concurrent memory model. So, is it sufficient that the loop entry is guarded by condition (cbz at top) for preventing the race? The loop entry will be guarded by condition if loop has been rotated by loop rotate pass. Since LICM runs after loop rotate, we can use ScalarEvolution::isLoopEntryGuardedByCond to check if we can speculatively execute load without

[LLVMdev] equivalent IR, different asm

2010 Sep 01

[LLVMdev] equivalent IR, different asm

The attached .ll files seem equivalent, but the resulting asm from 'opt-fail.ll' causes a crash to webkit. I suspect the usage of registers is wrong, can someone take a look ? $ llc opt-pass.ll -o - .section __TEXT,__text,regular,pure_instructions .globl __ZN7WebCore6kolos1ERiS0_PKNS_20RenderBoxModelObjectEPNS_10StyleImageE .align 4, 0x90

[LLVMdev] equivalent IR, different asm

2010 Sep 01

[LLVMdev] equivalent IR, different asm

On Sep 1, 2010, at 6:25 AM, Argyrios Kyrtzidis wrote: > The attached .ll files seem equivalent, but the resulting asm from 'opt-fail.ll' causes a crash to webkit. > I suspect the usage of registers is wrong, can someone take a look ? The difference is that there is a shift right after the multiply, before the divide. In IR, the difference is: %5 = mul nsw i32 %4, %tmp1

[LLVMdev] [cfe-dev] Code generation for noexcept functions

2014 May 11

[LLVMdev] [cfe-dev] Code generation for noexcept functions

On Sun, May 11, 2014 at 8:19 AM, Stephan Tolksdorf <st at quanttec.com> wrote: > Hi, > > When clang/LLVM can't prove that a noexcept function only contains > non-throwing code, it seems to insert an explicit exception handler that > calls std::terminate. Why doesn't clang leave it to the eh personality > function to call std::terminate when an exception is thrown

Help required regarding IPRA and Local Function optimization

2016 Jun 30

Help required regarding IPRA and Local Function optimization

Hello Mentors, I am currently finding bug in Local Function related optimization due to which runtime failures are observed in some test cases, as those test cases are containing very large function with recursion and object oriented code so I am not able to find a pattern which is causing failure. So I tried following simple case to understand expected behavior from this optimization. Consider

similar to: [LLVMdev] Need a clue to improve the optimization of some C code