similar to: [LLVMdev] GCC vs. LLVM difference on simple code example

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] GCC vs. LLVM difference on simple code example"

2013 Sep 04
6
[LLVMdev] Aliasing of volatile and non-volatile
A customer has reported a performance problem, which I have eventually tracked down to the following situation: Consider this program: int foo(int *p, volatile int *q, int n) { int i, s = 0; for (i = 0; i < n; ++i) s += *p + *q; return s; } LLVM's analysis indicates that *p and *q can alias, even though *p is non-volatile whereas *q is volatile. I don't have the
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
I write a small function and test it under clang and gcc, filet test.c: double X[100]; double Y[100]; double DA = 0.3; int f() { int i; for (i = 0; i < 100; i++) Y[i] = Y[i] - DA * X[i]; return 0; } clang -S -O3 -o test.s test.c -march=native -ccc-echo result: "D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr e -disable-llvm-verifier
2013 Sep 07
0
[LLVMdev] Aliasing of volatile and non-volatile
Are you sure this is an alias problem? What is happening is LLVM is leaving the code looking like this: int foo(int *p, volatile int *q, int n) { int i, s = 0; for (i = 0; i < n; ++i) s += *p + *q; return s; } but GCC is changing to code to look like this: int foo(int *p, volatile int *q, int n) { int i, s = 0; int t; t = *p; for (i = 0; i < n; ++i) s += t + *q;
2011 Dec 14
2
[LLVMdev] Failure to optimize ? operator
I don't understand your point. Which version is better does NOT depend on what inputs are passed to the function. The compiled code for (as per llvm) f1 will always take less time to execute than f2. for x > 0 => T(f1) < T(f2) for x <= 0 => T(f1) = T(f2) where T() is the time to execute the given function. So always T(f1) <= T(f2). I would call this a missed
2017 Aug 21
3
DragonEgg for GCC v8.x and LLVM v6.x is just able to work
Hi LLVM and GCC developers, My sincere thanks will goto: * Duncan, the core developer of llvm-gcc and dragonegg http://llvm.org/devmtg/2009-10/Sands_LLVMGCCPlugin.pdf * David, the innovator and developer of GCC https://dmalcolm.fedorapeople.org/gcc/global-state/requirements.html and others who give me kind response for teaching me patiently and carefully about how to migrate GCC v4.8.x to
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
I found a something that I quite not understand when compiling a common piece of code using the -Os flags. I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code: char pp[3]; char *scscx = pp; int tst( char i, char j, char k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } The above gets
2012 Mar 20
0
[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization
I was told that my writeup lacked an example and details so I reproduced the code that X uses and I was able to boil down the issue to a couple of lines of code. Sorry again for the length of this email. Code was compiled on OpenBSD with clang 3.0-release. ======================================================================== With -O0 which works as X expects:
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior
2020 Aug 05
10
[RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
Greetings, We present “Machine Function Splitter”, a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions
2016 Jun 30
4
Help required regarding IPRA and Local Function optimization
Hello Mentors, I am currently finding bug in Local Function related optimization due to which runtime failures are observed in some test cases, as those test cases are containing very large function with recursion and object oriented code so I am not able to find a pattern which is causing failure. So I tried following simple case to understand expected behavior from this optimization. Consider
2020 Aug 05
3
[RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
On Tue, Aug 4, 2020 at 10:51 PM aditya kumar <hiraditya at gmail.com> wrote: > Glad to hear that there is an interest in a function splitting pass. There > are advantages to splitting functions at different stages as you've already > noted. > Right -- with slightly different objectives. Machine Function Splitting Pass's main focus is on performance improvement. > -
2006 May 25
2
Compilation issues with s390
Hi all, I'm trying to compile asterisk on the mainframe (s390 / s390x) and I am running into issues. I was wondering if somebody could give a hand? I'm thinking that I should be able to do this. I have noticed that Debian even has binary RPM's out for Asterisk now. I'm trying to do this on SuSE SLES8 (with the 2.4 kernel). What I see is, an issue that arch=s390 isn't
2014 Dec 21
5
[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences
Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp
2012 Dec 01
0
[LLVMdev] operator overloading fails while debugging with gdb for i386
Problem seems not only with operator overloading, It occurs with struct value returning also. gdb while debugging expects the return value in eax, gcc does returns in eax, But Clang returns in edx(it can be checked in gdb by printing the contents of edx). Code(sample code) struct A1 { int x; int y; }; A1 sum(const A1 one, const A1 two) { A1 plus = {0,0}; plus.x = one.x + two.x; plus.y
2014 Dec 21
2
[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences
Which performance guidelines are you referring to? I'm not that familiar with decade-old CPUs, but to the best of my knowledge, this is not true on current hardware. There is one specific circumstance where PUSHes should be avoided - for Atom/Silvermont processors, the memory form of PUSH is inefficient, so the register-freeing optimization below may not be profitable (see 14.3.3.6 and
2011 Dec 14
0
[LLVMdev] Failure to optimize ? operator
On Tue, Dec 13, 2011 at 5:59 AM, Brent Walker <brenthwalker at gmail.com> wrote: > The following seemingly identical functions, get compiled to quite > different machine code.  The first is correctly optimized (the > computation of var y is nicely moved into the else branch of the "if" > statement), which the second one is not (the full computation of var y > is
2016 Apr 04
2
How to call an (x86) cleanup/catchpad funclet
I've modified llvm to emit vc++ compatible SEH structures for my personality on x86/Windows and my handler works fine, but the only thing I can't figure out is how to call these funclets, they look like: Catch: "?catch$3@?0?m3 at 4HA": LBB4_3: # %BasicBlock26 pushl %ebp pushl %eax addl $12, %ebp movl %esp, -28(%ebp) movl $LBB4_5, %eax
2018 Sep 20
3
Comparing Clang and GCC: only clang stores updated value in each iteration.
Hi, I have a benchmark (mcf) that is currently slower when compiled with clang compared to gcc 8 (~10%). It seems that a hot loop has a few differences, where one interesting one is that while clang stores an incremented value in each iteration, gcc waits and just stores the final value just once after the loop. The value is a global variable. I wonder if this is something clang does not do
2012 Dec 01
2
[LLVMdev] operator overloading fails while debugging with gdb for i386
Hi, Structures are passed by pointer, so the return value is not actually in eax. That code gets transformed into something like: void sum(A1 *out, const A1 one, const A1 two) { out->x = one.x + two.x out->y = one.y + two.y } So actually the function ends up returning void and operating on a hidden parameter, so %eax is dead at the end of the function and should not be being relied