thr3ads.net - similar to: "[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?"

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

2013 Oct 28

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

On Oct 27, 2013 2:16 PM, "David Nadlinger" <code at klickverbot.at> wrote: > > The following piece of IR is a fixed point for opt -std-compile-opts/-O3: > > --- > target datalayout = > "e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"

[LLVMdev] Labels

2008 Jan 12

[LLVMdev] Labels

I'm attempting to modify a parser generator to emit LLVM code instead of C. So far the experience has been trivial, but I am now running into an error regarding labels that I can't seem to solve. Situation 1: A label is used immediately after a void function call (l6 in this case): <snip> %tmp26 = load i32* @yybegin, align 4 %tmp27 = load i32* @yyend, align 4 call void

[LLVMdev] Hoisting elements of array argument into registers

2010 Nov 07

[LLVMdev] Hoisting elements of array argument into registers

David Peixotto <dmp <at> rice.edu> writes: > I am seeing the wf loop get optimized just fine with llvm 2.8 (and almost as good with head). I rechecked this and am I actually seeing the same results as you. I think I must have made a stupid mistake in my tests before - sorry for the noise. However, I found that we have a phase ordering problem which is preventing us getting as much

[LLVMdev] Unrolling loops into constant-time expressions

2010 Nov 23

[LLVMdev] Unrolling loops into constant-time expressions

Hello, I've come across another example: I'm compiling with clang -S -emit-llvm -std=gnu99 -O3 clang version 2.9 (trunk 118238) Target: x86_64-unknown-linux-gnu Thread model: posix I take the code: int loops(int x) { int ret = 0; for(int i = 0; i < x; i++) { for(int j = 0; j < x; j++) { ret += 1; } } return ret; } and the

[LLVMdev] Missing Optimization Opportunities

2010 Sep 10

[LLVMdev] Missing Optimization Opportunities

Hi, I'm using LLVM 2.7 right now, and I found "opt -std-compile-opts" has missed some opportunities for optimization: define void @spa.main() readonly { entry: %tmp = load i32* @dst-ip ; <i32> [#uses=3] %tmp1 = and i32 %tmp, -16777216 ; <i32> [#uses=1] %tmp2 = icmp eq i32 %tmp1, 167772160 ; <i1> [#uses=2]

[LLVMdev] Hoisting elements of array argument into registers

2010 Nov 06

[LLVMdev] Hoisting elements of array argument into registers

I am seeing the wf loop get optimized just fine with llvm 2.8 (and almost as good with head). I'm running on Mac OS X 10.6. I have an apple supplied llvm-gcc and a self compiled llvm 2.8. When I run $ llvm-gcc -emit-llvm -S M.c $ opt -O2 M.s | llvm-dis I see that: 1. Tail recursion has been eliminated from wf 2. The accesses to sp have been promoted to registers 3. The loop has

[LLVMdev] ARM backend problem ?

2007 Jun 12

[LLVMdev] ARM backend problem ?

Hello, I want to compile a LLVM file into an executable running on ARM platform. I use LLVM 2.0 with the following command lines: llvm-as -f -o test.bc test.ll llc -march=arm -mcpu=arm1136j-s -mattr=+v6 -f -o test.s test.bc arm-linux-gnu-as -mcpu=arm1136j-s test.s With the last command, I obtain the following error: rd and rm should be different in mul The bad instruction is

[LLVMdev] ARM backend problem ?

2007 Jun 12

[LLVMdev] ARM backend problem ?

Hi Mikael, You are obtaining warning, not an error, right? The most arm cores, including arm1136, can execute mul with rd = rm. So, you can ignore this warning. Lauro 2007/6/12, Peltier, Mikael <m-peltier at ti.com>: > > > > > Hello, > > > > I want to compile a LLVM file into an executable running on ARM platform. > > I use LLVM 2.0 with the following

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

2015 Jun 11

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

[+Arnold] > On Jun 10, 2015, at 1:29 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > [+CC Andy] > >> Can anyone familiar with ScalarRevolution tell me whether this is an >> expected behavior or a bug? > > Assuming you're talking about 2*k, this is a bug. ScalarEvolution > should be able to prove that {0,+,4} is <nsw> and

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

2015 Jun 10

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

I am testing vectorization on the following test case: float x[1024], y[1024]; void myloop1() { for (long int k = 0; k < 512; k++) { x[2*k] = x[2*k]+y[k]; } } Vectorization failed due to "unsafe dependent memory operation". I traced the LoopAccessAnalysis.cpp and found the reason is the NoWrapFlag for SCEVAddRecExpr is not set and consequently the

[LLVMdev] spilling & xmm register usage

2010 Sep 29

[LLVMdev] spilling & xmm register usage

On Sep 29, 2010, at 8:35 AMPDT, Ralf Karrenberg wrote: > Hello everybody, > > I have stumbled upon a test case (the attached module is a slightly > reduced version) that shows extremely reduced performance on linux > compared to windows when executed using LLVM's JIT. > > We narrowed the problem down to the actual code being generated, the > source IR on both systems

[LLVMdev] spilling & xmm register usage

2010 Sep 29

[LLVMdev] spilling & xmm register usage

Hello everybody, I have stumbled upon a test case (the attached module is a slightly reduced version) that shows extremely reduced performance on linux compared to windows when executed using LLVM's JIT. We narrowed the problem down to the actual code being generated, the source IR on both systems is the same. Try compiling the attached module: llc -O3 -filetype=asm -o BAD.s BAD.ll Under

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

2015 Jun 11

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

> On Jun 10, 2015, at 6:17 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > I'm not sure if inbounds can be used to prove <nuw>. If an object > %OBJ is allocated at address -1 then "gep inbounds %OBJ 1" is not > poison, but the underlying computation unsigned-overflows. I think that this should yield poison per langref because the signed

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

2013 Oct 29

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

On Mon, Oct 28, 2013 at 10:09 AM, James Courtier-Dutton <james.dutton at gmail.com> wrote: > My guess is that this is a missed optimization, but in real life, all > projects i have worked fix this in the C or C++ code using macros that > change what instructions are used based on target platform and its > endedness. One reason for writing code like this, i.e. explicitly spelling

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

2010 Jan 29

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

Hey Duncan, hey everybody else, I just stumbled upon a problem in the latest llvm-gcc trunk which is related to my previous problem with the 64bit ABI and structs: Given the following code: struct float3 { float x, y, z; }; extern "C" void __attribute__((noinline)) test(float3 a, float3* res) { res->y = a.y; } int main(void) { float3 a; float3 res; test(a,

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

2013 Oct 29

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

----- Original Message ----- > On Mon, Oct 28, 2013 at 10:09 AM, James Courtier-Dutton > <james.dutton at gmail.com> wrote: > > My guess is that this is a missed optimization, but in real life, > > all > > projects i have worked fix this in the C or C++ code using macros > > that > > change what instructions are used based on target platform and its >

[LLVMdev] llvm-gcc + abi stuff

2008 Jan 24

[LLVMdev] llvm-gcc + abi stuff

<moving this to llvmdev instead of commits> On Jan 22, 2008, at 11:23 PM, Duncan Sands wrote: >> Okay, well we already get many other x86-64 issues wrong already, but >> Evan is chipping away at it. How do you pass an array by value in C? >> Example please, > > I find the x86-64 ABI hard to interpret, but it seems to say that > aggregates are classified

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

2013 Oct 30

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

I wrote up this optimization as an LLVM IR pass last month, actually: https://code.google.com/p/foster/source/browse/compiler/llvm/passes/BitcastLoadRecognizer.cpp It recognizes trees of `or' operations where the leaves are (buf[v+c] << c * sizeof(buf[0])). There are a few improvements needed to make it fit for general consumption; it assumes (without checking) that it's targeting

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

2010 Jan 29

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

Hi Ralf, > llvm-gcc -c -emit-llvm -O3 produces this: > > %struct.float3 = type { float, float, float } > define void @test(double %a.0, float %a.1, %struct.float3* nocapture > %res) nounwind noinline { > entry: > %tmp8 = bitcast double %a.0 to i64 ; <i64> [#uses=1] > %tmp9 = zext i64 %tmp8 to i96 ; <i96> [#uses=1] >

[LLVMdev] Another memory fun

2008 Jan 06

[LLVMdev] Another memory fun

hm.... I think, that is valid in c but next code too doesn't works right: ; ModuleID = 'sample.lz' @.str1 = internal global [6 x i8] c"world\00" ; <[6 x i8]*> [#uses=1] @.str2 = internal global [7 x i8] c"hello \00" ; <[7 x i8]*> [#uses=1] @.str7 = internal global [7 x i8] c"father\00" ; <[7 x i8]*> [#uses=1]

similar to: [LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?