thr3ads.net - search: "getelementptring"

Displaying 20 results from an estimated 2665 matches for "getelementptring".

[LLVMdev] bb-vectorizer transforms only part of the block

2015 Jun 22

[LLVMdev] bb-vectorizer transforms only part of the block

The loads, stores and float arithmetic in attached function should be completely vectorizable. The bb-vectorizer does a good job at first, but from instruction %96 on it messes up by adding unnecessary vectorshuffles. (The function was designed so that no shuffle would be needed in order to vectorize it). I tested this with llvm 3.6 with the following command:

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

I seem to have problem to get the SLP vectorizer to make use of the full 8 floats available in a SIMD vector on a Sandy Bridge CPU with AVX. The function is attached, the CPU flags are: flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good

[LLVMdev] MCJIT generates MOVAPS on unaligned address

2014 Aug 07

[LLVMdev] MCJIT generates MOVAPS on unaligned address

MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address: movaps 88(%rdx), %xmm0 where %rdx comes in as a function argument with only natural alignment (float*). This x86 instruction requires the memory address to be 16 byte aligned which 88 plus something aligned to 4 byte isn't. Here the

[MemorySSA] Potential CachingMemorySSAWalker bug

2016 Apr 29

[MemorySSA] Potential CachingMemorySSAWalker bug

Hi guys, I think I have run into another CachingMemorySSAWalker cache bug. It's a bit tricky to reproduce, so I'd like to start by trying to show you what is happening when running EarlyCSE with my local changes to use MemorySSA. I've attached a debug log that shows that the value returned by getClobberingMemoryAccess(Inst) after a call to removeMemoryAccess is wrong. The

[LLVMdev] How to broaden the SLP vectorizer's search

2014 Aug 07

[LLVMdev] How to broaden the SLP vectorizer's search

On 7 August 2014 17:33, Chad Rosier <mcrosier at codeaurora.org> wrote: > You might consider filing a bug (llvm.org/bugs) requesting a flag, but I > don't know if the code owners want to expose such a flag. I'm not sure that's a good idea as a raw access to that limit, as there are no guarantees that it'll stay the same. But maybe a flag turning some

[LLVMdev] GVN Infinite loop

2011 May 03

[LLVMdev] GVN Infinite loop

Hi, GVN seems to be running in an infinite loop on my example. I have attached the output of one iteration. I cant seem to reduce the testcase either. Any pointers to how to reduce the test case. THanks, Arushi GVN iteration: 8 GVN WIDENED LOAD: %0 = load i8* getelementptr inbounds (%struct.CHESS_POSITION* @search, i64 0, i32 23), align 2, !dbg !875 TO: %1 = load i16* bitcast (i8*

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

2012 Feb 16

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

Something must be wrong, more probable on my side. So the C source code is unchanged, I just did another experiment to first extract all the GEPs in the code, and call AliasAnalysis::alias on each pair of GEPs. Here is the code: AliasAnalysis &AA = getAnalysis<AliasAnalysis>(); TargetData &TD = getAnalysis<TargetData>(); for (Module::iterator it = M.begin();

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

2012 Feb 15

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

Just want to test out the LLVM's AliasAnalysis::getModRefInfo API. The input C code is very simple: void foo(int *a, int *b) { for(int i=0; i<10; i++) b[i] = a[i]*a[i]; } int main() { int a[10]; int b[10]; for(int i=0; i<10; i++) a[i] = i; foo(a,b); return 0; } Obviously, for "foo", it only reads from array "a" and only writes to array

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

2012 Feb 16

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

Thanks Duncan! You are right! If I type the command "opt -basicaa -mypass ...", then the output makes sense. Now, how can you specify which AA to use in the code? Regards, Welson On Thu, Feb 16, 2012 at 12:05 PM, Duncan Sands <baldrick at free.fr> wrote: > Hi Welson, the default alias analysis is -no-aa. As the name suggests it > just returns MayAlias for everything.

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

2012 Feb 16

[LLVMdev] Wrong AliasAnalysis::getModRefInfo result

Hi Welson, the default alias analysis is -no-aa. As the name suggests it just returns MayAlias for everything. Maybe you are using that one? Best wishes, Duncan.

[LLVMdev] GVN Infinite loop

2011 May 04

[LLVMdev] GVN Infinite loop

On May 3, 2011, at 3:25 PM, Arushi Aggarwal wrote: > Hi, > > GVN seems to be running in an infinite loop on my example. I have attached the output of one iteration. I cant seem to reduce the testcase either. > > Any pointers to how to reduce the test case. Bugzilla can reduce testcases that cause infinite loops (it has a -timeout flag), I'd try it. Even if this doesn't

LoopVectorizer -- generating bad and unhandled shufflevector sequence

2016 Oct 06

LoopVectorizer -- generating bad and unhandled shufflevector sequence

Hi, I have experimented with enabling the LoopVectorizer for SystemZ. I have come across a loop which, when vectorized, seems to have been poorly generated. In short, there seems to be a completely unnecessary sequence of shufflevector instructions, that doesn't get optimized away anywhere. In other words, there is a shuffling so that leads back to the original vector: [0 1 2 3

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 19

[LLVMdev] [Vectorization] Mis match in code generated

Hi Arnold, Thanks for your reply. I tried test case as suggested by you. *void foo(int *a, int *sum) {*sum = a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}* so that it has a 'store' in its IR. *IR before vectorization :*target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple =

[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM

2012 Jul 31

[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM

Hi Duncan, A DragonEgg/GCC-related question: do you know where these strange FRAME tokens originate from (e.g. %struct.FRAME.matmul)? Compiling simple Fortran code with DragonEgg: > cat matmul.f90 subroutine matmul(nx, ny, nz) implicit none integer :: nx, ny, nz real, dimension(nx, ny) :: A real, dimension(ny, nz) :: B real, dimension(nx, nz) :: C integer :: i, j, k real,

[LLVMdev] LLVM constant propagation optimization question

2011 Oct 18

[LLVMdev] LLVM constant propagation optimization question

Hi all, I'm writting following LLVM assembly: ; ModuleID = 'structaccess.ll' %struct._anon0 = type <{ i32, i32, i32 }> @s = common global %struct._anon0 zeroinitializer define arm_aapcscc void @foo() nounwind { L.entry: store i32 5, i32* getelementptr inbounds (%struct._anon0* @s, i32 0, i32 0) store i32 10, i32* getelementptr inbounds (%struct._anon0* @s, i32 0, i32 1)

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

Hello everyone, please I need your help. To reproduce my problem I created simple pass for backends (TestPass.cpp in attached files). That pass I call from Mips backend in this way (MipsTargetMachine.cpp): bool MipsPassConfig::addPreRegAlloc() { addPass(createTestPass()); return false; } The problem becomes, when I am trying compile file ldtoa.ll (in attached files). Compiling

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 18

[LLVMdev] [Vectorization] Mis match in code generated

Hi Nadav, Thanks for the quick reply !! Ok, so as of now we are lacking capability to handle flat large reductions. I did go through function vectorizeChainsInBlock() (line number 2862). In this function, we try to vectorize if we have phi nodes in the IR (several if's check for phi nodes) i.e we try to construct tree that starts at chains. Any pointers on how to join multiple trees? I

Optimization of successive constant stores

2015 Dec 11

Optimization of successive constant stores

Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 18

[LLVMdev] [Vectorization] Mis match in code generated

Hi, I am trying to understand LLVM vectorization implementation and was looking into both loop and SLP vectorization. test case 1: *int foo(int *a) {int sum = 0,i;for(i=0; i<16; i++) sum += a[i];return sum;}* This code is vectorized by loop vectorizer where we calculate scalar loop cost as 4 and vector loop cost as 2. Since vector loop cost is less and above reduction is legal to

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

Le 29 mai 2010 à 01:08, Bill Wendling a écrit : > Hi Stéphane, > > The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? > > -bw > > On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: We are actually testing LLVM for the Faust language

search for: getelementptring