similar to: Optimization of successive constant stores

Displaying 20 results from an estimated 10000 matches similar to: "Optimization of successive constant stores"

2015 Dec 11
2
Optimization of successive constant stores
Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
I found a something that I quite not understand when compiling a common piece of code using the -Os flags. I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code: char pp[3]; char *scscx = pp; int tst( char i, char j, char k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } The above gets
2018 Sep 11
2
Byte-wide stores aren't coalesced if interspersed with other stores
Andres: FWIW, codegen will do the merge if you turn on global alias analysis for it "-combiner-global-alias-analysis". That said, we should be able to do this merging earlier. -Nirav On Mon, Sep 10, 2018 at 8:33 PM, Andres Freund via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > On 2018-09-10 13:42:21 -0700, Andres Freund wrote: > > I have, in postres,
2005 Feb 22
0
[LLVMdev] Area for improvement
When I increased COLS to the point where the loop could no longer be unrolled, the selection dag code generator generated effectively the same code as the default X86 code generator. Lots of redundant imul/movl/addl sequences. It can't clean it up either. Only unrolling all nested loops permits it to be optimized away, regardless of code generator. Jeff Cohen wrote: > I noticed
2005 Feb 22
5
[LLVMdev] Area for improvement
I noticed that fourinarow is one of the programs in which LLVM is much slower than GCC, so I decided to take a look and see why that is so. The program has many loops that look like this: #define ROWS 6 #define COLS 7 void init_board(char b[COLS][ROWS+1]) { int i,j; for (i=0;i<COLS;i++) for (j=0;j<ROWS;j++) b[i][j]='.';
2015 Mar 08
2
[LLVMdev] Optimizing out redundant alloca involving byval params
errata: I am on 3.6 full stop. I *thought* there was a 3.7 available, based on the title of http://llvm.org/docs/ ("LLVM 3.7 documentation"). I suppose the docs are ahead of the release schedule? On Sun, Mar 8, 2015 at 9:44 AM Mircea Trofin <mtrofin at google.com> wrote: > Sorry, that phase is part of the PNaCl toolchain. This would be LLVM 3.6, > would your comments still
2015 May 12
2
[LLVMdev] i1 types in MergeConsecutiveStores
Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is there some reason MergeConsecutiveStores should
2011 May 03
3
[LLVMdev] GVN Infinite loop
Hi, GVN seems to be running in an infinite loop on my example. I have attached the output of one iteration. I cant seem to reduce the testcase either. Any pointers to how to reduce the test case. THanks, Arushi GVN iteration: 8 GVN WIDENED LOAD: %0 = load i8* getelementptr inbounds (%struct.CHESS_POSITION* @search, i64 0, i32 23), align 2, !dbg !875 TO: %1 = load i16* bitcast (i8*
2016 Oct 26
2
RFC: a more detailed design for ThinLTO + vcall CFI
Hi all, As promised, here is a brain dump on how I see CFI for vcalls working under ThinLTO. Most of this has been prototyped, so the design does appear to be sound. For context on how CFI currently works under regular LTO, please read: http://llvm.org/docs/TypeMetadata.html http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html http://clang.llvm.org/docs/LTOVisibility.html ==== Summary
2017 Nov 08
5
Is it ok to allocate > half of address space?
Hi, I was looking into the semantics of GEP inbounds and some BasicAA rules and I'm wondering if it's valid in LLVM IR to allocate more than half of the address space with a global variable or an alloca. If that's a scenario want to consider, then we have problems :) Consider this C code (32 bits): #include <string.h> char obj[0x80000008]; char f() { char *p = obj +
2015 May 03
2
[LLVMdev] Semantics of an Inbounds GetElementPtr
Hi - I've got a question about what optimizations the "inbounds" keyword of "getelementptr" allows you to use. In the code below, %five is loaded from and inbounds offset of either a null pointer or %mem: target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128" define i8 @func(i8* %mem) { %test = icmp eq i8* %mem, null br i1 %test, label %done, label
2011 May 04
0
[LLVMdev] GVN Infinite loop
On May 3, 2011, at 3:25 PM, Arushi Aggarwal wrote: > Hi, > > GVN seems to be running in an infinite loop on my example. I have attached the output of one iteration. I cant seem to reduce the testcase either. > > Any pointers to how to reduce the test case. Bugzilla can reduce testcases that cause infinite loops (it has a -timeout flag), I'd try it. Even if this doesn't
2010 Feb 14
0
[LLVMdev] A very basic doubt about LLVM Alias Analysis
Hi Ambika, > Oh m sorry for that mistake as I had points to in mind. > But still what about the following prog: > > int main() > { > int *i,*j,k; > i=&k; > j=&k; > k=4; > printf("%d,%d,%d",*i,*j,k); > return 0; > } > > > here too i dont get <i,j> alias each other. how are you
2012 Jan 19
4
[LLVMdev] What happened to "malloc" in LLVM 3.0 IR?
Hello folks, I have a compiler written with LLVM 2.6 by a student that produces .ll files, It behaved fine at the time. Trying to take the work over using version 3.0, I run into the problem that "malloc" in the IR is no longer valid: semac1 menu > llvm-as Carre.ll llvm-as: Carre.ll:68:14: error: expected instruction opcode %_malloc = malloc i8, i32 %2 ;
2012 Jul 31
3
[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM
Hi Duncan, A DragonEgg/GCC-related question: do you know where these strange FRAME tokens originate from (e.g. %struct.FRAME.matmul)? Compiling simple Fortran code with DragonEgg: > cat matmul.f90 subroutine matmul(nx, ny, nz) implicit none integer :: nx, ny, nz real, dimension(nx, ny) :: A real, dimension(ny, nz) :: B real, dimension(nx, nz) :: C integer :: i, j, k real,
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 3:41 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: >> arm-none-linux-gnueabi > > Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get Minor remark: please use -target instead of -ccc-host-triple that is now deprecated. Thanks for looking at this testcase. Sebastian -- Qualcomm
2012 Jan 26
2
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote: > arm-none-linux-gnueabi Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get vectorization (even though I don't get vectorization when targeting x86_64). I'll let you know what I find. -Hal -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
2018 May 29
4
My own codegen is 2.5x slower than llc?
My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) If, instead, I dump out the LLVM IR, and manually
2011 Sep 15
2
[LLVMdev] problem with sgt's on Sparc machine
Hi guys, Thanks for the input. However, it seems that the code still produces the wrong output on a Sparc machine. My current llvm_print.bc code is: -------------------------------------------------------------------------------------------------- ; MduleID = '<stdin>' target datalayout =
2011 Sep 15
2
[LLVMdev] problem with sgt's on Sparc machine
Hi, I compiled the following code on a Sparc machine, basically it produce different results than a X86 machine. ------------------------------------------------------------------------------------------------------------------- ; MduleID = '<stdin>' target datalayout =