search for: mergeconsecutivestores

Displaying 15 results from an estimated 15 matches for "mergeconsecutivestores".

2015 May 12
2
[LLVMdev] i1 types in MergeConsecutiveStores
Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is the...
2020 Mar 19
2
large slowdown in DAGCombiner::MergeConsecutiveStores
Hello all, We are seeing a large compiler performance regression in moving from LLVM 6.0.1 to 8.0.1. We have a long function (~50000 instructions) that used to compile in about a minute but now takes at least an hour. All the time is in MergeConsecutiveStores, I believe due to super-linear behavior in analyzing very long chains of stores. For example, this change makes the problem go away: ``` --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -16011,6 +16011,9 @@ bool DAGCombiner::MergeConsecutiv...
2013 Nov 22
2
[LLVMdev] DAGCompiler::MergeConsecutiveStores Question
In DAGCombiner::MergeConsecutiveStores, there is this check: if (Index->getAlignment() != St->getAlignment()) break; Apparently this check ensures that all of the stores have the same alignment. Why is that necessary? This seems very overly restrictive to me. -David
2013 Nov 22
0
[LLVMdev] DAGCompiler::MergeConsecutiveStores Question
Hi David, You are right. This check is overly restrictive. We can replace this check with code that uses the alignment of the first store. Thanks, Nadav On Nov 22, 2013, at 9:31 AM, dag at cray.com wrote: > In DAGCombiner::MergeConsecutiveStores, there is this check: > > if (Index->getAlignment() != St->getAlignment()) > break; > > Apparently this check ensures that all of the stores have the same > alignment. Why is that necessary? This seems very overly restrictive > to me. > >...
2018 May 29
4
My own codegen is 2.5x slower than llc?
My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) If, instead, I dump out the LLVM IR, and manually run both opt and llc on it with -O2, the whole affair takes only 2 minutes. I am using a dynamically linked LLVM library. I hav...
2015 Feb 13
2
[LLVMdev] DAGCombiner::MergeConsecutiveStores
Hi, I'm quite puzzled by a little bit of code in the DAGCombiner where it merges loads in MergeConsecutiveStores. Two 16bit loads have been merged to one 32bit load, and two 16bit stores have been combined to one 32bit store. And then the code goes like this: // Replace one of the loads with the new load. LoadSDNode *Ld = cast<LoadSDNode>(LoadNodes[0].MemNode); DAG.ReplaceAllUsesOfValueWit...
2018 May 29
0
My own codegen is 2.5x slower than llc?
...22:02, David Jones via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. > > If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) > > If, instead, I dump out the LLVM IR, and manually run both opt and llc on it with -O2, the whole affair takes only 2 minutes. > > I am using a dynamically linked...
2018 May 29
0
My own codegen is 2.5x slower than llc?
..., David Jones via llvm-dev < llvm-dev at lists.llvm.org> wrote: > My back-end code generator uses LLVM 5.0.1 to optimize and generate code > for x86_64. > > If I run it on a given sample of IR, it takes almost 5 minutes to generate > object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One > function has a basic block with 14000 instructions, which is a pathological > case for MergeConsecutiveStores.) > > If, instead, I dump out the LLVM IR, and manually run both opt and llc on > it with -O2, the whole affair takes only 2 minutes. > > I am using a dynami...
2013 Jul 27
2
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
...if (!VectorizableTree.size()) { > - assert(!ExternalUses.size() && "We should not have any external > users"); > + // Don't vectorize tiny trees. Small load/store chains or consecutive > stores > + // of constants will be vectoried in SelectionDAG in > MergeConsecutiveStores. > + if (VectorizableTree.size() < 3) { > + if (!VectorizableTree.size()) { > + assert(!ExternalUses.size() && "We should not have any external > users"); > + } > return 0; > } > > -------------- next part -------------- An HTML at...
2013 Jul 27
0
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
...; - if (!VectorizableTree.size()) { >> - assert(!ExternalUses.size() && "We should not have any external users"); >> + // Don't vectorize tiny trees. Small load/store chains or consecutive stores >> + // of constants will be vectoried in SelectionDAG in MergeConsecutiveStores. >> + if (VectorizableTree.size() < 3) { >> + if (!VectorizableTree.size()) { >> + assert(!ExternalUses.size() && "We should not have any external users"); >> + } >> return 0; >> } -------------- next part -------------- A...
2013 Jul 27
1
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
...size()) { >> - assert(!ExternalUses.size() && "We should not have any external >> users"); >> + // Don't vectorize tiny trees. Small load/store chains or consecutive >> stores >> + // of constants will be vectoried in SelectionDAG in >> MergeConsecutiveStores. >> + if (VectorizableTree.size() < 3) { >> + if (!VectorizableTree.size()) { >> + assert(!ExternalUses.size() && "We should not have any external >> users"); >> + } >> return 0; >> } >> >> --------------...
2015 Dec 11
2
Optimization of successive constant stores
...hat all combined instructions have to have the same alignment. Why? On Fri, Dec 11, 2015 at 11:37 AM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi David, > > We generally handle this (early) in the backend where we have more > information about target capabilities and costs. See MergeConsecutiveStores > in lib/CodeGen/SelectionDAG/DAGCombiner.cpp > > -Hal > > ----- Original Message ----- > > From: "David Jones via llvm-dev" <llvm-dev at lists.llvm.org> > > To: llvm-dev at lists.llvm.org > > Sent: Friday, December 11, 2015 10:32:50 AM > > Su...
2015 Dec 11
2
Optimization of successive constant stores
Consider the following: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %UodStructType = type { i8, i8, i8, i8, i32, i8* } define void @test(%UodStructType*) { %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1
2015 Apr 03
2
[LLVMdev] How to enable use of 64bit load/store for 32bit architecture
> On Apr 2, 2015, at 2:07 PM, Tom Stellard <tom at stellard.net> wrote: > > On Thu, Apr 02, 2015 at 01:35:55PM -0700, Pete Cooper wrote: >> Hi James, Jim >> >> If you *really* want this to work in selection DAG then there is a solution, but its not pretty. >> >> First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you
2017 Dec 04
2
[RFC] - Deduplication of debug information in linkers (LLD)
...> documented, so well worth a ready. > [r319164](http://reviews.llvm.org/rL319164). > > * The new `-stack-size-section` flag causes metadata to be emitted in an > ELF > section with information on function stack sizes. > [r319430](http://reviews.llvm.org/rL319430). > > * MergeConsecutiveStores is now run a second time, just before instruction > selection. This allows lowered intrinsics to be merged as well. > [r319036](http://reviews.llvm.org/rL319036). > > * The MachineVerifier PHI and register operand checking has been improved. > [r319140](http://reviews.llvm.org/rL3191...