thr3ads.net - search: "mergeconsecutivestores"

[LLVMdev] i1 types in MergeConsecutiveStores

2015 May 12

2

[LLVMdev] i1 types in MergeConsecutiveStores

Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is the...

large slowdown in DAGCombiner::MergeConsecutiveStores

2020 Mar 19

2

large slowdown in DAGCombiner::MergeConsecutiveStores

Hello all, We are seeing a large compiler performance regression in moving from LLVM 6.0.1 to 8.0.1. We have a long function (~50000 instructions) that used to compile in about a minute but now takes at least an hour. All the time is in MergeConsecutiveStores, I believe due to super-linear behavior in analyzing very long chains of stores. For example, this change makes the problem go away: ``` --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -16011,6 +16011,9 @@ bool DAGCombiner::MergeConsecutiv...

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

2013 Nov 22

2

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

In DAGCombiner::MergeConsecutiveStores, there is this check: if (Index->getAlignment() != St->getAlignment()) break; Apparently this check ensures that all of the stores have the same alignment. Why is that necessary? This seems very overly restrictive to me. -David

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

2013 Nov 22

0

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

Hi David, You are right. This check is overly restrictive. We can replace this check with code that uses the alignment of the first store. Thanks, Nadav On Nov 22, 2013, at 9:31 AM, dag at cray.com wrote: > In DAGCombiner::MergeConsecutiveStores, there is this check: > > if (Index->getAlignment() != St->getAlignment()) > break; > > Apparently this check ensures that all of the stores have the same > alignment. Why is that necessary? This seems very overly restrictive > to me. > >...

My own codegen is 2.5x slower than llc?

2018 May 29

4

My own codegen is 2.5x slower than llc?

My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) If, instead, I dump out the LLVM IR, and manually run both opt and llc on it with -O2, the whole affair takes only 2 minutes. I am using a dynamically linked LLVM library. I hav...

[LLVMdev] DAGCombiner::MergeConsecutiveStores

2015 Feb 13

2

[LLVMdev] DAGCombiner::MergeConsecutiveStores

Hi, I'm quite puzzled by a little bit of code in the DAGCombiner where it merges loads in MergeConsecutiveStores. Two 16bit loads have been merged to one 32bit load, and two 16bit stores have been combined to one 32bit store. And then the code goes like this: // Replace one of the loads with the new load. LoadSDNode *Ld = cast<LoadSDNode>(LoadNodes[0].MemNode); DAG.ReplaceAllUsesOfValueWit...

My own codegen is 2.5x slower than llc?

2018 May 29

0

My own codegen is 2.5x slower than llc?

...22:02, David Jones via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. > > If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) > > If, instead, I dump out the LLVM IR, and manually run both opt and llc on it with -O2, the whole affair takes only 2 minutes. > > I am using a dynamically linked...

My own codegen is 2.5x slower than llc?

2018 May 29

0

My own codegen is 2.5x slower than llc?

..., David Jones via llvm-dev < llvm-dev at lists.llvm.org> wrote: > My back-end code generator uses LLVM 5.0.1 to optimize and generate code > for x86_64. > > If I run it on a given sample of IR, it takes almost 5 minutes to generate > object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One > function has a basic block with 14000 instructions, which is a pathological > case for MergeConsecutiveStores.) > > If, instead, I dump out the LLVM IR, and manually run both opt and llc on > it with -O2, the whole affair takes only 2 minutes. > > I am using a dynami...

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

2

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

...if (!VectorizableTree.size()) { > - assert(!ExternalUses.size() && "We should not have any external > users"); > + // Don't vectorize tiny trees. Small load/store chains or consecutive > stores > + // of constants will be vectoried in SelectionDAG in > MergeConsecutiveStores. > + if (VectorizableTree.size() < 3) { > + if (!VectorizableTree.size()) { > + assert(!ExternalUses.size() && "We should not have any external > users"); > + } > return 0; > } > > -------------- next part -------------- An HTML at...

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

0

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

...; - if (!VectorizableTree.size()) { >> - assert(!ExternalUses.size() && "We should not have any external users"); >> + // Don't vectorize tiny trees. Small load/store chains or consecutive stores >> + // of constants will be vectoried in SelectionDAG in MergeConsecutiveStores. >> + if (VectorizableTree.size() < 3) { >> + if (!VectorizableTree.size()) { >> + assert(!ExternalUses.size() && "We should not have any external users"); >> + } >> return 0; >> } -------------- next part -------------- A...

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

1

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

...size()) { >> - assert(!ExternalUses.size() && "We should not have any external >> users"); >> + // Don't vectorize tiny trees. Small load/store chains or consecutive >> stores >> + // of constants will be vectoried in SelectionDAG in >> MergeConsecutiveStores. >> + if (VectorizableTree.size() < 3) { >> + if (!VectorizableTree.size()) { >> + assert(!ExternalUses.size() && "We should not have any external >> users"); >> + } >> return 0; >> } >> >> --------------...

Optimization of successive constant stores

2015 Dec 11

2

Optimization of successive constant stores

...hat all combined instructions have to have the same alignment. Why? On Fri, Dec 11, 2015 at 11:37 AM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi David, > > We generally handle this (early) in the backend where we have more > information about target capabilities and costs. See MergeConsecutiveStores > in lib/CodeGen/SelectionDAG/DAGCombiner.cpp > > -Hal > > ----- Original Message ----- > > From: "David Jones via llvm-dev" <llvm-dev at lists.llvm.org> > > To: llvm-dev at lists.llvm.org > > Sent: Friday, December 11, 2015 10:32:50 AM > > Su...

Optimization of successive constant stores

2015 Dec 11

2

Optimization of successive constant stores

Consider the following: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %UodStructType = type { i8, i8, i8, i8, i32, i8* } define void @test(%UodStructType*) { %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

2015 Apr 03

2

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

> On Apr 2, 2015, at 2:07 PM, Tom Stellard <tom at stellard.net> wrote: > > On Thu, Apr 02, 2015 at 01:35:55PM -0700, Pete Cooper wrote: >> Hi James, Jim >> >> If you *really* want this to work in selection DAG then there is a solution, but its not pretty. >> >> First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you

[RFC] - Deduplication of debug information in linkers (LLD)

2017 Dec 04

2

[RFC] - Deduplication of debug information in linkers (LLD)

...> documented, so well worth a ready. > [r319164](http://reviews.llvm.org/rL319164). > > * The new `-stack-size-section` flag causes metadata to be emitted in an > ELF > section with information on function stack sizes. > [r319430](http://reviews.llvm.org/rL319430). > > * MergeConsecutiveStores is now run a second time, just before instruction > selection. This allows lowered intrinsics to be merged as well. > [r319036](http://reviews.llvm.org/rL319036). > > * The MachineVerifier PHI and register operand checking has been improved. > [r319140](http://reviews.llvm.org/rL3191...

search for: mergeconsecutivestores