thr3ads.net - similar to: "large slowdown in DAGCombiner::MergeConsecutiveStores"

[LLVMdev] i1 types in MergeConsecutiveStores

2015 May 12

2

[LLVMdev] i1 types in MergeConsecutiveStores

Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is there some reason MergeConsecutiveStores should

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

2013 Nov 22

2

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

In DAGCombiner::MergeConsecutiveStores, there is this check: if (Index->getAlignment() != St->getAlignment()) break; Apparently this check ensures that all of the stores have the same alignment. Why is that necessary? This seems very overly restrictive to me. -David

[LLVMdev] DAGCombiner::MergeConsecutiveStores

2015 Feb 13

2

[LLVMdev] DAGCombiner::MergeConsecutiveStores

Hi, I'm quite puzzled by a little bit of code in the DAGCombiner where it merges loads in MergeConsecutiveStores. Two 16bit loads have been merged to one 32bit load, and two 16bit stores have been combined to one 32bit store. And then the code goes like this: // Replace one of the loads with the new load. LoadSDNode *Ld = cast<LoadSDNode>(LoadNodes[0].MemNode);

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

2013 Nov 22

0

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

Hi David, You are right. This check is overly restrictive. We can replace this check with code that uses the alignment of the first store. Thanks, Nadav On Nov 22, 2013, at 9:31 AM, dag at cray.com wrote: > In DAGCombiner::MergeConsecutiveStores, there is this check: > > if (Index->getAlignment() != St->getAlignment()) > break; > > Apparently this check

My own codegen is 2.5x slower than llc?

2018 May 29

4

My own codegen is 2.5x slower than llc?

My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) If, instead, I dump out the LLVM IR, and manually

My own codegen is 2.5x slower than llc?

2018 May 29

0

My own codegen is 2.5x slower than llc?

> On 29 May 2018, at 22:02, David Jones via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. > > If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000

My own codegen is 2.5x slower than llc?

2018 May 29

0

My own codegen is 2.5x slower than llc?

What percentage of performance advantage do you expect to get from having a basic block with 14000 instructions, rather than breaking it up a bit? On Wed, May 30, 2018 at 12:02 AM, David Jones via llvm-dev < llvm-dev at lists.llvm.org> wrote: > My back-end code generator uses LLVM 5.0.1 to optimize and generate code > for x86_64. > > If I run it on a given sample of IR, it

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

2

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

Hey Nadav, I'd humbly suggest that rather than use 3 directly, you should add a shared constant between these two passes, so when one changes, the other doesn't need to be updated. It would also ensure this bit of info about what needs to be updated isn't only contained in the comments.. On Fri, Jul 26, 2013 at 4:07 PM, Nadav Rotem <nrotem at apple.com> wrote: > Author:

[LLVMdev] Integer divide by zero

2013 Apr 06

2

[LLVMdev] Integer divide by zero

A division intrinsic with defined behavior on all arguments would be awesome! Thanks for considering this. On Sat, Apr 6, 2013 at 11:27 AM, Joe Groff <arcata at gmail.com> wrote: > On Saturday, April 6, 2013, Jeff Bezanson wrote: >> >> >> Presumably the optimizer benefits from taking advantage of the >> undefined behavior, but to get a consistent result you need

[LLVMdev] Integer divide by zero

2013 Apr 06

0

[LLVMdev] Integer divide by zero

On Sat, Apr 6, 2013 at 3:22 PM, Jeff Bezanson <jeff.bezanson at gmail.com>wrote: > A division intrinsic with defined behavior on all arguments would be > awesome! Thanks for considering this. 'Tis a good compromise. If there are no objections/concerns, I would like to move forward with it. Thanks, Joe! -Cameron -------------- next part -------------- An HTML attachment was

Using answering machine in my phone

2004 Aug 04

4

Using answering machine in my phone

Is this supported? I have a very simple setup where I have 2 X100P cards and a TDM10B. The TDM10B is connected to a phone that has a digital answering machine built into it. If I make an inbound call on either X100P interface it gets transferred to the TDM10B interface. If I let it ring the TDM10B interface answers the call and the greeting message of the answering machine starts. Then shortly

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

0

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

Hi Daniel, Maybe my commit message was not clear. The idea is that the SelectionDAG store vectorizer can only handle pairs. So, the number three means "more than a pair". Thanks, Nadav Sent from my iPhone > On Jul 26, 2013, at 17:48, Daniel Berlin <dberlin at dberlin.org> wrote: > > Hey Nadav, > I'd humbly suggest that rather than use 3 directly, you should

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

2013 Jul 27

1

[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

Hi Nadav, Okay. 1. The comment doesn't make this clear. I would suggest, at a minimum, updating it to mention pairs specifically, to avoid the issue in #2 2. If the day comes when the selectiondag store vectorizer handles more than pairs, and does so better, is anyone really going to remember this random 3 exists in the other vectorizer? I would posit, based on experience, the answer is

Optimization of successive constant stores

2015 Dec 11

2

Optimization of successive constant stores

Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3

[LLVMdev] 3.2 version string

2012 Dec 25

2

[LLVMdev] 3.2 version string

LLVM 3.2 came as a nice Christmas present. Just one minor question: I noticed that the version string (used to name the shared library etc.) is "3.2svn" instead of the expected "3.2". This violates our build system's expectations of what things are called. It would be easy for us to change, but I want to make sure this is not a mistake. I am fairly certain I downloaded the

[LLVMdev] Integer divide by zero

2013 Apr 06

3

[LLVMdev] Integer divide by zero

I'm also not fully happy with LLVM's behavior here. There is another undefined case too, which is the minimum integer divided by -1. In Julia I can get "random" answers by doing: julia> sdiv_int(-9223372036854775808, -1) 87106304 julia> sdiv_int(-9223372036854775808, -1) 87108096 In other contexts where the arguments are not constant, this typically gives an FPE trap.

Optimization of successive constant stores

2015 Dec 11

2

Optimization of successive constant stores

Consider the following: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %UodStructType = type { i8, i8, i8, i8, i32, i8* } define void @test(%UodStructType*) { %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1

Julia

2012 Mar 01

2

Julia

My purpose in mentioning the Julia language (julialang.org) here is not to start a flame war. I find it to be a very interesting development and others who read this list may want to read about it too. It is still very much early days for this language - about the same stage as R was in 1995 or 1996 when only a few people knew about it - but Julia holds much potential. There is a thread about

[LLVMdev] Integer divide by zero

2013 Apr 06

0

[LLVMdev] Integer divide by zero

On Saturday, April 6, 2013, Jeff Bezanson wrote: > > Presumably the optimizer benefits from taking advantage of the > undefined behavior, but to get a consistent result you need to check > for both zero and this case, which is an awful lot of checks. Yes they > will branch predict well, but this still can't be good, for code size > if nothing else. How much performance can

[LLVMdev] Integer divide by zero

2013 Apr 07

2

[LLVMdev] Integer divide by zero

Hi Cameron, On 06/04/13 22:52, Cameron McInally wrote: > On Sat, Apr 6, 2013 at 3:22 PM, Jeff Bezanson <jeff.bezanson at gmail.com > <mailto:jeff.bezanson at gmail.com>> wrote: > > A division intrinsic with defined behavior on all arguments would be > awesome! Thanks for considering this. > > > 'Tis a good compromise. If there are no

similar to: large slowdown in DAGCombiner::MergeConsecutiveStores