similar to: [LLVMdev] DAGCompiler::MergeConsecutiveStores Question

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] DAGCompiler::MergeConsecutiveStores Question"

2013 Nov 22
0
[LLVMdev] DAGCompiler::MergeConsecutiveStores Question
Hi David, You are right. This check is overly restrictive. We can replace this check with code that uses the alignment of the first store. Thanks, Nadav On Nov 22, 2013, at 9:31 AM, dag at cray.com wrote: > In DAGCombiner::MergeConsecutiveStores, there is this check: > > if (Index->getAlignment() != St->getAlignment()) > break; > > Apparently this check
2015 May 12
2
[LLVMdev] i1 types in MergeConsecutiveStores
Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is there some reason MergeConsecutiveStores should
2020 Mar 19
2
large slowdown in DAGCombiner::MergeConsecutiveStores
Hello all, We are seeing a large compiler performance regression in moving from LLVM 6.0.1 to 8.0.1. We have a long function (~50000 instructions) that used to compile in about a minute but now takes at least an hour. All the time is in MergeConsecutiveStores, I believe due to super-linear behavior in analyzing very long chains of stores. For example, this change makes the problem go away: ```
2015 Feb 13
2
[LLVMdev] DAGCombiner::MergeConsecutiveStores
Hi, I'm quite puzzled by a little bit of code in the DAGCombiner where it merges loads in MergeConsecutiveStores. Two 16bit loads have been merged to one 32bit load, and two 16bit stores have been combined to one 32bit store. And then the code goes like this: // Replace one of the loads with the new load. LoadSDNode *Ld = cast<LoadSDNode>(LoadNodes[0].MemNode);
2013 Jul 27
2
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
Hey Nadav, I'd humbly suggest that rather than use 3 directly, you should add a shared constant between these two passes, so when one changes, the other doesn't need to be updated. It would also ensure this bit of info about what needs to be updated isn't only contained in the comments.. On Fri, Jul 26, 2013 at 4:07 PM, Nadav Rotem <nrotem at apple.com> wrote: > Author:
2009 Feb 19
0
[LLVMdev] Possible DAGCombiner or TargetData Bug
I agree, that doesn't look right. It looks like this is what was intended: Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp =================================================================== --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy) @@ -4903,9 +4903,9 @@ // resultant store does not need a higher alignment than
2013 Nov 15
4
[LLVMdev] Limit loop vectorizer to SSE
Something like: index 6db7f68..68564cb 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr Type *DataTy = VectorType::get(ScalarDataTy, VF); Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); unsigned Alignment = LI ?
2009 Feb 19
3
[LLVMdev] Possible DAGCombiner or TargetData Bug
I got bit by this in LLVM 2.4 DagCombiner.cpp and it's still in trunk: SDValue DAGCombiner::visitSTORE(SDNode *N) { [...] // If this is a store of a bit convert, store the input value if the // resultant store does not need a higher alignment than the original. if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && ST->isUnindexed()) {
2009 Feb 20
2
[LLVMdev] Possible DAGCombiner or TargetData Bug
On Wednesday 18 February 2009 21:43, Dan Gohman wrote: > I agree, that doesn't look right. It looks like this > is what was intended: > > Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp > =================================================================== > --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) > +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2013 Jul 27
0
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
Hi Daniel, Maybe my commit message was not clear. The idea is that the SelectionDAG store vectorizer can only handle pairs. So, the number three means "more than a pair". Thanks, Nadav Sent from my iPhone > On Jul 26, 2013, at 17:48, Daniel Berlin <dberlin at dberlin.org> wrote: > > Hey Nadav, > I'd humbly suggest that rather than use 3 directly, you should
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index
2013 Jul 27
1
[LLVMdev] [llvm] r187267 - SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
Hi Nadav, Okay. 1. The comment doesn't make this clear. I would suggest, at a minimum, updating it to mention pairs specifically, to avoid the issue in #2 2. If the day comes when the selectiondag store vectorizer handles more than pairs, and does so better, is anyone really going to remember this random 3 exists in the other vectorizer? I would posit, based on experience, the answer is
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
Yes, I was just about to send out: DL->getABITypeAlignment(ScalarDataTy); The question is: “… ABI alignment for the target …" is that getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer"
2013 Mar 11
0
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
> > Line 4501 in trunk DAGCombiner.cpp… I changed the ISD::SELECT to the VT.isVector() ? ISD::VSELECT : ISD::SELECT... > Thanks. From the commit message I think that we should only run this optimization on scalars. >> Can you write down the input SDNode ? What types are inputs ? > > 0x107046d10: v2i8 = vselect 0x107046c10, 0x107046b10, 0x107045e10 [ID=-3]
2013 Mar 11
3
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
On Mar 11, 2013, at 9:41 AM, Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> wrote: Hi Richard, I did… It originates from an icmp ne <2x i8>, zero initializer followed by a sext of the result 2x i1 to 2x i8. When we visit the SIGN_EXTEND, we generate the ISD::SELECT even though the selector and both operands are vectors. It sounds like a bug in the dag combine
2015 Dec 11
2
Optimization of successive constant stores
Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
Nadav, I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. v/r, Josh [1]
2013 Nov 15
6
[LLVMdev] Limit loop vectorizer to SSE
On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think
2013 Mar 11
2
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
On Mar 8, 2013, at 2:29 PM, Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> wrote: Hi Richard, visitSIGN_EXTEND() in DAGCombiner.cpp generates an ISD::SELECT even if VT is a vector, which causes ExpandSELECT() to assert during legalization. I think what's required is to have visitSIGN_EXTEND generate a VSELECT if VT is a vector… ISD::SELECT should be used for
2018 May 29
4
My own codegen is 2.5x slower than llc?
My back-end code generator uses LLVM 5.0.1 to optimize and generate code for x86_64. If I run it on a given sample of IR, it takes almost 5 minutes to generate object code. 95%+ of this time is spent in MergeConsecutiveStores(). (One function has a basic block with 14000 instructions, which is a pathological case for MergeConsecutiveStores.) If, instead, I dump out the LLVM IR, and manually