similar to: MatchLoadCombine(): handling for vectorized loop.

Displaying 20 results from an estimated 6000 matches similar to: "MatchLoadCombine(): handling for vectorized loop."

2018 Dec 04
2
MatchLoadCombine(): handling for vectorized loop.
Hi Eli, On 2018-12-04 00:37, Friedman, Eli wrote: > On 12/3/2018 8:20 AM, Jonas Paulsson wrote: >> Hi, >> >> I have noticed some loops that build a wider element by loading small >> elements, zero-extending them, shifting them (with different amounts) >> to then 'or' them all together. They are either equivalent of a wider >> load, or to that of a
2018 Sep 04
2
LoopVectorizer: shufflevectors
Hi, I have been discussing a bit with Sanjay on how to handle the poor sequences of shufflevector instructions produced by the loop vectorizer and he suggested we bring this up on llvm-dev. I have run into this in the past also and it surprised me to again see (on SystemZ) that the vectorized loop did many seemingly unnecessary shuffles. In this case (see
2009 Jan 20
2
[LLVMdev] Shouldn't DAGCombine insert legal nodes?
Duncan: DAGCombine is inserting an IllegalOperation after target-specific instruction legalization has occurred. I'm inserting the fabs and the bitconvert during instruction legalization; DAGCombine is converting the fabs/bitconvert to an 'and' on its second (third?) pass. -scooter On Jan 20, 2009, at 12:24 AM, Duncan Sands wrote: > On Tuesday 20 January 2009 07:52:37
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
Hi All, I'm writing a backend for a target which only supports 4-byte, 4-byte-aligned loads and stores. I custom-lower all {*EXT}LOAD and STORE nodes in TargetISelLowering.cpp to take advantage of all alignment information available to the backend, rather than treat each load and store conservatively, which takes O(10) instructions. My target's allowsUnalignedMemoryOperations()
2013 Apr 17
2
[LLVMdev] alias analysis in backend
Hi Hal, Thanks. How about a symbol with two different immediate offsets - the Value* would be the same, right? I don't see how AliasAnalysis::Location would handle this... And BasicAliasAnalysis does if (V1 == V2) return MustAlias; , so I'm not sure how this would be done .. ? /Jonas > -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent:
2013 Apr 18
2
[LLVMdev] alias analysis in backend
On Apr 17, 2013, at 2:33 AM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Jonas Paulsson" <jonas.paulsson at ericsson.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Wednesday, April 17, 2013 12:22:49 AM >> Subject: RE: [LLVMdev] alias analysis in backend
2009 Feb 20
2
[LLVMdev] Possible DAGCombiner or TargetData Bug
On Wednesday 18 February 2009 21:43, Dan Gohman wrote: > I agree, that doesn't look right. It looks like this > is what was intended: > > Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp > =================================================================== > --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) > +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2009 Jan 26
2
[LLVMdev] DAGCombiner rant
Yes, it was I who put that rant in the commit log and it's justified. Worse, it's unreasonable to actually go through all of DAGCombiner's code and check to see if certain kinds of constants, e.g., i64, are legal during a particular phase of DAGCombiner. DAGCombiner does good work and the backends are supposed to be good citizens. CellSPU is certainly trying to be a good citizen, no
2009 Jan 28
0
[LLVMdev] DAGCombiner rant
Hi Scott, I'm not clear on what you're saying here; some of your points below seem to be contradictory. The advice to use target-independent nodes when feasible seems sound to me, so I wrote up a comment about it in SelectionDAGNodes.h. If you can formulate your thoughts in the form of specific documentation changes, that would be helpful. In theory, DAGCombiner is supposed to check if
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
Hi Eli, On 07/27/2011 04:59 PM, Eli Friedman wrote: > On Wed, Jul 27, 2011 at 2:28 PM, Matt Johnson > <johnso87 at crhc.illinois.edu> wrote: >> Hi All, >> I'm writing a backend for a target which only supports 4-byte, >> 4-byte-aligned loads and stores. I custom-lower all {*EXT}LOAD and >> STORE nodes in TargetISelLowering.cpp to take advantage of
2019 Feb 08
2
Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
Hi, SystemZ supports @llvm.ctlz.i64() natively with a single instruction (FLOGR), and lesser bitwidth versions of the intrinsic are promoted to i64. For some reason, this leads to unfolded additions of constants as shown below: This function: define i16 @fun(i16 %arg) {   %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false)   ret i16 %1 } ,gives this optimized DAG as input to instruction
2009 Aug 23
4
[LLVMdev] Problems with DAG Combiner
Hi all, i'm writing an back-end for a new research processor architecture and have problems with the DAG Combiner. The processor architecture supports i1 and i32 registers. 1-bit registers are mainly used as comparison result but basic operations like OR are not possible between i1 registers. So I wrote custom lowering for i1 OR operations and replaced it by (trunc (or (aext x), (aext
2012 Aug 26
3
[LLVMdev] Illegal node introduced by DAGCombiner after legal phase
Hello, I'm getting an instruction selection error because DAGCombiner is introducing an illegal node after the legalizeDAG phase. Basically this is what is going on: 1) During legalization, BR_JT gets expanded introducing a (mul x, 2). 2) After legalization (AfterLegalizeDAG), that (mul x, 2) is converted to an (shl x, 1). However, that shl node introduced is illegal, and since my custom
2019 Aug 26
2
LLVM X86 backend combineIncDecVector's transform
No objections from me to make it run later. I didn't see the potential conflicts when I added that code. Delayed combine, custom lowering, or DAGToDAGISel all seem like viable options to me. On Mon, Aug 26, 2019 at 2:04 PM Roman Lebedev <lebedev.ri at gmail.com> wrote: > I have previously posted these two patches: > > [X86][CodeGen][NFC] Delay `combineIncDecVector()` from
2011 Jul 27
0
[LLVMdev] Avoiding load narrowing in DAGCombiner
On Wed, Jul 27, 2011 at 2:28 PM, Matt Johnson <johnso87 at crhc.illinois.edu> wrote: > Hi All, >     I'm writing a backend for a target which only supports 4-byte, > 4-byte-aligned loads and stores.  I custom-lower all {*EXT}LOAD and > STORE nodes in TargetISelLowering.cpp to take advantage of all alignment > information available to the backend, rather than treat each
2013 Apr 17
0
[LLVMdev] alias analysis in backend
----- Original Message ----- > From: "Jonas Paulsson" <jonas.paulsson at ericsson.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Wednesday, April 17, 2013 12:22:49 AM > Subject: RE: [LLVMdev] alias analysis in backend > > Hi Hal, > > Thanks. How about a symbol with two different immediate offsets - the
2011 Jan 28
3
[LLVMdev] Post-inc combining
Hi, I would like to transform a LLVM function containing a load and an add of the base address inside a loop to a post-incremented load. In DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add for instance if it is a predecessor/successor of the load. I find this odd, as this is exactly what I would like to handle: a simple loop with an address that is inremented in
2017 May 15
2
Disabling DAGCombine's specific optimization
Hi Vivek, You could work around this by creating a custom ISD node, e.g. MyTargetISD::MyLSHR, with the same type as the general ISD::LSHR. This custom node will then be ignored by the generic DAGCombiner. Convert ISD::LSHR to MyTargetISD::MyLSHR in DAGCombine, optimise it as you see fit, convert it back or lower it directly. I've done this for ISD::CONCAT_VECTORS to avoid an inconvenient
2018 Jul 03
4
Question about canonicalizing cmp+select
Hi, Sanjay/all, I noticed in rL331486 that some compare-select optimizations are disabled in favor of providing canonicalized cmp+select to the backend. I am currently working on a private backend target, and the target has a small code size limit. With this change, some of the apps went over the codesize limit. As an example, C code: b = (a > -1) ? 4 : 5; ll code: Before rL331486:
2019 Aug 26
2
LLVM X86 backend combineIncDecVector's transform
Hi all, As you knwo already, I'm trying to change DAGCombiner so that it process the nodes in topological order. Doing so is not difficult per se, but this creates various improvements and regression to the existing test suite. I'd like to work through as many of the regressions as possible ahead of time. One source of such regressions is combineIncDecVector in the X86 backend. It