thr3ads.net - similar to: "[LLVMdev] DAGCombiner::MergeConsecutiveStores"

Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] DAGCombiner::MergeConsecutiveStores"

[LLVMdev] Custom Lowering of ARM zero-extending loads

2013 Mar 04

[LLVMdev] Custom Lowering of ARM zero-extending loads

Hi, For my research, I need to reshape the current ARM backend to support armv2a. Zero-extend half word load (ldrh) is not supported by armv2a, so I need to make the code generation to not generate ldrh instructions. I want to replace all those instances with a 32-bit load (ldr) and then and the result with 0xffff to mask out the upper bits. These are the modifications that I have made to

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

Hi, Joseph, I guess getLoad() will either search an existed SDValue *OR* create a new one for a non-existed one depending on real parameters. Since you use exactly the same attributes dupVal/dupNode have, no doubt getLoad() return the old one. I am not sure it's *volatile* that let you get a new result, you might want to try change some other parameters and check what it turns out. Regards.

ISelDAGToDAG breaks node ordering

2017 Jul 29

ISelDAGToDAG breaks node ordering

Hi, During instruction selection, I have the following code for certain LOAD instructions: const LoadSDNode *LD = cast<LoadSDNode>(N); SDNode* LDW = CurDAG->getMachineNode(AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other, LD->getBasePtr(), LD->getChain()); // Honestly, I have no idea what this does, but other memory // accessing instructions

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

So I think I have made some progress. SDValue dupVal = consumer->getOperand(OpNo); LoadSDNode *dupNode = (LoadSDNode*) dupVal.getNode(); SDValue newLoad = CurDAG->getLoad(dupVal.getValueType(), dupVal.getDebugLoc(), dupVal.getOperand(0), dupVal.getOperand(1), dupNode->getPointerInfo(),

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

Hi, Joe. I am sorry I did not catch your point. Can you provide more details? Since SDValue/SDNode can be used multiple times, why would you want to create two identical objects instead of reference to the same one? 2012/12/2 Joseph Pusdesris <joe at pusdesris.com>: > Yes, changing parameters will create a new Node, but is there some way I can > force a new node with the same

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 01

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

Hi, I am writing an llvm target and I need both loads for isel reasons, but I am struggling to find the right way. I have been trying to use DAG.getLoad() to make a copy, then just change the operand in the consumers, but I cannot seem to get all of the arguments needed for that function in order to make the copy. Any help would be great, thanks! -Joe -------------- next part -------------- An

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

Yes, changing parameters will create a new Node, but is there some way I can force a new node with the same parameters? -Joe On Sat, Dec 1, 2012 at 10:57 PM, Triple Yang <triple.yang at gmail.com> wrote: > Hi, Joseph, I guess getLoad() will either search an existed SDValue > *OR* create a new one for a non-existed one depending on real > parameters. > > Since you use

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

OK, I get it. The essence of this problem is that a node can be covered exactly and just once but its result can be referred multiple times for a tree pattern matching isel. So to duplicate a load node (only if we can!) is convenient to conquer that case. The truth is, in pattern (add (load) (load)), source operands are memory addresses, and thus it can be treated as (addmm address,

Pseudo-instruction that overwrites its input register

2017 May 30

Pseudo-instruction that overwrites its input register

The reason the ones in PPCInstrInfo.td don't have the patterns to match is the reason they are more analogous to your problem. Namely, tblgen does not have a way to produce nodes with more than one result. The load-with-update instructions do exactly that - one of the inputs is also an output, but the other output is independent (and necessarily a separate register). The FMA variants have

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

I am writing a target for an odd cisc-like architecture which has no support for keeping most values in registers. As such, memory-memory operations are needed, but for isel to generate a memory-memory the pattern must be of the form (store (op (load) (load))). Let's use a simple example to show how this can be problematic: %0 = load i32* %a.addr, align 4 store i32 %0, i32* %other, align

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

2012 Dec 02

[LLVMdev] Splitting a load with 2 consumers into 2 loads.

I'll give that a shot, thanks! -Joe On Sun, Dec 2, 2012 at 12:06 PM, Triple Yang <triple.yang at gmail.com> wrote: > OK, I get it. > > The essence of this problem is that a node can be covered exactly and > just once but its result can be referred multiple times for a tree > pattern matching isel. So to duplicate a load node (only if we can!) > is convenient to

ISelDAGToDAG breaks node ordering

2017 Jul 31

ISelDAGToDAG breaks node ordering

On 7/29/2017 1:28 AM, Dr. ERDI Gergo via llvm-dev wrote: > Hi, > > During instruction selection, I have the following code for certain > LOAD instructions: > > const LoadSDNode *LD = cast<LoadSDNode>(N); > SDNode* LDW = CurDAG->getMachineNode(AVR::LDWRdPtr, SDLoc(N), > VT, PtrVT, MVT::Other, > LD->getBasePtr(), LD->getChain()); >

large slowdown in DAGCombiner::MergeConsecutiveStores

2020 Mar 19

large slowdown in DAGCombiner::MergeConsecutiveStores

Hello all, We are seeing a large compiler performance regression in moving from LLVM 6.0.1 to 8.0.1. We have a long function (~50000 instructions) that used to compile in about a minute but now takes at least an hour. All the time is in MergeConsecutiveStores, I believe due to super-linear behavior in analyzing very long chains of stores. For example, this change makes the problem go away: ```

Pseudo-instruction that overwrites its input register

2017 May 30

Pseudo-instruction that overwrites its input register

On Tue, 30 May 2017, Nemanja Ivanovic wrote: > This is typically accomplished with something like PPC's `RegConstraint` and > `NoEncode`. You can see examples of it that are very similar to what you're after in > PPC's load/store with update forms (i.e. load a value and update the base register > with the effective address - these are used for pre-increment loads/stores).

[LLVMdev] Avoiding load narrowing in DAGCombiner

2011 Jul 27

[LLVMdev] Avoiding load narrowing in DAGCombiner

On Wed, Jul 27, 2011 at 3:50 PM, Matt Johnson <johnso87 at crhc.illinois.edu> wrote: > Hi Eli, > > On 07/27/2011 04:59 PM, Eli Friedman wrote: >> >> On Wed, Jul 27, 2011 at 2:28 PM, Matt Johnson >> <johnso87 at crhc.illinois.edu> wrote: >>> >>> Hi All, >>> I'm writing a backend for a target which only supports 4-byte,

RFC: atomic operations on SI+

2016 Mar 25

RFC: atomic operations on SI+

Hi Tom, Matt, I'm working on a project that needs few coherent atomic operations (HSA mode: load, store, compare-and-swap) for std::atomic_uint in HCC. the attached patch implements atomic compare and swap for SI+ (untested). I tried to stay within what was available, but there are few issues that I was unsure how to address: 1.) it currently uses v2i32 for both input and output. This

[LLVMdev] i1 types in MergeConsecutiveStores

2015 May 12

[LLVMdev] i1 types in MergeConsecutiveStores

Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is there some reason MergeConsecutiveStores should

Re: NUMA issues on virtualized hosts

2018 Sep 17

Re: NUMA issues on virtualized hosts

On 09/14/2018 03:36 PM, Lukas Hejtmanek wrote: > Hello, > > ok, I found that cpu pinning was wrong, so I corrected it to be 1:1. The issue > with iozone remains the same. > > The spec is running, however, it runs slower than 1-NUMA case. > > The corrected XML looks like follows: [Reformated XML for better reading] <cpu mode="host-passthrough">

EnableFastISel

2017 Oct 23

EnableFastISel

Hi, In SelectionDAGISel::SelectAllBasicBlocks if (TM.Options.EnableFastISel) FastIS = TLI->createFastISel(*FuncInfo, LibInfo); followed by if (!FastIS) { LowerArguments(Fn); } else { The above implies that implementing FastIS is optional. In contrast to that, testing whether FastIS is actually been used is done by testing if TM.Options.EnableFastISel is set. For example

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

2013 Nov 22

[LLVMdev] DAGCompiler::MergeConsecutiveStores Question

In DAGCombiner::MergeConsecutiveStores, there is this check: if (Index->getAlignment() != St->getAlignment()) break; Apparently this check ensures that all of the stores have the same alignment. Why is that necessary? This seems very overly restrictive to me. -David

similar to: [LLVMdev] DAGCombiner::MergeConsecutiveStores