thr3ads.net - similar to: "[LLVMdev] PerformDAGCombine vs. DAG to DAG"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] PerformDAGCombine vs. DAG to DAG"

2009 Jan 26

[LLVMdev] DAGCombiner rant

Yes, it was I who put that rant in the commit log and it's justified. Worse, it's unreasonable to actually go through all of DAGCombiner's code and check to see if certain kinds of constants, e.g., i64, are legal during a particular phase of DAGCombiner. DAGCombiner does good work and the backends are supposed to be good citizens. CellSPU is certainly trying to be a good citizen, no

[LLVMdev] DAGCombiner rant

2009 Jan 28

[LLVMdev] DAGCombiner rant

Hi Scott, I'm not clear on what you're saying here; some of your points below seem to be contradictory. The advice to use target-independent nodes when feasible seems sound to me, so I wrote up a comment about it in SelectionDAGNodes.h. If you can formulate your thoughts in the form of specific documentation changes, that would be helpful. In theory, DAGCombiner is supposed to check if

[LLVMdev] Why change "sub x, 5" to "add x, -5" ?

2015 Jul 10

[LLVMdev] Why change "sub x, 5" to "add x, -5" ?

2015-07-08 17:58 GMT+02:00 escha <escha at apple.com>: > [...] > > If you want to “revert" this sort of thing, you can do it at Select() time > or PreprocessISelDAG(), which is what I did on an out-of-tree backend to > turn add X, -C into sub X, C on selection time. This still lets all the > intermediate optimizations take advantage of the canonicalization. > >

[LLVMdev] Prevent node from being combined

2009 Feb 11

[LLVMdev] Prevent node from being combined

How can I prevent some nodes from being combined in DAGCombine.cpp? Maybe what I want to do below doesn't follow the philosophy of LLVM, but I'd like to know if there is any way to avoid node from being combined. TargetLowering::PerformDAGCombine() is only called if DAGCombiner cannot combine a specific node. It seems that there is no chance to stop it from combining a node. I need the

[LLVMdev] possible PowerPC (32bits) backend bug

2009 Jun 17

[LLVMdev] possible PowerPC (32bits) backend bug

I have been doing some playing with the patterns that define complex instructions, and I saw a behavior that doesn't look right. I think its a bug in the PPC backend. The 32-bit PPC .td file defines a pattern for the fnmsubs instruction like this: def : Pat<(fsub F4RC:$B, (fmul F4RC:$A, F4RC:$C)), (FNMSUBS F4RC:$A, F4RC:$C, F4RC:$B)>,

How does one match undef in tablegen?

2017 May 07

How does one match undef in tablegen?

I would like to specialise build_vector for the case when one of the operands is undefined. How do I describe this? This is looking for an analog of specialisations like: def : Pat <v2i32 (build_vector i32:$x, (i32 0)),...>; but for an undefined, rather than zero, value. I can work around my ignorance in performDAGCombine but would prefer to add to the existing pattern matching. Thanks,

Inserting MachineInstr's

2015 Sep 08

Inserting MachineInstr's

Hi, I have a task to complete and I'm getting stuck. I can't find anything comparable in the documentation. The shortest explanation I can give is as follows: I need to use double-precision floating point values for floating-point multiplies. I'll not go into why: That would take the discussion away from the essential problem. E.g. Replace: fmuls %f20,%f21,%f8 with the

AArch64 fmul/fadd fusion

2015 Sep 19

AArch64 fmul/fadd fusion

Hi All, Recently I was doing some AArch64 work and noticed some cases where fmuls were not getting fused with fadds. Is there any particular reason that the AArch64 machine combiner doesn't do this like it does for add/mul? I am happy to work up a patch for this, but I wanted to make sure that there wasn't a good reason for it not already being there. FWIW, I see where GCC is doing

[LLVMdev] TableGen pattern for negated operand

2012 May 11

[LLVMdev] TableGen pattern for negated operand

I've been unable to come up with the TableGen recipe to match a negated operand. My target asm syntax allows the following transform: FNEG r8, r5 MUL r6, r8, r9 to MUL r6, -r5, r9 Is there a Pattern<> syntax that would allow matching *any* opcode (or even some subset), not just MUL, with a FNEG'd operand? I expect I can define a PatFrag: def fneg_su : PatFrag<(ops

About CodeGen quality

2017 Jun 15

About CodeGen quality

Hi Mats, It's private backend. I will try describing what I am dealing with. struct S { unsigned int a : 8; unsigned int b : 8; unsigned int c : 8; unsigned int d : 8; unsigned int e; } We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64. Below is the IR corresponding to S->b, IIRC. %0 = load

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

I would like to transform X/Y -> X*1/Y. Specifically, I would like to convert: define void @t1a(double %a, double %b, double %d) { entry: %div = fdiv fast double %a, %d %div1 = fdiv fast double %b, %d %call = tail call i32 @foo(double %div, double %div1) ret void } to: define void @t1b(double %a, double %b, double %d) { entry: %div = fdiv fast double 1.000000e+00, %d %mul = fmul

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

Hi Chad, This is a great transform to do, but you’re right that it’s only safe under fast-math. This is particularly interesting when the original divisor is a constant so you can materialize the reciprocal at compile-time. You’re right that in either case, this optimization should only kick in when there is more than one divide instruction that will be changed to a mul. I don’t have a strong

Node deletion during DAG Combination ?

2018 Jun 20

Node deletion during DAG Combination ?

Hi, I'm trying to optimize the 'extract_vector_elt' for my SIMD microcontroller. The idea is, during DAG combination, to merge load/extract sequence into an architecture specific node. During Instruction Selection, this specific node will be target selected to an architecture specific instruction. By 'combination of DAG nodes' I understand 'replacing a set of DAG nodes by

[LLVMdev] Flag output used by two other nodes in DAG

2010 Oct 08

[LLVMdev] Flag output used by two other nodes in DAG

I recently filed this bug: http://llvm.org/bugs/show_bug.cgi?id=8323 It's a dodgy one because you have to patch LLVM to demonstrate it. I suspect that the cause of the problem in that "bug" is that the peephole optimisation in PerformDAGCombine results in a Flag output from one node being used as input by two other nodes in the DAG, and the scheduler then can't cope with that.

Endianness for multi-word types

2015 Dec 01

Endianness for multi-word types

> -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Tuesday, December 01, 2015 1:01 AM > To: Tim Shen > Cc: Gao, Yunzhong; llvm-dev at lists.llvm.org; Kit Barton; Nemanja Ivanovic > Subject: Re: [llvm-dev] Endianness for multi-word types > > ----- Original Message ----- > > From: "Tim Shen via llvm-dev" <llvm-dev at

[LLVMdev] TableGen pattern for negated operand

2012 May 11

[LLVMdev] TableGen pattern for negated operand

Hi Joe, Le 11/05/2012 02:13, Joe Matarazzo a écrit : > I've been unable to come up with the TableGen recipe to match a > negated operand. My target asm syntax allows the following transform: > > FNEG r8, r5 > MUL r6, r8, r9 > > to > > MUL r6, -r5, r9 > > Is there a Pattern<> syntax that would allow matching *any* opcode (or > even some

[LLVMdev] i1 types in MergeConsecutiveStores

2015 May 12

[LLVMdev] i1 types in MergeConsecutiveStores

Hello LLVM, In DAGCombiner.cpp, MergeConsecutiveStores uses int64_t ElementSizeBytes = MemVT.getSizeInBits()/8; https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10669 which is broken for i1 types where getSizeInBits() == 1. My out-of-tree target hits this case and eventually LLVM asserts in Type.cpp. Is there some reason MergeConsecutiveStores should

Lowering ISD::TRUNCATE

2018 Aug 06

Lowering ISD::TRUNCATE

I'm working on defining the instructions and implementing the lowering code for a Z80 backend. For now, the backend supports only the native CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit long, float, ... yet). So far, a lot of the simple stuff like immediate loads and return values is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

Condition code in DAGCombiner::visitFADDForFMACombine?

2018 Aug 20

Condition code in DAGCombiner::visitFADDForFMACombine?

I'm curious why the condition to fuse is this: // Floating-point multiply-add with intermediate rounding. bool HasFMAD = (LegalOperations && TLI.isOperationLegal(ISD::FMAD, VT)); static bool isContractable(SDNode *N) { SDNodeFlags F = N->getFlags(); return F.hasAllowContract() || F.hasAllowReassociation(); } bool CanFuse = Options.UnsafeFPMath || isContractable(N); bool

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com> wrote: > Hi Chad, > > This is a great transform to do, but you’re right that it’s only safe under fast-math. This is particularly interesting when the original divisor is a constant so you can materialize the reciprocal at compile-time. You’re right that in either case, this optimization should only kick in when

similar to: [LLVMdev] PerformDAGCombine vs. DAG to DAG