thr3ads.net - similar to: "RFC: Moving DAG heuristic-based transforms to MI passes"

Displaying 20 results from an estimated 2000 matches similar to: "RFC: Moving DAG heuristic-based transforms to MI passes"

RFC: Moving DAG heuristic-based transforms to MI passes

2017 Jan 28

RFC: Moving DAG heuristic-based transforms to MI passes

In fact to commit the change before dealing with worst-case performance is a good idea because here we have 2 different issues. But the main idea of this RFC is an attempt to show the better approach to to these kinds of transformations and to suggest to use this approach in the future. At the same time, I'm trying to explain that this patch is not the performance one because the

[LLVMdev] Proposal: New DAG node type for reciprocal operation

2012 Sep 24

[LLVMdev] Proposal: New DAG node type for reciprocal operation

Yes, what I mean is a target independent node in the ISD::NodeType enum. I already did the node transformation DAGCombiner and target-specific lowering in the first place. It worked. But introducing a specific node will make the logic more clear. For example, in ARM, FDIV is a scalar operation. So, after DAGCombiner and Vector Type legalize, vectorized FDIV has been expanded into scalar versions,

Pattern transformation between scalar and vector on IR.

2016 Sep 08

Pattern transformation between scalar and vector on IR.

Hi All, I'm tring to use RSQRT instructions on follow case for ARM (now what using is sqrt): 1.0 / sqrt(x) The RSQRT instructions(VRSQRTE/VRSQRTS) are vector type, but above operation is scalar type. So a transformation must be done(transform sqrt pattern to rsqrt). I have completed a patch for this, but I made the transformation in the backend which will leads to additional

回复: [RFC] Improve iteration of estimating divisions

2019 Aug 08

回复: [RFC] Improve iteration of estimating divisions

Hal, Yes, speed is an important factor of making dicision. Here I just put the numerator into estimation, so it won't add any more instructions. A simple benchmark below keeps the same running time between the demo and current master: ``` float fdiv(unsigned int a, unsigned int b) { return (float)a / (float)b; } float m; __attribute__((noinline)) void foo() { m = 0.0; } int main() {

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

I remember why I didn't implement this rule in Instcombine. It add one instruction. So, this xform should be driven by a redundancy eliminator if you care code size. On 8/8/13 10:13 AM, Shuxin Yang wrote: > I did few transformation in Instruction *InstCombiner::visitFDiv() in > an attempt to remove some divs. > I may miss this case. If you need to implement this rule, it is >

[LLVMdev] Proposal: New DAG node type for reciprocal operation

2012 Sep 21

[LLVMdev] Proposal: New DAG node type for reciprocal operation

--- On Thu, 9/20/12, Jim Grosbach <grosbach at apple.com> wrote: From: Jim Grosbach <grosbach at apple.com> Subject: Re: [LLVMdev] Proposal: New DAG node type for reciprocal operation To: "Weiming Zhao" <weimingz at codeaurora.org> Cc: llvmdev at cs.uiuc.edu Date: Thursday, September 20, 2012, 3:32 PM Sounds like a reasonable fit for a target-specific DAG combine. I

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

I did few transformation in Instruction *InstCombiner::visitFDiv() in an attempt to remove some divs. I may miss this case. If you need to implement this rule, it is better done in Instcombine than in DAG combine. Doing such xform early expose the redundancy of 1/y, which have positive impact to neighboring code, while DAG combine is bit blind. You should be very careful, reciprocal is very

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

I believe we were under the impression that InstCombine, as a canonicalized/optimizer, should not increase code size but only reduce it. Minor aside, but you don't need all of fast-math for the IR, just the "arcp" flag, which allows for replacement of division with reciprocal-multiply. On Aug 8, 2013, at 10:21 AM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > I remember

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

Hi Chad, This is a great transform to do, but you’re right that it’s only safe under fast-math. This is particularly interesting when the original divisor is a constant so you can materialize the reciprocal at compile-time. You’re right that in either case, this optimization should only kick in when there is more than one divide instruction that will be changed to a mul. I don’t have a strong

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com> wrote: > Hi Chad, > > This is a great transform to do, but you’re right that it’s only safe under fast-math. This is particularly interesting when the original divisor is a constant so you can materialize the reciprocal at compile-time. You’re right that in either case, this optimization should only kick in when

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

I would like to transform X/Y -> X*1/Y. Specifically, I would like to convert: define void @t1a(double %a, double %b, double %d) { entry: %div = fdiv fast double %a, %d %div1 = fdiv fast double %b, %d %call = tail call i32 @foo(double %div, double %div1) ret void } to: define void @t1b(double %a, double %b, double %d) { entry: %div = fdiv fast double 1.000000e+00, %d %mul = fmul

[LLVMdev] X86 rsqrt instruction generated

2012 Dec 03

[LLVMdev] X86 rsqrt instruction generated

Hi, Please find attached the modified patch and description. We have modified and retested the patch taking into consideration the comments and inputs provided earlier. Thanks & Regards, soham -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, November 15, 2012 12:59 PM To: Chakraborty, Soham Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev]

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

2013 Jun 03

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

Actually this kind of opportunities, as outlined bellow, was one of my contrived motivating example for fast-math. But last year we don't see such opportunities in real applications we care about. t1 = x1/y ... t2 = x2/y. I think it is better to be taken care by GVN/PRE -- blindly convert x/y => x *1/y is not necessarily beneficial. Or maybe we can blindly perform such

[RFC] Improve iteration of estimating divisions

2019 Aug 06

[RFC] Improve iteration of estimating divisions

Hi there, I notice that our current implementation of fast division transformation (turn `a / b` into `a * (1/b)`) is worse in precision compared with GCC. Like this case in ppc64le: float fdiv(unsigned int a, unsigned int b) { return (float)a / (float)b; } Result of Clang -Ofast is 41A00001 (in Hex), while GCC produces 41A00000 which is the same as no

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

2013 Aug 08

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On 08.08.2013, at 18:25, Chad Rosier <chad.rosier at gmail.com> wrote: > I would like to transform X/Y -> X*1/Y. Specifically, I would like to convert: > > define void @t1a(double %a, double %b, double %d) { > entry: > %div = fdiv fast double %a, %d > %div1 = fdiv fast double %b, %d > %call = tail call i32 @foo(double %div, double %div1) > ret void >

how to simplify FP ops with an undef operand?

2018 Feb 28

how to simplify FP ops with an undef operand?

Yes, if %x is a NaN, we should expect that NaN is propagated. I'm still not sure what to do here. We can take comfort in knowing that whatever we do is likely an improvement over the current situation though. :) That's because the code in InstSimplify is inconsistent with the LangRef: http://llvm.org/docs/LangRef.html#undefined-values (UB for fdiv by 0?) ...and both of those are

[LLVMdev] "Anti" scheduling with OoO cores?

2014 Nov 02

[LLVMdev] "Anti" scheduling with OoO cores?

Hi Andy, Dave, I've been doing a bit of experimentation trying to understand the schedmodel a bit better and improving modelling of FDIV (on Cortex-A57). FDIV is not pipelined, and blocks other FDIV operations (FDIVDrr and FDIVSrr). This seems to be already semi-modelled, with a "ResourceCycles=[18]" line in the SchedWriteRes for this instruction. This doesn't seem to work (a

[LLVMdev] Question on Machine Combiner Pass

2015 Feb 04

[LLVMdev] Question on Machine Combiner Pass

Ping From: Mandeep Singh Grang [mailto:mgrang at codeaurora.org] Sent: Tuesday, February 03, 2015 4:34 PM To: 'llvmdev at cs.uiuc.edu' Cc: 'ghoflehner at apple.com'; 'apazos at codeaurora.org'; mgrang at codeaurora.org Subject: Question on Machine Combiner Pass Hi, In the file lib/CodeGen/MachineCombiner.cpp I see that in the function

how to simplify FP ops with an undef operand?

2018 Feb 28

how to simplify FP ops with an undef operand?

I’m not sure the transformation happening with fdiv is correct. If we have “%y = fdiv float %x, undef” and %x is a NaN then the result will be NaN for any value of the undef, right? So if I understand the undef rules correctly (never a certainty) then we can’t safely replace the expression with undef. We could, I think, replace it with “%y = %x” though. I think the same is true for fadd, fsub,

how to simplify FP ops with an undef operand?

2018 Feb 28

how to simplify FP ops with an undef operand?

Why is NaN “just ‘undef’ in IR”? NaN is a specific value with well-defined behavior. I would think that unless the no-NaNs flag is used we need to preserve the behavior of NaNs. From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Wednesday, February 28, 2018 12:08 PM To: Kaylor, Andrew <andrew.kaylor at intel.com> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Nuno Lopes

similar to: RFC: Moving DAG heuristic-based transforms to MI passes