thr3ads.net - search: "visitfptrunc"

Displaying 6 results from an estimated 6 matches for "visitfptrunc".

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

2013 Nov 10

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

...and so I modified Clang to emit calls to llvm.sqrt in fast-math mode for sqrt (and sqrt[fl]). This makes it similar to the libm pow and fma calls, which Clang always transforms into the llvm.pow and llvm.fma intrinsics. Here's the problem: There is an InstCombine optimization for sqrt (inside visitFPTrunc), and a bunch of optimizations inside SimplifyLibCalls that apply only to the sqrt libm call, and not to the intrinsics. The result, among other things, is PR17758, where fast-math mode actually produces slower code for non-vectorized sqrt calls. Some questions: - Is the asymmetry between optimi...

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

2011 Sep 08

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

...ith optimisations turned on: ----- define void @dpdiv(float* nocapture %result, float %x, float %y) nounwind uwtable { entry: %conv3 = fdiv float %x, %y store float %conv3, float* %result, align 4, !tbaa !1 ret void } ----- The main optimisation applied here is near the top of InstCombiner::visitFPTrunc, which simplifies fptrunc(fdiv (fpextend x), (fpextend y)) to fdiv(x, y). Because double precision floating point divides are accurate in OpenCL, the single precision divide in the optimised code must also be accurate, unlike a "direct" single precision divide. I would imagine that creat...

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

2013 Nov 11

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

...modified Clang to emit calls to llvm.sqrt in fast-math mode for sqrt (and sqrt[fl]). This makes it similar to the libm pow and fma calls, which Clang always transforms into the llvm.pow and llvm.fma intrinsics. > > Here's the problem: There is an InstCombine optimization for sqrt (inside visitFPTrunc), and a bunch of optimizations inside SimplifyLibCalls that apply only to the sqrt libm call, and not to the intrinsics. The result, among other things, is PR17758, where fast-math mode actually produces slower code for non-vectorized sqrt calls. > > Some questions: > > - Is the asymm...

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

2011 Sep 08

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

Peter, Is there a way to make this flag globally available? Metadata can be fairly expensive to handle at each node when in many cases it is a global flag and not a per operation flag. > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Robert Quill > Sent: Thursday, September 08, 2011 3:24 AM > To: Peter

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

2013 Nov 12

[LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry

...-math > > mode for sqrt (and sqrt[fl]). This makes it similar to the libm > > pow and fma calls, which Clang always transforms into the llvm.pow > > and llvm.fma intrinsics. > > > > Here's the problem: There is an InstCombine optimization for sqrt > > (inside visitFPTrunc), and a bunch of optimizations inside > > SimplifyLibCalls that apply only to the sqrt libm call, and not to > > the intrinsics. The result, among other things, is PR17758, where > > fast-math mode actually produces slower code for non-vectorized > > sqrt calls. > > &g...

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

2011 Sep 08

[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)

Hi Peter, This sounds like I really good idea. One thing that did occur to me though from an OpenCL point of view is that ULP accuracy requirements can differ for embedded and full profile so that may need to be handled somehow. Thanks, Rob On Wed, 2011-09-07 at 21:55 +0100, Peter Collingbourne wrote: > Hi, > > This is my proposal to add floating point accuracy support to LLVM. >

search for: visitfptrunc