similar to: [LLVMdev] Clarifying FMA-related TargetOptions

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Clarifying FMA-related TargetOptions"

2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
Hi Owen, Having looked into this due to Clang failing PlumHall with it recently I can give an opinion... I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:44 AM, James Molloy wrote: > Hi Owen, > > Having looked into this due to Clang failing PlumHall with it recently I can give an opinion... > > I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If
2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
On Wed, 2012-02-08 at 10:11 -0800, Owen Anderson wrote: > Hello everyone, > > > I'd like to propose the attached patch to form FMA intrinsics > aggressively, but in order to do so I need some clarification on the > intended semantics for the various FP precision-related > TargetOptions. I've summarized the three relevant ones below: > > > UnsafeFPMath -
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:42 AM, Hal Finkel wrote: > In my experience, users of numerical codes expect that the compiler will > use FMA instructions where it can, unless specifically asked to avoid > doing so by the user. Even though this can sometimes produce a different > result (*almost* always a better one), the performance gain is too large > to be ignored by default. I highly
2013 Jul 18
2
[LLVMdev] LLVM 3.3 JIT code speed
Hi, Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? I tried to play with TargetOptions without any success… Here is the kind of code we use to allocate the JIT: EngineBuilder builder(fResult->fModule);
2016 Nov 18
2
what does -ffp-contract=fast allow?
Sent from my Verizon Wireless 4G LTE DROID On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote: > > >> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: >> >> >> ________________________________ >>> >>> From: "Warren
2016 Nov 18
2
what does -ffp-contract=fast allow?
----- Original Message ----- > From: "Sanjay Patel" <spatel at rotateright.com> > To: "Hal J. Finkel" <hfinkel at anl.gov> > Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "llvm-dev" > <llvm-dev at lists.llvm.org>, "cfe-dev" <cfe-dev at lists.llvm.org>, > "andrew kaylor" <andrew.kaylor at
2013 Jul 18
0
[LLVMdev] LLVM 3.3 JIT code speed
On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr> wrote: > Hi, > > Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? > > I tried to play with TargetOptions without any success… > > Here is the kind of code we use to
2017 Jun 28
2
Override TargetOptions for block of code?
Hi, we generally run our JIT with UnsafeFPMath enabled, but there are a few specific instances where a block of code needs to follow strict FPMath. Is there a way to temporarily override TargetOptions for a specific block of IR? -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Jul 18
2
[LLVMdev] LLVM 3.3 JIT code speed
Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit : > On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr> wrote: >> Hi, >> >> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? >>
2013 Jul 25
2
[LLVMdev] Clang/LLVM 3.3 unwanted attributes being added: NoFramePointerElim
Since updating to LLVM 3.3, the system is generating attributes such as: attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false"
2018 Aug 20
3
Condition code in DAGCombiner::visitFADDForFMACombine?
I'm curious why the condition to fuse is this: // Floating-point multiply-add with intermediate rounding. bool HasFMAD = (LegalOperations && TLI.isOperationLegal(ISD::FMAD, VT)); static bool isContractable(SDNode *N) { SDNodeFlags F = N->getFlags(); return F.hasAllowContract() || F.hasAllowReassociation(); } bool CanFuse = Options.UnsafeFPMath || isContractable(N); bool
2014 Sep 19
2
[LLVMdev] More careful treatment of floating point exceptions
Hi Sanjay, Thanks, I saw this flag and it's definitely should be considered, but it appeared to me to be static characteristic of target platform. I'm not sure how appropriate it would be to change its value from a front-end. It says "Has", while optional flag would rather say "Uses" meaning that implementation cares about floating point exceptions. Regards, Sergey
2018 Aug 22
4
Condition code in DAGCombiner::visitFADDForFMACombine?
On 22.08.2018 13:29, Ryan Taylor wrote: > The example starts as SPIR-V with the NoContraction decoration flag on > the fmul. > > I think what you are saying seems valid in that if the user had put the > flag on the fadd instead of the fmul it would not contract and so in > this example the user needs to put the NoContraction on the fadd though > I'm not sure
2018 Aug 22
2
Condition code in DAGCombiner::visitFADDForFMACombine?
On 21.08.2018 16:08, Ryan Taylor via llvm-dev wrote: > So I have a test case where: > > %20 = fmul nnan arcp float %15, %19 > %21 = fadd reassoc nnan arcp contract float %20, -1.000000e+00 > > is being contracted in DAG to fmad. Is this correct since the fmul has > no reassoc or contract fast math flag? By having the reassoc and contract flags on fadd, the frontend is
2015 Jan 09
5
[LLVMdev] Enable changing UnsafeFPMath on a per-function basis
To continue the discussion I started last year (see the link below) on embedding command-line options in bitcode, I came up with a plan to improve the way the backend changes UnsafeFPMath on a per-function basis. The code in trunk currently resets TargetOptions::UnsafeFPMath at the beginning of SelectionDAGISel::runOnMachineFunction to enable compiling one function with “unsafe-fp-math=true” and
2012 Dec 12
3
[LLVMdev] Question about FMA formation
Hi, Dear All: I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA formation despite restrictive FP mode? Thanks Shuxin
2014 Dec 10
2
[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
Thanks! That’s probably close enough for practical purposes. I looked at the overrides on various targets, and they all return true if the FMA hardware exists. - Arch From: Jingyue Wu [mailto:jingyue at google.com] Sent: Wednesday, December 10, 2014 2:56 PM To: Robison, Arch Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast? Does
2014 Dec 10
2
[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
For the Julia language JIT, we'd like be able to tell whether the llvm.fma.* intrinsic has hardware support. What's the best way to query LLVM (JIT) for this information? The information would be used in situations where the user wants to use different algorithms depending on whether FMA hardware is present or not. - Arch D. Robison -------------- next part -------------- An HTML
2012 Dec 13
0
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 3:40 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > Hi, Dear All: > > I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than > "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA > formation despite restrictive FP mode? > I believe