thr3ads.net - similar to: "[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?"

[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?

2014 Dec 10

[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?

Thanks! That’s probably close enough for practical purposes. I looked at the overrides on various targets, and they all return true if the FMA hardware exists. - Arch From: Jingyue Wu [mailto:jingyue at google.com] Sent: Wednesday, December 10, 2014 2:56 PM To: Robison, Arch Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast? Does

[LLVMdev] API break for out-of-tree targets implementing TargetLoweringBase::isFMAFasterThanMulAndAdd

2013 Jul 08

[LLVMdev] API break for out-of-tree targets implementing TargetLoweringBase::isFMAFasterThanMulAndAdd

Hello, To any out-of-tree targets, please be aware that I intend to commit a patch that will break the build of any target implementing TargetLoweringBase::isFMAFasterThanMulAndAdd, for the reasons described below. (Basically, the current interface definition is broken and not followed, and no in-tree target was doing the right thing with it, so it is unlikely any out-of-tree target is either...)

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

Sent from my Verizon Wireless 4G LTE DROID On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote: > > >> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: >> >> >> ________________________________ >>> >>> From: "Warren

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

----- Original Message ----- > From: "Sanjay Patel" <spatel at rotateright.com> > To: "Hal J. Finkel" <hfinkel at anl.gov> > Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "llvm-dev" > <llvm-dev at lists.llvm.org>, "cfe-dev" <cfe-dev at lists.llvm.org>, > "andrew kaylor" <andrew.kaylor at

[LLVMdev] Question about FMA formation

2012 Dec 12

[LLVMdev] Question about FMA formation

Hi, Dear All: I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA formation despite restrictive FP mode? Thanks Shuxin

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

On Feb 8, 2012, at 10:42 AM, Hal Finkel wrote: > In my experience, users of numerical codes expect that the compiler will > use FMA instructions where it can, unless specifically asked to avoid > doing so by the user. Even though this can sometimes produce a different > result (*almost* always a better one), the performance gain is too large > to be ignored by default. I highly

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

On Dec 12, 2012, at 3:40 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > Hi, Dear All: > > I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than > "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA > formation despite restrictive FP mode? > I believe

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

Hi Michael, Shuxin, > Shuxin was showing some more complicated patterns that required > re-association to match (fast-math flags permitting). For those, we're > considering if having a re-associate-for-FMA functionality in > codegen-prepare would solve that problem. Thus, we can re-associate in > codegen-prepare and emit FMA in fast-isel. > Yep. I misread the association

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

On Feb 8, 2012, at 10:44 AM, James Molloy wrote: > Hi Owen, > > Having looked into this due to Clang failing PlumHall with it recently I can give an opinion... > > I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

On Dec 12, 2012, at 5:20 PM, Eric Christopher <echristo at gmail.com> wrote: > Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass).

[X86] FMA transformation restrictions

2016 Sep 12

[X86] FMA transformation restrictions

I noticed that the operand commuting code in X86InstrInfo.cpp treats scalar FMA intrinsics specially. It prevents operand commuting on these scalar instructions because the scalar FMA instructions preserve the upper bits of the vector. Presumably, the restrictions are there because commuting operands potentially changes the result upper bits. However, AFAIK the Intel and GNU FMA intrinsics

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

Hi, Eli, Mike and Lang: Thank you all for the input. This is one e.g which might be difficult for isel: a*b + c*d + e => a*b + (c*d + e). Thanks Shuxin On 12/12/12 4:43 PM, Lang Hames wrote: > A little background: > > The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma > in C. llvm.fmuladd.* is generated by clang when it sees an expression > of the

FMA canonicalization in IR

2016 Nov 19

FMA canonicalization in IR

Sent from my Verizon Wireless 4G LTE DROID On Nov 19, 2016 10:26 AM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote: > > If I have my FMA intrinsics story straight now (thanks for the explanation, Hal!), I think it raises another question about IR canonicalization (and may affect the proposed revision to IR FMF): No, I think that we specifically

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

Hi Owen, Having looked into this due to Clang failing PlumHall with it recently I can give an opinion... I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

On Wed, 2012-02-08 at 10:11 -0800, Owen Anderson wrote: > Hello everyone, > > > I'd like to propose the attached patch to form FMA intrinsics > aggressively, but in order to do so I need some clarification on the > intended semantics for the various FP precision-related > TargetOptions. I've summarized the three relevant ones below: > > > UnsafeFPMath -

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

A little background: The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma in C. llvm.fmuladd.* is generated by clang when it sees an expression of the form 'a * b + c' within a single source statement. If you want to opportunistically form FMA target instructions my inclination would be to skip llvm.fmuladd.* and just form them from a*b+c expressions at isel time. I

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass). -eric On Wed, Dec 12, 2012 at 5:14 PM, Michael Ilseman <milseman at apple.com>

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

Right now we're shying towards having a re-association helper in codegen-prepare that will re-associate expressions (if allowed). This would allow fast-isel to more easily spot FMA opportunities, and form better code. On Dec 12, 2012, at 5:11 PM, Eric Christopher <echristo at gmail.com> wrote: > > > > You hit send right when I did! > For your example, do you mean that

AVX2 codegen - question reg. FMA generation

2019 Sep 02

AVX2 codegen - question reg. FMA generation

On Mon, 2 Sep 2019 at 16:59, Roman Lebedev <lebedev.ri at gmail.com> wrote: > > It appears you need 'reassoc' on fmul/fadd: > https://godbolt.org/z/nuTzx2 Thanks very much, that was it. Either that or providing -enable-unsafe-fp-math to llc yielded FMAs. I didn't expect this since using FMAs here instead of mul/add appears to be safer (the reverse is unsafe). ~ Uday

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

Hello everyone, I'd like to propose the attached patch to form FMA intrinsics aggressively, but in order to do so I need some clarification on the intended semantics for the various FP precision-related TargetOptions. I've summarized the three relevant ones below: UnsafeFPMath - Defaults to off, enables "less precise" results than permitted by IEEE754. Comments specifically

similar to: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?