Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?"
2014 Dec 10
2
[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
Thanks! That’s probably close enough for practical purposes. I looked at the overrides on various targets, and they all return true if the FMA hardware exists.
- Arch
From: Jingyue Wu [mailto:jingyue at google.com]
Sent: Wednesday, December 10, 2014 2:56 PM
To: Robison, Arch
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
Does
2013 Jul 08
1
[LLVMdev] API break for out-of-tree targets implementing TargetLoweringBase::isFMAFasterThanMulAndAdd
Hello,
To any out-of-tree targets, please be aware that I intend to commit a
patch that will break the build of any target implementing
TargetLoweringBase::isFMAFasterThanMulAndAdd, for the reasons
described below. (Basically, the current interface definition is
broken and not followed, and no in-tree target was doing the right
thing with it, so it is unlikely any out-of-tree target is either...)
2016 Nov 18
2
what does -ffp-contract=fast allow?
Sent from my Verizon Wireless 4G LTE DROID
On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote:
>
>
>> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote:
>>
>>
>> ________________________________
>>>
>>> From: "Warren
2016 Nov 18
2
what does -ffp-contract=fast allow?
----- Original Message -----
> From: "Sanjay Patel" <spatel at rotateright.com>
> To: "Hal J. Finkel" <hfinkel at anl.gov>
> Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "llvm-dev"
> <llvm-dev at lists.llvm.org>, "cfe-dev" <cfe-dev at lists.llvm.org>,
> "andrew kaylor" <andrew.kaylor at
2012 Dec 12
3
[LLVMdev] Question about FMA formation
Hi, Dear All:
I'm going implement FMA formation. On some architectures, "FMA a, b,
c" is more precise than
"a * b + c". I'm wondering if FMA could be less precise. In the former
case, can we enable FMA
formation despite restrictive FP mode?
Thanks
Shuxin
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:42 AM, Hal Finkel wrote:
> In my experience, users of numerical codes expect that the compiler will
> use FMA instructions where it can, unless specifically asked to avoid
> doing so by the user. Even though this can sometimes produce a different
> result (*almost* always a better one), the performance gain is too large
> to be ignored by default. I highly
2012 Dec 13
0
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 3:40 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Dear All:
>
> I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than
> "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA
> formation despite restrictive FP mode?
>
I believe
2012 Dec 13
0
[LLVMdev] Question about FMA formation
Hi Michael, Shuxin,
> Shuxin was showing some more complicated patterns that required
> re-association to match (fast-math flags permitting). For those, we're
> considering if having a re-associate-for-FMA functionality in
> codegen-prepare would solve that problem. Thus, we can re-associate in
> codegen-prepare and emit FMA in fast-isel.
>
Yep. I misread the association
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:44 AM, James Molloy wrote:
> Hi Owen,
>
> Having looked into this due to Clang failing PlumHall with it recently I can give an opinion...
>
> I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If
2012 Dec 13
2
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 5:20 PM, Eric Christopher <echristo at gmail.com> wrote:
> Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass).
2016 Sep 12
4
[X86] FMA transformation restrictions
I noticed that the operand commuting code in X86InstrInfo.cpp treats
scalar FMA intrinsics specially. It prevents operand commuting on these
scalar instructions because the scalar FMA instructions preserve the
upper bits of the vector. Presumably, the restrictions are there
because commuting operands potentially changes the result upper bits.
However, AFAIK the Intel and GNU FMA intrinsics
2012 Dec 13
0
[LLVMdev] Question about FMA formation
Hi, Eli, Mike and Lang:
Thank you all for the input. This is one e.g which might be
difficult for isel:
a*b + c*d + e => a*b + (c*d + e).
Thanks
Shuxin
On 12/12/12 4:43 PM, Lang Hames wrote:
> A little background:
>
> The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma
> in C. llvm.fmuladd.* is generated by clang when it sees an expression
> of the
2016 Nov 19
2
FMA canonicalization in IR
Sent from my Verizon Wireless 4G LTE DROID
On Nov 19, 2016 10:26 AM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote:
>
> If I have my FMA intrinsics story straight now (thanks for the explanation, Hal!), I think it raises another question about IR canonicalization (and may affect the proposed revision to IR FMF):
No, I think that we specifically
2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
Hi Owen,
Having looked into this due to Clang failing PlumHall with it recently I can give an opinion...
I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c
2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
On Wed, 2012-02-08 at 10:11 -0800, Owen Anderson wrote:
> Hello everyone,
>
>
> I'd like to propose the attached patch to form FMA intrinsics
> aggressively, but in order to do so I need some clarification on the
> intended semantics for the various FP precision-related
> TargetOptions. I've summarized the three relevant ones below:
>
>
> UnsafeFPMath -
2012 Dec 13
3
[LLVMdev] Question about FMA formation
A little background:
The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma in
C. llvm.fmuladd.* is generated by clang when it sees an expression of the
form 'a * b + c' within a single source statement.
If you want to opportunistically form FMA target instructions my
inclination would be to skip llvm.fmuladd.* and just form them from a*b+c
expressions at isel time. I
2012 Dec 13
0
[LLVMdev] Question about FMA formation
Why not just form them via a fast IR level pass and just have patterns
match in fast isel instead of trying to form code? Or are we saying the
same thing? (Your words of "fast isel spot"ting and "form better code"
caused me to think you mean to do optimizations within the fast isel pass).
-eric
On Wed, Dec 12, 2012 at 5:14 PM, Michael Ilseman <milseman at apple.com>
2012 Dec 13
2
[LLVMdev] Question about FMA formation
Right now we're shying towards having a re-association helper in codegen-prepare that will re-associate expressions (if allowed). This would allow fast-isel to more easily spot FMA opportunities, and form better code.
On Dec 12, 2012, at 5:11 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
>
> You hit send right when I did!
> For your example, do you mean that
2019 Sep 02
2
AVX2 codegen - question reg. FMA generation
On Mon, 2 Sep 2019 at 16:59, Roman Lebedev <lebedev.ri at gmail.com> wrote:
>
> It appears you need 'reassoc' on fmul/fadd:
> https://godbolt.org/z/nuTzx2
Thanks very much, that was it. Either that or providing
-enable-unsafe-fp-math to llc yielded FMAs. I didn't expect this since
using FMAs here instead of mul/add appears to be safer (the reverse is
unsafe).
~ Uday
2012 Feb 08
6
[LLVMdev] Clarifying FMA-related TargetOptions
Hello everyone,
I'd like to propose the attached patch to form FMA intrinsics aggressively, but in order to do so I need some clarification on the intended semantics for the various FP precision-related TargetOptions. I've summarized the three relevant ones below:
UnsafeFPMath - Defaults to off, enables "less precise" results than permitted by IEEE754. Comments specifically