Displaying 20 results from an estimated 5000 matches similar to: "[X86] FMA transformation restrictions"
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Kay,
My patch will partially address your bug. For now I'm just looking to
switch the default FMA from vfmadd213xx to vfmadd231xx. That will
cause the code in PR17229 to compile as desired, but would regress
code like:
double foo(double a, double b, double c) {
return a * b + c;
}
Which will now require a vmovaps + vfmadd231.
If this impacts real benchmarks we could add an
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Elena,
Thank you very much for looking in to that.
I'll go ahead and remove the isCommutable flag from those
instructions, since it sounds like that's the right thing to do. I
would still like to change the default from the 231 variant to 213
too, as this will reduce code-size for accumulator-style loops. I have
at least one benchmark that shows significant speedups when this
change
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Lang,
Unfortunately, I don't have an answer on the commutability question, but I
wanted to let you know that I filed a bug on this:
http://llvm.org/bugs/show_bug.cgi?id=17229
This also shows a memory operand variant of the fma that you may want to
consider in your patch and testcases.
Thanks!
On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote:
> Hi all,
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi all,
The 213 variant of the FMA3 instructions is currently marked
commutable (see X86InstrFMA.td). Is that safe? According to the ISA
the FMA3 instructions aren't commutable for non-numeric results, so
I'd have thought commuting this would only be valid in fast-math mode?
For the curious, the reason that I'm asking is that we currently
always select the 213 variant, but this
2012 Dec 12
3
[LLVMdev] Question about FMA formation
Hi, Dear All:
I'm going implement FMA formation. On some architectures, "FMA a, b,
c" is more precise than
"a * b + c". I'm wondering if FMA could be less precise. In the former
case, can we enable FMA
formation despite restrictive FP mode?
Thanks
Shuxin
2012 Dec 13
0
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 3:40 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Dear All:
>
> I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than
> "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA
> formation despite restrictive FP mode?
>
I believe
2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
Hi Owen,
Having looked into this due to Clang failing PlumHall with it recently I can give an opinion...
I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c
2012 Dec 13
0
[LLVMdev] Question about FMA formation
Hi Michael, Shuxin,
> Shuxin was showing some more complicated patterns that required
> re-association to match (fast-math flags permitting). For those, we're
> considering if having a re-associate-for-FMA functionality in
> codegen-prepare would solve that problem. Thus, we can re-associate in
> codegen-prepare and emit FMA in fast-isel.
>
Yep. I misread the association
2012 Dec 13
0
[LLVMdev] Question about FMA formation
Hi, Eli, Mike and Lang:
Thank you all for the input. This is one e.g which might be
difficult for isel:
a*b + c*d + e => a*b + (c*d + e).
Thanks
Shuxin
On 12/12/12 4:43 PM, Lang Hames wrote:
> A little background:
>
> The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma
> in C. llvm.fmuladd.* is generated by clang when it sees an expression
> of the
2012 Dec 13
2
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 5:20 PM, Eric Christopher <echristo at gmail.com> wrote:
> Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass).
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:42 AM, Hal Finkel wrote:
> In my experience, users of numerical codes expect that the compiler will
> use FMA instructions where it can, unless specifically asked to avoid
> doing so by the user. Even though this can sometimes produce a different
> result (*almost* always a better one), the performance gain is too large
> to be ignored by default. I highly
2014 Dec 10
2
[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
Thanks! That’s probably close enough for practical purposes. I looked at the overrides on various targets, and they all return true if the FMA hardware exists.
- Arch
From: Jingyue Wu [mailto:jingyue at google.com]
Sent: Wednesday, December 10, 2014 2:56 PM
To: Robison, Arch
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
Does
2016 Nov 19
2
FMA canonicalization in IR
Sent from my Verizon Wireless 4G LTE DROID
On Nov 19, 2016 10:26 AM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote:
>
> If I have my FMA intrinsics story straight now (thanks for the explanation, Hal!), I think it raises another question about IR canonicalization (and may affect the proposed revision to IR FMF):
No, I think that we specifically
2012 Dec 13
3
[LLVMdev] Question about FMA formation
A little background:
The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma in
C. llvm.fmuladd.* is generated by clang when it sees an expression of the
form 'a * b + c' within a single source statement.
If you want to opportunistically form FMA target instructions my
inclination would be to skip llvm.fmuladd.* and just form them from a*b+c
expressions at isel time. I
2012 Feb 08
1
[LLVMdev] Clarifying FMA-related TargetOptions
On Feb 8, 2012, at 10:44 AM, James Molloy wrote:
> Hi Owen,
>
> Having looked into this due to Clang failing PlumHall with it recently I can give an opinion...
>
> I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If
2012 Feb 08
0
[LLVMdev] Clarifying FMA-related TargetOptions
On Wed, 2012-02-08 at 10:11 -0800, Owen Anderson wrote:
> Hello everyone,
>
>
> I'd like to propose the attached patch to form FMA intrinsics
> aggressively, but in order to do so I need some clarification on the
> intended semantics for the various FP precision-related
> TargetOptions. I've summarized the three relevant ones below:
>
>
> UnsafeFPMath -
2014 Dec 10
2
[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?
For the Julia language JIT, we'd like be able to tell whether the llvm.fma.* intrinsic has hardware support. What's the best way to query LLVM (JIT) for this information?
The information would be used in situations where the user wants to use different algorithms depending on whether FMA hardware is present or not.
- Arch D. Robison
-------------- next part --------------
An HTML
2012 Feb 08
6
[LLVMdev] Clarifying FMA-related TargetOptions
Hello everyone,
I'd like to propose the attached patch to form FMA intrinsics aggressively, but in order to do so I need some clarification on the intended semantics for the various FP precision-related TargetOptions. I've summarized the three relevant ones below:
UnsafeFPMath - Defaults to off, enables "less precise" results than permitted by IEEE754. Comments specifically
2019 Sep 02
2
AVX2 codegen - question reg. FMA generation
On Mon, 2 Sep 2019 at 16:59, Roman Lebedev <lebedev.ri at gmail.com> wrote:
>
> It appears you need 'reassoc' on fmul/fadd:
> https://godbolt.org/z/nuTzx2
Thanks very much, that was it. Either that or providing
-enable-unsafe-fp-math to llc yielded FMAs. I didn't expect this since
using FMAs here instead of mul/add appears to be safer (the reverse is
unsafe).
~ Uday
2012 Dec 13
2
[LLVMdev] Question about FMA formation
On Dec 12, 2012, at 4:49 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Eli, Mike and Lang:
>
> Thank you all for the input. This is one e.g which might be difficult for isel:
> a*b + c*d + e => a*b + (c*d + e).
>
You hit send right when I did!
For your example, do you mean that it's grouped like:
(fadd (fadd (fmul a b) (fmul c d)) e)
How would your