Displaying 7 results from an estimated 7 matches for "vfmadd".
Did you mean:
fmadd
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...muting this would only be valid in fast-math mode?
For the curious, the reason that I'm asking is that we currently
always select the 213 variant, but this introduces an extra copies in
accumulator-style loops. Something like:
while (...)
accumulator = x * y + accumulator;
yields:
loop:
vfmadd.213 y, x, acc
vmovaps acc, x
decl count
jne loop
instead of
loop:
vfmadd.231 acc, x, y
decl count
jne loop
I have started writing a patch to generate the 231 variant by default,
and I want to know whether I need to go to the trouble of adding
custom commute logic. If these things are...
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
...> For the curious, the reason that I'm asking is that we currently
> always select the 213 variant, but this introduces an extra copies in
> accumulator-style loops. Something like:
>
> while (...)
> accumulator = x * y + accumulator;
>
> yields:
>
> loop:
> vfmadd.213 y, x, acc
> vmovaps acc, x
> decl count
> jne loop
>
> instead of
>
> loop:
> vfmadd.231 acc, x, y
> decl count
> jne loop
>
> I have started writing a patch to generate the 231 variant by default,
> and I want to know whether I need to go to t...
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Kay,
My patch will partially address your bug. For now I'm just looking to
switch the default FMA from vfmadd213xx to vfmadd231xx. That will
cause the code in PR17229 to compile as desired, but would regress
code like:
double foo(double a, double b, double c) {
return a * b + c;
}
Which will now require a vmovaps + vfmadd231.
If this impacts real benchmarks we could add an optimization to change
the F...
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...23:03
> To: Kay Tiong Khoo
> Cc: LLVM Developers Mailing List; Demikhovsky, Elena; Craig Topper
> Subject: Re: [LLVMdev] Commutability of X86 FMA3 instructions.
>
> Hi Kay,
>
> My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like:
>
> double foo(double a, double b, double c) {
> return a * b + c;
> }
>
> Which will now require a vmovaps + vfmadd231.
>
> If this impacts real benchmarks we coul...
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
...I assume
isn't needed given the cpu type.) The result is the same even with
-mcpu=haswell.
This is a common pattern involved in a reduction with two things on
the RHS. The three things in play here are (%rax,%rcx), (%rdi,%rcx),
and %ymm0. If another register is used to hold a loaded value, the
vfmadd instruction could be used in multiple ways. I suspect I'm
missing something, which I why I'm not already posting this on
llvm-bugs. Is this expected behavior?
-------------------------------------------------------------------------------------------
; ModuleID = 'LLVMDialectModule'...
2019 Jul 10
2
RFC: change -fp-contract=off to actually disable FMAs
...TRACT pragma, default) | off (never fuse)
Current behaviour in LLVM 8.0 below:
$ cat fma.ll
define double @fmadd(double %a, double %b, double %c) {
%mul = fmul fast double %b, %a
%add = fadd fast double %mul, %c
ret double %add
}
$ llc -mattr=+fma fma.ll -fp-contract=off -o - | grep vfmadd
vfmadd213sd %xmm2, %xmm1, %xmm0 # xmm0 = (xmm1 * xmm0) + xmm2
It still generates an fma due to the logic in DAGCombiner:
bool CanFuse = Options.UnsafeFPMath || isContractable(N);
bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast ||
CanF...
2019 Sep 02
2
AVX2 codegen - question reg. FMA generation
...s the same even with
> > -mcpu=haswell.
> >
> > This is a common pattern involved in a reduction with two things on
> > the RHS. The three things in play here are (%rax,%rcx), (%rdi,%rcx),
> > and %ymm0. If another register is used to hold a loaded value, the
> > vfmadd instruction could be used in multiple ways. I suspect I'm
> > missing something, which I why I'm not already posting this on
> > llvm-bugs. Is this expected behavior?
> >
> > -------------------------------------------------------------------------------------------...