search for: vfmadd

Displaying 7 results from an estimated 7 matches for "vfmadd".

Did you mean: fmadd
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...muting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this introduces an extra copies in accumulator-style loops. Something like: while (...) accumulator = x * y + accumulator; yields: loop: vfmadd.213 y, x, acc vmovaps acc, x decl count jne loop instead of loop: vfmadd.231 acc, x, y decl count jne loop I have started writing a patch to generate the 231 variant by default, and I want to know whether I need to go to the trouble of adding custom commute logic. If these things are...
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
...> For the curious, the reason that I'm asking is that we currently > always select the 213 variant, but this introduces an extra copies in > accumulator-style loops. Something like: > > while (...) > accumulator = x * y + accumulator; > > yields: > > loop: > vfmadd.213 y, x, acc > vmovaps acc, x > decl count > jne loop > > instead of > > loop: > vfmadd.231 acc, x, y > decl count > jne loop > > I have started writing a patch to generate the 231 variant by default, > and I want to know whether I need to go to t...
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Kay, My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like: double foo(double a, double b, double c) { return a * b + c; } Which will now require a vmovaps + vfmadd231. If this impacts real benchmarks we could add an optimization to change the F...
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...23:03 > To: Kay Tiong Khoo > Cc: LLVM Developers Mailing List; Demikhovsky, Elena; Craig Topper > Subject: Re: [LLVMdev] Commutability of X86 FMA3 instructions. > > Hi Kay, > > My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like: > > double foo(double a, double b, double c) { > return a * b + c; > } > > Which will now require a vmovaps + vfmadd231. > > If this impacts real benchmarks we coul...
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
...I assume isn't needed given the cpu type.) The result is the same even with -mcpu=haswell. This is a common pattern involved in a reduction with two things on the RHS. The three things in play here are (%rax,%rcx), (%rdi,%rcx), and %ymm0. If another register is used to hold a loaded value, the vfmadd instruction could be used in multiple ways. I suspect I'm missing something, which I why I'm not already posting this on llvm-bugs. Is this expected behavior? ------------------------------------------------------------------------------------------- ; ModuleID = 'LLVMDialectModule'...
2019 Jul 10
2
RFC: change -fp-contract=off to actually disable FMAs
...TRACT pragma, default) | off (never fuse) Current behaviour in LLVM 8.0 below: $ cat fma.ll define double @fmadd(double %a, double %b, double %c) { %mul = fmul fast double %b, %a %add = fadd fast double %mul, %c ret double %add } $ llc -mattr=+fma fma.ll -fp-contract=off -o - | grep vfmadd vfmadd213sd %xmm2, %xmm1, %xmm0 # xmm0 = (xmm1 * xmm0) + xmm2 It still generates an fma due to the logic in DAGCombiner: bool CanFuse = Options.UnsafeFPMath || isContractable(N); bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast || CanF...
2019 Sep 02
2
AVX2 codegen - question reg. FMA generation
...s the same even with > > -mcpu=haswell. > > > > This is a common pattern involved in a reduction with two things on > > the RHS. The three things in play here are (%rax,%rcx), (%rdi,%rcx), > > and %ymm0. If another register is used to hold a loaded value, the > > vfmadd instruction could be used in multiple ways. I suspect I'm > > missing something, which I why I'm not already posting this on > > llvm-bugs. Is this expected behavior? > > > > -------------------------------------------------------------------------------------------...