search for: fmadd

Displaying 20 results from an estimated 21 matches for "fmadd".

Did you mean: fadd
2015 Sep 19
3
AArch64 fmul/fadd fusion
On Fri, Sep 18, 2015 at 10:34 PM, Tim Northover <t.p.northover at gmail.com> wrote: > AArch64's fmadd instruction is fused, which means it can produce a > different result to the two operations executed separately. The C and > C++ standards do not allow such changes. Sorry, sloppy language on my part. I was aware of fmadd, but I was really asking about turning sequences like: fmul s0, s0...
2016 Sep 11
3
defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
On Sep 10, 2016, at 3:33 AM, Steve Canon <scanon at apple.com> wrote: >>> >>> Pretty much. In particular, imagine a user trying to debug an unexpected floating point result caused by conversion of a*b + c into fma(a, b, c). >> >> I think that’s unavoidable, because of the way the optimization levels work. Even fma contraction is on by default (something I’d
2015 Sep 19
2
AArch64 fmul/fadd fusion
Hi All, Recently I was doing some AArch64 work and noticed some cases where fmuls were not getting fused with fadds. Is there any particular reason that the AArch64 machine combiner doesn't do this like it does for add/mul? I am happy to work up a patch for this, but I wanted to make sure that there wasn't a good reason for it not already being there. FWIW, I see where GCC is doing
2016 Jun 28
2
Question about Instruction Selection
...t's used as a tie-breaker when the input complexity is equal. > Sorry that I didn’t state the question clear: “cost” I’m asking is the hardware instruction(or operation) cost. Or is it possible that backend developers express the cost model by mean of TableGen patterns? E.g. If there is an fmadd instruction and consume less cycle, developers need to add a pattern which map fmul + fadd into fmadd. So one doesn’t need to provide numeric cost values for tablegen to select optimized instruction. Cheers McClane > TableGen then uses that cost to order the matching tables; I'm not >...
2016 Jun 28
0
Question about Instruction Selection
...r when the input complexity is equal. > > > Sorry that I didn’t state the question clear: “cost” I’m asking is the > hardware instruction(or operation) cost. > Or is it possible that backend developers express the cost model by mean of > TableGen patterns? > E.g. If there is an fmadd instruction and consume less cycle, developers > need to add a pattern which map fmul + fadd into fmadd. So one doesn’t need > to provide numeric cost values for tablegen to select optimized instruction. Yes, the general thinking is that number of instructions (and also the size of each inst...
2017 Mar 15
5
[RFC] FP Contract = fast?
...lly behaving like it, at least not as I would expect: int foo(float a, float b, float c) { return a*b+c; } $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - (...) fmul s0, s0, s1 fadd s0, s0, s2 (...) $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - (...) fmadd s0, s0, s1, s2 (...) I'm not sure this works in Fortran either, but defaulting to "on" when (I believe) the language should allow contraction and not doing it is not a good default. i haven't worked out what would be necessary to make it work on a case-by-case basis (what kinds...
2016 Nov 17
2
what does -ffp-contract=fast allow?
This is just paraphrasing from D26602, so credit to Nicolai for first raising the issue there. float foo(float x, float y) { return x * (y + 1); } $ ./clang -O2 xy1.c -S -o - -target aarch64 -ffp-contract=fast | grep fm fmadd s0, s1, s0, s0 Is this a bug? We transformed the original expression into: x * y + x When x=INF and y=0, the code returns INF if we don't reassociate. With reassociation to FMA, it returns NAN because 0 * INF = NAN. 1. I used aarch64 as the example target, but this is not target-dependent...
2016 Jun 28
0
Question about Instruction Selection
On Tue, Jun 28, 2016 at 4:42 AM, Bekket McClane via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi, > I'm new to LLVM and I'm doing research on factors of compilation time, > especially instruction selection and scheduling. One of the academic papers > I read, > https://llvm.org/svn/llvm-project/www-pubs/trunk/2008-CGO-DagISel.pdf (Koes, > David Ryan, and Seth
2017 Mar 15
2
[cfe-dev] [RFC] FP Contract = fast?
...) { return a*b+c; } >> >> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - >> (...) >> fmul s0, s0, s1 >> fadd s0, s0, s2 >> (...) >> >> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >> (...) >> fmadd s0, s0, s1, s2 >> (...) >> >> I'm not sure this works in Fortran either, but defaulting to "on" when >> (I believe) the language should allow contraction and not doing it is >> not a good default. >> >> i haven't worked out what would be n...
2016 Jun 28
2
Question about Instruction Selection
Hi, I'm new to LLVM and I'm doing research on factors of compilation time, especially instruction selection and scheduling. One of the academic papers I read, https://llvm.org/svn/llvm-project/www-pubs/trunk/2008-CGO-DagISel.pdf (Koes, David Ryan, and Seth Copen Goldstein. "Near-optimal instruction selection on dags."), which is also said to be the algorithm LLVM currently
2017 Mar 15
2
[cfe-dev] [RFC] FP Contract = fast?
...ux-gnu -O2 -S fma.c -ffp-contract=on -o - >>>> (...) >>>> fmul s0, s0, s1 >>>> fadd s0, s0, s2 >>>> (...) >>>> >>>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >>>> (...) >>>> fmadd s0, s0, s1, s2 >>>> (...) >>>> >>>> I'm not sure this works in Fortran either, but defaulting to "on" when >>>> (I believe) the language should allow contraction and not doing it is >>>> not a good default. >>>>...
2017 Mar 15
2
[cfe-dev] [RFC] FP Contract = fast?
...t; (...) >>>>>> fmul s0, s0, s1 >>>>>> fadd s0, s0, s2 >>>>>> (...) >>>>>> >>>>>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >>>>>> (...) >>>>>> fmadd s0, s0, s1, s2 >>>>>> (...) >>>>>> >>>>>> I'm not sure this works in Fortran either, but defaulting to "on" when >>>>>> (I believe) the language should allow contraction and not doing it is >>>>>&g...
2016 Nov 18
2
what does -ffp-contract=fast allow?
...hrasing from D26602, so credit to Nicolai for first raising the issue there. >>> >>> float foo(float x, float y) { >>> return x * (y + 1); >>> } >>> >>> $ ./clang -O2 xy1.c -S -o - -target aarch64 -ffp-contract=fast | grep fm >>> fmadd s0, s1, s0, s0 >>> >>> Is this a bug? We transformed the original expression into: >>> x * y + x >>> >>> When x=INF and y=0, the code returns INF if we don't reassociate. With reassociation to FMA, it returns NAN because 0 * INF = NAN. >>&gt...
2019 Jul 10
2
RFC: change -fp-contract=off to actually disable FMAs
...Clang option suggests this to be the case: $ clang --help | grep fp-contract -ffp-contract=<value> Form fused FP ops (e.g. FMAs): fast (everywhere) | on (according to FP_CONTRACT pragma, default) | off (never fuse) Current behaviour in LLVM 8.0 below: $ cat fma.ll define double @fmadd(double %a, double %b, double %c) { %mul = fmul fast double %b, %a %add = fadd fast double %mul, %c ret double %add } $ llc -mattr=+fma fma.ll -fp-contract=off -o - | grep vfmadd vfmadd213sd %xmm2, %xmm1, %xmm0 # xmm0 = (xmm1 * xmm0) + xmm2 It still generates an fma due to the logic in...
2017 Mar 16
2
[cfe-dev] [RFC] FP Contract = fast?
...t;>> (...) >>>>>>>>>> >>>>>>>>>> $ clang -target aarch64-linux-gnu -O2 -S fma.c >>>>>>>>>> -ffp-contract=fast -o - >>>>>>>>>> (...) >>>>>>>>>> fmadd s0, s0, s1, s2 >>>>>>>>>> (...) >>>>>>>>>> >>>>>>>>>> I'm not sure this works in Fortran either, but defaulting to >>>>>>>>>> "on" when >>>>>>>...
2017 Mar 15
3
[cfe-dev] [RFC] FP Contract = fast?
...1 >>>>>>>> fadd s0, s0, s2 >>>>>>>> (...) >>>>>>>> >>>>>>>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >>>>>>>> (...) >>>>>>>> fmadd s0, s0, s1, s2 >>>>>>>> (...) >>>>>>>> >>>>>>>> I'm not sure this works in Fortran either, but defaulting to "on" when >>>>>>>> (I believe) the language should allow contraction and not do...
2016 Nov 18
2
what does -ffp-contract=fast allow?
...x, float y) { > > > >>> return x * (y + 1); > > > >>> } > > > >>> > > > >>> $ ./clang -O2 xy1.c -S -o - -target aarch64 -ffp-contract=fast > > >>> | > > >>> grep fm > > > >>> fmadd s0, s1, s0, s0 > > > >>> > > > >>> Is this a bug? We transformed the original expression into: > > > >>> x * y + x > > > >>> > > > >>> When x=INF and y=0, the code returns INF if we don't > > &g...
2017 Mar 16
2
[cfe-dev] [RFC] FP Contract = fast?
...gt;>>>> (...) >>>>>>>>>>>> >>>>>>>>>>>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >>>>>>>>>>>> (...) >>>>>>>>>>>> fmadd s0, s0, s1, s2 >>>>>>>>>>>> (...) >>>>>>>>>>>> >>>>>>>>>>>> I'm not sure this works in Fortran either, but defaulting to "on" when >>>>>>>>>>>&...
2018 Aug 22
2
Condition code in DAGCombiner::visitFADDForFMACombine?
On 22.08.2018 17:52, Ryan Taylor wrote: > This is probably going to effect on other backends and break llvm-lit > for them? Very likely, yes. Can you take a look at how big the fallout is? This might give us a hint about what other frontends might expect, and who needs to be involved in the discussion (if one is needed). Cheers, Nicolai > > On Wed, Aug 22, 2018 at 11:41 AM
2019 Feb 04
7
[RFC] Vector Predication
On Mon, 4 Feb 2019 at 22:04, Simon Moll <moll at cs.uni-saarland.de> wrote: > On 2/4/19 9:18 PM, Robin Kruppe wrote: > > > > On Mon, 4 Feb 2019 at 18:15, David Greene via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Simon Moll <moll at cs.uni-saarland.de> writes: >> >> > You are referring to the sub-vector sizes, if i am