Hi All, Recently I was doing some AArch64 work and noticed some cases where fmuls were not getting fused with fadds. Is there any particular reason that the AArch64 machine combiner doesn't do this like it does for add/mul? I am happy to work up a patch for this, but I wanted to make sure that there wasn't a good reason for it not already being there. FWIW, I see where GCC is doing this optimization. Cheers, Meador
On 18 September 2015 at 20:14, Meador Inge via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Recently I was doing some AArch64 work and noticed some cases where > fmuls were not getting fused with fadds. Is there any particular > reason that the AArch64 machine combiner doesn't do this like it does > for add/mul?AArch64's fmadd instruction is fused, which means it can produce a different result to the two operations executed separately. The C and C++ standards do not allow such changes. We support them via various flags (-ffast-math is the obvious one, though an argument could be made for supporting -mfused-madd and -mnofused-madd as well). But in the backend we definitely have to check *somthing* before doing the substitution.> I am happy to work up a patch for this, but I wanted to make sure that > there wasn't a good reason for it not already being there. FWIW, I > see where GCC is doing this optimization.You might want to get together with Ana Pazos, who just asked similar questions today. Personally, I'd be sad to see Clang's default execution become less conforming to the standard. But it's pretty undeniable that "-std=gnu99/gnu11/..." do allow it by default. Cheers. Tim.
On Fri, Sep 18, 2015 at 10:34 PM, Tim Northover <t.p.northover at gmail.com> wrote:> AArch64's fmadd instruction is fused, which means it can produce a > different result to the two operations executed separately. The C and > C++ standards do not allow such changes.Sorry, sloppy language on my part. I was aware of fmadd, but I was really asking about turning sequences like: fmul s0, s0, s2 fadd s0, s1, s0 into a fmadd: fmadd s0, s0, s2, s1> We support them via various flags (-ffast-math is the obvious one, > though an argument could be made for supporting -mfused-madd and > -mnofused-madd as well). But in the backend we definitely have to > check *somthing* before doing the substitution.Support in what way? I don't see any patterns or machine combiners to do the above replacement. Did I miss something? If I didn't miss something, is there interest in adding this to the AArch64 machine combiners assuming it was guarded by the right flag?> You might want to get together with Ana Pazos, who just asked similar > questions today.I see that on cfe-dev now. Thanks for pointing that out.> Personally, I'd be sad to see Clang's default execution become less > conforming to the standard. But it's pretty undeniable that > "-std=gnu99/gnu11/..." do allow it by default.Agreed. Thanks for the reply! Cheers, Meador