On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov>
wrote:> As I understand it, the fundamental property being addresses here is: Are
the semantics of scalar FP math the same as vector FP math? TTI seems like a
good place to expose that information. If the semantics are indeed different,
then the vectorizer would require fast-math flags in order to vectorize FP
operations (similarly, gcc's man page says it requires
-funsafe-math-optimizations for vectorization unless -mfpu=neon or similar is
specified). In this context, this different-semantics query would return true
if:
The semantics is indeed different, VFP is IEEE-754 compliant while
NEON is not. We don't want to stop the compiler from using VFP for FP
math, but we want to be cautious when using NEON in the same way..
> !(isDarwin OR ARMISA >= v8 OR fpMath == NEON)
>
> and then we need to teach people to use -mfpu=neon ;)
So, there's the catch. In GCC, -mfpu=neon means to use NEON, which is
not enabled by default, so the compiler assumes that the user is aware
that NEON FP is not IEEE compliant. I don't think that's a safe
assumption, but I also don't want to have a slightly different
behaviour than GCC gratuitously.
Clang defaults to -mfpu=neon when we choose -mcpu=cortex-a* or
-march=armv7a, so our current behaviour is on par with GCC. But I
think that's a dangerous assumption.
Furthermore, the only alternatives we have at the moment is to either
use NEON for everything or nothing. It would be good to have an option
to use NEON for integer arithmetic and VFP for FP if the user requires
IEEE compliance..
> P.S. Looking at gcc's man page, gcc seems to use -mfpu for ARM and
-mfpmath for x86. Do we use -mfpmath for both?
We already support -mfpmath=vfp/neon in Clang, but it's bogus. My
proposal is to make it count.
The best way I can think of is to let -mfpmath=vfp *disable* only FP
NEON and -mfpmath=neon *enable* only FP NEON, both orthogonal from
integer math.
Examples:
Works today:
-mfpu=soft -> Int (ALU), FP (LIB), no VFP/NEON instructions
-mfpu=softfp -> Int (ALU), FP (LIB), VFP/NEON instructions allowed
-mfpu=vfp -> Int (ALU), FP (VFP)
-mfpu=neon -> Int (NEON), FP (NEON)
Change proposed:
-mfpmath=neon -mfpu=vfp -> Int (ALU), FP (NEON)
-mfpmath=vfp -mfpu=neon -> Int (NEON), FP (VFP)
This would be similar enough to GCC, and would allow the small number
of users that care about IEEE-754 compliance to disable FP NEON on
demand.
cheers,
--renato