James Molloy via llvm-dev
2016-Feb-15 08:34 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
Hi,> James, is that a correct assessment?Yes, it is also my belief that the only way ARMv7 NEON differs from IEEE754 is lack of denormal support. James> On 11 Feb 2016, at 10:53, Renato Golin <renato.golin at linaro.org> wrote: > > Hal, > > I had a read on the ARM ARM about VFP and SIMD FP semantics and my > analysis is that NEON's only problem is the Flush-to-zero behaviour, > which is non-compliant. > > NEON deals with NaNs and Infs in the way specified by the standard and > should not cause any concern to us. But we don't seem to have a flag > specifically to denormals, so I think using the UnsafeMath is the > safest option for now. > > On 11 February 2016 at 01:15, Hal Finkel <hfinkel at anl.gov> wrote: >> nsz >> No Signed Zeros - Allow optimizations to treat the sign of a zero argument or result as insignificant. > > In both VFP and NEON, zero signs are significant. In NEON, the > flush-to-zero's zero will have the same sign as the input denormal. > > >> nnan >> No NaNs - Allow optimizations to assume the arguments and result are not NaN. Such optimizations are required to retain defined behavior over NaNs, but the value of the result is undefined. > > Both VFP and NEON treat NaNs as the standard requires, ie. [ NaN op ? ] = NaN. > > >> ninf >> No Infs - Allow optimizations to assume the arguments and result are not +/-Inf. Such optimizations are required to retain defined behavior over +/-Inf, but the value of the result is undefined. > > Same here. Operations with Inf generate Inf or NaNs on both units. > > The flush-to-zero behaviour has an effect on both NaNs and Infs, since > it happens before. So a denormal operation with an Inf in VFP will not > generate a NaN, while in NEON it'll be flushed to zero first, thus > generating NaNs. > > James, is that a correct assessment? > > cheers, > --renato >
Stephen Canon via llvm-dev
2016-Feb-15 14:22 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
> On Feb 15, 2016, at 3:34 AM, James Molloy via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > >> James, is that a correct assessment? > > Yes, it is also my belief that the only way ARMv7 NEON differs from IEEE754 is lack of denormal support.- ARMv7 NEON ignores the rounding mode set in bits 23:22 of FPSCR and always uses round to nearest. - ARMv7 NEON ignores the trap enable bits (15:8) in FPSCR and always uses default exception handling. As with denormal support, the issue at hand is not so much that these differ from IEEE 754 as it is that they differ from the behavior of the scalar (VFP) arithmetic. - Steve
Renato Golin via llvm-dev
2016-Feb-21 12:14 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
On 15 February 2016 at 14:22, Stephen Canon <scanon at apple.com> wrote:> - ARMv7 NEON ignores the rounding mode set in bits 23:22 of FPSCR and always uses round to nearest. > - ARMv7 NEON ignores the trap enable bits (15:8) in FPSCR and always uses default exception handling.If I read the manuals correctly, these are not strictly defined on IEEE 754 to be one way or another, so these don't violate the standard. The subnormal treatment does.> As with denormal support, the issue at hand is not so much that these differ from IEEE 754 as it is that they differ from the behavior of the scalar (VFP) arithmetic.This one of the practical consequences, yes, but of no relevance to this work. Right now, I'm only trying to avoid surprises. If a user has different results using -ffast-math, it's expected. Without, not so much. cheers, --renato