Renato Golin via llvm-dev
2016-Feb-08 19:15 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
On 8 February 2016 at 16:33, James Molloy <James.Molloy at arm.com> wrote:> The loop vectorizer does indeed require -ffast-math, but the IEEE-nonconformant transforms it does are far greater than using an ISA which may FTZ. It needs -ffast-math because any FP reductions necessarily have their execution order shuffled, due to executing some of them in parallel and reducing to scalar at the end. Therefore the LV doesn’t need to be changed - it will only work when “fast” is given and will only emit “fast” vector instructions.Good point. This seems to be a much more rigorous definition in the new 2008 standard. Right now, the loop vectorizer produces vector code without -ffast-math. Are you saying we should disable it altogether for all architectures that claim to follow the new standard? Inner loops can be "vectorized" by SLP using only VFP instructions. The implementation seem to have moved to Inst->hasUnsafeAlgebra(), so we may need to return false in the legalization phase if the flag is omitted and any instruction has unsafe algebra.> The SLP vectoriser however should theoretically take non-fast scalars and produce non-fast vectors. Similarly people will hand-write vector IR, or generate it from other frontends.We can't guarantee the semantics of the unsafe-math flag in any IR that was not generated by a front-end which knows about it. So, it follows that we'll stop vectorizing their basic blocks, and there could be some outcry. We need some general consensus if that's what people want. I don't think we do.> Because of this, I think it’s important that we shouldn’t change the semantics of the IR currently. Making vector IR targeting ARM produce scalar instructions unless a modifier is given will undoubtedly cause problems down the line with frontends being out of sync or not being updated. Even worse, the symptom of this would just be “LLVM produces poor code for ARM” / “LLVM’s vector codegen is terrible for ARM” - performance errata and not conformance. That’s why I think changing to a full-strict-by-default approach would be bad for the project. > It would also violate the principle of least surprise - I wrote vector instructions and picked a vector ISA… but they’re being scalarized?Right, this is opposing to marking an instruction with unsafe by default (ie my second option). If that's so, I agree with you that it's not trivial and may create more problems than it solves. Hand written IR, inline ASM and intrinsics should remain for what they are. So 16274 is probably a "won't fix"?> My experience is that the number of people who care about pull IEEE compatibility on ARMv7 hardware is limited, and the set of people who care about exact ULP constraints even more limited. I think we absolutely should make a solution that solves PR16274, but I think it would have to be opt-in, not opt-out.And I'm guessing this is related to SLP and others. If so, I agree. So, For 16275, the fix is to disable loop vect. for no-fast-math + hasUnsafeAlgebra. For 16274, disabling NEON emission in SLP would be one way, but we must avoid any fiddling with inline asm and intrinsics, so I don't think we should be doing that in any generic way. Certainly not related to the example, from IR to instruction. Makes sense? --renato
James Molloy via llvm-dev
2016-Feb-08 19:25 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
Sorry, on phone so cherry picking what I reply to :> On 8 Feb 2016, at 19:15, Renato Golin <renato.golin at linaro.org> wrote: > > For 16275, the fix is to disable loop vect. for no-fast-math + hasUnsafeAlgebra.Do you think there is a set of people that care about IEEE accuracy in so far that they don't want FTZ, but *are* happy to reassociate FP operations? That seems fairly niche to me?
Stephen Canon via llvm-dev
2016-Feb-08 19:44 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
> On Feb 8, 2016, at 2:25 PM, James Molloy via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Sorry, on phone so cherry picking what I reply to : > >> On 8 Feb 2016, at 19:15, Renato Golin <renato.golin at linaro.org> wrote: >> >> For 16275, the fix is to disable loop vect. for no-fast-math + hasUnsafeAlgebra. > > Do you think there is a set of people that care about IEEE accuracy in so far that they don't want FTZ, but *are* happy to reassociate FP operations? That seems fairly niche to me?I agree. FZ is usually relatively benign (it only causes major problems when programs expect x != y to imply that x - y != 0, an axiom of floating-point that’s broken in FZ). Re-association more frequently causes significant instability. I think it’s reasonable for unsafeAlgebra to imply "FZ is an allowed mode”. – Steve
Renato Golin via llvm-dev
2016-Feb-08 20:51 UTC
[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
On 8 February 2016 at 19:25, James Molloy <James.Molloy at arm.com> wrote:>> For 16275, the fix is to disable loop vect. for no-fast-math + hasUnsafeAlgebra. > > Do you think there is a set of people that care about IEEE accuracy in so far that they don't want FTZ, but *are* happy to reassociate FP operations? That seems fairly niche to me?No. But I also don't want to disable the vectorizer for integer arithmetic. I'm guessing hasUnsafeAlgebra is not just for FZ but also NaNs and Infs, so disabling the vectorization of loops that have any of those unless safe-math is chosen seems simple enough to me. cheers, --renato
Possibly Parallel Threads
- Vectorization with fast-math on irregular ISA sub-sets
- Vectorization with fast-math on irregular ISA sub-sets
- Vectorization with fast-math on irregular ISA sub-sets
- Vectorization with fast-math on irregular ISA sub-sets
- Vectorization with fast-math on irregular ISA sub-sets