Sanjay Patel via llvm-dev
2020-Dec-02 13:16 UTC
[llvm-dev] llvm 10: Why is float experimental_vector_reduce_fmin not tried?
I agree with your guess: the loop vectorizer doesn't know how to match the 'minnum' intrinsics into a reduction yet. The SLP vectorizer is missing that functionality too. We need to update/consolidate both to recognize the FP min/max intrinsics as well as the recently added integer min/max intrinsics ( http://llvm.org/docs/LangRef.html#llvm-smax-intrinsic ). cc'ing Craig to see if anything has happened since: https://reviews.llvm.org/rGc195ae2 I just changed the x86 cost model to remove what could have been another roadblock: https://reviews.llvm.org/rG136f98e52365 We may need to extend that kind of cost model fix-up to other targets. On Tue, Nov 24, 2020 at 4:17 PM Mark Schimmel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> LLVM vectorizes this same function for floating point addition just fine > (uses experimental_vector_reduce_v2_fadd), but refuses to do the same for > minf(). Does anyone have any insight why that would be? I’m using > -ffast-math but that doesn’t seem to help. > > > > From grep’ing the sources the best I can figure is that some logic exists > for Instruction::FCmp but perhaps not for Intrinsic:: minnum. Is that the > case? > > > > ; Function Attrs: norecurse nounwind readonly > > define float @f(float addrspace(4)* noalias nocapture readonly %a, float > addrspace(4)* noalias nocapture readonly %b, float %m) local_unnamed_addr > #0 { > > entry: > > br label %for.body > > > > for.cond.cleanup: ; preds = %for.body > > ret float %3 > > > > for.body: ; preds = %entry, > %for.body > > %m.addr.024 = phi float [ %m, %entry ], [ %3, %for.body ] ; [#uses=1 > type=float] > > %i.023 = phi i32 [ 0, %entry ], [ %inc, %for.body ] ; [#uses=3 type=i32] > > %arrayidx = getelementptr inbounds float, float addrspace(4)* %a, i32 > %i.023 ; [#uses=1 type=float addrspace(4)*] > > %0 = load float, float addrspace(4)* %arrayidx, align 4, !tbaa !3 ; > [#uses=1 type=float] > > %arrayidx1 = getelementptr inbounds float, float addrspace(4)* %b, i32 > %i.023 ; [#uses=1 type=float addrspace(4)*] > > %1 = load float, float addrspace(4)* %arrayidx1, align 4, !tbaa !3 ; > [#uses=1 type=float] > > %2 = tail call fast float @llvm.minnum.f32(float %0, float %1) ; > [#uses=1 type=float] > > %3 = tail call fast float @llvm.minnum.f32(float %m.addr.024, float %2) > ; [#uses=2 type=float] > > %inc = add nuw nsw i32 %i.023, 1 ; [#uses=2 type=i32] > > %cmp = icmp ult i32 %inc, 8192 ; [#uses=1 type=i1] > > br i1 %cmp, label %for.body, label %for.cond.cleanup, !llvm.loop !7 > > } > > > > LV: Checking a loop in "f" from /path/to/x.c > > LV: Loop hints: force=enabled width=0 unroll=0 optspace=0 > > LV: Found a loop: for.body > > *LV: Not vectorizing: Found an unidentified PHI %m.addr.024 = phi float > [ %m, %entry ], [ %3, %for.body ] ; [#uses=1 type=float]* > > LV: Interleaving disabled by the pass manager > > LV: Can't vectorize the instructions or CFG > > LV: Not vectorizing: Cannot prove legality. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201202/40700441/attachment.html>