thr3ads.net - llvm dev - [llvm-dev] llvm 10: Why is float experimental_vector_reduce

If this information is useful, please help other people find it:
Share via:

Sanjay Patel via llvm-dev

2020-Dec-02 13:16 UTC

[llvm-dev] llvm 10: Why is float experimental_vector_reduce_fmin not tried?

I agree with your guess: the loop vectorizer doesn't know how to match the
'minnum' intrinsics into a reduction yet. The SLP vectorizer is missing
that functionality too. We need to update/consolidate both to recognize the
FP min/max intrinsics as well as the recently added integer min/max
intrinsics ( http://llvm.org/docs/LangRef.html#llvm-smax-intrinsic ).

cc'ing Craig to see if anything has happened since:
https://reviews.llvm.org/rGc195ae2

I just changed the x86 cost model to remove what could have been another
roadblock:
https://reviews.llvm.org/rG136f98e52365
We may need to extend that kind of cost model fix-up to other targets.

On Tue, Nov 24, 2020 at 4:17 PM Mark Schimmel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> LLVM vectorizes this same function for floating point addition just fine
> (uses experimental_vector_reduce_v2_fadd), but refuses to do the same for
> minf(). Does anyone have any insight why that would be? I’m using
> -ffast-math but that doesn’t seem to help.
>
>
>
> From grep’ing the sources the best I can figure is that some logic exists
> for Instruction::FCmp but perhaps not for Intrinsic:: minnum. Is that the
> case?
>
>
>
> ; Function Attrs: norecurse nounwind readonly
>
> define float @f(float addrspace(4)* noalias nocapture readonly %a, float
> addrspace(4)* noalias nocapture readonly %b, float %m) local_unnamed_addr
> #0 {
>
> entry:
>
>   br label %for.body
>
>
>
> for.cond.cleanup:                                 ; preds = %for.body
>
>   ret float %3
>
>
>
> for.body:                                         ; preds = %entry,
> %for.body
>
>   %m.addr.024 = phi float [ %m, %entry ], [ %3, %for.body ] ; [#uses=1
> type=float]
>
>   %i.023 = phi i32 [ 0, %entry ], [ %inc, %for.body ] ; [#uses=3 type=i32]
>
>   %arrayidx = getelementptr inbounds float, float addrspace(4)* %a, i32
> %i.023 ; [#uses=1 type=float addrspace(4)*]
>
>   %0 = load float, float addrspace(4)* %arrayidx, align 4, !tbaa !3 ;
> [#uses=1 type=float]
>
>   %arrayidx1 = getelementptr inbounds float, float addrspace(4)* %b, i32
> %i.023 ; [#uses=1 type=float addrspace(4)*]
>
>   %1 = load float, float addrspace(4)* %arrayidx1, align 4, !tbaa !3 ;
> [#uses=1 type=float]
>
>   %2 = tail call fast float @llvm.minnum.f32(float %0, float %1) ;
> [#uses=1 type=float]
>
>   %3 = tail call fast float @llvm.minnum.f32(float %m.addr.024, float %2)
> ; [#uses=2 type=float]
>
>   %inc = add nuw nsw i32 %i.023, 1                ; [#uses=2 type=i32]
>
>   %cmp = icmp ult i32 %inc, 8192                  ; [#uses=1 type=i1]
>
>   br i1 %cmp, label %for.body, label %for.cond.cleanup, !llvm.loop !7
>
> }
>
>
>
> LV: Checking a loop in "f" from /path/to/x.c
>
> LV: Loop hints: force=enabled width=0 unroll=0 optspace=0
>
> LV: Found a loop: for.body
>
> *LV: Not vectorizing: Found an unidentified PHI   %m.addr.024 = phi float
> [ %m, %entry ], [ %3, %for.body ] ; [#uses=1 type=float]*
>
> LV: Interleaving disabled by the pass manager
>
> LV: Can't vectorize the instructions or CFG
>
> LV: Not vectorizing: Cannot prove legality.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201202/40700441/attachment.html>

llvm dev - Dec 2020 - llvm 10: Why is float experimental_vector_reduce_fmin not tried?

[llvm-dev] llvm 10: Why is float experimental_vector_reduce_fmin not tried?