search for: vecinit

Displaying 10 results from an estimated 10 matches for "vecinit".

2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...ather than on the input vectors themselves. In the PR you linked, there is an example that shows the difference (simplified to <2 x double> for brevity): define dso_local <2 x double> @test(i64 %a, i64 %b) { entry: %conv = uitofp i64 %a to double %conv1 = uitofp i64 %b to double %vecinit = insertelement <2 x double> undef, double %conv, i32 0 %vecinit2 = insertelement <2 x double> %vecinit, double %conv1, i32 1 ret <2 x double> %vecinit2 } The inputs here are scalars so I suppose it is quite possible (perhaps likely) that on some targets, doing the insert wit...
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...you linked, there is an example that shows the difference >> (simplified to <2 x double> for brevity): >> define dso_local <2 x double> @test(i64 %a, i64 %b) { >> entry: >> %conv = uitofp i64 %a to double >> %conv1 = uitofp i64 %b to double >> %vecinit = insertelement <2 x double> undef, double %conv, i32 0 >> %vecinit2 = insertelement <2 x double> %vecinit, double %conv1, i32 1 >> ret <2 x double> %vecinit2 >> } >> >> The inputs here are scalars so I suppose it is quite possible (perhaps >&g...
2012 Feb 28
1
[LLVMdev] How to vectorize a vector type cast?
...it-llvm" and then through "opt -O2 -S", produces the following IR: define <4 x float> @to_float4(i32 %in.coerce) nounwind uwtable readnone { entry: %0 = bitcast i32 %in.coerce to <4 x i8> %1 = extractelement <4 x i8> %0, i32 0 %conv = uitofp i8 %1 to float %vecinit = insertelement <4 x float> undef, float %conv, i32 0 %2 = extractelement <4 x i8> %0, i32 1 %conv2 = uitofp i8 %2 to float %vecinit3 = insertelement <4 x float> %vecinit, float %conv2, i32 1 %3 = extractelement <4 x i8> %0, i32 2 %conv4 = uitofp i8 %3 to float %...
2020 Jan 10
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
I have added a few PPC-specific DAG combines in the past that follow this pattern on specific operations. Now that it appears that this would be useful to do on yet another operation, I'm wondering what people think about doing this in the target-independent DAG Combiner for any legal/custom operation on the target. TL; DR; The generic pattern would look like this: (build_vector (op
2017 Sep 13
2
RFC phantom memory intrinsic
...nd we don't want to keep it. BTW: Looks like SLP could not recognize the case either : define <4 x double> @vsht_d4_fold(double* %ptr, i64 %i) local_unnamed_addr #0 { entry: %arrayidx = getelementptr inbounds double, double* %ptr, i64 %i %0 = load double, double* %arrayidx, align 8 %vecinit = insertelement <4 x double> undef, double %0, i32 0 %add = add i64 %i, 1 %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add %1 = load double, double* %arrayidx1, align 8 %vecinit2 = insertelement <4 x double> %vecinit, double %1, i32 1 %add3 = add i64 %i, 2...
2017 Sep 13
2
RFC phantom memory intrinsic
...SLP could not recognize the case either : >> define <4 x double> @vsht_d4_fold(double* %ptr, i64 %i) local_unnamed_addr #0 { >> entry: >> %arrayidx = getelementptr inbounds double, double* %ptr, i64 %i >> %0 = load double, double* %arrayidx, align 8 >> %vecinit = insertelement <4 x double> undef, double %0, i32 0 >> %add = add i64 %i, 1 >> %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add >> %1 = load double, double* %arrayidx1, align 8 >> %vecinit2 = insertelement <4 x double> %vecinit, dou...
2017 Sep 26
0
RFC phantom memory intrinsic
...e the case either : >>> define <4 x double> @vsht_d4_fold(double* %ptr, i64 %i) local_unnamed_addr #0 { >>> entry: >>> %arrayidx = getelementptr inbounds double, double* %ptr, i64 %i >>> %0 = load double, double* %arrayidx, align 8 >>> %vecinit = insertelement <4 x double> undef, double %0, i32 0 >>> %add = add i64 %i, 1 >>> %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add >>> %1 = load double, double* %arrayidx1, align 8 >>> %vecinit2 = insertelement <4 x doub...
2017 Sep 26
2
RFC phantom memory intrinsic
...define <4 x double> @vsht_d4_fold(double* %ptr, i64 %i) >>>> local_unnamed_addr #0 { >>>> entry: >>>> %arrayidx = getelementptr inbounds double, double* %ptr, i64 %i >>>> %0 = load double, double* %arrayidx, align 8 >>>> %vecinit = insertelement <4 x double> undef, double %0, i32 0 >>>> %add = add i64 %i, 1 >>>> %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add >>>> %1 = load double, double* %arrayidx1, align 8 >>>> %vecinit2 = insertelem...
2017 Sep 26
0
RFC phantom memory intrinsic
...t; @vsht_d4_fold(double* %ptr, i64 %i) >>>>> local_unnamed_addr #0 { >>>>> entry: >>>>> %arrayidx = getelementptr inbounds double, double* %ptr, i64 %i >>>>> %0 = load double, double* %arrayidx, align 8 >>>>> %vecinit = insertelement <4 x double> undef, double %0, i32 0 >>>>> %add = add i64 %i, 1 >>>>> %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add >>>>> %1 = load double, double* %arrayidx1, align 8 >>>>> %v...
2017 Sep 12
3
RFC phantom memory intrinsic
Hi, For PR21780 solution, I plan to add a new functionality to restore memory operations that was once deleted, in this particular case it is the load operations that were deleted by InstCombine, please note that once the load was removed there is no way to restore it back and that prevents us from vectorizing the shuffle operation. There are probably more similar issues where this approach could