search for: d8943

Displaying 3 results from an estimated 3 matches for "d8943".

Did you mean: 18943
2015 Nov 19
5
[RFC] Introducing a vector reduction add instruction.
...1, %xmm0 # xmm0 = xmm1[2,3,0,1] paddd %xmm1, %xmm0 pshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3] paddd %xmm0, %xmm1 movd %xmm1, %eax retq Note that due to smaller VF we are using now (currently 4), we could not explore the most benefit of psadbw. The patch in http://reviews.llvm.org/D8943 has enables us to use bigger VFs based on the smallest type in a loop. The follow-up work is refining the cost model to let bigger VFs have less cost. For the example above the best result is from VF >=16. The draft of the patch is here: http://reviews.llvm.org/D14840 I will refine the patch l...
2015 Nov 25
2
[RFC] Introducing a vector reduction add instruction.
...%xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3] >> paddd %xmm0, %xmm1 >> movd %xmm1, %eax >> retq >> >> >> Note that due to smaller VF we are using now (currently 4), we could >> not explore the most benefit of psadbw. The patch in >> http://reviews.llvm.org/D8943 has enables us to use bigger VFs based >> on the smallest type in a loop. The follow-up work is refining the >> cost model to let bigger VFs have less cost. For the example above >> the >> best result is from VF >=16. >> >> The draft of the patch is here: http...
2015 Nov 25
2
[RFC] Introducing a vector reduction add instruction.
...addd %xmm0, %xmm1 > >> movd %xmm1, %eax > >> retq > >> > >> > >> Note that due to smaller VF we are using now (currently 4), we > >> could > >> not explore the most benefit of psadbw. The patch in > >> http://reviews.llvm.org/D8943 has enables us to use bigger VFs > >> based > >> on the smallest type in a loop. The follow-up work is refining the > >> cost model to let bigger VFs have less cost. For the example above > >> the > >> best result is from VF >=16. > >> >...