Displaying 3 results from an estimated 3 matches for "masked_add_1".
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
...p, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get (on SSE):
movaps %xmm0, %xmm2
cmpltps %xmm1, %xmm0
addps %xmm2, %xmm1
blendvps %xmm1, %xmm2
movaps %xmm2, %xmm0
ret
great :)
But now, let us try to pass a mask to a function.
define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) {
entry:
%add = fadd <4 x float> %a, %b
%sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get:
addps %xmm1, %xmm2
pslld $31, %xmm0
blendvps %xmm2...
2013 Feb 26
0
[LLVMdev] passing vector of booleans to functions
Hi Roland,
> define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float>
%b) {
> entry:
> %add = fadd <4 x float> %a, %b
> %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
> ret <4 x float> %sel
> }
>
> I will get:
>
> addps %x...
2013 Feb 26
1
[LLVMdev] passing vector of booleans to functions
Hi Duncan,
thanks for the hint. I tried both variants:
define <4 x float> @masked_add_1(<4 x i1> signext %mask, <4 x float> %a, <4 x
float> %b)
define <4 x float> @masked_add_32(<4 x i32> %mask, <4 x float> %a, <4 x float> %b)
Unfortunately, this will raise an assertion:
Wrong types for attribute: zeroext signext noalias nocapture sret byval...