Displaying 3 results from an estimated 3 matches for "pssld".
Did you mean:
pslld
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
...t;4 x float> %a, %b
%sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get:
addps %xmm1, %xmm2
pslld $31, %xmm0
blendvps %xmm2, %xmm1
movaps %xmm1, %xmm0
ret
While this is correct and works, I'm unhappy with the pssld. Apparently,
LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
bit. But blendvps expects the MSB as mask bit and therefore the shift.
OK, let's try better. This time, I will directly use <4 x i32>:
define <4 x float> @masked_add_32(<4 x i32>...
2013 Feb 26
0
[LLVMdev] passing vector of booleans to functions
...k, <4 x float> %add, <4 x float> %a
> ret <4 x float> %sel
> }
>
> I will get:
>
> addps %xmm1, %xmm2
> pslld $31, %xmm0
> blendvps %xmm2, %xmm1
> movaps %xmm1, %xmm0
> ret
>
> While this is correct and works, I'm unhappy with the pssld. Apparently,
> LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
> bit. But blendvps expects the MSB as mask bit and therefore the shift.
try plunking a signext attribute on the mask parameter. That's supposed to tell
the code generators that the caller...
2013 Feb 26
1
[LLVMdev] passing vector of booleans to functions
...loat> %sel
> >
> > }
> >
> > I will get:
> >
> > addps %xmm1, %xmm2
> > pslld $31, %xmm0
> > blendvps %xmm2, %xmm1
> > movaps %xmm1, %xmm0
> > ret
> >
> > While this is correct and works, I'm unhappy with the pssld. Apparently,
> > LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
> > bit. But blendvps expects the MSB as mask bit and therefore the shift.
>
> try plunking a signext attribute on the mask parameter. That's supposed to
> tell the code g...