thr3ads.net - llvm dev - [LLVMdev] passing vector of booleans to functions [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Roland Leißa

2013-Feb-26 02:48 UTC

[LLVMdev] passing vector of booleans to functions

Hi all,

I'm currently trying to figure out the best way to pass vector of
booleans to other functions. Take this small example:

define <4 x float> @vcmp_add(<4 x float> %a, <4 x float> %b) {
entry:
  %cmp = fcmp olt <4 x float> %a, %b
  %add = fadd <4 x float> %a, %b
  %sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float>
%a
  ret <4 x float> %sel
}

I will get (on SSE):
	movaps	%xmm0, %xmm2
	cmpltps	%xmm1, %xmm0
	addps	%xmm2, %xmm1
	blendvps	%xmm1, %xmm2
	movaps	%xmm2, %xmm0
	ret

great :)
But now, let us try to pass a mask to a function.

define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float>
%a, <4 x float> %b) {
entry:
  %add = fadd <4 x float> %a, %b
  %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float>
%a
  ret <4 x float> %sel
}

I will get:

addps   %xmm1, %xmm2
pslld   $31, %xmm0
blendvps    %xmm2, %xmm1
movaps  %xmm1, %xmm0
ret

While this is correct and works, I'm unhappy with the pssld. Apparently,
LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the
mask
bit. But blendvps expects the MSB as mask bit and therefore the shift.

OK, let's try better. This time, I will directly use <4 x i32>:

define <4 x float> @masked_add_32(<4 x i32> %mask, <4 x float>
%a, <4 x float> %b)
{
entry:
  %add = fadd <4 x float> %a, %b
  %trunc = trunc <4 x i32> %mask to <4 x i1>
  %sel = select <4 x i1> %trunc, <4 x float> %add, <4 x float>
%a
  ret <4 x float> %sel
}

But damn, I have to truncate the mask in order to use the select. So in
the end, LLVM will produce the same code as above. So what code do I
have to use, in order to get rid of the shift? 

If there would be a way to somehow tell LLVM that each element of %mask
is guaranteed to be 0xFFFFFFFF or 0x0...

Thanks,
Roland

Duncan Sands

2013-Feb-26 09:02 UTC

head link

[LLVMdev] passing vector of booleans to functions

Hi Roland,

 > define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x
float> %a, <4 x float>
%b) {> entry:
>    %add = fadd <4 x float> %a, %b
>    %sel = select <4 x i1> %mask, <4 x float> %add, <4 x
float> %a
>    ret <4 x float> %sel
> }
>
> I will get:
>
> addps   %xmm1, %xmm2
> pslld   $31, %xmm0
> blendvps    %xmm2, %xmm1
> movaps  %xmm1, %xmm0
> ret
>
> While this is correct and works, I'm unhappy with the pssld.
Apparently,
> LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds
the mask
> bit. But blendvps expects the MSB as mask bit and therefore the shift.
try plunking a signext attribute on the mask parameter.  That's supposed to
tell
the code generators that the caller passed in an all-zero or all-one value.

Ciao, Duncan.

Roland Leißa

2013-Feb-26 13:18 UTC

head link

[LLVMdev] passing vector of booleans to functions

Hi Duncan,

thanks for the hint. I tried both variants:

define <4 x float> @masked_add_1(<4 x i1> signext %mask, <4 x
float> %a, <4 x
float> %b)
define <4 x float> @masked_add_32(<4 x i32> %mask, <4 x float>
%a, <4 x float> %b)

Unfortunately, this will raise an assertion:
Wrong types for attribute: zeroext signext noalias nocapture sret byval nest


Should I file a bug report?

-- 
Roland

On Tuesday 26 February 2013 10:02:22 Duncan Sands wrote:> Hi Roland,
> 
>  > define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x
float> %a, <4 x
>  > float>
> %b) {
> 
> > entry:
> >    %add = fadd <4 x float> %a, %b
> >    %sel = select <4 x i1> %mask, <4 x float> %add, <4 x
float> %a
> >    ret <4 x float> %sel
> > 
> > }
> > 
> > I will get:
> > 
> > addps   %xmm1, %xmm2
> > pslld   $31, %xmm0
> > blendvps    %xmm2, %xmm1
> > movaps  %xmm1, %xmm0
> > ret
> > 
> > While this is correct and works, I'm unhappy with the pssld.
Apparently,
> > LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB
holds the mask
> > bit. But blendvps expects the MSB as mask bit and therefore the shift.
> 
> try plunking a signext attribute on the mask parameter.  That's
supposed to
> tell the code generators that the caller passed in an all-zero or all-one
> value.
> 
> Ciao, Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Feb 2013 - [LLVMdev] passing vector of booleans to functions

[LLVMdev] passing vector of booleans to functions

[LLVMdev] passing vector of booleans to functions

[LLVMdev] passing vector of booleans to functions

Possibly Parallel Threads