thr3ads.net - llvm dev - [LLVMdev] branch on vector compare? [Sep 2012]

If this information is useful, please help other people find it:
Share via:

Stephen

2012-Sep-02 22:40 UTC

[LLVMdev] branch on vector compare?

Hi all, llvm newbie here.

I'm trying to branch based on a vector compare. I've found a slow way
(below)
which goes through memory. Is there some idiom I'm missing so that it would
use
for instance movmsk for SSE or vcmpgt & cr6 for altivec?

Or do I need to resort to calling the intrinsic directly? 

Thanks,
Stephen.

  %16 = fcmp ogt <4 x float> %15, %cr
  %17 = extractelement <4 x i1> %16, i32 0
  %18 = extractelement <4 x i1> %16, i32 1
  %19 = extractelement <4 x i1> %16, i32 2
  %20 = extractelement <4 x i1> %16, i32 3
  %21 = or i1 %17, %18
  %22 = or i1 %19, %20
  %23 = or i1 %21, %22
  br i1 %23, label %true1, label %false2

Duncan Sands

2012-Sep-03 08:12 UTC

head link

[LLVMdev] branch on vector compare?

Hi Stephen,
> Hi all, llvm newbie here.
welcome!
> I'm trying to branch based on a vector compare. I've found a slow
way (below)
> which goes through memory. Is there some idiom I'm missing so that it
would use
> for instance movmsk for SSE or vcmpgt & cr6 for altivec?
I don't think you are missing anything: LLVM IR has no support for
horizontal
operations like or'ing the elements of a vector of boolean together.  The
code
generators do try to recognize a few idioms and synthesize horizontal
operations from them, but I think only addition is currently recognized, and
it expects the addition to be done (IIRC) by using shufflevector to split the
vector into two, followed by an addition of the two halves, repeatedly.  In
fact for your case you could do something similar:
   %lo1 = shufflevector <4 x i1> %16, <4 x i1> undef, <2 x
i32> <i32 0, i32 1>
   %hi1 = shufflevector <4 x i1> %16, <4 x i1> undef, <2 x
i32> <i32 2, i32 3>
   %join = or <2 x i1> %lo1, %hi1
   %lo2 = extractelement <2 x i1> %join, i32 0
   %hi2 = extractelement <2 x i1> %join, i32 1
   %final = or i1 %lo2, %hi2
Currently I would expect the code generators to produce something nasty for
this.  Feel free to open a bugreport requesting that the code generators do
something better.

Ciao, Duncan.
>
> Or do I need to resort to calling the intrinsic directly?
>
> Thanks,
> Stephen.
>
>    %16 = fcmp ogt <4 x float> %15, %cr
>    %17 = extractelement <4 x i1> %16, i32 0
>    %18 = extractelement <4 x i1> %16, i32 1
>    %19 = extractelement <4 x i1> %16, i32 2
>    %20 = extractelement <4 x i1> %16, i32 3
>    %21 = or i1 %17, %18
>    %22 = or i1 %19, %20
>    %23 = or i1 %21, %22
>    br i1 %23, label %true1, label %false2
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Stephen

2012-Sep-03 22:08 UTC

head link

[LLVMdev] branch on vector compare?

> > which goes through memory. Is there some idiom I'm missing so that
it would
use> > for instance movmsk for SSE or vcmpgt & cr6 for altivec?
> 
> I don't think you are missing anything: LLVM IR has no support for
horizontal
> operations like or'ing the elements of a vector of boolean together. 
The code
> generators do try to recognize a few idioms and synthesize horizontal
> operations from them, but I think only addition is currently recognized,
and
Thanks Duncan,

you're right - that does compile to a mess of spills to memory not
unlike the original.

I went to have a look at this further: It seems the existing SelectInst
is pretty close to what is needed.
Value IRBuilder::*CreateSelect(Value *C, Value *True, Value *False,
const Twine &Name)
Currently, this asserts that the True & False are both vector types of
the same size as "C". I was thinking of weakening this condition so
that
if True and False are both i1 types, it will be allowed and will result
in something which can be branched on.

I have quite a bit of reading ahead it seems!
Stephen.

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Sep 2012 - [LLVMdev] branch on vector compare?

[LLVMdev] branch on vector compare?

[LLVMdev] branch on vector compare?

[LLVMdev] branch on vector compare?

Reasonably Related Threads