thr3ads.net - llvm dev - [LLVMdev] XOR optimization [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Daniel Nicácio

2011-Jul-26 01:25 UTC

[LLVMdev] XOR optimization

Hi folks,

I couldn't find a specific XOR (OR and AND) optimization on llvm, and
therefore I am about to implement it.
But first I would like to check with you guys that it really does not exist.

For a simple loop like this:

nbits = 128;
bit_addr = 0;
 while(nbits--)
{
 bindex=bit_addr>>5;     /* Index is number /32 */
bitnumb=bit_addr % 32;  /* Bit number in longword */
 bitmap[bindex]^=(1L<<bitnumb);
bit_addr++;
}


The -O3 set of optimizations generates a code like this:


entry:
  br label %while.body

while.body:                                       ; preds = %while.body,
%entry
  %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ]
  %shr = lshr i32 %0, 5
  %rem = and i32 %0, 28
  %shl = shl i32 1, %rem
  %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
  %tmp6 = load i32* %arrayidx, align 4
  %xor = xor i32 %tmp6, %shl
  %rem.1 = or i32 %rem, 1
  %shl.1 = shl i32 1, %rem.1
  %xor.1 = xor i32 %xor, %shl.1
  %rem.2 = or i32 %rem, 2
  %shl.2 = shl i32 1, %rem.2
  %xor.2 = xor i32 %xor.1, %shl.2
  %rem.3 = or i32 %rem, 3
  %shl.3 = shl i32 1, %rem.3
  %xor.3 = xor i32 %xor.2, %shl.3
  store i32 %xor.3, i32* %arrayidx, align 4
  %inc.3 = add i32 %0, 4
  %exitcond.3 = icmp eq i32 %inc.3, 128
  br i1 %exitcond.3, label %while.end, label %while.body

while.end:                                        ; preds = %while.body
  ret void



It is clear that we are able to fold all XORs into a single XOR, and the
same happens to all SHLs and ORs.
I am using -O3, but the code is not optimized, so I am assuming there is no
optimization for this case. Am I correct?

If yes, I have a few other questions:

 - Do you know of any other similar optimization that could do something
here but is not being triggered for some reason??
 - Do you know why a OR instruction is used for increments? instead of using
a INC or ADD?
 - Is there a straight forward way to know if an instruction belongs to a
loop? (just curiosity)


Thanks very much

Daniel Nicacio
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110725/6955c76b/attachment.html>

FlyLanguage

2011-Jul-26 10:21 UTC

head link

[LLVMdev] XOR optimization

Den 26.07.2011 03:25, skrev Daniel Nicácio:>   - Do you know why a OR instruction is used for increments? instead of
> using a INC or ADD?
You mean "bit_addr++;" ?

It's add'ing (+4 due to loop unrolling i guess)

   %inc.3 = add i32 %0, 4

Rafael Auler

2011-Jul-26 15:22 UTC

head link

[LLVMdev] XOR optimization

I think he mean the intermediate steps:

%rem.1 = or i32 %rem, 1
%rem.2 = or i32 %rem, 2
%rem.3 = or i32 %rem, 3

When calculating bit_addr % 32. Inside the loop, %rem should be incremented
because of the loop unrolling, but instead, the optimizer thought it's
easier to OR it. This gives the same result, but it's weird.

2011/7/26 FlyLanguage <flylanguage at gmail.com>
> Den 26.07.2011 03:25, skrev Daniel Nicácio:
> >   - Do you know why a OR instruction is used for increments? instead
of
> > using a INC or ADD?
>
> You mean "bit_addr++;" ?
>
> It's add'ing (+4 due to loop unrolling i guess)
>
>   %inc.3 = add i32 %0, 4
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110726/1a5bd89c/attachment.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Jul 2011 - [LLVMdev] XOR optimization

[LLVMdev] XOR optimization

[LLVMdev] XOR optimization

[LLVMdev] XOR optimization

Maybe Matching Threads