On Tue, Mar 8, 2011 at 6:19 AM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> Hi!
>
> I've attached a patch which takes care of the issues mentioned (and
adds two
> tests).
> Index: test/Transforms/InstCombine/sext.ll
> ==================================================================> ---
test/Transforms/InstCombine/sext.ll	(revision 127153)
> +++ test/Transforms/InstCombine/sext.ll	(working copy)
> @@ -126,3 +126,16 @@
>  ; CHECK-NEXT: store <2 x i16>
>  ; CHECK-NEXT: ret
>  }
> +
> +define i64 @test12(i32 %x) {
> +  %a = and i32 %x, -5
> +  %b = sext i32 %a to i64
> +  %c = add i64 %b, 1
> +  ret i64 %c
> +
> +; CHECK: @test12
> +; CHECK-NEXT: and i32
> +; CHECK-NEXT: add nsw
Why not check for 'i32' after the add? Isn't the entire point of
this
patch to shrink the add in cases like this?
> +; CHECK-NEXT: sext i32
> +; CHECK-NEXT: ret i64
> +}
> Index: test/Transforms/InstCombine/add-sitofp.ll
> ==================================================================> ---
test/Transforms/InstCombine/add-sitofp.ll	(revision 127153)
> +++ test/Transforms/InstCombine/add-sitofp.ll	(working copy)
> @@ -1,4 +1,5 @@
>  ; RUN: opt < %s -instcombine -S | grep {add nsw i32}
> +; RUN: opt < %s -instcombine -S | grep sitofp | count 2
When adding to old tests like this one it's better to migrate them
from grep to FileCheck.
RUN: opt < %s -instcombine -S | FileCheck
>
>  define double @x(i32 %a, i32 %b) nounwind {
CHECK: @x
CHECK: add nsw i32
>    %m = lshr i32 %a, 24
> @@ -7,3 +8,12 @@
>    %p = fadd double %o, 1.0
>    ret double %p
>  }
> +
> +define double @y(i32 %x, i32 %y) {
CHECK: @y
CHECK: <something>
Note that FileCheck is a nice way to perform more precise checking here :).
In particular, if I recall the transformation this is being used for
correctly, you want to check this does an integer addition instead of
a floating-point one?
Though in this case the 'add' later gets optimized to an 'or',
so
maybe you should replace -4 to -5 to keep the test clearer (and be
able to check for the 'nsw' flag).
> +  %p = and i32 %x, -4
> +  %q = and i32 %y,  1
CHECK: add nsw i32
CHECK: sitofp
CHECK-NOT: sitofp
CHECK-NOT: fadd
(Those last two are to confirm you removed the other sitofp and the
floating-point add)
> +  %a = sitofp i32 %p to double
> +  %b = sitofp i32 %q to double
> +  %result = fadd double %a, %b
> +  ret double %result
> +}
> Index: lib/Transforms/InstCombine/InstCombineAddSub.cpp
> ==================================================================> ---
lib/Transforms/InstCombine/InstCombineAddSub.cpp	(revision 127153)
> +++ lib/Transforms/InstCombine/InstCombineAddSub.cpp	(working copy)
> @@ -57,6 +57,30 @@
>  }
>
>
> +static bool RippleBucketExists(APInt &thisKnownZero, APInt
&thisKnownOne,
> +                               APInt &otherKnownZero, unsigned width)
{
> +  APInt mask;
> +  // First try to to take care of the case
> +  //  (X & ~4) + (Y & 1)
> +  int32_t power = (~thisKnownZero).exactLogBase2();
> +  if (power == -1) {
> +    if (~thisKnownZero == thisKnownOne)
> +      power = thisKnownOne.exactLogBase2();
Why did you introduce the extra log here? This assignment is a no-op
when ~thisKnownZero == thisKnownOne...
> +    if (power == -1)
... which means this check is redundant.
> +      return false;
> +  }
> +
> +  if (power == (width - 1))
Hmm.. I know I said 'width' should be unsigned, but now I get a
warning here for comparing a signed integer to an unsigned one (and
LLVM is supposed to compile without warnings).
So I guess changing width back to int isn't a disaster. (Who needs
integers with over 2 million bits anyway?)
Alternatively, since you know 'power' is non-negative here you can
safely cast it to unsigned for the comparison.
> +    mask = APInt::getSignBit(width);
> +  else
> +    mask = APInt::getBitsSet(width, power + 1, width - 2);
You removed the 'if (power < width-2)' case from the code I mentioned
as equivalent to your loop.
Because of this, if power + 1 > width - 2 (or equivalently in this
case, if power == width - 2) this will create a "wrapped" bit set: the
upper bit and the lower width-2 bits will be set, leaving only bit
width-2 zero. That isn't what you want, right?
> +
> +  if ((mask & otherKnownZero).getBoolValue())
> +    return true;
> +
> +  return false;
This can be more concisely written as just
  return (mask & otherKnownZero).getBoolValue();
> +}
> +
>  /// WillNotOverflowSignedAdd - Return true if we can prove that:
>  ///    (sext (add LHS, RHS))  === (add (sext LHS), (sext RHS))
>  /// This basically requires proving that the add in the original type
would not
> @@ -77,9 +101,19 @@
>    // has a known-zero bit in a more significant place than it (not
including the
>    // sign bit) the ripple may go up to and fill the zero, but won't
change the
>    // sign.  For example, (X & ~4) + 1.
> -
> -  // TODO: Implement.
> -
> +
> +  unsigned width = LHS->getType()->getScalarSizeInBits();
> +  APInt mask(width, -1, true), LHSKnownZero(width, 0), LHSKnownOne(width,
0),
> +    RHSKnownZero(width, 0), RHSKnownOne(width, 0);
> +
> +  ComputeMaskedBits(LHS, mask, LHSKnownZero, LHSKnownOne);
> +  ComputeMaskedBits(RHS, mask, RHSKnownZero, RHSKnownOne);
> +
> +  if (RippleBucketExists(LHSKnownZero, LHSKnownOne, RHSKnownZero, width))
> +    return true;
> +  if (RippleBucketExists(RHSKnownZero, RHSKnownOne, LHSKnownZero, width))
> +    return true;
> +
>    return false;
>  }