On Tue, Mar 8, 2011 at 6:19 AM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> Hi!
>
> I've attached a patch which takes care of the issues mentioned (and
adds two
> tests).
> Index: test/Transforms/InstCombine/sext.ll
> ==================================================================> ---
test/Transforms/InstCombine/sext.ll (revision 127153)
> +++ test/Transforms/InstCombine/sext.ll (working copy)
> @@ -126,3 +126,16 @@
> ; CHECK-NEXT: store <2 x i16>
> ; CHECK-NEXT: ret
> }
> +
> +define i64 @test12(i32 %x) {
> + %a = and i32 %x, -5
> + %b = sext i32 %a to i64
> + %c = add i64 %b, 1
> + ret i64 %c
> +
> +; CHECK: @test12
> +; CHECK-NEXT: and i32
> +; CHECK-NEXT: add nsw
Why not check for 'i32' after the add? Isn't the entire point of
this
patch to shrink the add in cases like this?
> +; CHECK-NEXT: sext i32
> +; CHECK-NEXT: ret i64
> +}
> Index: test/Transforms/InstCombine/add-sitofp.ll
> ==================================================================> ---
test/Transforms/InstCombine/add-sitofp.ll (revision 127153)
> +++ test/Transforms/InstCombine/add-sitofp.ll (working copy)
> @@ -1,4 +1,5 @@
> ; RUN: opt < %s -instcombine -S | grep {add nsw i32}
> +; RUN: opt < %s -instcombine -S | grep sitofp | count 2
When adding to old tests like this one it's better to migrate them
from grep to FileCheck.
RUN: opt < %s -instcombine -S | FileCheck
>
> define double @x(i32 %a, i32 %b) nounwind {
CHECK: @x
CHECK: add nsw i32
> %m = lshr i32 %a, 24
> @@ -7,3 +8,12 @@
> %p = fadd double %o, 1.0
> ret double %p
> }
> +
> +define double @y(i32 %x, i32 %y) {
CHECK: @y
CHECK: <something>
Note that FileCheck is a nice way to perform more precise checking here :).
In particular, if I recall the transformation this is being used for
correctly, you want to check this does an integer addition instead of
a floating-point one?
Though in this case the 'add' later gets optimized to an 'or',
so
maybe you should replace -4 to -5 to keep the test clearer (and be
able to check for the 'nsw' flag).
> + %p = and i32 %x, -4
> + %q = and i32 %y, 1
CHECK: add nsw i32
CHECK: sitofp
CHECK-NOT: sitofp
CHECK-NOT: fadd
(Those last two are to confirm you removed the other sitofp and the
floating-point add)
> + %a = sitofp i32 %p to double
> + %b = sitofp i32 %q to double
> + %result = fadd double %a, %b
> + ret double %result
> +}
> Index: lib/Transforms/InstCombine/InstCombineAddSub.cpp
> ==================================================================> ---
lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 127153)
> +++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy)
> @@ -57,6 +57,30 @@
> }
>
>
> +static bool RippleBucketExists(APInt &thisKnownZero, APInt
&thisKnownOne,
> + APInt &otherKnownZero, unsigned width)
{
> + APInt mask;
> + // First try to to take care of the case
> + // (X & ~4) + (Y & 1)
> + int32_t power = (~thisKnownZero).exactLogBase2();
> + if (power == -1) {
> + if (~thisKnownZero == thisKnownOne)
> + power = thisKnownOne.exactLogBase2();
Why did you introduce the extra log here? This assignment is a no-op
when ~thisKnownZero == thisKnownOne...
> + if (power == -1)
... which means this check is redundant.
> + return false;
> + }
> +
> + if (power == (width - 1))
Hmm.. I know I said 'width' should be unsigned, but now I get a
warning here for comparing a signed integer to an unsigned one (and
LLVM is supposed to compile without warnings).
So I guess changing width back to int isn't a disaster. (Who needs
integers with over 2 million bits anyway?)
Alternatively, since you know 'power' is non-negative here you can
safely cast it to unsigned for the comparison.
> + mask = APInt::getSignBit(width);
> + else
> + mask = APInt::getBitsSet(width, power + 1, width - 2);
You removed the 'if (power < width-2)' case from the code I mentioned
as equivalent to your loop.
Because of this, if power + 1 > width - 2 (or equivalently in this
case, if power == width - 2) this will create a "wrapped" bit set: the
upper bit and the lower width-2 bits will be set, leaving only bit
width-2 zero. That isn't what you want, right?
> +
> + if ((mask & otherKnownZero).getBoolValue())
> + return true;
> +
> + return false;
This can be more concisely written as just
return (mask & otherKnownZero).getBoolValue();
> +}
> +
> /// WillNotOverflowSignedAdd - Return true if we can prove that:
> /// (sext (add LHS, RHS)) === (add (sext LHS), (sext RHS))
> /// This basically requires proving that the add in the original type
would not
> @@ -77,9 +101,19 @@
> // has a known-zero bit in a more significant place than it (not
including the
> // sign bit) the ripple may go up to and fill the zero, but won't
change the
> // sign. For example, (X & ~4) + 1.
> -
> - // TODO: Implement.
> -
> +
> + unsigned width = LHS->getType()->getScalarSizeInBits();
> + APInt mask(width, -1, true), LHSKnownZero(width, 0), LHSKnownOne(width,
0),
> + RHSKnownZero(width, 0), RHSKnownOne(width, 0);
> +
> + ComputeMaskedBits(LHS, mask, LHSKnownZero, LHSKnownOne);
> + ComputeMaskedBits(RHS, mask, RHSKnownZero, RHSKnownOne);
> +
> + if (RippleBucketExists(LHSKnownZero, LHSKnownOne, RHSKnownZero, width))
> + return true;
> + if (RippleBucketExists(RHSKnownZero, RHSKnownOne, LHSKnownZero, width))
> + return true;
> +
> return false;
> }