On Sun, Mar 6, 2011 at 10:11 AM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> While I was at it, I also took a stab at finishing up one of the TODOs.
I've
> attached the patch for review.
Comments inline.
For those of you following at home, this code is in
InstCombiner::WillNotOverflowSignedAdd(), and the first line of the
initial comment is:
// If one of the operands only has one non-zero bit, and if the
other operand> --- lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 126747)
> +++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy)
> @@ -77,9 +77,55 @@
> // has a known-zero bit in a more significant place than it (not
including the
> // sign bit) the ripple may go up to and fill the zero, but won't
change the
> // sign. For example, (X & ~4) + 1.
> -
> - // TODO: Implement.
> -
> +
> + int32_t power;
> +
> + {
> + int width = LHS->getType()->getScalarSizeInBits();
This should be an unsigned, like the result type of getScalarSizeInBits().
> + APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
> + mask.setAllBits();
An easier way to get an all-one mask would be to construct it as
'APInt mask(width, -1, true)'.
> + ComputeMaskedBits(LHS, mask, zeroes, ones);
> + zeroes.flipAllBits();
> +
> + if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) {
It would probably be cleaner to assign 'power' outside the condition.
I don't think you're handling the case where the only set bit is the
sign bit correctly; in that case the value is the most negative number
possible, so you only need to check that the other side is
non-negative. See below for how to handle this.
This is also the only place where above 'zeroes' is used at the
moment. It might be better not change zeroes itself here so it can be
reused in the next block (see below). Use something like '~zeroes
=ones'. Alternatively something like '(zeroes |
ones).isAllOnesValue()'
would work too, since they won't share any bits.
> + int width = RHS->getType()->getScalarSizeInBits();
This is the same as the previous width since the types of the LHS and
RHS are required to be equal for instructions like add. Just delete
this line.
> + APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
These shadow (have the same name as) the ones before. It's better to
reuse the mask and rename the other two. The earlier ones should be
LHSKnownZero and LHSKnownOne and these should be RHSKnownZero and
RHSKnownOne. This is more consistent with the way they're named
elsewhere in LLVM.
> + mask.clearAllBits();
This is redundant here because it's already been initialized to zero
in the constructor. If you'd reuse the old mask variable as suggested
above you'd need this, but see below.
> +
> + // Disregarding the sign bit
> + for (int i = (width - 2); i > power; i--)
> + mask.setBit(i);
I think this is equivalent to
if (power < width-2)
mask = APInt::getBitsSet(width, power+1, width-2);
else
mask.clearAllBits();
(This would mean the clearAllBits() above would again be redundant)
However, a nice way to handle the signbit-only case would be to wrap
that in an extra if as follows:
if (power == width - 1)
mask = APInt::getSignBit(width); // Alternatively: LHSKnownOne,
which should be equivalent.
else if // ... the code above ...
so that for signbit-only LHS the check below tests whether the RHS is
non-negative.
> + ComputeMaskedBits(RHS, mask, zeroes, ones);
> +
> + // At least one 0
> + if (zeroes.countPopulation())
This should be 'if (RHSKnownZero.getBoolValue())' / 'if
(!!RHSKnownZero)' or similar. You don't need the actual number of set
bits here, you just want to know whether it's zero.
> + return true;
> + }
> + }
> +
> + {
> + int width = RHS->getType()->getScalarSizeInBits();
This has already been calculated in the previous block. Reuse it.
> + APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
> + mask.setAllBits();
> + ComputeMaskedBits(RHS, mask, zeroes, ones);
If you calculated this beforehand and stored it in RHSKnownZero and
RHSKnownOne, you'd only need two ComputeMaskedBits calls instead of
four in this code. Inside the 'if's you can use e.g. (Mask &
RHSKnownZero) to get the values you're currently getting there.
> + zeroes.flipAllBits();
> +
> + if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) {
> + int width = LHS->getType()->getScalarSizeInBits();
> + APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
> + mask.clearAllBits();
> +
> + // Disregarding the sign bit
> + for (int i = (width - 2); i > power; i--)
> + mask.setBit(i);
> + ComputeMaskedBits(LHS, mask, zeroes, ones);
> +
> + // At least one 0
> + if (zeroes.countPopulation())
> + return true;
> + }
> + }
Much of the same comments apply to this block as to the one above it
since it appears to be duplicated code with LHS and RHS switched.
Depending on how much code remains in the end, it might be better to
factor it out to a static function taking the changing values as
parameters.
> +
> return false;
> }
You're missing tests for this new functionality. Add some to
test/Transforms/InstCombine/ after searching for
'WillNotOverflowSignedAdd' calls to see the patterns it's used for.