thr3ads.net - llvm dev - [llvm-dev] Branch is not optimized because of right shift [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Craig Topper via llvm-dev

2020-Apr-06 02:22 UTC

[llvm-dev] Branch is not optimized because of right shift

On Sun, Apr 5, 2020 at 6:34 PM Stefanos Baziotis <
stefanos.baziotis at gmail.com> wrote:
> Hi Craig,
>
> > Adding a nuw to the add -8 is incorrect.
> Yeah, I didn't mean to say it was correct. It was just an observation
that
> with nuw the optimization was happened and I asked if someone thought it
> was somehow connected.
>
> > From the perspective of the unsigned math, -8 is treated a very large
> positive number. The input to the add is [8,13) and adding a large positive
> number to it wraps around past 0. So that is guaranteed unsigned wrap
> I understand yes, but I don't think it is guaranteed. Unless I miss
> something, for values in [0, 7] it won't wrap. But past that and up to
(and
> including in the original source code) 13, it will wrap yes.
>
But the value can't be [0,7] due to the earlier branch. When I said it was
guaranteed to wrap, I only meant for the range of values that were possible
after the first branch.

In theory, the CorrelatedValuePropagation pass should have been able to
optimize the select. It was able to see that the input to add was in the
range [8,14) in the call to LVI->getConstantRange in processBinOp.
processCmp skips calling LVI for the select's icmp because the input
isn't
in the same basic block and isn't a phi. And the call to LVI->getConstant
for the select in processSelect didn't return a constant. I think this is
because the code executed for getConstant doesn't handle icmp even when it
can prove the input is in a constant range. I tried removing the local
value check in processCmp so that getPredicateAt would called, but that
didn't help either.

> Best,
> - Stefanos
>
>
> Στις Δευ, 6 Απρ 2020 στις 3:10 π.μ., ο/η Craig Topper <
> craig.topper at gmail.com> έγραψε:
>
>> Adding a nuw to the add -8 is incorrect. From the perspective of the
>> unsigned math, -8 is treated a very large positive number. The input to
the
>> add is [8,13) and adding a large positive number to it wraps around
past 0.
>> So that is guaranteed unsigned wrap. On the other hand, a sub nuw 8
would
>> be correct.
>>
>> ~Craig
>>
>>
>> On Sun, Apr 5, 2020 at 3:27 PM Stefanos Baziotis via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Thanks, I didn't know that! Indeed, it's instruction
combine that does
>>> the job.
>>>
>>> - Stefanos
>>>
>>> Στις Δευ, 6 Απρ 2020 στις 12:38 π.μ., ο/η Florian Hahn <
>>> florian_hahn at apple.com> έγραψε:
>>>
>>>>
>>>>
>>>> > On Apr 5, 2020, at 22:20, Stefanos Baziotis <
>>>> stefanos.baziotis at gmail.com> wrote:
>>>> >
>>>> > > Any idea about how the compiler could remove the lshr
and use a add
>>>> -16?
>>>> > Actually, I just figured that doing this test is like
solving this:
>>>> >
>>>> > 8 <= x/2 <= 13
>>>> > 16 <= x <= 26
>>>> > 0 <= x - 16 <= 10 => 0 <= x < 11
>>>> > The left part is know since it's unsigned
>>>> > The right part could be done x <= 11 => x < 12
because it's actually
>>>> an integer division.
>>>> > Wow... I would be really happy to know what pass does
that.
>>>>
>>>> I’d guess a combination of instcombine rules together with some
other
>>>> transforms. You could probably use `-print-after-all` (`clang
-mllvm
>>>> -print-after-all` if you are using clang) to track down the
relevant
>>>> passes/steps.
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200405/df366aa8/attachment-0001.html>

Stefanos Baziotis via llvm-dev

2020-Apr-06 15:53 UTC

head link

[llvm-dev] Branch is not optimized because of right shift

Hi,
> But the value can't be [0,7] due to the earlier branch. When I said itwas guaranteed to wrap, I only meant for the range of values that were
possible after the first branch.
Indeed, for some reason I remembered that the left check is >= 0 when I
wrote this.

Thanks a lot for the breakdown in CorrelatedValuePropagation. I had no idea
about this pass.
> It was able to see that the input to add was in the range [8,14) in thecall to LVI->getConstantRange in processBinOp.

I could not reproduce that. I get:
BinOp: %4 = add nsw i32 %2, -8
LRange: [0,-2147483648)
RRange: [-8,-7)
> processCmp skips calling LVI for the select's icmp because the inputisn't in the same basic block and isn't a phi.
Do you maybe mean: _is_ in the same BB (and isn't a PHI).

By the way, I don't get the reasoning in the comment above:
// As a policy choice, we choose not to waste compile time on anything where
// the comparison is testing local values.
> I think this is because the code executed for getConstant doesn't
handleicmp even when it can prove the input is in a constant range.

Maybe we ended up on the same thing. I'm not sure I followed that correctly
but getValueFromICmpCondition() should have been able to handle that.

Best,
Stefanos


Στις Δευ, 6 Απρ 2020 στις 5:22 π.μ., ο/η Craig Topper <
craig.topper at gmail.com> έγραψε:
> On Sun, Apr 5, 2020 at 6:34 PM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
>
>> Hi Craig,
>>
>> > Adding a nuw to the add -8 is incorrect.
>> Yeah, I didn't mean to say it was correct. It was just an
observation
>> that with nuw the optimization was happened and I asked if someone
thought
>> it was somehow connected.
>>
>> > From the perspective of the unsigned math, -8 is treated a very
large
>> positive number. The input to the add is [8,13) and adding a large
positive
>> number to it wraps around past 0. So that is guaranteed unsigned wrap
>> I understand yes, but I don't think it is guaranteed. Unless I miss
>> something, for values in [0, 7] it won't wrap. But past that and up
to (and
>> including in the original source code) 13, it will wrap yes.
>>
>
> But the value can't be [0,7] due to the earlier branch. When I said it
was
> guaranteed to wrap, I only meant for the range of values that were possible
> after the first branch.
>
> In theory, the CorrelatedValuePropagation pass should have been able to
> optimize the select. It was able to see that the input to add was in the
> range [8,14) in the call to LVI->getConstantRange in processBinOp.
> processCmp skips calling LVI for the select's icmp because the input
isn't
> in the same basic block and isn't a phi. And the call to
LVI->getConstant
> for the select in processSelect didn't return a constant. I think this
is
> because the code executed for getConstant doesn't handle icmp even when
it
> can prove the input is in a constant range. I tried removing the local
> value check in processCmp so that getPredicateAt would called, but that
> didn't help either.
>
>
>> Best,
>> - Stefanos
>>
>>
>> Στις Δευ, 6 Απρ 2020 στις 3:10 π.μ., ο/η Craig Topper <
>> craig.topper at gmail.com> έγραψε:
>>
>>> Adding a nuw to the add -8 is incorrect. From the perspective of
the
>>> unsigned math, -8 is treated a very large positive number. The
input to the
>>> add is [8,13) and adding a large positive number to it wraps around
past 0.
>>> So that is guaranteed unsigned wrap. On the other hand, a sub nuw 8
would
>>> be correct.
>>>
>>> ~Craig
>>>
>>>
>>> On Sun, Apr 5, 2020 at 3:27 PM Stefanos Baziotis via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Thanks, I didn't know that! Indeed, it's instruction
combine that does
>>>> the job.
>>>>
>>>> - Stefanos
>>>>
>>>> Στις Δευ, 6 Απρ 2020 στις 12:38 π.μ., ο/η Florian Hahn <
>>>> florian_hahn at apple.com> έγραψε:
>>>>
>>>>>
>>>>>
>>>>> > On Apr 5, 2020, at 22:20, Stefanos Baziotis <
>>>>> stefanos.baziotis at gmail.com> wrote:
>>>>> >
>>>>> > > Any idea about how the compiler could remove the
lshr and use a
>>>>> add -16?
>>>>> > Actually, I just figured that doing this test is like
solving this:
>>>>> >
>>>>> > 8 <= x/2 <= 13
>>>>> > 16 <= x <= 26
>>>>> > 0 <= x - 16 <= 10 => 0 <= x < 11
>>>>> > The left part is know since it's unsigned
>>>>> > The right part could be done x <= 11 => x <
12 because it's actually
>>>>> an integer division.
>>>>> > Wow... I would be really happy to know what pass does
that.
>>>>>
>>>>> I’d guess a combination of instcombine rules together with
some other
>>>>> transforms. You could probably use `-print-after-all`
(`clang -mllvm
>>>>> -print-after-all` if you are using clang) to track down the
relevant
>>>>> passes/steps.
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200406/be5ff004/attachment.html>

Florian Hahn via llvm-dev

2020-Jul-11 18:12 UTC

head link

[llvm-dev] Branch is not optimized because of right shift

Hi, 

I think this should be optimized as expected on latest trunk:
https://godbolt.org/z/vW7K7f

Cheers
> On Apr 6, 2020, at 16:53, Stefanos Baziotis <stefanos.baziotis at
gmail.com> wrote:
> 
> Hi,
> 
> > But the value can't be [0,7] due to the earlier branch. When I
said it was guaranteed to wrap, I only meant for the range of values that were
possible after the first branch.
> Indeed, for some reason I remembered that the left check is >= 0 when I
wrote this.
> 
> Thanks a lot for the breakdown in CorrelatedValuePropagation. I had no idea
about this pass.
> 
> > It was able to see that the input to add was in the range [8,14) in
the call to LVI->getConstantRange in processBinOp.
> 
> I could not reproduce that. I get:
> BinOp: %4 = add nsw i32 %2, -8
> LRange: [0,-2147483648)
> RRange: [-8,-7)
> 
> > processCmp skips calling LVI for the select's icmp because the
input isn't in the same basic block and isn't a phi.
> Do you maybe mean: _is_ in the same BB (and isn't a PHI).
> 
> By the way, I don't get the reasoning in the comment above:
> // As a policy choice, we choose not to waste compile time on anything
where
> // the comparison is testing local values.
> 
> > I think this is because the code executed for getConstant doesn't
handle icmp even when it can prove the input is in a constant range.
> 
> Maybe we ended up on the same thing. I'm not sure I followed that
correctly but getValueFromICmpCondition() should have been able to handle that.
> 
> Best,
> Stefanos
> 
> 
> Στις Δευ, 6 Απρ 2020 στις 5:22 π.μ., ο/η Craig Topper <craig.topper at
gmail.com <mailto:craig.topper at gmail.com>> έγραψε:
> On Sun, Apr 5, 2020 at 6:34 PM Stefanos Baziotis <stefanos.baziotis at
gmail.com <mailto:stefanos.baziotis at gmail.com>> wrote:
> Hi Craig,
> 
> > Adding a nuw to the add -8 is incorrect.
> Yeah, I didn't mean to say it was correct. It was just an observation
that with nuw the optimization was happened and I asked if someone thought it
was somehow connected.
> 
> > From the perspective of the unsigned math, -8 is treated a very large
positive number. The input to the add is [8,13) and adding a large positive
number to it wraps around past 0. So that is guaranteed unsigned wrap
> I understand yes, but I don't think it is guaranteed. Unless I miss
something, for values in [0, 7] it won't wrap. But past that and up to (and
including in the original source code) 13, it will wrap yes.
> 
> But the value can't be [0,7] due to the earlier branch. When I said it
was guaranteed to wrap, I only meant for the range of values that were possible
after the first branch.
>  
> In theory, the CorrelatedValuePropagation pass should have been able to
optimize the select. It was able to see that the input to add was in the range
[8,14) in the call to LVI->getConstantRange in processBinOp. processCmp skips
calling LVI for the select's icmp because the input isn't in the same
basic block and isn't a phi. And the call to LVI->getConstant for the
select in processSelect didn't return a constant. I think this is because
the code executed for getConstant doesn't handle icmp even when it can prove
the input is in a constant range. I tried removing the local value check in
processCmp so that getPredicateAt would called, but that didn't help either.
> 
> 
> Best,
> - Stefanos
> 
> 
> Στις Δευ, 6 Απρ 2020 στις 3:10 π.μ., ο/η Craig Topper <craig.topper at
gmail.com <mailto:craig.topper at gmail.com>> έγραψε:
> Adding a nuw to the add -8 is incorrect. From the perspective of the
unsigned math, -8 is treated a very large positive number. The input to the add
is [8,13) and adding a large positive number to it wraps around past 0. So that
is guaranteed unsigned wrap. On the other hand, a sub nuw 8 would be correct.
> 
> ~Craig
> 
> 
> On Sun, Apr 5, 2020 at 3:27 PM Stefanos Baziotis via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Thanks, I didn't know that! Indeed, it's instruction combine that
does the job.
> 
> - Stefanos
> 
> Στις Δευ, 6 Απρ 2020 στις 12:38 π.μ., ο/η Florian Hahn <florian_hahn at
apple.com <mailto:florian_hahn at apple.com>> έγραψε:
> 
> 
> > On Apr 5, 2020, at 22:20, Stefanos Baziotis <stefanos.baziotis at
gmail.com <mailto:stefanos.baziotis at gmail.com>> wrote:
> > 
> > > Any idea about how the compiler could remove the lshr and use a
add -16?
> > Actually, I just figured that doing this test is like solving this:
> > 
> > 8 <= x/2 <= 13
> > 16 <= x <= 26
> > 0 <= x - 16 <= 10 => 0 <= x < 11
> > The left part is know since it's unsigned
> > The right part could be done x <= 11 => x < 12 because
it's actually an integer division.
> > Wow... I would be really happy to know what pass does that.
> 
> I’d guess a combination of instcombine rules together with some other
transforms. You could probably use `-print-after-all` (`clang -mllvm
-print-after-all` if you are using clang) to track down the relevant
passes/steps.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200711/d8362118/attachment.html>

llvm dev - Jul 2020 - Branch is not optimized because of right shift

[llvm-dev] Branch is not optimized because of right shift

[llvm-dev] Branch is not optimized because of right shift

[llvm-dev] Branch is not optimized because of right shift