thr3ads.net - llvm dev - [llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Nicolas Brunie via llvm-dev

2015-Sep-30 06:01 UTC

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

Hi all,
     I have been looking at the way LLVM optimizes code before 
forwarding it to the backend I develop for my company and while building
define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 {
entry:
   %conv = zext i32 %x to i64
   %conv1 = zext i32 %y to i64
   %mul = mul nuw i64 %conv1, %conv
   %shr = lshr i64 %mul, 32
   %xor = xor i64 %shr, %mul
   %conv2 = trunc i64 %xor to i32
   ret i32 %conv2
}

I came upon the following optimization (during instcombine):
*IC: Visiting:   %mul = mul nuw i64 %conv, %conv1
IC: Visiting:   %shr = lshr i64 %mul, 32
IC: Visiting:   %conv2 = trunc i64 %shr to i32
IC: Visiting:   %conv3 = trunc i64 %mul to i32
IC: Visiting:   %xor = xor i32 %conv3, %conv2
IC: ADD:   %xor6 = xor i64 %mul, %shr
IC: Old =   %xor = xor i32 %conv3, %conv2
     New =   <badref> = trunc i64 %xor6 to i32
*
which seems to be performed by SDValue 
DAGCombiner::SimplifyBinOpWithSameOpcodeHands(SDNode *N)

In my backend's architecture truncate is free, but zext is not (and i64 
is not a desirable type for xor or any binary operation in general), so 
I would expect this optimization to be bypassed but because of the 
following statement   :
(N0.getOpcode() == ISD::TRUNCATE && (!TLI.isZExtFree(VT, Op0VT)  || 
!TLI.isTruncateFree(Op0VT, VT))
it is not (as isZExtFree return false for my architecture while 
isTruncateFree returns true). The comment on binop simplification says 
that binop over truncs should be optimize only if trunc is not free, so 
I do not understand the point of adding !isZExtFree at this point.
Can someone enlighten my ignorance on this optimization ?

best regards,
Nicolas Brunie
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150930/fecff467/attachment.html>

Hal Finkel via llvm-dev

2015-Oct-26 17:40 UTC

head link

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

----- Original Message -----> From: "Nicolas Brunie via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Wednesday, September 30, 2015 1:01:52 AM
> Subject: [llvm-dev] InstCombine wrongful (?) optimization on BinOp with
SameOperands
> 
> 
> Hi all,
> I have been looking at the way LLVM optimizes code before forwarding
> it to the backend I develop for my company and while building
> define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 {
> entry:
> %conv = zext i32 %x to i64
> %conv1 = zext i32 %y to i64
> %mul = mul nuw i64 %conv1, %conv
> %shr = lshr i64 %mul, 32
> %xor = xor i64 %shr, %mul
> %conv2 = trunc i64 %xor to i32
> ret i32 %conv2
> }
> 
> I came upon the following optimization (during instcombine):
> IC: Visiting: %mul = mul nuw i64 %conv, %conv1
> IC: Visiting: %shr = lshr i64 %mul, 32
> IC: Visiting: %conv2 = trunc i64 %shr to i32
> IC: Visiting: %conv3 = trunc i64 %mul to i32
> IC: Visiting: %xor = xor i32 %conv3, %conv2
> IC: ADD: %xor6 = xor i64 %mul, %shr
> IC: Old = %xor = xor i32 %conv3, %conv2
> New = <badref> = trunc i64 %xor6 to i32
> 
> which seems to be performed by SDValue
> DAGCombiner::SimplifyBinOpWithSameOpcodeHands(SDNode *N)
You might have figured this out by now, but no, InstCombine and DAGCombine are
two completely different pieces of code. One is driven by the code in
lib/Transforms/InstCombine/* and the other in
lib/CodeGen/SelectionDAG/DAGCombiner.cpp. InstCombine's job is to move the
IR toward our chosen canonical form, which is designed to simplify operations in
a way that exposes further optimization opportunities (as well as being
generally beneficial). It does not take target costs into account.
> 
> In my backend's architecture truncate is free, but zext is not (and
> i64 is not a desirable type for xor or any binary operation in
> general),
Why, then, have you listed i64 as a legal type?

 -Hal
> so I would expect this optimization to be bypassed but
> because of the following statement :
> (N0.getOpcode() == ISD::TRUNCATE && (!TLI.isZExtFree(VT, Op0VT) ||
> !TLI.isTruncateFree(Op0VT, VT))
> it is not (as isZExtFree return false for my architecture while
> isTruncateFree returns true). The comment on binop simplification
> says that binop over truncs should be optimize only if trunc is not
> free, so I do not understand the point of adding !isZExtFree at this
> point.
> Can someone enlighten my ignorance on this optimization ?
> 
> best regards,
> Nicolas Brunie
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Nicolas Brunie via llvm-dev

2015-Oct-26 19:05 UTC

head link

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

----- Mail original -----
De: "Hal Finkel" <hfinkel at anl.gov>
À: "Nicolas Brunie" <nicolas.brunie at kalray.eu>
Cc: llvm-dev at lists.llvm.org
Envoyé: Lundi 26 Octobre 2015 18:40:54
Objet: Re: [llvm-dev] InstCombine wrongful (?) optimization on BinOp with
SameOperands

----- Original Message -----> From: "Nicolas Brunie via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Wednesday, September 30, 2015 1:01:52 AM
> Subject: [llvm-dev] InstCombine wrongful (?) optimization on BinOp with
SameOperands
> 
> 
> Hi all,
> I have been looking at the way LLVM optimizes code before forwarding
> it to the backend I develop for my company and while building
> define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 {
> entry:
> %conv = zext i32 %x to i64
> %conv1 = zext i32 %y to i64
> %mul = mul nuw i64 %conv1, %conv
> %shr = lshr i64 %mul, 32
> %xor = xor i64 %shr, %mul
> %conv2 = trunc i64 %xor to i32
> ret i32 %conv2
> }
> 
> I came upon the following optimization (during instcombine):
> IC: Visiting: %mul = mul nuw i64 %conv, %conv1
> IC: Visiting: %shr = lshr i64 %mul, 32
> IC: Visiting: %conv2 = trunc i64 %shr to i32
> IC: Visiting: %conv3 = trunc i64 %mul to i32
> IC: Visiting: %xor = xor i32 %conv3, %conv2
> IC: ADD: %xor6 = xor i64 %mul, %shr
> IC: Old = %xor = xor i32 %conv3, %conv2
> New = <badref> = trunc i64 %xor6 to i32
> 
> which seems to be performed by SDValue
> DAGCombiner::SimplifyBinOpWithSameOpcodeHands(SDNode *N)
You might have figured this out by now, but no, InstCombine and DAGCombine are
two completely different pieces of code. One is driven by the code in
lib/Transforms/InstCombine/* and the other in
lib/CodeGen/SelectionDAG/DAGCombiner.cpp. InstCombine's job is to move the
IR toward our chosen canonical form, which is designed to simplify operations in
a way that exposes further optimization opportunities (as well as being
generally beneficial). It does not take target costs into account.


Yes indeed, I went on my visit of LLVM sources and discover my mistake. But your
explanation helps my understanding, than you.
> 
> In my backend's architecture truncate is free, but zext is not (and
> i64 is not a desirable type for xor or any binary operation in
> general),
Why, then, have you listed i64 as a legal type?

Because for operation such as mul, add, and in fact xor ... the targets does in
fact supports i64, it is just more costly than i32 : the target is a VLIW which
can do two 32b add or a single 64b one each cycle.
So when possible I would like LLLVM to forward the information it gathers about
use of result : i.e. if only the 32 MSB of a i64 result are not used it will be
better if only the 32b operations was performed and this optimization was
recursively applied to the 64b DAG until a node whose 64b are effectively
required.
  It may well be that I did not described my target correctly to LLVM and thus
the 64b DAG is not simplified to 32b. I was under the impression that I should
declare i32 as a "preffered" type for these operations and i64 as
legal because I do not want i64 operations to be legalize/expanded just
simplified (but maybe this is the point of the "legal" declaration).

Thank you a lot for digging-up this thread, and for the info

Regards,
Nicolas

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Sep 2015 - InstCombine wrongful (?) optimization on BinOp with SameOperands

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

[llvm-dev] InstCombine wrongful (?) optimization on BinOp with SameOperands

Maybe Matching Threads