Jonas Paulsson via llvm-dev
2019-Feb-08 17:20 UTC
[llvm-dev] Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
Hi, SystemZ supports @llvm.ctlz.i64() natively with a single instruction (FLOGR), and lesser bitwidth versions of the intrinsic are promoted to i64. For some reason, this leads to unfolded additions of constants as shown below: This function: define i16 @fun(i16 %arg) { %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false) ret i16 %1 } ,gives this optimized DAG as input to instruction selection: SelectionDAG has 15 nodes: t0: ch = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t10: i32 = and t2, Constant:i32<65535> t16: i64 = zero_extend t10 t17: i64 = ctlz t16 t22: i64 = add t17, Constant:i64<-32> t20: i32 = truncate t22 t15: i32 = add t20, Constant:i32<-16> t7: ch,glue = CopyToReg t0, Register:i32 $r2l, t15 t8: ch = SystemZISD::RET_FLAG t7, Register:i32 $r2l, t7:1 It seems that SelectionDAG::computeKnownBits() has a case for ISD::CTLZ, and it seems to figure out that the high bits of t17 are zero, as expected. t17 is guaranteed to have a value between 48 and 64, so there could not be any overflow here, even though I am not sure if that's the problem or not... Should DAGCombiner::visitADD() handle this, or perhaps visitTRUNCATE()? Thanks for any help, Jonas
Sanjay Patel via llvm-dev
2019-Feb-08 21:23 UTC
[llvm-dev] Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
If I'm seeing it correctly, (part of?) the fold you're looking for is here: https://reviews.llvm.org/rL350006 ...but it's restricted to pre-legalization. I don't remember exactly what the problem was allowing that fold post-legalization, but maybe you can loosen that restriction? On Fri, Feb 8, 2019 at 10:20 AM Jonas Paulsson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > SystemZ supports @llvm.ctlz.i64() natively with a single instruction > (FLOGR), and lesser bitwidth versions of the intrinsic are promoted to i64. > > For some reason, this leads to unfolded additions of constants as shown > below: > > This function: > > define i16 @fun(i16 %arg) { > %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false) > ret i16 %1 > } > > ,gives this optimized DAG as input to instruction selection: > > SelectionDAG has 15 nodes: > t0: ch = EntryToken > t2: i32,ch = CopyFromReg t0, Register:i32 %0 > t10: i32 = and t2, Constant:i32<65535> > t16: i64 = zero_extend t10 > t17: i64 = ctlz t16 > t22: i64 = add t17, Constant:i64<-32> > t20: i32 = truncate t22 > t15: i32 = add t20, Constant:i32<-16> > t7: ch,glue = CopyToReg t0, Register:i32 $r2l, t15 > t8: ch = SystemZISD::RET_FLAG t7, Register:i32 $r2l, t7:1 > > It seems that SelectionDAG::computeKnownBits() has a case for ISD::CTLZ, > and it seems to figure out that the high bits of t17 are zero, as expected. > > t17 is guaranteed to have a value between 48 and 64, so there could not > be any overflow here, even though I am not sure if that's the problem or > not... Should DAGCombiner::visitADD() handle this, or perhaps > visitTRUNCATE()? > > Thanks for any help, > > Jonas > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190208/07bd85f3/attachment.html>
Jonas Paulsson via llvm-dev
2019-Feb-12 03:17 UTC
[llvm-dev] Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
Hi sanjay,> If I'm seeing it correctly, (part of?) the fold you're looking for is > here: > https://reviews.llvm.org/rL350006 > > ...but it's restricted to pre-legalization. > I don't remember exactly what the problem was allowing that fold > post-legalization, but maybe you can loosen that restriction? >Thanks! I tried just to remove the !LegalOperations condition (DAGCombiner.cpp:10056), and indeed my problem was solved. Doing this on SystemZ (for all of the opcodes) did not affect SPEC that much. Opcode counts (trunk to left): aghi : 38759 38742 -17 ahi : 34921 34936 +15 risbgn : 37104 37092 -12 nill : 2172 2183 +11 lr : 29731 29735 +4 sr : 6055 6059 +4 srk : 3743 3741 -2 lhi : 89566 89568 +2 risblg : 6528 6529 +1 la : 192375 192374 -1 Spill|Reload : 189670 189670 +0 So, to me it seems this could be the default on SystemZ at least. /Jonas> On Fri, Feb 8, 2019 at 10:20 AM Jonas Paulsson via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi, > > SystemZ supports @llvm.ctlz.i64() natively with a single instruction > (FLOGR), and lesser bitwidth versions of the intrinsic are > promoted to i64. > > For some reason, this leads to unfolded additions of constants as > shown > below: > > This function: > > define i16 @fun(i16 %arg) { > %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false) > ret i16 %1 > } > > ,gives this optimized DAG as input to instruction selection: > > SelectionDAG has 15 nodes: > t0: ch = EntryToken > t2: i32,ch = CopyFromReg t0, Register:i32 %0 > t10: i32 = and t2, Constant:i32<65535> > t16: i64 = zero_extend t10 > t17: i64 = ctlz t16 > t22: i64 = add t17, Constant:i64<-32> > t20: i32 = truncate t22 > t15: i32 = add t20, Constant:i32<-16> > t7: ch,glue = CopyToReg t0, Register:i32 $r2l, t15 > t8: ch = SystemZISD::RET_FLAG t7, Register:i32 $r2l, t7:1 > > It seems that SelectionDAG::computeKnownBits() has a case for > ISD::CTLZ, > and it seems to figure out that the high bits of t17 are zero, as > expected. > > t17 is guaranteed to have a value between 48 and 64, so there > could not > be any overflow here, even though I am not sure if that's the > problem or > not... Should DAGCombiner::visitADD() handle this, or perhaps > visitTRUNCATE()? > > Thanks for any help, > > Jonas > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190211/45285371/attachment.html>