Jay Foad via llvm-dev
2020-Oct-22 14:37 UTC
[llvm-dev] SelectionDAG: target-specific simplification of generic nodes using demanded bits
All, The AMDGPU target has a 24-bit multiply instruction. In SimplifyDemandedBits I'd like to be able to turn a generic i32 ISD::MUL into AMDGPUISD::MUL_I24 if only the low order 24 bits are demanded. Currently it seems like there's no way to do that, because the target hook SimplifyDemandedBitsForTargetNode is only called for target-specific nodes, not for generic nodes like MUL. Would it be acceptable to call the target hook for generic nodes as well? Here's a patch to show the general idea: https://reviews.llvm.org/D89964 (It probably needs a bit of polish, e.g. the name "SimplifyDemandedBitsForTargetNode" is misleading now.) In the future perhaps "demanded bits" could become a cached analysis that could be queried from anywhere. Then I could write a target-specific DAG combine for MUL that would query the demanded bits for the node to do this transformation. Thanks, Jay. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201022/ff1baa7f/attachment.html>
Simon Pilgrim via llvm-dev
2020-Oct-23 08:57 UTC
[llvm-dev] SelectionDAG: target-specific simplification of generic nodes using demanded bits
X86 would definitely benefit from this as we often use generic opcodes as part of more complex patterns that could then be handled through SimplifyDemandedBits/SimplifyDemandedVectorElts/SimplifyMultipleUse. Is this likely to slow compile times down? I've tried but struggled to come up with an acceptable way to determine the accumulated demanded bits/elts across all users of a SDValue - I was always worried that caching would miss nodes that had recently been added/removed. I've often thought it a shame that the standard combines don't include the demandedbits/elts masks directly instead of having a separate SimplifyDemanded* mechanism. Simon. On 22/10/2020 15:37, Jay Foad via llvm-dev wrote:> All, > > The AMDGPU target has a 24-bit multiply instruction. In > SimplifyDemandedBits I'd like to be able to turn a generic i32 > ISD::MUL into AMDGPUISD::MUL_I24 if only the low order 24 bits are > demanded. Currently it seems like there's no way to do that, because > the target hook SimplifyDemandedBitsForTargetNode is only called for > target-specific nodes, not for generic nodes like MUL. > > Would it be acceptable to call the target hook for generic nodes as > well? Here's a patch to show the general idea: > https://reviews.llvm.org/D89964 > (It probably needs a bit of polish, e.g. the name > "SimplifyDemandedBitsForTargetNode" is misleading now.) > > In the future perhaps "demanded bits" could become a cached analysis > that could be queried from anywhere. Then I could write a > target-specific DAG combine for MUL that would query the demanded bits > for the node to do this transformation. > > Thanks, > Jay. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201023/c5ce06c4/attachment.html>