On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote:> An informal metric might be: if the operation is supported as a > primitive op or built-in in source languages and it is supported as a > single target instruction, can we guarantee that 1-to-1 translation > through optimization?It seems perfectly reasonable for LLVM users to expect this to happen reliably. I'd like to take a look at the other side of the equation: the cost of adding a new intrinsic in terms of teaching passes to see through it, so we don't lose optimizations that worked before the intrinsic was added. For example, clearly ValueTracking needs a few lines added so that computeKnownBits and friends don't get stopped by a rotate. Anyone have a reasonably complete list of files that need similar changes? John
What do we want do with masking of the amount on this proposed intrinsic. Do we need an explicit AND to keep it in bounds? X86 can delete the AND during isel since the hardware is well behaved for out of range values. Hardware only masks to 5-bits for 8/16 bit rotates for the purpose of flags, but the data will be modulo the bit width. Since we don't use the flags from rotates we can remove the mask. But if the mask is explicit in IR, then LICM might hoist it and isel won't see it to remove it. ~Craig On Wed, May 16, 2018 at 1:21 PM John Regehr via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote: > > > An informal metric might be: if the operation is supported as a > > primitive op or built-in in source languages and it is supported as a > > single target instruction, can we guarantee that 1-to-1 translation > > through optimization? > > It seems perfectly reasonable for LLVM users to expect this to happen > reliably. > > I'd like to take a look at the other side of the equation: the cost of > adding a new intrinsic in terms of teaching passes to see through it, so > we don't lose optimizations that worked before the intrinsic was added. > > For example, clearly ValueTracking needs a few lines added so that > computeKnownBits and friends don't get stopped by a rotate. Anyone have > a reasonably complete list of files that need similar changes? > > John > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180516/fa70f8dc/attachment.html>
A rotate intrinsic should be relatively close in cost/complexity to the existing bswap. A grep of intrinsic::bswap says we'd probably add code in: InstCombine InstructionSimplify ConstantFolding DemandedBits ValueTracking VectorUtils SelectionDAGBuilder But I don't think it's fair to view those additions as pure added cost. As an example, consider that we have to add hacks to EarlyCSE to recognize multi-IR-instruction min/max/abs patterns. Intrinsics just work as-is there. So if you search for 'matchSelectPattern', you get an idea (I see 32 hits in 10 files) of the cost of *not* having intrinsics for those operations that we've decided are not worthy of intrinsics. On Wed, May 16, 2018 at 2:20 PM, John Regehr via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote: > > An informal metric might be: if the operation is supported as a primitive >> op or built-in in source languages and it is supported as a single target >> instruction, can we guarantee that 1-to-1 translation through optimization? >> > > It seems perfectly reasonable for LLVM users to expect this to happen > reliably. > > I'd like to take a look at the other side of the equation: the cost of > adding a new intrinsic in terms of teaching passes to see through it, so we > don't lose optimizations that worked before the intrinsic was added. > > For example, clearly ValueTracking needs a few lines added so that > computeKnownBits and friends don't get stopped by a rotate. Anyone have a > reasonably complete list of files that need similar changes? > > John > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180517/ab4ec4a6/attachment.html>
Thanks Sanjay! At this point the cost/benefit tradeoff for rotate intrinsics seems pretty good. John On 05/17/2018 11:14 AM, Sanjay Patel wrote:> A rotate intrinsic should be relatively close in cost/complexity to the > existing bswap. > > A grep of intrinsic::bswap says we'd probably add code in: > InstCombine > InstructionSimplify > ConstantFolding > DemandedBits > ValueTracking > VectorUtils > SelectionDAGBuilder > > But I don't think it's fair to view those additions as pure added cost. > As an example, consider that we have to add hacks to EarlyCSE to > recognize multi-IR-instruction min/max/abs patterns. Intrinsics just > work as-is there. So if you search for 'matchSelectPattern', you get an > idea (I see 32 hits in 10 files) of the cost of *not* having intrinsics > for those operations that we've decided are not worthy of intrinsics. > > > On Wed, May 16, 2018 at 2:20 PM, John Regehr via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote: > > An informal metric might be: if the operation is supported as a > primitive op or built-in in source languages and it is supported > as a single target instruction, can we guarantee that 1-to-1 > translation through optimization? > > > It seems perfectly reasonable for LLVM users to expect this to > happen reliably. > > I'd like to take a look at the other side of the equation: the cost > of adding a new intrinsic in terms of teaching passes to see through > it, so we don't lose optimizations that worked before the intrinsic > was added. > > For example, clearly ValueTracking needs a few lines added so that > computeKnownBits and friends don't get stopped by a rotate. Anyone > have a reasonably complete list of files that need similar changes? > > John > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >