Sam Parker via llvm-dev
2020-Apr-28 15:13 UTC
[llvm-dev] [RFC] Being explicit in the cost model
Hi, I have been working in the TargetTransformInfo layer for the last few weeks with several ongoing goals:> Remove ambiguity. > Improve cost modelling performance, accuracy and utility. > Help empower the backends. > Reduce the surface area of the API. > Reduce the dependencies between the layers - of which there are many!My latest patch is an NFC to help address the issue of ambiguity, I have uploaded the patch here: https://reviews.llvm.org/D79002. It's a biggy, adding TargetCostKind as an argument to almost all the get*Cost methods. My hope is that, as well as clarity, it will allow some backends to re-use and correlate costs if they wish. It would also allow adding a hook to query the backend which cost kind is the most important to it, which could then be passed around. But as the patch currently stands, even as an NFC, it could still lead users astray because nothing else is setup to differentiate and return a specific cost. I'd like to hear from the backend people to hear what kind of costs they were modelling for and how we should handle the 'default' behaviour. For instance, whether they're happy to return the same cost for all cost kinds, or whether a check should be performed first on the cost kind first and only return something specific if its the expected cost kind. There's also the issue of the value of default arguments - I've used RecipThroughput for the calls used by the vectorizers and SizeAndLatency for the rest, but I wonder if default values should be used at all. Regards, Sam Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200428/dd96bd7d/attachment-0001.html>
Finkel, Hal J. via llvm-dev
2020-Apr-29 18:20 UTC
[llvm-dev] [RFC] Being explicit in the cost model
Hi, Sam, Thanks for sending this out and I think that it's great that this is being worked on. It looks like we currently have this: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize, ///< Instruction code size. TCK_SizeAndLatency ///< The weighted sum of size and latency. }; Three of the four of these are measurable quantities, at least in an average sense, independent of their use in the optimizer. I think that's a good thing, in part because we can continue to create good tools to measure them. TCK_SizeAndLatency, however, is not like the others. Is there a way to set this value independent of all of its uses in the optimizer? I suspect that the answer is: no. Moreover, it's not just the use by a single transformation that matters, but use by all users in the pipeline. The only way to tune these costs is to run large test sets through the entire optimization pipeline. While I understand that this reflects the current state of things, it seems less than ideal. There's no way that a single blended ratio of size and latency will be optimal for all potential users (inliner, unroller, PRE, etc.), or at least it seems very unlikely. The transformation also won't know what the right mixture is across all targets. If we really want to be able to tune these things, maintain stability as we evolve the optimizer, and understand what's going on, I think we'll want more flexibility in the system than this. Also, the "size and latency" mixture is, as I recall, mostly measuring # of uops. Maybe we should understand how much it deviates from that and, if appropriate, call it that? We might also consider a system allowing for per-pass TTI modifications of the costs returned by the model. a TCK_Blend, taking an explicit pass ID (and some discriminator code). This way a backend can return explicit heuristic costs for particular clients when there's really no other way to reason about the needed costs. To your other questions: the vectorizers should certainly use recip throughput -- that's how they're designed. Targets do specifically return throughput costs for these separate from the costs used by the inliner. "size and latency" is the right default for the others (although, as I mention above, I think this might be better called TCK_MicroOps or similar). It is certainly a good idea to be explicit everywhere about what costs are being requested. Please do continue with that. Thanks, Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Sam Parker via llvm-dev <llvm-dev at lists.llvm.org> Sent: Tuesday, April 28, 2020 10:13 AM To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Cc: nd <nd at arm.com> Subject: [llvm-dev] [RFC] Being explicit in the cost model Hi, I have been working in the TargetTransformInfo layer for the last few weeks with several ongoing goals:> Remove ambiguity. > Improve cost modelling performance, accuracy and utility. > Help empower the backends. > Reduce the surface area of the API. > Reduce the dependencies between the layers - of which there are many!My latest patch is an NFC to help address the issue of ambiguity, I have uploaded the patch here: https://reviews.llvm.org/D79002. It's a biggy, adding TargetCostKind as an argument to almost all the get*Cost methods. My hope is that, as well as clarity, it will allow some backends to re-use and correlate costs if they wish. It would also allow adding a hook to query the backend which cost kind is the most important to it, which could then be passed around. But as the patch currently stands, even as an NFC, it could still lead users astray because nothing else is setup to differentiate and return a specific cost. I'd like to hear from the backend people to hear what kind of costs they were modelling for and how we should handle the 'default' behaviour. For instance, whether they're happy to return the same cost for all cost kinds, or whether a check should be performed first on the cost kind first and only return something specific if its the expected cost kind. There's also the issue of the value of default arguments - I've used RecipThroughput for the calls used by the vectorizers and SizeAndLatency for the rest, but I wonder if default values should be used at all. Regards, Sam Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200429/589de006/attachment-0001.html>
Sam Parker via llvm-dev
2020-Apr-30 08:04 UTC
[llvm-dev] [RFC] Being explicit in the cost model
Hi Hal, Thanks for the support Hal. Yes, I actually only added SizeAndLatency at the start of the week to make it explicit when we're trying to use that vague cost and to highlight that it's falsely interchanged with CodeSize too. On the naming, I took it from the comments in TTI, though I'm sure each backend has derived slightly different meanings. I'm not sure whether reporting micro-ops translates to all targets, certainly not the small microcontrollers that I work on where latency is key, and a micro-op cost seems more related to throughput to me. I'd happily receive more feedback though so we can gain a consensus though. I think the way forward is, like you say, for the backend to be given the opportunity to report a cost for each transform. Ideally the backend can then choose which cost is important to it, as well as a suitable threshold, and be given a larger region of code if necessary. It would be great to move the ad-hoc cost modelling out of the analysis and transform passes, while they can always fallback to their original costing if a backend doesn't exist or doesn't return anything. Thanks, Sam Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com ________________________________ From: Finkel, Hal J. <hfinkel at anl.gov> Sent: 29 April 2020 19:20 To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Sam Parker <Sam.Parker at arm.com> Cc: nd <nd at arm.com> Subject: Re: [RFC] Being explicit in the cost model Hi, Sam, Thanks for sending this out and I think that it's great that this is being worked on. It looks like we currently have this: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize, ///< Instruction code size. TCK_SizeAndLatency ///< The weighted sum of size and latency. }; Three of the four of these are measurable quantities, at least in an average sense, independent of their use in the optimizer. I think that's a good thing, in part because we can continue to create good tools to measure them. TCK_SizeAndLatency, however, is not like the others. Is there a way to set this value independent of all of its uses in the optimizer? I suspect that the answer is: no. Moreover, it's not just the use by a single transformation that matters, but use by all users in the pipeline. The only way to tune these costs is to run large test sets through the entire optimization pipeline. While I understand that this reflects the current state of things, it seems less than ideal. There's no way that a single blended ratio of size and latency will be optimal for all potential users (inliner, unroller, PRE, etc.), or at least it seems very unlikely. The transformation also won't know what the right mixture is across all targets. If we really want to be able to tune these things, maintain stability as we evolve the optimizer, and understand what's going on, I think we'll want more flexibility in the system than this. Also, the "size and latency" mixture is, as I recall, mostly measuring # of uops. Maybe we should understand how much it deviates from that and, if appropriate, call it that? We might also consider a system allowing for per-pass TTI modifications of the costs returned by the model. a TCK_Blend, taking an explicit pass ID (and some discriminator code). This way a backend can return explicit heuristic costs for particular clients when there's really no other way to reason about the needed costs. To your other questions: the vectorizers should certainly use recip throughput -- that's how they're designed. Targets do specifically return throughput costs for these separate from the costs used by the inliner. "size and latency" is the right default for the others (although, as I mention above, I think this might be better called TCK_MicroOps or similar). It is certainly a good idea to be explicit everywhere about what costs are being requested. Please do continue with that. Thanks, Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Sam Parker via llvm-dev <llvm-dev at lists.llvm.org> Sent: Tuesday, April 28, 2020 10:13 AM To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Cc: nd <nd at arm.com> Subject: [llvm-dev] [RFC] Being explicit in the cost model Hi, I have been working in the TargetTransformInfo layer for the last few weeks with several ongoing goals:> Remove ambiguity. > Improve cost modelling performance, accuracy and utility. > Help empower the backends. > Reduce the surface area of the API. > Reduce the dependencies between the layers - of which there are many!My latest patch is an NFC to help address the issue of ambiguity, I have uploaded the patch here: https://reviews.llvm.org/D79002. It's a biggy, adding TargetCostKind as an argument to almost all the get*Cost methods. My hope is that, as well as clarity, it will allow some backends to re-use and correlate costs if they wish. It would also allow adding a hook to query the backend which cost kind is the most important to it, which could then be passed around. But as the patch currently stands, even as an NFC, it could still lead users astray because nothing else is setup to differentiate and return a specific cost. I'd like to hear from the backend people to hear what kind of costs they were modelling for and how we should handle the 'default' behaviour. For instance, whether they're happy to return the same cost for all cost kinds, or whether a check should be performed first on the cost kind first and only return something specific if its the expected cost kind. There's also the issue of the value of default arguments - I've used RecipThroughput for the calls used by the vectorizers and SizeAndLatency for the rest, but I wonder if default values should be used at all. Regards, Sam Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200430/4455c9a7/attachment.html>