Duan Bing via llvm-dev
2019-May-13 01:58 UTC
[llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR?
Inspired by https://www.agner.org/optimize/instruction_tables.pdf, which gives us the latency and reciprocal throughput of each instruction in the different architecture of X86, Is there anybody taking the effort to do a similar job for LLVM IR? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/6b565a12/attachment.html>
Cranmer, Joshua via llvm-dev
2019-May-13 14:52 UTC
[llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR?
There is no fixed latency/throughput to LLVM IR instructions. Instructions are lowered to target instructions based on the target and the subtarget, and these lowerings can have multiple IR instructions coalesced into a single target instruction or one IR instruction split into several instructions. Latency/throughput is provided for target instructions based on the *Schedule.td files, but there is no easy mapping between these instructions and LLVM IR. Some ad-hoc estimation for the IR level is provided by TargetLowering and especially TargetTransformInfo, but these do not provide complete coverage and only provide coarse estimates. From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Duan Bing via llvm-dev Sent: Sunday, May 12, 2019 21:59 To: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> Subject: [llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR? Inspired by https://www.agner.org/optimize/instruction_tables.pdf, which gives us the latency and reciprocal throughput of each instruction in the different architecture of X86, Is there anybody taking the effort to do a similar job for LLVM IR? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/72a750f8/attachment.html>
Matt Davis via llvm-dev
2019-May-13 15:04 UTC
[llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR?
Hi Duan, It sounds like you might be interested in llvm’s Machine Code Analyzer. It’s an llvm tool that can calculate reciprocal throughput by simulating an out-of-order instruction pipeline. It also provides other useful cycle information: https://llvm.org/docs/CommandGuide/llvm-mca.html Your question asks about IR. llvm-mca currently only works on assembly. However, you can easily generate assembly from IR or make use of MCA as a library. I suggest just converting your IR to assembly and running that through llvm-mca. MCA is built on LLVM instruction scheduling information which contains cycle latency information. If you’re curious of what that data looks like, outside of MCA, take a peek at the X86 instruction scheduling information located around: ‘llvm/lib/Target/X86/X86Schedule.td’ -Matt From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Duan Bing via llvm-dev <llvm-dev at lists.llvm.org> Reply-To: Duan Bing <hibduan at gmail.com> Date: Sunday, May 12, 2019 at 6:59 PM To: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> Subject: [llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR? Inspired by https://www.agner.org/optimize/instruction_tables.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.agner.org_optimize_instruction-5Ftables.pdf&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=UilupKr8sbkenSLlythLNg&m=DWhb3uSjfkI09NsIvzdD_gqnlaI7VN0NpDYUdEvguDw&s=g7xcpplySDXjLzySPYQQjj3BbmDZdSgQtcKxwgCOh9s&e=>, which gives us the latency and reciprocal throughput of each instruction in the different architecture of X86, Is there anybody taking the effort to do a similar job for LLVM IR? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/f20e4d5b/attachment.html>
Finkel, Hal J. via llvm-dev
2019-May-13 16:57 UTC
[llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR?
We have an interface for estimating this, although it depends on the targets providing appropriate information through the TargetTransformInfo interface. Call TTI->getInstructionCost with the kind parameter set to TCK_Latency. -Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Matt Davis via llvm-dev <llvm-dev at lists.llvm.org> Sent: Monday, May 13, 2019 10:04 AM To: Duan Bing; LLVM Developers Mailing List Subject: Re: [llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR? Hi Duan, It sounds like you might be interested in llvm’s Machine Code Analyzer. It’s an llvm tool that can calculate reciprocal throughput by simulating an out-of-order instruction pipeline. It also provides other useful cycle information: https://llvm.org/docs/CommandGuide/llvm-mca.html Your question asks about IR. llvm-mca currently only works on assembly. However, you can easily generate assembly from IR or make use of MCA as a library. I suggest just converting your IR to assembly and running that through llvm-mca. MCA is built on LLVM instruction scheduling information which contains cycle latency information. If you’re curious of what that data looks like, outside of MCA, take a peek at the X86 instruction scheduling information located around: ‘llvm/lib/Target/X86/X86Schedule.td’ -Matt From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Duan Bing via llvm-dev <llvm-dev at lists.llvm.org> Reply-To: Duan Bing <hibduan at gmail.com> Date: Sunday, May 12, 2019 at 6:59 PM To: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> Subject: [llvm-dev] How shall I evaluate the latency of each instruction in LLVM IR? Inspired by https://www.agner.org/optimize/instruction_tables.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.agner.org_optimize_instruction-5Ftables.pdf&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=UilupKr8sbkenSLlythLNg&m=DWhb3uSjfkI09NsIvzdD_gqnlaI7VN0NpDYUdEvguDw&s=g7xcpplySDXjLzySPYQQjj3BbmDZdSgQtcKxwgCOh9s&e=>, which gives us the latency and reciprocal throughput of each instruction in the different architecture of X86, Is there anybody taking the effort to do a similar job for LLVM IR? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/a97283aa/attachment.html>
Possibly Parallel Threads
- [LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences
- Adding support for self-modifying branches to LLVM?
- Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S
- Adding support for self-modifying branches to LLVM?
- Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S