Phipps, Alan via llvm-dev
2020-Sep-23 16:24 UTC
[llvm-dev] Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td
Thanks, Peter, for your response. Right -- certainly not incorrect in the sense of generating an incorrect schedule, but definitely seems suboptimal. I've also noticed that if I experimentally base the v7-r model on the Cortex-R52 ProcessModel (or even build for Cortex-R52), I achieve a better schedule than if it were based on cortex-a8, and I see 2%-3% performance improvement on benchmarks like Coremark running on cortex-r5 hardware. Do you know why that might be the case? Can you suggest other, more straightforward ways one might improve performance scheduling for cortex-r5 if there aren't any plans to develop a custom model for v7-r? Thanks for your help, -Alan -----Original Message----- From: Peter Smith [mailto:Peter.Smith at arm.com] Sent: Wednesday, September 23, 2020 11:06 AM To: llvm-dev at lists.llvm.org; Phipps, Alan Subject: [EXTERNAL] Re: Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td Hello Alan, Using a cortex-a8 scheduling model for v7-r CPUs may not be optimal but I wouldn't go as far as to call it incorrect. The cortex-r4, cortex-r4f and cortex-r5 are in-order cores like cortex-a8 (another in-order core) is the closest match. We don't have any current plans to develop a custom scheduling model for r4, r4f or r5. Peter ________________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Phipps, Alan via llvm-dev <llvm-dev at lists.llvm.org> Sent: 23 September 2020 15:27 To: llvm-dev at lists.llvm.org Subject: [llvm-dev] Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td In ARM.td, I see that the ProcessorModel for cortex-r4, cortex-r4f, and cortex-r5 (as well as r7 and r8) is based on "CortexA8Model", which seems incorrect. When this was added in 2015, there were also comments associated with this configuration, such as "// FIXME: R5 has currently the same ProcessorModel as A8" (later removed). The processor model for Cortex-r52 appears to be correct and corresponds to an associated "CortexR52Model". Does anyone know why r4/r4f/r5 were setup based on "CortexA8Model". Is there a plan to upstream a fix to correct this? Thanks! Alan Phipps
Peter Smith via llvm-dev
2020-Sep-23 16:55 UTC
[llvm-dev] Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td
Hello Alan, Looking at the public information for Cortex-R5 (https://developer.arm.com/ip-products/processors/cortex-r/cortex-r5) and Cortex-R52 (https://developer.arm.com/ip-products/processors/cortex-r/cortex-r52) shows that both are in-order with similar length pipelines. It is possible that the Cortex-R52 scheduling model may match the Cortex-R5 more closely than the choices available at the time that Cortex-R5 was upstreamed. I haven't written a schedule model myself. My understanding of the process is that the technical reference manual or any other publicly available information about the micro-architecure is used to provide initial values for the model. Then it is a matter of refinement against as many benchmarks as you can run. I think if empirically the Cortex-R52 model is producing better results than the Cortex-A8 then it could be possible to adapt the model for the Cortex-R5 by removing the parts specific to V8-R and tweaking parameters based on cycle times from the technical reference manual (TRM). I'm sure we could find someone to review a patch if there is good enough set of benchmarks showing that a model is better than the Cortex-A8. The technical reference manual for the Cortex-R5: https://developer.arm.com/documentation/ddi0460/c/ Peter ________________________________________ From: Phipps, Alan <a-phipps at ti.com> Sent: 23 September 2020 17:24 To: Peter Smith; llvm-dev at lists.llvm.org Subject: RE: Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td Thanks, Peter, for your response. Right -- certainly not incorrect in the sense of generating an incorrect schedule, but definitely seems suboptimal. I've also noticed that if I experimentally base the v7-r model on the Cortex-R52 ProcessModel (or even build for Cortex-R52), I achieve a better schedule than if it were based on cortex-a8, and I see 2%-3% performance improvement on benchmarks like Coremark running on cortex-r5 hardware. Do you know why that might be the case? Can you suggest other, more straightforward ways one might improve performance scheduling for cortex-r5 if there aren't any plans to develop a custom model for v7-r? Thanks for your help, -Alan -----Original Message----- From: Peter Smith [mailto:Peter.Smith at arm.com] Sent: Wednesday, September 23, 2020 11:06 AM To: llvm-dev at lists.llvm.org; Phipps, Alan Subject: [EXTERNAL] Re: Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td Hello Alan, Using a cortex-a8 scheduling model for v7-r CPUs may not be optimal but I wouldn't go as far as to call it incorrect. The cortex-r4, cortex-r4f and cortex-r5 are in-order cores like cortex-a8 (another in-order core) is the closest match. We don't have any current plans to develop a custom scheduling model for r4, r4f or r5. Peter ________________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Phipps, Alan via llvm-dev <llvm-dev at lists.llvm.org> Sent: 23 September 2020 15:27 To: llvm-dev at lists.llvm.org Subject: [llvm-dev] Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td In ARM.td, I see that the ProcessorModel for cortex-r4, cortex-r4f, and cortex-r5 (as well as r7 and r8) is based on "CortexA8Model", which seems incorrect. When this was added in 2015, there were also comments associated with this configuration, such as "// FIXME: R5 has currently the same ProcessorModel as A8" (later removed). The processor model for Cortex-r52 appears to be correct and corresponds to an associated "CortexR52Model". Does anyone know why r4/r4f/r5 were setup based on "CortexA8Model". Is there a plan to upstream a fix to correct this? Thanks! Alan Phipps