Grang, Mandeep Singh via llvm-dev
2016-Feb-24 03:15 UTC
[llvm-dev] Performance degradation on ARMv7 (cortex-a9)
Hi Bradley, I was doing some performance analysis for ARMv7 (cortex-a9) and I noticed that one of my benchmarks degraded by 93%. I have tracked the regression down to the following commit by you: / //commit 7c1b77248baaeafec5d6433c3d1da9a2e2b69595// //Author: Bradley Smith <bradley.smith at arm.com>// //Date: Mon Nov 16 11:10:19 2015 +0000// // [ARM] Introduce subtarget features per ARM architecture.// // This allows for accurate architecture targeting as well as removing// // duplicate information (hardcoded feature strings) from MCTargetDesc./ I see that in lib/Target/ARM/ARM.td all the features have been removed from Proc definition (e.g.: ProcA9) and added to ProcessorModel definition (e.g.: ProcessorModel<"cortex-a9"). But I find that the features from Proc are still being read and set in MCSubtargetInfo through the ARMFeatureKV table. So if the Proc is empty the corresponding feature is not being set. In my case, if I add FeatureFP16 back to the ProcA9 definition in ARM.td I get back all the lost performance. Could you please give me some insight on how, after your change, do the Proc features get correctly set in MCSubtargetInfo and other places which access Proc? Thanks, Mandeep -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/30dcc51a/attachment.html>
Bradley Smith via llvm-dev
2016-Feb-24 10:42 UTC
[llvm-dev] Performance degradation on ARMv7 (cortex-a9)
Hi, The idea behind that change was to make ARM.td clearer, that is, adding architecture features to new architecture subtarget features, and to have the CPUs inherit from this. ProcA9 (and similar) from what I could tell were only being used for their enum value in making codegen decisions, hence I moved all of the features they inherit over to the actual CPUs for clarity, the idea being that all features a given target uses come from a combination of the architecture it inherits from and the target itself, not any intermediary features like ProcA9. I’m not aware of any place where ProcA9 is getting used to get subtarget features like this, and after a quick look I still can’t find anything. Where exactly are you seeing ProcA9 being used to get features? Even so, the cortex-a9 processer model itself inherits FeatureFP16 now so I would expect it to use FP16, unless you’re not using cortex-a9 directly? (In which case all CPUs that used to inherit ProcA9 now need to inherit all of the features ProcA9 used to inherit as well as ProcA9, which is what I did in the change you mention). Regards, Bradley Smith From: Grang, Mandeep Singh [mailto:mgrang at codeaurora.org] Sent: 24 February 2016 03:16 To: Bradley Smith Cc: llvm-dev at lists.llvm.org Subject: Performance degradation on ARMv7 (cortex-a9) Hi Bradley, I was doing some performance analysis for ARMv7 (cortex-a9) and I noticed that one of my benchmarks degraded by 93%. I have tracked the regression down to the following commit by you: commit 7c1b77248baaeafec5d6433c3d1da9a2e2b69595 Author: Bradley Smith <bradley.smith at arm.com><mailto:bradley.smith at arm.com> Date: Mon Nov 16 11:10:19 2015 +0000 [ARM] Introduce subtarget features per ARM architecture. This allows for accurate architecture targeting as well as removing duplicate information (hardcoded feature strings) from MCTargetDesc. I see that in lib/Target/ARM/ARM.td all the features have been removed from Proc definition (e.g.: ProcA9) and added to ProcessorModel definition (e.g.: ProcessorModel<"cortex-a9"). But I find that the features from Proc are still being read and set in MCSubtargetInfo through the ARMFeatureKV table. So if the Proc is empty the corresponding feature is not being set. In my case, if I add FeatureFP16 back to the ProcA9 definition in ARM.td I get back all the lost performance. Could you please give me some insight on how, after your change, do the Proc features get correctly set in MCSubtargetInfo and other places which access Proc? Thanks, Mandeep -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/a786ed62/attachment-0001.html>
Grang, Mandeep Singh via llvm-dev
2016-Feb-24 20:28 UTC
[llvm-dev] Performance degradation on ARMv7 (cortex-a9)
Thanks Bradley. I see that the features set in /ARM.td/ get written to the generated file /<build>/llvm/lib/Target/ARM/ARMGenSubtargetInfo.inc./ Here the ProcA9 features appear in /ARMFeatureKV/ table: /{ "a9", "Cortex-A9 ARM processors", { ARM::ProcA9 }, { *ARM::FeatureFP16* } }, /With your change, the features for ProcA9 in the above entry are empty.//This ARMFeatureKV table is then read in MC/MCSubtargetInfo.cpp in the getFeatures() function. Thanks, Mandeep On 2/24/2016 2:42 AM, Bradley Smith wrote:> > Hi, > > The idea behind that change was to make ARM.td clearer, that is, > adding architecture features to new architecture subtarget features, > and to have the CPUs inherit from this. ProcA9 (and similar) from what > I could tell were only being used for their enum value in making > codegen decisions, hence I moved all of the features they inherit over > to the actual CPUs for clarity, the idea being that all features a > given target uses come from a combination of the architecture it > inherits from and the target itself, not any intermediary features > like ProcA9. > > I’m not aware of any place where ProcA9 is getting used to get > subtarget features like this, and after a quick look I still can’t > find anything. Where exactly are you seeing ProcA9 being used to get > features? Even so, the cortex-a9 processer model itself inherits > FeatureFP16 now so I would expect it to use FP16, unless you’re not > using cortex-a9 directly? (In which case all CPUs that used to inherit > ProcA9 now need to inherit all of the features ProcA9 used to inherit > as well as ProcA9, which is what I did in the change you mention). > > Regards, > > Bradley Smith > > *From:*Grang, Mandeep Singh [mailto:mgrang at codeaurora.org] > *Sent:* 24 February 2016 03:16 > *To:* Bradley Smith > *Cc:* llvm-dev at lists.llvm.org > *Subject:* Performance degradation on ARMv7 (cortex-a9) > > Hi Bradley, > > I was doing some performance analysis for ARMv7 (cortex-a9) and I > noticed that one of my benchmarks degraded by 93%. I have tracked the > regression down to the following commit by you: > / > commit 7c1b77248baaeafec5d6433c3d1da9a2e2b69595 > Author: Bradley Smith <bradley.smith at arm.com> > <mailto:bradley.smith at arm.com> > Date: Mon Nov 16 11:10:19 2015 +0000 > [ARM] Introduce subtarget features per ARM architecture. > This allows for accurate architecture targeting as well as removing > duplicate information (hardcoded feature strings) from MCTargetDesc./ > > I see that in lib/Target/ARM/ARM.td all the features have been removed > from Proc definition (e.g.: ProcA9) and added to ProcessorModel > definition (e.g.: ProcessorModel<"cortex-a9"). > But I find that the features from Proc are still being read and set in > MCSubtargetInfo through the ARMFeatureKV table. So if the Proc is > empty the corresponding feature is not being set. > In my case, if I add FeatureFP16 back to the ProcA9 definition in > ARM.td I get back all the lost performance. > > Could you please give me some insight on how, after your change, do > the Proc features get correctly set in MCSubtargetInfo and other > places which access Proc? > > Thanks, > Mandeep >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/26f6f46d/attachment.html>