Andy, et al., In ScheduleDAGInstrs::buildSchedGraph, the code for handling stores has this: if (!ExitSU.isPred(SU)) // Push store's up a bit to avoid them getting in between cmp // and branches. ExitSU.addPred(SDep(SU, SDep::Artificial)); This code does not seem to be in any way specific to compares; and in any case, at least on the PPC A2, scheduling stores in between the compare and the branch would not be a bad thing (because the core is in order, and the compare has a 2-cycle latency, so if there is nothing else to do, a store would not be a bad thing to put there). Can you explain the motivation for this (why or for what it is bad), and what else it might do (aside from the commented cmp/branch pairing)? I'm wondering if we should make this target dependent. Thanks again, Hal -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
On Jan 17, 2014, at 3:54 PM, Hal Finkel <hfinkel at anl.gov> wrote:> Andy, et al., > > In ScheduleDAGInstrs::buildSchedGraph, the code for handling stores has this: > > if (!ExitSU.isPred(SU)) > // Push store's up a bit to avoid them getting in between cmp > // and branches. > ExitSU.addPred(SDep(SU, SDep::Artificial)); > > This code does not seem to be in any way specific to compares; and in any case, at least on the PPC A2, scheduling stores in between the compare and the branch would not be a bad thing (because the core is in order, and the compare has a 2-cycle latency, so if there is nothing else to do, a store would not be a bad thing to put there). > > Can you explain the motivation for this (why or for what it is bad), and what else it might do (aside from the commented cmp/branch pairing)? I'm wondering if we should make this target dependent.I don’t agree with the existing comment. It’s possible that somewhere, maybe in target specific code, we make use of the extra store->exit edge, but I can’t remember any reason for it now. You could (a) Just nuke it and verify PPC and x86 benchmarks. If all looks good, then I’ll confirm by running arm. (b) Make it target specific and add a FIXME to remove after performance analysis -Andy> Thanks again, > Hal > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory
On Jan 17, 2014, at 4:03 PM, Andrew Trick <atrick at apple.com> wrote:> > On Jan 17, 2014, at 3:54 PM, Hal Finkel <hfinkel at anl.gov> wrote: > >> Andy, et al., >> >> In ScheduleDAGInstrs::buildSchedGraph, the code for handling stores has this: >> >> if (!ExitSU.isPred(SU)) >> // Push store's up a bit to avoid them getting in between cmp >> // and branches. >> ExitSU.addPred(SDep(SU, SDep::Artificial)); >> >> This code does not seem to be in any way specific to compares; and in any case, at least on the PPC A2, scheduling stores in between the compare and the branch would not be a bad thing (because the core is in order, and the compare has a 2-cycle latency, so if there is nothing else to do, a store would not be a bad thing to put there). >> >> Can you explain the motivation for this (why or for what it is bad), and what else it might do (aside from the commented cmp/branch pairing)? I'm wondering if we should make this target dependent. > > I don’t agree with the existing comment. It’s possible that somewhere, maybe in target specific code, we make use of the extra store->exit edge, but I can’t remember any reason for it now.Do you have another mechanism for encouraging macro fusion? Thanks, /jakob
----- Original Message -----> From: "Andrew Trick" <atrick at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, January 17, 2014 6:03:46 PM > Subject: Re: Artificial deps and stores > > > On Jan 17, 2014, at 3:54 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > > Andy, et al., > > > > In ScheduleDAGInstrs::buildSchedGraph, the code for handling stores > > has this: > > > > if (!ExitSU.isPred(SU)) > > // Push store's up a bit to avoid them getting in between > > cmp > > // and branches. > > ExitSU.addPred(SDep(SU, SDep::Artificial)); > > > > This code does not seem to be in any way specific to compares; and > > in any case, at least on the PPC A2, scheduling stores in between > > the compare and the branch would not be a bad thing (because the > > core is in order, and the compare has a 2-cycle latency, so if > > there is nothing else to do, a store would not be a bad thing to > > put there). > > > > Can you explain the motivation for this (why or for what it is > > bad), and what else it might do (aside from the commented > > cmp/branch pairing)? I'm wondering if we should make this target > > dependent. > > I don’t agree with the existing comment. It’s possible that > somewhere, maybe in target specific code, we make use of the extra > store->exit edge, but I can’t remember any reason for it now. You > could > > (a) Just nuke it and verify PPC and x86 benchmarks. If all looks > good, then I’ll confirm by running arm.r206795, thanks! Maybe one day I won't be so far behind on my TODO list ;) -Hal> > (b) Make it target specific and add a FIXME to remove after > performance analysis > > -Andy > > > Thanks again, > > Hal > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory