thr3ads.net - search: "schedmodel"

Displaying 20 results from an estimated 22 matches for "schedmodel".

MachineScheduler not scheduling for latency

2019 Sep 10

MachineScheduler not scheduling for latency

...on of issue limitation on in-order CPUs, or on CPUs that don't define instruction groups, or some similar condition? Something like: --- a/lib/CodeGen/MachineScheduler.cpp +++ b/lib/CodeGen/MachineScheduler.cpp @@ -2062,10 +2062,13 @@ getOtherResourceCount(unsigned &OtherCritIdx) { if (!SchedModel->hasInstrSchedModel()) return 0; - unsigned OtherCritCount = Rem->RemIssueCount - + (RetiredMOps * SchedModel->getMicroOpFactor()); - LLVM_DEBUG(dbgs() << " " << Available.getName() << " + Remain MOps: " - << OtherC...

DFAPacketizer, Scheduling and LoadLatency

2015 Nov 16

DFAPacketizer, Scheduling and LoadLatency

I'm unclear how does DFAPacketizer and the scheduler know a given instruction is a load. Here is what I'm talking about Let's assume my VLIW target is described as follows: def MyTargetItineraries : ProcessorItineraries<[Slot0, Slot1], [], [ .............................. InstrItinData<RI, [InstrStage<1, [Slot0, Slot1]>]>,

[MachineScheduler] Question about IssueWidth / NumMicroOps

2018 May 09

[MachineScheduler] Question about IssueWidth / NumMicroOps

...modeled as a single micro-op, and SB can decode 4 // instructions per cycle. // FIXME: Identify instructions that aren't a single fused micro-op. let IssueWidth = 4; , which also seem to indicate (1). What's more, I see that checkHazard() returns true if '(CurrMOps + uops > SchedModel->getIssueWidth())'. This means that the SU will be put in Pending instead of Available based on the number of microops it uses. To me this seems like an in-order decoding hazard check, since an OOO machine will rearrange the microops during execution, so there is not much use in checking f...

[RFC] Moving tools/llvm-mca/lib into lib/MCA

2018 Dec 12

[RFC] Moving tools/llvm-mca/lib into lib/MCA

...m-mca/lib into lib/MCA and create a new MCA library in LLVM. llvm-mca has recently been split <https://bugs.llvm.org/show_bug.cgi?id=37696> into its core part and the tool part. - The core part simulates the execution of a basic block of machine instructions as modeled by the llvm SchedModel. - The tool part deals with the plumbing and interacting with the user. The core part can be used by parts of LLVM that deal with cost modeling (e.g. scheduling and vectorization). MCA provides a more realistic target for optimization than the heuristics typically used to drive these passe...

Fwd: MachineScheduler not scheduling for latency

2019 Sep 09

Fwd: MachineScheduler not scheduling for latency

Hi, I'm trying to understand why MachineScheduler does a poor job in straight line code in cases like the one in the attached debug dump. This is on AMDGPU, an in-order target, and the problem is that the IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in the resulting schedule they are often placed right next to their uses like this: 1784B %140:vgpr_32 =

[LLVMdev] Question about per-operand machine model

2014 Feb 18

[LLVMdev] Question about per-operand machine model

Hi Andy and all, I have a question about per-operand machine model. I am finding some relations between 'MCWriteLatencyEntry' and 'MCWriteProcResEntry'. For example, class InstTEST<..., InstrItinClass itin> : Instruction { let Itinerary = Itin; } // I assume this MI writes 2 registers. def TESTINST : InstTEST<..., II_TEST> // schedule info II_TEST:

DFAPacketizer, Scheduling and LoadLatency

2015 Nov 17

DFAPacketizer, Scheduling and LoadLatency

> In particular, the LoadLatency is used in defaultDefLatency: > > /// Return the default expected latency for a def based on it's opcode. > unsigned TargetInstrInfo::defaultDefLatency( > const MCSchedModel &SchedModel, const MachineInstr *DefMI) const { > if (DefMI->isTransient()) > return 0; > if (DefMI->mayLoad()) > return SchedModel.LoadLatency; > if (isHighLatencyDef(DefMI->getOpcode())) > return SchedModel.HighLatency; > return 1; > } >...

[LLVMdev] Question about per-operand machine model

2014 Feb 18

[LLVMdev] Question about per-operand machine model

...================================================================== --- utils/TableGen/SubtargetEmitter.cpp (revision 201607) +++ utils/TableGen/SubtargetEmitter.cpp (working copy) @@ -932,12 +932,7 @@ WLEntry.Cycles = 0; unsigned WriteID = WriteSeq.back(); WriterNames.push_back(SchedModels.getSchedWrite(WriteID).Name); - // If this Write is not referenced by a ReadAdvance, don't distinguish it - // from other WriteLatency entries. - if (!SchedModels.hasReadOfWrite( - SchedModels.getSchedWrite(WriteID).TheDef)) { - WriteID = 0; - } + W...

[MachineScheduler] Question about IssueWidth / NumMicroOps

2018 May 09

[MachineScheduler] Question about IssueWidth / NumMicroOps

...and SB can decode 4 > // instructions per cycle. > // FIXME: Identify instructions that aren't a single fused micro-op. > let IssueWidth = 4; > > , which also seem to indicate (1). > > What's more, I see that checkHazard() returns true if '(CurrMOps + uops > SchedModel->getIssueWidth())'. > This means that the SU will be put in Pending instead of Available based on the number of microops it uses. > To me this seems like an in-order decoding hazard check, since an OOO machine will rearrange the microops > during execution, so there is not much use...

[LLVMdev] SchedMachineModel clarifications

2013 Nov 21

[LLVMdev] SchedMachineModel clarifications

Dear All, Attached files is related to the changes made to add the Schedmodel for a AMD bulldozer target, Please note that , the model is incomplete but has some of the valuables features implemented. Request to the group or someone from AMD for the comments on the implementation. Thanks ~umesh On Wed, Nov 13, 2013 at 8:14 PM, Umesh Kalappa <umesh.kalappa0 at gma...

[LLVMdev] "Anti" scheduling with OoO cores?

2014 Nov 02

[LLVMdev] "Anti" scheduling with OoO cores?

Hi Andy, Dave, I've been doing a bit of experimentation trying to understand the schedmodel a bit better and improving modelling of FDIV (on Cortex-A57). FDIV is not pipelined, and blocks other FDIV operations (FDIVDrr and FDIVSrr). This seems to be already semi-modelled, with a "ResourceCycles=[18]" line in the SchedWriteRes for this instruction. This doesn't seem to work...

[LLVMdev] SchedMachineModel clarifications

2013 Nov 13

[LLVMdev] SchedMachineModel clarifications

Dear Andrew and the Group, I’m trying come up with a SchedMachineModel for the AMD bulldozer http://en.wikipedia.org/wiki/Bulldozer_(microarchitecture). The model is not exist for the same .Please correct me if am i wrong here. I was going through your reference @ https://llvm.org/svn/llvm-project/llvm/trunk/include/llvm/Target/TargetSchedule.td . But I couldn’t model some of the

A question about AArch64 Cortex-A57 subtarget definition

2016 May 13

A question about AArch64 Cortex-A57 subtarget definition

...A57UnitM : ProcResource<1>; // Type M micro-ops def A57UnitL : ProcResource<1>; // Type L micro-ops def A57UnitS : ProcResource<1>; // Type S micro-ops def A57UnitX : ProcResource<1>; // Type X micro-ops def A57UnitW : ProcResource<1>; // Type W micro-ops let SchedModel = CortexA57Model in { def A57UnitV : ProcResGroup<[A57UnitX, A57UnitW]>; // Type V micro-ops } ``` According the Cortex-A57 software optimization manual, Cortex-A57 has 8 function units in the backend, - Branch(B) - Integer 0(I0) - Integer 1(I1) - Integer Muti-Cycle(M) - Load...

help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?

2016 Dec 16

help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?

...`ve tried going as far down the rabbit hole as I can, I haven`t found a way to set DefaultLoopMicroOpBufferSize on a per-ISA basis or to change the generic AArch64 model. I`ve searched for anything that might create an ISA-specific default scheduling model or that might initialize the default MCSchedModel in an ISA-specific way, but I`ve come up empty. The closest I found to what I was looking for seems to be a barrier for what I am trying to do; quoting "lib/Target/AArch64/AArch64GenSubtargetInfo.inc" in a build dir.: { "generic", (const void *)&NoSchedModel }, [...

[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2

2013 Apr 30

[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2

...mputeLatency' and > 'computeOperandLatency'. However, these methods have been removed from > 'ScheduleDAG' and 'ScheduleDAGInstrs' so are no longer invoked on our > implementation. Instead, the correct approach seems to implement a > sub-class of 'TargetSchedModel'. > > When I had a look at how other targets dealt with this transition I found > that none of them had implemented the latency scheduler the way we had, so I > couldn't just mimic a strategy used by other targets moving from v3.1 to > v3.2. Unfortunately the people who imp...

Verifying Backend Schedule (Over)Coverage

2017 Jun 21

Verifying Backend Schedule (Over)Coverage

...he apply method to output <idx, pat> pairs and <idx, name> pairs and then joined them togather using a script. However, I couldn't easily determine from within that method what specific subtarget the patterns came from. Is there a better place to do this check? It seems that CodeGenSchedModels::checkCompleteness would be the logical place. Joel Jones

[LLVMdev] [PATCH] Bulldozer SchedMachineModel

2013 Nov 22

[LLVMdev] [PATCH] Bulldozer SchedMachineModel

...should send patches to llvm-commits at cs.uiuc.edu, also each patch > should be its own plain-text attachment. > > -Tom > > On Thu, Nov 21, 2013 at 11:22:36PM +0530, Umesh Kalappa wrote: > > Dear All, > > > > Attached files is related to the changes made to add the Schedmodel for > a > > AMD bulldozer target, > > > > Please note that , the model is incomplete but has some of the valuables > > features implemented. > > > > Request to the group or someone from AMD for the comments on the > > implementation. > > > &...

How to get started with instruction scheduling? Advice needed.

2016 Apr 26

How to get started with instruction scheduling? Advice needed.

...----===// // The following definitions describe the simpler per-operand machine model. // This works with MachineScheduler and will eventually replace itineraries. class A9WriteLMOpsListType<list<WriteSequence> writes> { list <WriteSequence> Writes = writes; SchedMachineModel SchedModel = ?; } // Cortex-A9 machine model for scheduling and other instruction cost heuristics. def CortexA9Model : SchedMachineModel { let IssueWidth = 2; // 2 micro-ops are dispatched per cycle. let MicroOpBufferSize = 56; // Based on available renamed registers. let LoadLatency = 2; // Optimistic...

How to get started with instruction scheduling? Advice needed.

2016 Apr 20

How to get started with instruction scheduling? Advice needed.

So if I use the SchedMachineModel method, can I just skip itineraries? Phil On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <slarin at codeaurora.org> wrote: > Target does make a difference. VLIW needs more hand-holding. For what you > are describing it should be fairly simple. > > > > Best strategy – see what other targets do. ARM might be a good start for > generic

what can cause a "CPU table is not sorted" assertion

2015 Oct 15

what can cause a "CPU table is not sorted" assertion

...decode 2 instructions per cycle. let IssueWidth = 2; let LoadLatency = 4; let MispredictPenalty = 16; // This flag is set to allow the scheduler to assign a default model to // unrecognized opcodes. let CompleteModel = 0; } def WriteALU : SchedWrite; def WriteBranch : SchedWrite; let SchedModel = MyTargetModel in { // SLOT0 can handles everything def Slot0 : ProcResource<1>; // SLOT1 can't handles branches def Slot1 : ProcResource<1>; // Many micro-ops are capable of issuing on multiple ports. def SlotAny : ProcResGroup<[Slot0, Slot1]>; def : WriteRes<WriteALU,...

search for: schedmodel