Displaying 20 results from an estimated 22 matches for "schedmodel".
2019 Sep 10
2
MachineScheduler not scheduling for latency
...on of issue limitation on
in-order CPUs, or on CPUs that don't define instruction groups, or
some similar condition? Something like:
--- a/lib/CodeGen/MachineScheduler.cpp
+++ b/lib/CodeGen/MachineScheduler.cpp
@@ -2062,10 +2062,13 @@ getOtherResourceCount(unsigned &OtherCritIdx) {
if (!SchedModel->hasInstrSchedModel())
return 0;
- unsigned OtherCritCount = Rem->RemIssueCount
- + (RetiredMOps * SchedModel->getMicroOpFactor());
- LLVM_DEBUG(dbgs() << " " << Available.getName() << " + Remain MOps: "
- << OtherC...
2015 Nov 16
3
DFAPacketizer, Scheduling and LoadLatency
I'm unclear how does DFAPacketizer and the scheduler know a given
instruction is a load.
Here is what I'm talking about
Let's assume my VLIW target is described as follows:
def MyTargetItineraries :
ProcessorItineraries<[Slot0, Slot1], [], [
..............................
InstrItinData<RI, [InstrStage<1, [Slot0, Slot1]>]>,
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
...modeled as a single micro-op, and SB can
decode 4
// instructions per cycle.
// FIXME: Identify instructions that aren't a single fused micro-op.
let IssueWidth = 4;
, which also seem to indicate (1).
What's more, I see that checkHazard() returns true if '(CurrMOps + uops
> SchedModel->getIssueWidth())'.
This means that the SU will be put in Pending instead of Available based
on the number of microops it uses.
To me this seems like an in-order decoding hazard check, since an OOO
machine will rearrange the microops
during execution, so there is not much use in checking f...
2018 Dec 12
4
[RFC] Moving tools/llvm-mca/lib into lib/MCA
...m-mca/lib into lib/MCA and
create a new MCA library in LLVM.
llvm-mca has recently been split
<https://bugs.llvm.org/show_bug.cgi?id=37696> into its core part and the
tool part.
-
The core part simulates the execution of a basic block of machine
instructions as modeled by the llvm SchedModel.
-
The tool part deals with the plumbing and interacting with the user.
The core part can be used by parts of LLVM that deal with cost modeling
(e.g. scheduling and vectorization). MCA provides a more realistic target
for optimization than the heuristics typically used to drive these passe...
2019 Sep 09
2
Fwd: MachineScheduler not scheduling for latency
Hi,
I'm trying to understand why MachineScheduler does a poor job in
straight line code in cases like the one in the attached debug dump.
This is on AMDGPU, an in-order target, and the problem is that the
IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in
the resulting schedule they are often placed right next to their uses
like this:
1784B %140:vgpr_32 =
2014 Feb 18
2
[LLVMdev] Question about per-operand machine model
Hi Andy and all,
I have a question about per-operand machine model. I am finding some
relations between 'MCWriteLatencyEntry' and 'MCWriteProcResEntry'.
For example,
class InstTEST<..., InstrItinClass itin> : Instruction {
let Itinerary = Itin;
}
// I assume this MI writes 2 registers.
def TESTINST : InstTEST<..., II_TEST>
// schedule info
II_TEST:
2015 Nov 17
2
DFAPacketizer, Scheduling and LoadLatency
> In particular, the LoadLatency is used in defaultDefLatency:
>
> /// Return the default expected latency for a def based on it's opcode.
> unsigned TargetInstrInfo::defaultDefLatency(
> const MCSchedModel &SchedModel, const MachineInstr *DefMI) const {
> if (DefMI->isTransient())
> return 0;
> if (DefMI->mayLoad())
> return SchedModel.LoadLatency;
> if (isHighLatencyDef(DefMI->getOpcode()))
> return SchedModel.HighLatency;
> return 1;
> }
>...
2014 Feb 18
2
[LLVMdev] Question about per-operand machine model
...==================================================================
--- utils/TableGen/SubtargetEmitter.cpp (revision 201607)
+++ utils/TableGen/SubtargetEmitter.cpp (working copy)
@@ -932,12 +932,7 @@
WLEntry.Cycles = 0;
unsigned WriteID = WriteSeq.back();
WriterNames.push_back(SchedModels.getSchedWrite(WriteID).Name);
- // If this Write is not referenced by a ReadAdvance, don't distinguish it
- // from other WriteLatency entries.
- if (!SchedModels.hasReadOfWrite(
- SchedModels.getSchedWrite(WriteID).TheDef)) {
- WriteID = 0;
- }
+
W...
2018 May 09
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
...and SB can decode 4
> // instructions per cycle.
> // FIXME: Identify instructions that aren't a single fused micro-op.
> let IssueWidth = 4;
>
> , which also seem to indicate (1).
>
> What's more, I see that checkHazard() returns true if '(CurrMOps + uops > SchedModel->getIssueWidth())'.
> This means that the SU will be put in Pending instead of Available based on the number of microops it uses.
> To me this seems like an in-order decoding hazard check, since an OOO machine will rearrange the microops
> during execution, so there is not much use...
2013 Nov 21
0
[LLVMdev] SchedMachineModel clarifications
Dear All,
Attached files is related to the changes made to add the Schedmodel for a
AMD bulldozer target,
Please note that , the model is incomplete but has some of the valuables
features implemented.
Request to the group or someone from AMD for the comments on the
implementation.
Thanks
~umesh
On Wed, Nov 13, 2013 at 8:14 PM, Umesh Kalappa <umesh.kalappa0 at gma...
2014 Nov 02
3
[LLVMdev] "Anti" scheduling with OoO cores?
Hi Andy, Dave,
I've been doing a bit of experimentation trying to understand the
schedmodel a bit better and improving modelling of FDIV (on Cortex-A57).
FDIV is not pipelined, and blocks other FDIV operations (FDIVDrr and
FDIVSrr). This seems to be already semi-modelled, with a
"ResourceCycles=[18]" line in the SchedWriteRes for this instruction. This
doesn't seem to work...
2013 Nov 13
2
[LLVMdev] SchedMachineModel clarifications
Dear Andrew and the Group,
I’m trying come up with a SchedMachineModel for the AMD bulldozer
http://en.wikipedia.org/wiki/Bulldozer_(microarchitecture).
The model is not exist for the same .Please correct me if am i wrong here.
I was going through your reference @
https://llvm.org/svn/llvm-project/llvm/trunk/include/llvm/Target/TargetSchedule.td
.
But I couldn’t model some of the
2016 May 13
2
A question about AArch64 Cortex-A57 subtarget definition
...A57UnitM : ProcResource<1>; // Type M micro-ops
def A57UnitL : ProcResource<1>; // Type L micro-ops
def A57UnitS : ProcResource<1>; // Type S micro-ops
def A57UnitX : ProcResource<1>; // Type X micro-ops
def A57UnitW : ProcResource<1>; // Type W micro-ops
let SchedModel = CortexA57Model in {
def A57UnitV : ProcResGroup<[A57UnitX, A57UnitW]>; // Type V micro-ops
}
```
According the Cortex-A57 software optimization manual, Cortex-A57 has 8
function units in the backend,
- Branch(B)
- Integer 0(I0)
- Integer 1(I1)
- Integer Muti-Cycle(M)
- Load...
2016 Dec 16
1
help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?
...`ve tried going as far down the rabbit hole as I can, I haven`t found a way to set
DefaultLoopMicroOpBufferSize on a per-ISA basis or to change the generic AArch64 model. I`ve
searched for anything that might create an ISA-specific default scheduling model or that might
initialize the default MCSchedModel in an ISA-specific way, but I`ve come up empty.
The closest I found to what I was looking for seems to be a barrier for what I am trying to do;
quoting "lib/Target/AArch64/AArch64GenSubtargetInfo.inc" in a build dir.:
{ "generic", (const void *)&NoSchedModel },
[...
2013 Apr 30
1
[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2
...mputeLatency' and
> 'computeOperandLatency'. However, these methods have been removed from
> 'ScheduleDAG' and 'ScheduleDAGInstrs' so are no longer invoked on our
> implementation. Instead, the correct approach seems to implement a
> sub-class of 'TargetSchedModel'.
>
> When I had a look at how other targets dealt with this transition I found
> that none of them had implemented the latency scheduler the way we had, so I
> couldn't just mimic a strategy used by other targets moving from v3.1 to
> v3.2. Unfortunately the people who imp...
2017 Jun 21
2
Verifying Backend Schedule (Over)Coverage
...he apply method to output
<idx, pat> pairs and <idx, name> pairs and then joined them togather
using a script. However, I couldn't easily determine from within that method
what specific subtarget the patterns came from. Is there a better place to do
this check? It seems that CodeGenSchedModels::checkCompleteness would be the
logical place.
Joel Jones
2013 Nov 22
0
[LLVMdev] [PATCH] Bulldozer SchedMachineModel
...should send patches to llvm-commits at cs.uiuc.edu, also each patch
> should be its own plain-text attachment.
>
> -Tom
>
> On Thu, Nov 21, 2013 at 11:22:36PM +0530, Umesh Kalappa wrote:
> > Dear All,
> >
> > Attached files is related to the changes made to add the Schedmodel for
> a
> > AMD bulldozer target,
> >
> > Please note that , the model is incomplete but has some of the valuables
> > features implemented.
> >
> > Request to the group or someone from AMD for the comments on the
> > implementation.
> >
> &...
2016 Apr 26
3
How to get started with instruction scheduling? Advice needed.
...----===//
// The following definitions describe the simpler per-operand machine model.
// This works with MachineScheduler and will eventually replace itineraries.
class A9WriteLMOpsListType<list<WriteSequence> writes> {
list <WriteSequence> Writes = writes;
SchedMachineModel SchedModel = ?;
}
// Cortex-A9 machine model for scheduling and other instruction cost heuristics.
def CortexA9Model : SchedMachineModel {
let IssueWidth = 2; // 2 micro-ops are dispatched per cycle.
let MicroOpBufferSize = 56; // Based on available renamed registers.
let LoadLatency = 2; // Optimistic...
2016 Apr 20
2
How to get started with instruction scheduling? Advice needed.
So if I use the SchedMachineModel method, can I just skip itineraries?
Phil
On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <slarin at codeaurora.org>
wrote:
> Target does make a difference. VLIW needs more hand-holding. For what you
> are describing it should be fairly simple.
>
>
>
> Best strategy – see what other targets do. ARM might be a good start for
> generic
2015 Oct 15
3
what can cause a "CPU table is not sorted" assertion
...decode 2 instructions per cycle.
let IssueWidth = 2;
let LoadLatency = 4;
let MispredictPenalty = 16;
// This flag is set to allow the scheduler to assign a default model to
// unrecognized opcodes.
let CompleteModel = 0;
}
def WriteALU : SchedWrite;
def WriteBranch : SchedWrite;
let SchedModel = MyTargetModel in {
// SLOT0 can handles everything
def Slot0 : ProcResource<1>;
// SLOT1 can't handles branches
def Slot1 : ProcResource<1>;
// Many micro-ops are capable of issuing on multiple ports.
def SlotAny : ProcResGroup<[Slot0, Slot1]>;
def : WriteRes<WriteALU,...