Displaying 17 results from an estimated 17 matches for "resourcecycl".
Did you mean:
resourcecycle
2013 Dec 23
2
[LLVMdev] [RFC] Iterrative compilation framework for Clang/LLVM
On Dec 16, 2013, at 4:26 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> At the end of each iteration, quality of generated code is estimated
>> by
>> executing newly introduced target dependent pass. Based on results
>> path for
>> the following iteration is calculated. At the moment, this has been
>> proved
>> for MIPS only and it is based on code
2014 Nov 02
3
[LLVMdev] "Anti" scheduling with OoO cores?
Hi Andy, Dave,
I've been doing a bit of experimentation trying to understand the
schedmodel a bit better and improving modelling of FDIV (on Cortex-A57).
FDIV is not pipelined, and blocks other FDIV operations (FDIVDrr and
FDIVSrr). This seems to be already semi-modelled, with a
"ResourceCycles=[18]" line in the SchedWriteRes for this instruction. This
doesn't seem to work (a poor schedule is produced) so I changed it to also
require another resource that I modelled as unbuffered (BufferSize=0), in
the hope that this would "block" other FDIVs... no joy.
Then I notice...
2014 Mar 03
2
[LLVMdev] Question about per-operand machine model
...s no attempt to insert nops currently. However, at the very least, you will want to implement your own MachineSchedStrategy. It would be natural to handle nop insertion within your implementation.
In fact, the interpretation of most machine model properties (MircoOpBufferSize, resource BufferSize, ResourceCycles, ResourceDelay) is handled within the MachineSchedStrategy. In past emails I have been explaining how the GenericScheduler interprets the model, but it is really up to your custom strategy to implement the model.
> I have attached a patch that adds the 'ResourceDelays' field in tableg...
2014 Feb 28
2
[LLVMdev] Question about per-operand machine model
...to be efficient for the common case, and per-operand resources don’t really make sense most of the time.
It sounds like you want to model the pipeline stage at which a resource is used. To do that with the per-operand machine model (misnomer), I think we need a ResourceDelay vector in addition to ResourceCycles, which we could easily add.
However, overall, I think you’re target is interesting enough that you may be better off augmenting the standard machine model with your own model. Your scheduler plugin could keep your own tables or state machine to model the constraints.
If you want to be clever, y...
2014 Feb 19
2
[LLVMdev] Question about per-operand machine model
Hi JinGu,
We currently have the ResourceCycles list to indicate the number of cpu cycles during which a resource is reserved. We could simply add a ResourceDelay with similar grammar. The MachineScheduler could be taught to keep track of the first and last time that a resource is reserved.
Note that the MachineScheduler will work with the in...
2014 Mar 04
2
[LLVMdev] Question about per-operand machine model
...preRA and postRA. So, if you want to do nop insertion within MachineScheduler (as opposed to a separate pass) you could enable it only during postRA scheduling.
-Andy
> Pete
>>
>> In fact, the interpretation of most machine model properties (MircoOpBufferSize, resource BufferSize, ResourceCycles, ResourceDelay) is handled within the MachineSchedStrategy. In past emails I have been explaining how the GenericScheduler interprets the model, but it is really up to your custom strategy to implement the model.
>>
>>> I have attached a patch that adds the 'ResourceDelays'...
2017 Apr 03
2
Scheduler: modelling long register reservations?
.../MyTargetSchedule.td:
//
def DesGCv3GenericModel : SchedMachineModel
{
let IssueWidth = 1;
let MicroOpBufferSize = 0;
let CompleteModel = 1;
}
// ...
def FlexU : ProcResource<64> { let BufferSize = 1; }
def : WriteRes<IIFlexRead, [FlexU]> { let Latency = 25; let ResourceCycles = [25]; }
class SchedFlexRead : Sched< [IIFlexRead] >; // I apply this to the definition of FXLV instruction
// ...
2015 Oct 15
3
what can cause a "CPU table is not sorted" assertion
...everything
def Slot0 : ProcResource<1>;
// SLOT1 can't handles branches
def Slot1 : ProcResource<1>;
// Many micro-ops are capable of issuing on multiple ports.
def SlotAny : ProcResGroup<[Slot0, Slot1]>;
def : WriteRes<WriteALU, [SlotAny]> {
let Latency = 1;
let ResourceCycles =[1];
}
def : WriteRes<WriteBranch, [Slot0]> {
let Latency = 1;
let ResourceCycles =[1];
}
}
I've also changed OR1K.td to have
def : ProcessorModel<"generic", MyTargetModel, [FeatureDiv, FeatureMul]>;
def : ProcessorModel<"or1200", MyTargetModel, [Fe...
2014 Feb 18
2
[LLVMdev] Question about per-operand machine model
Hi Andy and all,
I have a question about per-operand machine model. I am finding some
relations between 'MCWriteLatencyEntry' and 'MCWriteProcResEntry'.
For example,
class InstTEST<..., InstrItinClass itin> : Instruction {
let Itinerary = Itin;
}
// I assume this MI writes 2 registers.
def TESTINST : InstTEST<..., II_TEST>
// schedule info
II_TEST:
2013 Sep 26
1
[LLVMdev] [llvm] r190717 - Adds support for Atom Silvermont (SLM) - -march=slm
..."throughput" (the number of cycles which must elapse before an instruction of the same type can start) of an instruction?
The new model that works with MachineScheduler (not PostRA) lets you specify throughput in two dimensions, horizontally as a functional unit list, and vertically as a ResourceCycles attribute.
Horizontal:
def : WriteRes<WriteTwoPorts, [Port1, Port2]>;
Vertical:
def : WriteRes<WriteTwoCycles, [Port1]> { let ResourceCycles = [2]; }
> Do you expect that the new machine model will produce a better schedule than the current PostRA scheduler?
Yes. If not, then...
2013 Jul 23
0
[LLVMdev] Questions about MachineScheduler
...ou do away with most of the complexity in ConvergingScheduler::SchedBoundary and implement a straightforward reservation table. If it’s fully pipelined then you just count resource units for the current cycle until one reaches the latency factor. If it’s not fully pipelined, then you need to define ResourceCycles in the machine’s SchedWrite definitions and implement a simple reservation table (mark earliest cycle at which a resource is used for bottom-up scheduling). Some of this can be made a generic utility, but it’s not much to implement.
Since the strategy defines the priority queues, you can do what...
2013 Jul 22
2
[LLVMdev] Questions about MachineScheduler
Hi,
I'm working on defining a SchedMachineModel for the Southern Islands
family of GPUs, and I have two questions related to the
MachineScheduler.
1. I have a resource that can process 15 instructions at the same time.
In the TableGen definitions, should I do:
def HWVMEM : ProcResource<15>;
or
let BufferSize = 15 in {
def HWVMEM : ProcResource<1>;
}
2. Southern Islands has
2018 Feb 26
0
Suggentions on modeling a micro architecture with per-operand machine model
Hi everyone,
I would like to know how to model an instruction waiting a pipeline unit to
be empty for cycles.
For example, I have a vstr that waits FP pipelines to be empty for at most
3 cycles. I set FP instructions use a resource unit called FPPipe with
resourceCycle=3 and vstr use FPPipe with resourceClycle=0. So scheduler
will know a vstr will wait 3 cycle if it is scheduled right after a FP
instruction.
However, this way will cause one FP instruction waits other FP instruction
for 3 cycle.
Machine Resource model doesn't seem to support resource usag...
2020 May 09
2
[llvm-mca] Resource consumption of ProcResGroups
...ase would decide to dispatch it on Port5.
This (I believe) explains the following reported timings on a basic block which consists of a single instruction with no dependencies and a small NumMicroOps (i.e., only bottlenecked by resource availability), where I have tried out different port maps and ResourceCycles (all of these are for 100 iterations):
• When the resource mapping is: { HWPort0: 2 cycles, HWPort01: 2 cycles }, the instruction has a Total Cycles of 200, because the round-robin scheduler always assigns the HWPort01 resource to execute on HWPort1, so each iteration requires 2 cycles total.
•...
2014 Feb 18
2
[LLVMdev] Question about per-operand machine model
>Resources and latency are not tied. An instruction is mapped to a scheduling class. A scheduling class is mapped to a set of resources and a per-operand list of latencies.
Thanks for your kind explanation.
Our heuristic algorithm have needed the latency and the resource per operand to check resource conflicts per cycle. In order to support this with LLVM, I expected a per-operand list of
2018 May 09
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
...ency. To model those delays, the abstract model has various tools like ReadAdvance (bypassing) and the ability to extend the model with arbitrary "resources" and associate a cycle count with those resources for each instruction. (One tool currently missing is the ability to add a delay to ResourceCycles, but that would be easy to add).
Now we come to out-of-order execution, or, more generally, instruction buffers. Part of the CPU pipeline is always in-order. The issue point, which is the point of reference for counting cycles, only makes sense as an in-order part of the pipeline. Other parts of...
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi,
I would like to ask what IssueWidth and NumMicroOps refer to in
MachineScheduler, just to be 100% sure what the intent is.
Are we modeling the decoder phase or the execution stage?
Background:
First of all, there seems to be different meanings of "issue" depending
on which platform you're on: