Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] SchedMachineModel clarifications"
2013 Nov 21
0
[LLVMdev] SchedMachineModel clarifications
Dear All,
Attached files is related to the changes made to add the Schedmodel for a
AMD bulldozer target,
Please note that , the model is incomplete but has some of the valuables
features implemented.
Request to the group or someone from AMD for the comments on the
implementation.
Thanks
~umesh
On Wed, Nov 13, 2013 at 8:14 PM, Umesh Kalappa <umesh.kalappa0 at gmail.com>wrote:
>
2013 Nov 22
0
[LLVMdev] [PATCH] Bulldozer SchedMachineModel
Tom ,
Thank you for correcting me here ,
All ,
Please review the changes made and is it ok to commit ??
Thanks
~Umesh
On Thu, Nov 21, 2013 at 11:47 PM, Tom Stellard <tom at stellard.net> wrote:
> Hi Umesh,
>
> You should send patches to llvm-commits at cs.uiuc.edu, also each patch
> should be its own plain-text attachment.
>
> -Tom
>
> On Thu, Nov 21, 2013 at
2013 Nov 22
2
[LLVMdev] SchedMachineModel clarifications
If you haven't found it yet, the last public AMD Software Optimization
Guide for Family 15h is here:
http://developer.amd.com/wordpress/media/2012/03/47414_15h_sw_opt_guide.pdf
This one describes both Bulldozer and Piledriver processors. Chapter 2
will given an overview of the Microarchitecture and Appendix B gives some
additional details on which pipes are used for where.
I haven't yet
2013 Nov 22
1
[LLVMdev] SchedMachineModel clarifications
I made a quick cross check with information in the SWOG (Software
Optimization Guide). The port assignments look consistent. A few of the
latency values are slightly different from the SWOG, e.g. WriteFRcp --> 6,
WriteFSqrt --> 29 and WriteCvt* --> 4 seem to be suggested instead.
Others are in better position to describe how to use llvm performance
framework.
--mev, Mike Vermeulen
2013 Nov 22
0
[LLVMdev] SchedMachineModel clarifications
Hi Mike,
Thank you for the link and my bad last mail has the old patch file.
Please have look at the attached patch file herewith,which has the latest
changes.
i'm new to llvm testing framework and cross compilation as such ,Please
can you through some lights like references etc ,Which states that how can
i cross compile the llvm for Bulldozer and run the performance test
against my
2015 Nov 16
3
DFAPacketizer, Scheduling and LoadLatency
I'm unclear how does DFAPacketizer and the scheduler know a given
instruction is a load.
Here is what I'm talking about
Let's assume my VLIW target is described as follows:
def MyTargetItineraries :
ProcessorItineraries<[Slot0, Slot1], [], [
..............................
InstrItinData<RI, [InstrStage<1, [Slot0, Slot1]>]>,
2016 May 13
2
A question about AArch64 Cortex-A57 subtarget definition
Hello everybody,
I'm reading the .td files defining the Cortex-A57 processor,
which is a subtarget of AArch64 target, and there is something
confusing me in the `AArch64SchedA57.td` file.
In the top of `AArch64SchedA57.td`, various processor resource are
defined, as follows
```
def A57UnitB : ProcResource<1>; // Type B micro-ops
def A57UnitI : ProcResource<2>; // Type
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi,
I would like to ask what IssueWidth and NumMicroOps refer to in
MachineScheduler, just to be 100% sure what the intent is.
Are we modeling the decoder phase or the execution stage?
Background:
First of all, there seems to be different meanings of "issue" depending
on which platform you're on:
2016 Dec 16
1
help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?
Dear all,
Some benchmarking experimentation I`ve done recently -- all on AArch64 -- has shown that it
might be beneficial for all AArch64 targets to have a positive LoopMicroOpBufferSize, whereas
the default that applies to all ISAs seems to be zero.
Although I`ve tried going as far down the rabbit hole as I can, I haven`t found a way to set
DefaultLoopMicroOpBufferSize on a per-ISA basis or
2016 Apr 26
3
How to get started with instruction scheduling? Advice needed.
Hi Phil.
You more or less answered your own question, but let me give you some more info. Maybe it is of use.
>From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in
2016 Apr 20
2
How to get started with instruction scheduling? Advice needed.
So if I use the SchedMachineModel method, can I just skip itineraries?
Phil
On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <slarin at codeaurora.org>
wrote:
> Target does make a difference. VLIW needs more hand-holding. For what you
> are describing it should be fairly simple.
>
>
>
> Best strategy – see what other targets do. ARM might be a good start for
> generic
2013 Apr 30
1
[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2
On Apr 26, 2013, at 3:53 AM, Martin J. O'Riordan <Martin.ORiordan at movidius.com> wrote:
> I am migrating the llvm/clang derived compiler for our processor from the
> v3.1 to v3.2 codebase. This has mostly gone well except that instruction
> latency scheduling is no longer happening.
>
> The people who implemented this previously sub-classed 'ScheduleDAGInstrs'
2015 Nov 07
2
Is there a way to convert between SchedMachineModel and Itineraries?
Is there a way to convert between SchedMachineModel and Itineraries?
I was trying to write a very simple VLIW packetizer (Hexagon was my
starting point). It turns out that current DFAPacketizer is using
itineraries, but my schedule is based on SchedMachineModel (I was
recommended to use it since the itineraries are being phased out). I was
wondering if there is an automated tool that would
2018 Apr 05
1
A9 Scheduler
Hi,
I am having some trouble understanding the scheduling scheme for the C-A9.
Looking at the ARMScheduleA9.td file I find this line that overrides the
target SchedWrite with processor specific latencies.
def : SchedAlias<WriteALU, A9WriteALU>;
However, in this same file, I find the lines presented below, which are
mapping the SchedReadWrite to, for example, the ANDri instruction.
//
2020 Sep 14
2
Simulation of load-store forwarding with MI scheduler on AArch64
Hi list,
Is it possible to simulate load to store forwarding on aarch64 with MI scheduling model on AArch64?
For instance $x0 data latency in the example below should be 1 cycle
ldr $x0, [$x1]
str $x0, [$x2]
But it should be 4 cycles if we have another instruction:
ldr $x0, [$x1]
add $x0, $x0, 4
For ALU instructions it’s possible to use either ReadAdvance or SchedReadAdvance, but I don’t see
2015 Nov 09
2
Is there a way to convert between SchedMachineModel and Itineraries?
----- Original Message -----
> From: "Rail Shafigulin via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Monday, November 9, 2015 10:09:07 AM
> Subject: Re: [llvm-dev] Is there a way to convert between SchedMachineModel and Itineraries?
>
>
> Anybody? Does anyone at all know how to do it?
There is
2019 Sep 10
2
MachineScheduler not scheduling for latency
Hi Andy,
Thanks for the explanations. Yes AMDGPU is in-order and has
MicroOpBufferSize = 1.
Re "issue limited" and instruction groups: could it make sense to
disable the generic scheduler's detection of issue limitation on
in-order CPUs, or on CPUs that don't define instruction groups, or
some similar condition? Something like:
--- a/lib/CodeGen/MachineScheduler.cpp
+++
2018 May 09
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
> On May 9, 2018, at 9:43 AM, Jonas Paulsson <paulsson at linux.vnet.ibm.com> wrote:
>
> Hi,
>
> I would like to ask what IssueWidth and NumMicroOps refer to in MachineScheduler, just to be 100% sure what the intent is.
> Are we modeling the decoder phase or the execution stage?
>
> Background:
>
> First of all, there seems to be different meanings of
2020 Sep 15
2
[EXTERNAL] Re: Simulation of load-store forwarding with MI scheduler on AArch64
Thanks for prompt response, Andy
This will work for cases when address is not modified. However this doesn’t seem to work for pre/post increment load stores.
Consider data to address forwarding:
$x0 = ldr x0, [x1]
$x0, $x2 = ldr x2, [x0, 16]!
The second instruction will have it’s own latency for address modification ($x0 register). So I don’t see how we can use ReadAdr stuff
here. May be
2011 Nov 30
3
[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg
On 30.11.2011, at 08:33, Duncan Sands wrote:
> Hi Jan,
>
>> if I compile with dragonegg and -march=native I get this message:
>> 'bdver1' is not a recognized processor for this target (ignoring processor)
>
> this is coming directly from LLVM which doesn't know about bulldozer yet.
>
>> Is there any plan to support this cpu ?
>
> I don't