similar to: [LLVMdev] SchedMachineModel clarifications

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] SchedMachineModel clarifications"

2013 Nov 21
0
[LLVMdev] SchedMachineModel clarifications
Dear All, Attached files is related to the changes made to add the Schedmodel for a AMD bulldozer target, Please note that , the model is incomplete but has some of the valuables features implemented. Request to the group or someone from AMD for the comments on the implementation. Thanks ~umesh On Wed, Nov 13, 2013 at 8:14 PM, Umesh Kalappa <umesh.kalappa0 at gmail.com>wrote: >
2013 Nov 22
0
[LLVMdev] [PATCH] Bulldozer SchedMachineModel
Tom , Thank you for correcting me here , All , Please review the changes made and is it ok to commit ?? Thanks ~Umesh On Thu, Nov 21, 2013 at 11:47 PM, Tom Stellard <tom at stellard.net> wrote: > Hi Umesh, > > You should send patches to llvm-commits at cs.uiuc.edu, also each patch > should be its own plain-text attachment. > > -Tom > > On Thu, Nov 21, 2013 at
2013 Nov 22
2
[LLVMdev] SchedMachineModel clarifications
If you haven't found it yet, the last public AMD Software Optimization Guide for Family 15h is here: http://developer.amd.com/wordpress/media/2012/03/47414_15h_sw_opt_guide.pdf This one describes both Bulldozer and Piledriver processors. Chapter 2 will given an overview of the Microarchitecture and Appendix B gives some additional details on which pipes are used for where. I haven't yet
2013 Nov 22
1
[LLVMdev] SchedMachineModel clarifications
I made a quick cross check with information in the SWOG (Software Optimization Guide). The port assignments look consistent. A few of the latency values are slightly different from the SWOG, e.g. WriteFRcp --> 6, WriteFSqrt --> 29 and WriteCvt* --> 4 seem to be suggested instead. Others are in better position to describe how to use llvm performance framework. --mev, Mike Vermeulen
2013 Nov 22
0
[LLVMdev] SchedMachineModel clarifications
Hi Mike, Thank you for the link and my bad last mail has the old patch file. Please have look at the attached patch file herewith,which has the latest changes. i'm new to llvm testing framework and cross compilation as such ,Please can you through some lights like references etc ,Which states that how can i cross compile the llvm for Bulldozer and run the performance test against my
2015 Nov 16
3
DFAPacketizer, Scheduling and LoadLatency
I'm unclear how does DFAPacketizer and the scheduler know a given instruction is a load. Here is what I'm talking about Let's assume my VLIW target is described as follows: def MyTargetItineraries : ProcessorItineraries<[Slot0, Slot1], [], [ .............................. InstrItinData<RI, [InstrStage<1, [Slot0, Slot1]>]>,
2016 May 13
2
A question about AArch64 Cortex-A57 subtarget definition
Hello everybody, I'm reading the .td files defining the Cortex-A57 processor, which is a subtarget of AArch64 target, and there is something confusing me in the `AArch64SchedA57.td` file. In the top of `AArch64SchedA57.td`, various processor resource are defined, as follows ``` def A57UnitB : ProcResource<1>; // Type B micro-ops def A57UnitI : ProcResource<2>; // Type
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi, I would like to ask what IssueWidth and NumMicroOps refer to in MachineScheduler, just to be 100% sure what the intent is. Are we modeling the decoder phase or the execution stage? Background: First of all, there seems to be different meanings of "issue" depending on which platform you're on:
2016 Dec 16
1
help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?
Dear all, Some benchmarking experimentation I`ve done recently -- all on AArch64 -- has shown that it might be beneficial for all AArch64 targets to have a positive LoopMicroOpBufferSize, whereas the default that applies to all ISAs seems to be zero. Although I`ve tried going as far down the rabbit hole as I can, I haven`t found a way to set DefaultLoopMicroOpBufferSize on a per-ISA basis or
2016 Apr 26
3
How to get started with instruction scheduling? Advice needed.
Hi Phil. You more or less answered your own question, but let me give you some more info. Maybe it is of use. >From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in
2016 Apr 20
2
How to get started with instruction scheduling? Advice needed.
So if I use the SchedMachineModel method, can I just skip itineraries? Phil On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <slarin at codeaurora.org> wrote: > Target does make a difference. VLIW needs more hand-holding. For what you > are describing it should be fairly simple. > > > > Best strategy – see what other targets do. ARM might be a good start for > generic
2013 Apr 30
1
[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2
On Apr 26, 2013, at 3:53 AM, Martin J. O'Riordan <Martin.ORiordan at movidius.com> wrote: > I am migrating the llvm/clang derived compiler for our processor from the > v3.1 to v3.2 codebase. This has mostly gone well except that instruction > latency scheduling is no longer happening. > > The people who implemented this previously sub-classed 'ScheduleDAGInstrs'
2015 Nov 07
2
Is there a way to convert between SchedMachineModel and Itineraries?
Is there a way to convert between SchedMachineModel and Itineraries? I was trying to write a very simple VLIW packetizer (Hexagon was my starting point). It turns out that current DFAPacketizer is using itineraries, but my schedule is based on SchedMachineModel (I was recommended to use it since the itineraries are being phased out). I was wondering if there is an automated tool that would
2018 Apr 05
1
A9 Scheduler
Hi, I am having some trouble understanding the scheduling scheme for the C-A9. Looking at the ARMScheduleA9.td file I find this line that overrides the target SchedWrite with processor specific latencies. def : SchedAlias<WriteALU, A9WriteALU>; However, in this same file, I find the lines presented below, which are mapping the SchedReadWrite to, for example, the ANDri instruction. //
2020 Sep 14
2
Simulation of load-store forwarding with MI scheduler on AArch64
Hi list, Is it possible to simulate load to store forwarding on aarch64 with MI scheduling model on AArch64? For instance $x0 data latency in the example below should be 1 cycle ldr $x0, [$x1] str $x0, [$x2] But it should be 4 cycles if we have another instruction: ldr $x0, [$x1] add $x0, $x0, 4 For ALU instructions it’s possible to use either ReadAdvance or SchedReadAdvance, but I don’t see
2015 Nov 09
2
Is there a way to convert between SchedMachineModel and Itineraries?
----- Original Message ----- > From: "Rail Shafigulin via llvm-dev" <llvm-dev at lists.llvm.org> > To: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Monday, November 9, 2015 10:09:07 AM > Subject: Re: [llvm-dev] Is there a way to convert between SchedMachineModel and Itineraries? > > > Anybody? Does anyone at all know how to do it? There is
2019 Sep 10
2
MachineScheduler not scheduling for latency
Hi Andy, Thanks for the explanations. Yes AMDGPU is in-order and has MicroOpBufferSize = 1. Re "issue limited" and instruction groups: could it make sense to disable the generic scheduler's detection of issue limitation on in-order CPUs, or on CPUs that don't define instruction groups, or some similar condition? Something like: --- a/lib/CodeGen/MachineScheduler.cpp +++
2018 May 09
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
> On May 9, 2018, at 9:43 AM, Jonas Paulsson <paulsson at linux.vnet.ibm.com> wrote: > > Hi, > > I would like to ask what IssueWidth and NumMicroOps refer to in MachineScheduler, just to be 100% sure what the intent is. > Are we modeling the decoder phase or the execution stage? > > Background: > > First of all, there seems to be different meanings of
2020 Sep 15
2
[EXTERNAL] Re: Simulation of load-store forwarding with MI scheduler on AArch64
Thanks for prompt response, Andy This will work for cases when address is not modified. However this doesn’t seem to work for pre/post increment load stores. Consider data to address forwarding: $x0 = ldr x0, [x1] $x0, $x2 = ldr x2, [x0, 16]! The second instruction will have it’s own latency for address modification ($x0 register). So I don’t see how we can use ReadAdr stuff here. May be
2011 Nov 30
3
[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg
On 30.11.2011, at 08:33, Duncan Sands wrote: > Hi Jan, > >> if I compile with dragonegg and -march=native I get this message: >> 'bdver1' is not a recognized processor for this target (ignoring processor) > > this is coming directly from LLVM which doesn't know about bulldozer yet. > >> Is there any plan to support this cpu ? > > I don't