Displaying 20 results from an estimated 200 matches similar to: "Fwd: MachineScheduler not scheduling for latency"
2019 Sep 10
2
MachineScheduler not scheduling for latency
Hi Andy,
Thanks for the explanations. Yes AMDGPU is in-order and has
MicroOpBufferSize = 1.
Re "issue limited" and instruction groups: could it make sense to
disable the generic scheduler's detection of issue limitation on
in-order CPUs, or on CPUs that don't define instruction groups, or
some similar condition? Something like:
--- a/lib/CodeGen/MachineScheduler.cpp
+++
2015 Mar 27
2
[LLVMdev] Question about load clustering in the machine scheduler
On Thu, Mar 26, 2015 at 11:50:20PM -0700, Andrew Trick wrote:
>
> > On Mar 26, 2015, at 7:36 PM, Tom Stellard <tom at stellard.net> wrote:
> >
> > Hi,
> >
> > I have a program with over 100 loads (each with a 10 cycle latency)
> > at the beginning of the program, and I can't figure out how to get
> > the machine scheduler to intermix ALU
2017 May 16
2
Bug in TableGen RegisterBankEmitter
On 05/16/2017 11:57 AM, Daniel Sanders wrote:
>> If that's right, one possible fix would be to rename some of the subregister indices but that's likely to be quite painful. I'll have a think and see if I can come up with something nicer.
>
> I haven't been able to come up with a better answer for this, just an alternate choice as to where the complexity is. If we were
2016 Mar 28
0
RFC: atomic operations on SI+
On Fri, Mar 25, 2016 at 02:22:11PM -0400, Jan Vesely wrote:
> Hi Tom, Matt,
>
> I'm working on a project that needs few coherent atomic operations (HSA
> mode: load, store, compare-and-swap) for std::atomic_uint in HCC.
>
> the attached patch implements atomic compare and swap for SI+
> (untested). I tried to stay within what was available, but there are
> few issues
2019 Sep 09
2
LiveInterval error with 2 dead defs
Hi,
I’m hitting a machine verifier error in a trivial testcase which I don’t understand. There are 2 dead defs of the same register:
---
name: multiple_connected_compnents_dead
tracksRegLiveness: true
body: |
bb.0:
dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
...
The live intervals look OK to me with 1 valno
2016 Mar 25
2
RFC: atomic operations on SI+
Hi Tom, Matt,
I'm working on a project that needs few coherent atomic operations (HSA
mode: load, store, compare-and-swap) for std::atomic_uint in HCC.
the attached patch implements atomic compare and swap for SI+
(untested). I tried to stay within what was available, but there are
few issues that I was unsure how to address:
1.) it currently uses v2i32 for both input and output. This
2017 May 10
2
Bug in TableGen RegisterBankEmitter
Hi Tom,
The output:
Added VReg_64(explicit)
Added VS_32(explicit (VS_32) VReg_64 class-with-subregs: VReg_64)
is saying that VS_32 was added because VReg_64 was explicitly specified and that while inspecting VS_32, it noticed that every register in VS_32 was a subregister of a register from VReg_64 using a single common subregister index.
I've added some more tracing to my local copy and
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi,
I would like to ask what IssueWidth and NumMicroOps refer to in
MachineScheduler, just to be 100% sure what the intent is.
Are we modeling the decoder phase or the execution stage?
Background:
First of all, there seems to be different meanings of "issue" depending
on which platform you're on:
2019 Oct 07
2
LiveInterval error with 2 dead defs
The associated patch caused a compilation problems on Hexagon: https://bugs.llvm.org/show_bug.cgi?id=43302
The splitting of a live interval should not be done automatically upon creation. Calling LIS->getInterval(Reg) should not go around changing the code behind the scenes.
There is already a function “splitSeparateComponents” that does that. It should be added where it’s missing.
--
2017 Nov 25
2
mischeduler (pre-RA) experiments
>
> Of course, you want to duplicate as little of the generic scheduling logic
> as you can. So I think the challenge is how to expose the
> generic scheduler's functionality as a base class or composition of
> utilities so that defining your strategy doesn't require too much
> copy-paste.
Isn't GCNMaxOccupancySchedStrategy [1] already an example on
using
2018 May 09
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
> On May 9, 2018, at 9:43 AM, Jonas Paulsson <paulsson at linux.vnet.ibm.com> wrote:
>
> Hi,
>
> I would like to ask what IssueWidth and NumMicroOps refer to in MachineScheduler, just to be 100% sure what the intent is.
> Are we modeling the decoder phase or the execution stage?
>
> Background:
>
> First of all, there seems to be different meanings of
2014 Nov 02
3
[LLVMdev] "Anti" scheduling with OoO cores?
Hi Andy, Dave,
I've been doing a bit of experimentation trying to understand the
schedmodel a bit better and improving modelling of FDIV (on Cortex-A57).
FDIV is not pipelined, and blocks other FDIV operations (FDIVDrr and
FDIVSrr). This seems to be already semi-modelled, with a
"ResourceCycles=[18]" line in the SchedWriteRes for this instruction. This
doesn't seem to work (a
2015 Mar 27
2
[LLVMdev] Question about load clustering in the machine scheduler
Hi,
I have a program with over 100 loads (each with a 10 cycle latency)
at the beginning of the program, and I can't figure out how to get
the machine scheduler to intermix ALU instructions with the loads to
effectively hide the latency.
It seems the issue is with load clustering. I restrict load clustering
to 4 at a time, but when I look at the debug output, the loads are
always being
2016 Oct 21
2
Prioritizing an SDNode for scheduling
Hello.
Is there a way to specify in the back end an (ISD::INLINEASM) SDNode to be scheduled
first under all circumstances? I need to specify something like node priority to schedule
the node before all other nodes in the SelectionDAG of the basic block.
(Using chain or glue edges in order to make a node first is not a good idea, since I
am doing this at instruction selection time, on
2013 Sep 25
2
[LLVMdev] how to detect data hazard in pre-RA-sched
Hi, Andrew,
Thank you for answering my question.
What's the status of misched? is it experimental? I found it is disabled by
default for all architectures(3.4svn). I also don't understand the
algorithm. Could you point to me more papers or text materials about your
approach? it seems that you want to balance register pressure and ILP in
misched.
On Tue, Sep 24, 2013 at 4:07 PM,
2016 Apr 26
3
How to get started with instruction scheduling? Advice needed.
Hi Phil.
You more or less answered your own question, but let me give you some more info. Maybe it is of use.
>From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in
2016 Oct 21
3
Prioritizing an SDNode for scheduling
I probably misunderstood the question. You probably want to do this in
SelectionDAG.
On Fri, Oct 21, 2016 at 10:29 AM, Ehsan Amiri <ehsanamiri at gmail.com> wrote:
> You can do this by changing instruction scheduling heuristics. I think the
> more important question is if this correct always for all platforms.
>
> I don't know which scheduler you use. We use
2018 May 14
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi Andrew,
Thank you very much for the most helpful explanations! Many things could
go in as comments, if you ask me - for example:
---
> The LLVM machine model is an abstract machine.
> The abstract pipeline is built around the notion of an "issue point". This is merely a reference point for counting machine cycles.
>
>
> IssueWidth is meant to be a hard in-order
2013 Sep 26
2
[LLVMdev] how to detect data hazard in pre-RA-sched
On Wed, Sep 25, 2013 at 1:15 PM, Andrew Trick <atrick at apple.com> wrote:
>
> On Sep 24, 2013, at 7:59 PM, Liu Xin <navy.xliu at gmail.com> wrote:
>
> Hi, Andrew,
>
> Thank you for answering my question.
>
> What's the status of misched? is it experimental? I found it is disabled
> by default for all architectures(3.4svn). I also don't understand
2017 Apr 03
2
Scheduler: modelling long register reservations?
Hello,
My out-of-tree target features some high latency instructions (let's call them FXLV). When an FXLV issues, it reserves its destination register and execution continues; if a subsequent instruction attempts to read or write that register, the pipline will stall until the FXLV completes. I have attempted to encode this constraint in the machine scheduler (excerpt at bottom of email).