Displaying 20 results from an estimated 400 matches similar to: "DFAPacketizer, Scheduling and LoadLatency"
2015 Nov 17
2
DFAPacketizer, Scheduling and LoadLatency
> In particular, the LoadLatency is used in defaultDefLatency:
>
> /// Return the default expected latency for a def based on it's opcode.
> unsigned TargetInstrInfo::defaultDefLatency(
> const MCSchedModel &SchedModel, const MachineInstr *DefMI) const {
> if (DefMI->isTransient())
> return 0;
> if (DefMI->mayLoad())
> return
2016 Jan 06
2
DFAPacketizer, Scheduling and LoadLatency
On Tue, Nov 17, 2015 at 11:15 AM, Krzysztof Parzyszek <
kparzysz at codeaurora.org> wrote:
> On 11/17/2015 12:26 PM, Rail Shafigulin wrote:
>
>>
>> I tried setting
>> let mayLoad = 1 {
>> class InstrLD .... {
>> }
>> }
>>
>> But that didn't seem to work. When I looked at the debug output the
>> latency for the load
2013 Apr 30
1
[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2
On Apr 26, 2013, at 3:53 AM, Martin J. O'Riordan <Martin.ORiordan at movidius.com> wrote:
> I am migrating the llvm/clang derived compiler for our processor from the
> v3.1 to v3.2 codebase. This has mostly gone well except that instruction
> latency scheduling is no longer happening.
>
> The people who implemented this previously sub-classed 'ScheduleDAGInstrs'
2015 Oct 15
3
what can cause a "CPU table is not sorted" assertion
I'm trying to create a simplified 2 slot VLIW from an OR1K. The codebase
I'm working with is here <https://github.com/openrisc/llvm-or1k>. I've
created an initial MyTargetSchedule.td
def MyTargetModel : SchedMachineModel {
// HW can decode 2 instructions per cycle.
let IssueWidth = 2;
let LoadLatency = 4;
let MispredictPenalty = 16;
// This flag is set to allow the
2013 Feb 11
2
[LLVMdev] DFAPacketizer
Jonas,
At this point, the DFA packetizer models a simple VLIW architecture and
does not accommodate multiple stages. That's the reason for the behavior
you're seeing.
-Anshu
---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
*From:*llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
*On Behalf Of *Jonas
2013 Feb 12
0
[LLVMdev] DFAPacketizer
Hi,
I looked a bit through the mail archives, and found this question answered in Oct 2011 (see below). It is interesting to find this in the ARM backend, considering your answer. Can you give more information about for example is this a temporary deficiency in the DFAPacketizer? What is the IIC_iMOVi itinerary doing below?
Thanks,
Jonas
Thu Oct 6 15:11:25 CDT 2011:
Hello Hal.
> Is there
2013 Feb 12
2
[LLVMdev] DFAPacketizer
Hi Jonas,
> It is interesting to find this in the ARM backend, considering your
answer.
The ARM backend doesn't use the DFA packetizer. It's only used by
Hexagon. At this point, there is no plan to address thisin the DFA
packetizer since none of the supported targets needthe functionality.
Thanks
-Anshu
---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
2013 Feb 18
0
[LLVMdev] DFAPacketizer
Hi Anshu,
Would there be any interest in extending this algorithm to handling more extensive models, such as VLIW scheduling based on FU's and bundle space... ie handle multiple stages ?
I might do it and commit, if there is acceptance and guidance...
Jonas
________________________________
From: Anshuman Dasgupta [mailto:adasgupt at codeaurora.org]
Sent: Tuesday, February 12, 2013 4:47 PM
2016 Apr 20
2
How to get started with instruction scheduling? Advice needed.
So if I use the SchedMachineModel method, can I just skip itineraries?
Phil
On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <slarin at codeaurora.org>
wrote:
> Target does make a difference. VLIW needs more hand-holding. For what you
> are describing it should be fairly simple.
>
>
>
> Best strategy – see what other targets do. ARM might be a good start for
> generic
2016 Apr 26
3
How to get started with instruction scheduling? Advice needed.
Hi Phil.
You more or less answered your own question, but let me give you some more info. Maybe it is of use.
>From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in
2012 Apr 19
0
[LLVMdev] Target Dependent Hexagon Packetizer patch
Sure I will split it and put it in two patches.
Give me few hours. I need to test those patches.
Sirish
On 4/19/2012 8:40 AM, Tom Stellard wrote:
> On Wed, Apr 18, 2012 at 11:18:05PM -0500, Sirish Pande wrote:
>> Hi,
>>
>> Here's a patch for Hexagon Packetizer for review. This patch does
>> not yield any warnings.
>>
> Would it be possible to split this
2013 Feb 11
0
[LLVMdev] DFAPacketizer
Hi,
I am having problems writing the ProcessorItineraries list. As instructions on my VLIW target have varying size I want to model both cpu units and bundle bits as FUs. The following does not work, to my surprise:
InstrItinData<ALU, [InstrStage<1, [BITS1,BITS2, BITS3, BITS4], 0>,
InstrStage<1, [ALU1, ALU2]>]>
I want to express that there are
2016 Nov 27
5
Extending Register Rematerialization
Hello LLVM Developers,
We are working on extending currently available register rematerialization
to include cases where sequence of multiple instructions is required to
rematerialize a value.
We had a discussion on this in community mailing list and link is here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
>From the above discussion and studying the code we
2015 Jan 08
4
[LLVMdev] Machine LICM and cheap instructions?
Hi everyone,
The MachineLICM pass has a heuristic such that, even in low-register-pressure situations, it will refuse to hoist "cheap" instructions out of loops. By default, when an itinerary is available, this means that all of the defined operands are available in at most 1 cycle. ARM overrides this, and provides this more-customized definition:
bool ARMBaseInstrInfo::
2011 Oct 22
0
[LLVMdev] Instruction Scheduling Itineraries
On Oct 21, 2011, at 12:15 AM, James Molloy wrote:
> Hi Andy,
>
> Could you describe how this would be done? In the current ARM itineraries
> (say C-A9 for example), the superscalar issue stage is modelled as taking 1
> cycle. If it were to take 2 cycles instead, as far as I can tell the hazard
> analyser would stall because both FU's would be acquired.
>
> I would
2016 Jun 06
2
Instruction Itineraries: question about operand latencies
In our architecture loads from certain memory locations take a long time to
complete (on the order of 150 clock cycles). Since we don't have a way to
tell at compile time if the address being loaded from lies in slow or fast
memory, I've gone ahead and made all of the load numbers high, such as:
InstrItinData< II_LOAD1, [InstrStage<150, [AGU]>]>,
However, I see that
2019 Sep 10
2
MachineScheduler not scheduling for latency
Hi Andy,
Thanks for the explanations. Yes AMDGPU is in-order and has
MicroOpBufferSize = 1.
Re "issue limited" and instruction groups: could it make sense to
disable the generic scheduler's detection of issue limitation on
in-order CPUs, or on CPUs that don't define instruction groups, or
some similar condition? Something like:
--- a/lib/CodeGen/MachineScheduler.cpp
+++
2020 Jun 18
2
[ARM] Thumb code-gen for 8-bit imm arguments results in extra reg copies
On Tue, 16 Jun 2020 at 15:47, Tim Northover <t.p.northover at gmail.com> wrote:
>
> On Tue, 16 Jun 2020 at 10:23, Prathamesh Kulkarni via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > (b) Modifies RegisterCoalescer::reMaterializeTrivialDef and
> > TargetInstrInfo::isReallyTriviallyReMaterializableGeneric to check
> > for single live def, instead of
2017 Jun 29
2
Ok with mismatch between dead-markings in BUNDLE and bundled instructions?
> On Jun 28, 2017, at 5:10 PM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Oh wait, vreg1 is indeed used.
> Yeah, having a dead flag here sounds wrong.
I mean on the instruction itself.
On the bundle, that’s debatable. That would fit the semantic “if no side effect you can kill it” (here there is side effect, we define other vregs).
>
>> On
2016 Jun 08
2
Instruction Itineraries: question about operand latencies
I overrode getInstrLatency and did some printing to see what is available
there. It looks like the registers are still virtual at that point when
getInstrLatency is called - is that correct? (we needed to make some
decisions based on actual registers that have been assigned since some
registers are reserved as address space pointers and we could vary the
latency based on which address space