Displaying 20 results from an estimated 10000 matches similar to: "machine scheduler: pre-RA bidirectional scheduling"
2017 Nov 23
3
mischeduler (pre-RA) experiments
Hi,
I have been experimenting for a while with tryCandidate() method of the
pre-RA mischeduler. I have by chance found some parameters that give
quite good results on benchmarks on SystemZ (on average 1% improvement,
some improvements of several percent and very little regressions).
Basically, I add a "latency heuristic boost" just above processor
resources checking:
2018 May 15
1
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi Andy,
>> Right now it seems that BeginGroup/EndGroup is only used by SystemZ,
>> or? I see they are used in checkHazard(), which I actually don't see
>> as helpful during pre-RA scheduling for SystemZ. Could this be made
>> optional, or perhaps only done post-RA if target does post-RA
>> scheduling? SystemZ does post-RA scheduling to manage decoder
2018 May 14
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi Andrew,
Thank you very much for the most helpful explanations! Many things could
go in as comments, if you ask me - for example:
---
> The LLVM machine model is an abstract machine.
> The abstract pipeline is built around the notion of an "issue point". This is merely a reference point for counting machine cycles.
>
>
> IssueWidth is meant to be a hard in-order
2019 Jun 19
2
live-in lists during register allocation
Hi,
I wonder if live-in lists can be trusted to be accurate during register
allocation / foldMemoryOperandImp().
On SystemZ, a compare register-register which has one of the registers
spilled can fold that reload into a compare register-memory instruction.
In order to do this also with the first (LHS) register, the operands
must be swapped. This can only reasonably be done when all the CC
2012 May 11
2
[LLVMdev] overlaps generation, RA crasch
Hi,
Recently on trunk, the ovlaps list for a register got a dual entry on my target, which caused the RA to crash.
Reg
Subreg1
Subreg b
Subreg2
Subreg b
I have a register with two subregs that have subreg b in common. This causes the SuperReg to appear twice in the ovelaps list for Subreg b.
As this causes a register allocator to crasch (it evicts a register, and then inremenets
2012 May 11
0
[LLVMdev] overlaps generation, RA crasch
On May 11, 2012, at 6:35 AM, Jonas Paulsson <jonas.paulsson at ericsson.com> wrote:
> Hi,
>
> Recently on trunk, the ovlaps list for a register got a dual entry on my target, which caused the RA to crash.
>
> Reg
> Subreg1
> Subreg b
> Subreg2
> Subreg b
>
> I have a register with two subregs that have subreg b in common. This causes the
2013 Feb 21
2
[LLVMdev] hazard scheduling nodes
Hi,
I am trying to add Hazard scheduling nodes after buildSchedGraph(), with a scheduler derived from ScheduleDAGInstrs. I get weird errors, so I wonder what I am doing wrong?
What I am doing right now is:
I have a created MI with opcode HAZARD that does not have parent, and I greate a SUnit(HazardMI). I use this one HazardMI for all hazard nodes.
I remove all edges using removePred.
I insert
2018 Sep 13
4
Loop Distribution pass
Hi,
I found with the help of the optimization remarks a loop that could not
be vectorized, but if loop distribution was enabled this may happen,
which it in fact did with a very significant benchmark improvement (~25%).
I tried (on SystemZ) to enable this pass, and found that it only
affected a handful of files on SPEC. This means I could enable this
without worrying about any regressions on
2017 Aug 30
2
Register pressure calculation in the machine scheduler and live-through registers
> On Aug 30, 2017, at 1:43 PM, Matthias Braun <matze at braunis.de> wrote:
>
> That means you cannot use the code from RegisterPressure.{cpp|h} to compute this. The other liveness analysis we have in llvm codegen is LiveIntervals (LiveItnervalAnalysis) which gives you a list of liveness segments of a given vreg (the same representation is used in most linear scan allocators even
2013 Mar 09
0
[LLVMdev] hazard scheduling nodes
On Feb 21, 2013, at 9:11 AM, Jonas Paulsson <jonas.paulsson at ericsson.com> wrote:
> Hi,
>
> I am trying to add Hazard scheduling nodes after buildSchedGraph(), with a scheduler derived from ScheduleDAGInstrs. I get weird errors, so I wonder what I am doing wrong?
>
> What I am doing right now is:
>
> I have a created MI with opcode HAZARD that does not have
2017 Aug 17
2
reg coalescing improvements
Hi,
I am seeing cases of poorly coalesced IV updates on SystemZ:
In the final IR, it is obvious that
%R4D<def> = LA %R2D<kill>, 4, %noreg // R4 = R2 + 4
%R2D<def> = LGR %R4D<kill> // R2 = R4
could be optimized to ->
%R2D<def> = LA %R2D<kill>, 4, %noreg // R2 = R2 + 4
The reason this wasn't coalesced, is
2013 Mar 12
1
[LLVMdev] hazard scheduling nodes
Hi Andy,
The thing is that I was trying to build a sched graph in other places than these two standard scheduling passes. For instance, in pre-emit. I would like to reschedule a basic block on my vliw target just before assembly emission.
I tried to add SUnits for hazards in an experiment, but this gave very weird errors... even while allocating extra space in SUnits vector. For some function, I
2018 May 14
0
[MachineScheduler] Question about IssueWidth / NumMicroOps
> On May 14, 2018, at 11:10 AM, Jonas Paulsson <paulsson at linux.vnet.ibm.com> wrote:
>
> Hi Andrew,
>
> Thank you very much for the most helpful explanations! Many things could go in as comments, if you ask me - for example:
>
> ---
>> The LLVM machine model is an abstract machine.
>
>> The abstract pipeline is built around the notion of an
2016 Apr 27
2
phys reg liveness during foldMemoryOperandImpl()
I would expect that it shouldn't be too hard to pass around a reference to LiveIntervalAnalysis*. Patches welcome :)
- Matthias
> On Apr 27, 2016, at 11:38 AM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> ping.
>
> Either this can be implemented easily, or the current SystemZ optimization LAY -> AGSI in foldMemoryOperandImpl() should be
2018 May 09
2
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi,
I would like to ask what IssueWidth and NumMicroOps refer to in
MachineScheduler, just to be 100% sure what the intent is.
Are we modeling the decoder phase or the execution stage?
Background:
First of all, there seems to be different meanings of "issue" depending
on which platform you're on:
2016 Apr 15
2
phys reg liveness during foldMemoryOperandImpl()
Hi,
I wonder if it would be possible to extend foldMemoryOperandImp() so
that targets can check for liveness of a particular phys reg?
The case I am thinking of is when the new instruction clobbers the CC
reg, while the old one did not. In this case the new instruction can
only become a replacement if the CC reg is known to be dead.
The idea is that liveness of phys regs should be available
2016 Mar 28
2
LoopStrengthReduce.cpp
Hi,
I am looking for a way to rewrite induction variables to use an addition
of -1 whenever possible (and not otherwise unprofitable). This is needed
to utilize hardware loop instructions, which are present on SystemZ
(branch on count). Later in the backend, an 'add -1; compare w/ 0; jne
0'-sequence can be replaced with a brct instruction.
I could not find any way in the LSR pass to
2017 Aug 17
3
callee saved regs list
Hi,
It has been discovered recently that it is needed for the SystemZ
backend to add super-regs to the callee saved regs list like:
def CSR_SystemZ : CalleeSavedRegs<(add (sequence "R%dD", 6, 15),
- (sequence "F%dD", 8, 15))>;
+ [R6Q, R8Q, R10Q, R12Q, R14Q],
+
2018 May 30
2
InstrEmitter::CreateVirtualRegisters handling of CopyToReg
Hi,
I wonder if anyone has any comment on a patch like:
diff --git a/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
b/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
index 65ee3816f84..4780f6f0e59 100644
--- a/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
+++ b/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
@@ -243,18 +243,21 @@ void InstrEmitter::CreateVirtualRegisters(SDNode
*Node,
if (!VRBase &&
2017 Mar 24
2
SLP regression on SystemZ
Hi,
I have come across a major regression resulting after SLP vectorization
(+18% on SystemZ, just for enabling SLP). This all relates to one
particular very hot loop.
Scalar code:
%conv252 = zext i16 %110 to i64
%conv254 = zext i16 %111 to i64
%sub255 = sub nsw i64 %conv252, %conv254
... repeated
SLP output:
%101 = zext <16 x i16> %100 to <16 x i64>
%104 = zext