thr3ads.net - similar to: "Loop Distribution pass"

Displaying 20 results from an estimated 10000 matches similar to: "Loop Distribution pass"

2018 Sep 13

Loop Distribution pass

Jonas/Renato, >I think it's mostly about the success rate, given it's too conservative. But in the past 2 years, improvements in (and around) the LV have been slowed down a bit due to the move >to VPlan. It wasn't our intention to slow down LV improvements, but if the project ended up causing other developers take the stance of wait-and-see, that's an inevitable side effect

Loop Distribution pass

2018 Sep 13

Loop Distribution pass

Sorry for jumping from http://lists.llvm.org/pipermail/llvm-dev/2018-September/125853.html but this is relevant. Sorry for not responding to that sooner. I was thinking about a longer reply, and time flied too quickly. >But, as I said back then, before we do so, we need to understand >exactly where to put it. That will depend on what other passes will >actually use it and if we want it

Loop Distribution pass

2018 Sep 13

Loop Distribution pass

>I'm just curious as tho which concrete passes would benefit sooner. This all depends on those who are working on other loop xforms, since we currently don't have bandwidth to drive that kind of changes into other loop xforms. That's why when this line of questions pops up, I offer to work together. Short of that, the best we can proactively do is to make vectorizer analyses

[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us

2017 Dec 06

[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us

Status Update on VPlan ---- where we are currently, and what's ahead of us ========================================================== Goal: ----- Extending Loop Vectorizer (LV) such that it can handle outer loops, via uplifting its infrastructure with VPlan. The goal of this status update is to summarize the progress and the future steps needed. Background: ----------- This is related to

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 06

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Amara, >I support this direction Thanks for the support. >but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in some form.’ It's not like I have specific application code in

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

All, I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now moved to VPlan.h, and I have a patch under review to move LoopVectorizationPlanner class out of LoopVectorize.cpp (https://reviews.llvm.org/D41420). Next thing I'm working on is

live-in lists during register allocation

2019 Jun 19

live-in lists during register allocation

Hi, I wonder if live-in lists can be trusted to be accurate during register allocation / foldMemoryOperandImp(). On SystemZ, a compare register-register which has one of the registers spilled can fold that reload into a compare register-memory instruction. In order to do this also with the first (LHS) register, the operands must be swapped. This can only reasonably be done when all the CC

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

> On 5 Jan 2018, at 21:01, Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > All, > > I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward > (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now > moved to VPlan.h, and I have a patch under review

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 07

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

On 01/05/2018 06:28 PM, Saito, Hideki wrote: > Amara, > >> I support this direction > Thanks for the support. > >> but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in

InstrEmitter::CreateVirtualRegisters handling of CopyToReg

2018 May 30

InstrEmitter::CreateVirtualRegisters handling of CopyToReg

Hi, I wonder if anyone has any comment on a patch like: diff --git a/lib/CodeGen/SelectionDAG/InstrEmitter.cpp b/lib/CodeGen/SelectionDAG/InstrEmitter.cpp index 65ee3816f84..4780f6f0e59 100644 --- a/lib/CodeGen/SelectionDAG/InstrEmitter.cpp +++ b/lib/CodeGen/SelectionDAG/InstrEmitter.cpp @@ -243,18 +243,21 @@ void InstrEmitter::CreateVirtualRegisters(SDNode *Node, if (!VRBase &&

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 09

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Thanks, Hal. I plan to post a patch w/o HW Legality early bailout first. That should enable further discussion on where the initial very high cost for "illegal masked load/store/gather/scatter" should be coming from --- like should LoopVectorize provide it? Or should it be provided by TTI? I prefer the latter (TTI) but the first revision of the patch will intentionally do the former

[MachineScheduler] Question about IssueWidth / NumMicroOps

2018 May 15

[MachineScheduler] Question about IssueWidth / NumMicroOps

Hi Andy, >> Right now it seems that BeginGroup/EndGroup is only used by SystemZ, >> or? I see they are used in checkHazard(), which I actually don't see >> as helpful during pre-RA scheduling for SystemZ. Could this be made >> optional, or perhaps only done post-RA if target does post-RA >> scheduling? SystemZ does post-RA scheduling to manage decoder

phys reg liveness during foldMemoryOperandImpl()

2016 Apr 27

phys reg liveness during foldMemoryOperandImpl()

I would expect that it shouldn't be too hard to pass around a reference to LiveIntervalAnalysis*. Patches welcome :) - Matthias > On Apr 27, 2016, at 11:38 AM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ping. > > Either this can be implemented easily, or the current SystemZ optimization LAY -> AGSI in foldMemoryOperandImpl() should be

phys reg liveness during foldMemoryOperandImpl()

2016 Apr 15

phys reg liveness during foldMemoryOperandImpl()

Hi, I wonder if it would be possible to extend foldMemoryOperandImp() so that targets can check for liveness of a particular phys reg? The case I am thinking of is when the new instruction clobbers the CC reg, while the old one did not. In this case the new instruction can only become a replacement if the CC reg is known to be dead. The idea is that liveness of phys regs should be available

LoopStrengthReduce.cpp

2016 Mar 28

LoopStrengthReduce.cpp

Hi, I am looking for a way to rewrite induction variables to use an addition of -1 whenever possible (and not otherwise unprofitable). This is needed to utilize hardware loop instructions, which are present on SystemZ (branch on count). Later in the backend, an 'add -1; compare w/ 0; jne 0'-sequence can be replaced with a brct instruction. I could not find any way in the LSR pass to

[MachineScheduler] Question about IssueWidth / NumMicroOps

2018 May 14

[MachineScheduler] Question about IssueWidth / NumMicroOps

Hi Andrew, Thank you very much for the most helpful explanations! Many things could go in as comments, if you ask me - for example: --- > The LLVM machine model is an abstract machine. > The abstract pipeline is built around the notion of an "issue point". This is merely a reference point for counting machine cycles. > > > IssueWidth is meant to be a hard in-order

LoopVectorizer: shufflevectors

2018 Sep 04

LoopVectorizer: shufflevectors

Hi, I have been discussing a bit with Sanjay on how to handle the poor sequences of shufflevector instructions produced by the loop vectorizer and he suggested we bring this up on llvm-dev. I have run into this in the past also and it surprised me to again see (on SystemZ) that the vectorized loop did many seemingly unnecessary shuffles. In this case (see

mischeduler (pre-RA) experiments

2017 Nov 23

mischeduler (pre-RA) experiments

Hi, I have been experimenting for a while with tryCandidate() method of the pre-RA mischeduler. I have by chance found some parameters that give quite good results on benchmarks on SystemZ (on average 1% improvement, some improvements of several percent and very little regressions). Basically, I add a "latency heuristic boost" just above processor resources checking:

callee saved regs list

2017 Aug 17

callee saved regs list

Hi, It has been discovered recently that it is needed for the SystemZ backend to add super-regs to the callee saved regs list like: def CSR_SystemZ : CalleeSavedRegs<(add (sequence "R%dD", 6, 15), - (sequence "F%dD", 8, 15))>; + [R6Q, R8Q, R10Q, R12Q, R14Q], +

SLP regression on SystemZ

2017 Mar 24

SLP regression on SystemZ

Hi, I have come across a major regression resulting after SLP vectorization (+18% on SystemZ, just for enabling SLP). This all relates to one particular very hot loop. Scalar code: %conv252 = zext i16 %110 to i64 %conv254 = zext i16 %111 to i64 %sub255 = sub nsw i64 %conv252, %conv254 ... repeated SLP output: %101 = zext <16 x i16> %100 to <16 x i64> %104 = zext

similar to: Loop Distribution pass