search for: hardwareloops

Displaying 14 results from an estimated 14 matches for "hardwareloops".

Did you mean: hardwareloop
2020 May 01
3
LV: predication
Hello, We are working on predication for our vector extension (MVE). Since quite a few people are working on predication and different forms of it (e.g. SVE, RISC-V, NEC), I thought I would share what we would like to add to the loop vectoriser. Hopefully it's just a minor one and not intrusive, but could be interesting and useful for others, and feedback on this is welcome of course. TL;DR:
2020 Aug 07
2
Branches which return values in SelectionDAG
...temZ's 'BRCT', which takes a register, decrements it, and branches if the register is nonzero. I saw that the LLVM backend for SystemZ generates the instruction in a MachineFunctionPass as part of a pass intended to eliminate or combine compares. I then looked at ARM, where it uses the HardwareLoops pass first, and then a combine that occurs in the ARM ISel stage. It replaces branch instructions with special 'WLS' and 'LE' nodes that are custom selected into t2WhileLoopStart and t2LoopEnd pseudo instructions with isBranch and isTerminator set. These pseudo instructions are fina...
2020 May 01
5
LV: predication
...iddle of nowhere. I do see that point. But is that also not the beauty of it? It just sits in the preheader, if gets removed, then so be it. And if it not recognised, then also no harm done? > Probably the simplest path to get this working is to derive the number of elements in the backend (in HardwareLoops, or your tail predication pass). You should be able to figure it from the masks used in the llvm.masked.load/store instructions in the loop. This is what we are currently doing and works excellent for simpler cases. For the more complicated cases that we now what to handle as well, the pattern mat...
2020 May 20
2
LV: predication
...e of nowhere. I do see that point. But is that also not the beauty of it? It just sits in the preheader, if gets removed, then so be it. And if it not recognised, then also no harm done? > Probably the simplest path to get this working is to derive the number of elements in the backend (in HardwareLoops, or your tail predication pass). You should be able to figure it from the masks used in the llvm.masked.load/store instructions in the loop. This is what we are currently doing and works excellent for simpler cases. For the more complicated cases that we now what to handle as well, the pattern m...
2020 May 21
2
LV: predication
...e of nowhere. I do see that point. But is that also not the beauty of it? It just sits in the preheader, if gets removed, then so be it. And if it not recognised, then also no harm done? > Probably the simplest path to get this working is to derive the number of elements in the backend (in HardwareLoops, or your tail predication pass). You should be able to figure it from the masks used in the llvm.masked.load/store instructions in the loop. This is what we are currently doing and works excellent for simpler cases. For the more complicated cases that we now what to handle as well, the pattern m...
2020 Sep 09
2
[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.
...; https://reviews.llvm.org/D53005, but it seems to have gone stale. I’m > also curious about customizing the expander, since we too have some ways to > make the prolog and epilog more efficient. > > > > My current solution hides the PHI and IndVar update instructions added by > HardwareLoops inside the branch and rematerializes them after the stock > MachinePipeliner runs. Not having to do that would be great, but now that > it’s implemented, I think I can stop bothering you guys for historical > data! J > > > > Thanks again! > > > > JB > > > &...
2020 Sep 07
2
[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.
Hi James, Having not worked on this for circa one year I've gone and refreshed my memory. We have a pretty capable implementation of swing modulo scheduling downstream, distinct from the MachinePipeliner implementation. Historically, MachinePipeliner had very tight coupling between the finding of a suitable schedule and emitting the code that adheres to that schedule. I spent quite a bit of
2020 May 19
2
LV: predication
Invitation accepted, I am happy to help out with reviews, like I did with the previous VP patches. And of course agreed that things should be well defined, and that we shouldn't paint ourselves in a corner, but I don't think that this is the case. And it's not that I am in a rush, but I don't think this change needs to be predicated on a big change landing first like the LV
2019 Jul 11
4
llvm.set.loop.iterations
After playing a bit with the newly introduced hardware loop framework I realize that the llvm.set.loop.iterations intrinsic takes as argument the number of iterations the loop will execute. In fact it goes all the way to, on IR, insert an addition of constant 1 to the number of taken backedges returned by SCEV. If the machine instruction realizing the loop is interested in the number of
2020 May 18
2
LV: predication
> You have similar problems with https://reviews.llvm.org/D79100 The new revision D79100<https://reviews.llvm.org/D79100> solves your comment 1), and I don't think your comments2) and 3) apply as there are no vendor specific intrinsics involved at all here. Just to quickly discuss the optimisation pipeline, D79100<https://reviews.llvm.org/D79100> is a small extension for the
2020 May 04
3
LV: predication
...preheader, if gets removed, then so be it. And if it not recognised, then also no harm done? The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop. > Probably the simplest path to get this working is to derive the number of elements in the backend (in HardwareLoops, or your tail predication pass). You should be able to figure it from the masks used in the llvm.masked.load/store instructions in the loop. This is what we are currently doing and works excellent for simpler cases. For the more complicated cases that we now what to handle as well, the pattern m...
2020 May 19
3
LV: predication
Hi Simon, Thanks for reposting the example, and looking at it more carefully, I think it is very similar to my first proposal. This was met with some resistance here because it dumps loop information in the vector preheader. Doing it this early, we want to emit this in the vectoriser, puts a restriction on (future) optimisations that transform vector loops to honour/update/support this intrinsic
2020 May 18
2
LV: predication
Hi, I abandoned that approach and followed Eli's suggestion, see somewhere earlier in this thread, and emit an intrinsic that represents/calculates the active mask. I've just uploaded a new revision for D79100 that implements this. Cheers. ________________________________ From: Simon Moll <Simon.Moll at EMEA.NEC.COM> Sent: 18 May 2020 13:32 To: Sjoerd Meijer <Sjoerd.Meijer at
2020 Sep 03
1
[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.
Hi James, Adding Hendrik, who has taken over ownership of the downstream code involved. I can also add background about the rationale, of that helps? It was added to ignore induction variable update code (scalar code) that is rewritten when we unroll / peel the prolog epilog anyway. Targets like Hexagon or PPC with dedicated loop control instructions for pipelined loops don't need this, but