Displaying 20 results from an estimated 6000 matches similar to: "[RFC] Intrinsics for Hardware Loops"
2019 Jul 15
2
Tail-Loop Folding/Predication
I am looking for feedback to add support for a new loop pragma to Clang/LLVM.
With "#pragma tail_predicate" the idea would be to indicate that a loop
epilogue/tail can, or should be, folded into the main loop. I see two use
cases for this pragma.
First, this could be interesting for the vectorizer. It currently supports tail
folding by masking all loop instructions/blocks, but does this
2020 May 21
2
LV: predication
> The compare of interest is clear, I think. It compares a Vector Induction Variable with a broadcasted loop invariant value, aka the BTC. Obtaining the latter operand is the goal, clearly, but to do so, the former operand needs to be recognized as a VIV.
Yep, exactly that.
> What if this compare is not generated by LV’s fold-tail-by-masking transformation?
Not sure I completely follow
2020 May 20
2
LV: predication
Hi Ayal,
Let me start with commenting on this:
> A dedicated intrinsic that freezes the compare instruction, for no apparent reason, may potentially cripple subsequent passes from further optimizing the vectorized loop.
The point is we have a very good reason, which is that it passes on the right information on the backend, enabling opimisations as opposed to crippling them.
The compare
2020 Nov 06
4
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 8:49 AM, Roger Ferrer Ibáñez wrote:
Hi Sjoerd,
Trying to remember how everything fits together here, but could get.active.lane.mask not create the %mask of the VP intrinsics? Or in other words, in the vectoriser, who's producing the %mask and %evl that is consumed by the VP intrinsics?
I'm not sure what would be the best way here. I think about the Loop Vectorizer. I imagine
2020 Nov 06
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 12:39 PM, Sjoerd Meijer wrote:
Hello Simon,
Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear:
; Some examples:
; RISC-V V & VE(*):
; %mask = (splat i1 1)
; %evl = min(256, %n - %i)
; MVE/SVE :
; %mask = get.active.lane.mask(%i, %n)
; %evl = call @llvm.vscale()
; AVX:
; %mask = icmp (%i + (seq
2020 May 04
3
LV: predication
Hi Roger,
That's a good example, that shows most of the moving parts involved here. In a nutshell, the difference is, and what we would like to make explicit, is the vector trip versus the scalar loop trip count. In your IR example, the loads/stores are predicated on a mask that is calculated from a splat induction variable, which is compared with the vector trip count. Illustrated with your
2020 May 18
2
LV: predication
> You have similar problems with https://reviews.llvm.org/D79100
The new revision D79100<https://reviews.llvm.org/D79100> solves your comment 1), and I don't think your comments2) and 3) apply as there are no vendor specific intrinsics involved at all here. Just to quickly discuss the optimisation pipeline, D79100<https://reviews.llvm.org/D79100> is a small extension for the
2020 May 01
3
LV: predication
Hello,
We are working on predication for our vector extension (MVE). Since quite a few people are working on predication and different forms of it (e.g. SVE, RISC-V, NEC), I thought I would share what we would like to add to the loop vectoriser. Hopefully it's just a minor one and not intrusive, but could be interesting and useful for others, and feedback on this is welcome of course.
TL;DR:
2015 May 05
2
[LLVMdev] [AArch64] Should we restrict to the pointer type used in ldN/stN intrinsics?
Hi,
The ldN like intrinsics (including all the ld1xN, ldN, ldNlane, ldNr, stN,
stNlane) can use any pointer types. The definition (in IntrinsicsAArch64.td)
of such intrinsics use 'LLVMAnyPointerType', which means we can pass any
pointer type to such intrinsics.
E.g. I tried following case ld2.ll:
define { <4 x i32>, <4 x i32> } @test(float* %ptr) {
%vld2 = call {
2020 May 19
3
LV: predication
Hi Simon,
Thanks for reposting the example, and looking at it more carefully, I think it is very similar to my first proposal. This was met with some resistance here because it dumps loop information in the vector preheader. Doing it this early, we want to emit this in the vectoriser, puts a restriction on (future) optimisations that transform vector loops to honour/update/support this intrinsic
2020 May 18
2
LV: predication
Hi,
I abandoned that approach and followed Eli's suggestion, see somewhere earlier in this thread, and emit an intrinsic that represents/calculates the active mask. I've just uploaded a new revision for D79100 that implements this.
Cheers.
________________________________
From: Simon Moll <Simon.Moll at EMEA.NEC.COM>
Sent: 18 May 2020 13:32
To: Sjoerd Meijer <Sjoerd.Meijer at
2019 Aug 26
2
SCEV related question
Here is original C code:
void topup(int a[], unsigned long i) {
for (; i < 16; i++) {
a[i] = 1;
}
}
Here is the IR before the pass where I expect SCEV to return trip-count
value
; Function Attrs: nofree norecurse nounwind uwtable writeonly
define dso_local void @topup(i32* nocapture %a, i64 %i) local_unnamed_addr
#0 {
entry:
%cmp3 = icmp ult i64 %i, 16
br i1
2020 May 04
3
LV: predication
> The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop.
The intrinsic is marked as IntrNoDuplicate, so I wasn't worried about it ending up somewhere else. Also, it is a property of a specific loop, a tail-folded vector loop, that holds even after it is transformed I think. I.e. unrolling a vector loop is probably not what you want, but even if you do
2020 Nov 05
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
For RISC-V V and VE being explicit about %evl is important for performance & correctness and that is what VP does. The get.active.lane.mask intrinsic is used as a hint for the MVE, SVE backends to use hardware tail-predication (the backends reverse engineer that hint by pattern matching for get.active.lane.mask in the mask parameter of "some" masked intrinsics). IMHO, it's more
2020 May 19
2
LV: predication
Invitation accepted, I am happy to help out with reviews, like I did with the previous VP patches.
And of course agreed that things should be well defined, and that we shouldn't paint ourselves in a corner, but I don't think that this is the case. And it's not that I am in a rush, but I don't think this change needs to be predicated on a big change landing first like the LV
2020 May 01
5
LV: predication
Hi Eli,
> The problem with your proposal, as written, is that the vectorizer is producing the intrinsic. Because we don’t impose any ordering on optimizations before codegen, every optimization pass in LLVM would have to be taught to preserve any @llvm.set.loop.elements.i32 whenever it makes any change. This is completely impractical because the intrinsic isn’t related to anything
2005 Mar 08
0
[LLVMdev] Recursive Types using the llvm support library
On Mon, 7 Mar 2005, John Carrino wrote:
> As far as I can tell, when you construct a type using the support
> library StructType::get, you have to pass in a list of types. How can
> you make a Recursive type by passing in a pointer to the type you are
> constucting.
>
> An example where something really simple like the line below was output
> would be perfect.
>
>
2008 Sep 12
2
[LLVMdev] Order of fiels and structure usage
I'd like to be able to make use of a structure type and its fields before
it is completely defined. To be specific, let me ask detailed questions
at various stages in the construction of a recursive type. I copy from
http://llvm.org/docs/ProgrammersManual.html#TypeResolve
// Create the initial outer struct
PATypeHolder StructTy = OpaqueType::get();
Is it possible to declare
2005 Mar 08
2
[LLVMdev] Recursive Types using the llvm support library
>> An example where something really simple like the line below was output
>> would be perfect.
>>
>> %struct.linked_list = type { %struct.linked_list*, %sbyte* }
>
> Use something like this:
>
> PATypeHolder StructTy = OpaqueType::get();
> std::vector<const Type*> Elts;
> Elts.push_back(PointerType::get(StructTy));
>
2015 Jan 20
2
[LLVMdev] Bug in InsertElement constant propagation?
Does anybody else have an opinion on this issue? I'm planning to submit a patch which would add a new get method for ConstantDataVector taking an ArrayRef<Constant*> and use that in the few places in constant propagation where convertToFloat is used.
Let me know if you think there is a more obvious way to do it.
Right now the only way to create a ConstantDataVector are those method: