search for: whilelo

Displaying 3 results from an estimated 3 matches for "whilelo".

2020 May 04
3
LV: predication
...u can then skip but then you lose out on it.... So, I really like this: > If the problem is specifically figuring out the underlying element count given a predicate, maybe we could attack it from that angle? For example, introduce a special intrinsic for deriving the mask (sort of like the SVE whilelo). That would be an excellent way of doing it and it would also map very well to MVE too, where we have a VCTP intrinsic/instruction that creates the mask/predicate (Vector Create Tail-Predicate). So I will go for this approach. Such an intrinsic was actually also proposed in Sam's original RFC...
2016 Nov 04
2
[RFC] Supporting ARM's SVE in LLVM
...The main vector body of the resulting code is one instruction longer than it would be for NEON, but no scalar tail is required and performance will scale with register length. The *seriesvector*, *shufflevector*(splat), *icmp*, *propff*, *test* sequence has been recognized and transformed into the `whilelo` instruction. \newpage ```nasm SimpleReduction: // BB#0: subs w9, w1, #1 b.lt .LBB0_4 // BB#1: add x9, x9, #1 mov x8, xzr whilelo p0.s, xzr, x9 mov z0.s, #0 .LBB0_2: ld1w {z1.s}, p0/z,...
2020 May 01
5
LV: predication
Hi Eli, > The problem with your proposal, as written, is that the vectorizer is producing the intrinsic. Because we don’t impose any ordering on optimizations before codegen, every optimization pass in LLVM would have to be taught to preserve any @llvm.set.loop.elements.i32 whenever it makes any change. This is completely impractical because the intrinsic isn’t related to anything