search for: vctp

Displaying 7 results from an estimated 7 matches for "vctp".

Did you mean: sctp
2020 May 04
3
LV: predication
...ing out the underlying element count given a predicate, maybe we could attack it from that angle? For example, introduce a special intrinsic for deriving the mask (sort of like the SVE whilelo). That would be an excellent way of doing it and it would also map very well to MVE too, where we have a VCTP intrinsic/instruction that creates the mask/predicate (Vector Create Tail-Predicate). So I will go for this approach. Such an intrinsic was actually also proposed in Sam's original RFC (see https://lists.llvm.org/pipermail/llvm-dev/2019-May/132512.html), but we hadn't implemented it yet. Th...
2019 Jul 15
2
Tail-Loop Folding/Predication
...orm without any predicated intrinsics: #pragma tail_predicate do { VLD(..); // some vector load intrinsic VST(..); // some vector store intrinsic .. } while (N); which can then be transformed and predication made explicit through data dependencies like so: do { mask = vctp(N); // intrinsic that generates the mask of active lanes VLD(.., mask); VST(.., mask); .. } while (N); A vector loop in this form can easily be picked up the new hardware loop pass, and the corresponding tail-predicated hardware loop can be generated. This is only a small example,...
2020 May 01
3
LV: predication
...4 x i32> @llvm.masked.load call <4 x i32> @llvm.masked.load call void @llvm.masked.store call i32 @llvm.loop.decrement.reg br i1 %12, label %.*, label %vector.body We then pick this up in our tail-predication pass, remove @llvm.set.loop.elements intrinsic, and add @vctp which is our intrinsic that generates the mask of active/inactive lanes: vector.ph: call void @llvm.set.loop.iterations.i32(i32 %5) br label %vector.body vector.body: call <4 x i1> @llvm.arm.mve.vctp32 call <4 x i32> @llvm.masked.load call <4 x i32&...
2020 May 01
5
LV: predication
...2> @llvm.masked.load call <4 x i32> @llvm.masked.load call void @llvm.masked.store call i32 @llvm.loop.decrement.reg br i1 %12, label %.*, label %vector.body We then pick this up in our tail-predication pass, remove @llvm.set.loop.elements intrinsic, and add @vctp which is our intrinsic that generates the mask of active/inactive lanes: vector.ph: call void @llvm.set.loop.iterations.i32(i32 %5) br label %vector.body vector.body: call <4 x i1> @llvm.arm.mve.vctp32 call <4 x i32> @llvm.masked.load call <...
2020 May 04
3
LV: predication
...om> Cc: Eli Friedman <efriedma at quicinc.com>; llvm-dev <llvm-dev at lists.llvm.org>; Sam Parker <Sam.Parker at arm.com> Subject: Re: [llvm-dev] LV: predication Hi Sjoerd, That would be an excellent way of doing it and it would also map very well to MVE too, where we have a VCTP intrinsic/instruction that creates the mask/predicate (Vector Create Tail-Predicate). So I will go for this approach. Such an intrinsic was actually also proposed in Sam's original RFC (see https://lists.llvm.org/pipermail/llvm-dev/2019-May/132512.html), but we hadn't implemented it yet. Th...
2020 May 20
2
LV: predication
...2> @llvm.masked.load call <4 x i32> @llvm.masked.load call void @llvm.masked.store call i32 @llvm.loop.decrement.reg br i1 %12, label %.*, label %vector.body We then pick this up in our tail-predication pass, remove @llvm.set.loop.elements intrinsic, and add @vctp which is our intrinsic that generates the mask of active/inactive lanes: vector.ph: call void @llvm.set.loop.iterations.i32(i32 %5) br label %vector.body vector.body: call <4 x i1> @llvm.arm.mve.vctp32 call <4 x i32> @llvm.masked.load call <...
2020 May 21
2
LV: predication
...2> @llvm.masked.load call <4 x i32> @llvm.masked.load call void @llvm.masked.store call i32 @llvm.loop.decrement.reg br i1 %12, label %.*, label %vector.body We then pick this up in our tail-predication pass, remove @llvm.set.loop.elements intrinsic, and add @vctp which is our intrinsic that generates the mask of active/inactive lanes: vector.ph: call void @llvm.set.loop.iterations.i32(i32 %5) br label %vector.body vector.body: call <4 x i1> @llvm.arm.mve.vctp32 call <4 x i32> @llvm.masked.load call <...