Displaying 6 results from an estimated 6 matches for "vtrn".
Did you mean:
etrn
2009 Aug 09
1
[LLVMdev] proposal to add MVT::vAny type
...uld make them easier for target-independent code
> to understand.
Yes, I have tried to do that as much as possible. There are still a
number of operations where we've ended up using intrinsics, for
varying reasons.
For example, I had been planning to have the front-end translate the
VTRN, VZIP, and VUZP builtins to vector shuffles, since that is
exactly what they are. But, after discussing it with Evan, I changed
these to intrinsics because we couldn't figure out a good way to
handle them as shuffles. They take two vector operands and shuffle
them in place, producing...
2014 Dec 09
1
[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
...ance-critical code when
you say cycle I think machine cycle, and NEON definitely can't process
16 floats in one of those.
> + while (len >= 16) {
> + /* Accumulate results into single float */
> + tv.val[0] = vadd_f32(vget_low_f32(SUMM), vget_high_f32(SUMM));
> + tv = vtrn_f32(tv.val[0], ZERO);
> + tv.val[0] = vadd_f32(tv.val[0], tv.val[1]);
> +
> + vst1_lane_f32(&sumi, tv.val[0], 0);
Accessing tv.val[0] and tv.val[1] directly seems to send these values
through the stack, e.g.,
f4: f3ba7085 vtrn.32 d7, d5
f8: ed0b7b0f vstr...
2009 Aug 09
0
[LLVMdev] proposal to add MVT::vAny type
Hi Bob,
An alternative would be to model the operations as regular shuffle,
load, and store operators, combined to describe the actual
instructions. This would make them easier for target-independent code
to understand.
Dan
On Aug 8, 2009, at 11:47 PM, Bob Wilson <bob.wilson at apple.com> wrote:
> The ARM Neon load, store and shuffle operations that I've been
>
2009 Aug 09
2
[LLVMdev] proposal to add MVT::vAny type
On Aug 9, 2009, at 8:37 AM, Chris Lattner wrote:
> I really do think that bitcast is the right way to go here. I ran
> into a couple of similar problems when bringing up the altivec port.
> For example, at one time we'd get "all zero vectors" of different
> MVTs, which would not be CSEd.
>
> The fix for this was to be really disciplined about what types to make
2009 Aug 09
4
[LLVMdev] proposal to add MVT::vAny type
The ARM Neon load, store and shuffle operations that I've been
implementing recently with LLVM intrinsics do not care about the
distinction between vectors with i32 and f32 elements -- only the size
matters. But, because we have only MVT::fAny and MVT::iAny types,
I've been having to define separate intrinsics for the operations with
floating-point vector elements. It
2014 Dec 07
2
[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
Hi,
Optimizes celt_pitch_xcorr for floating point.
Changes from RFCv1:
- Rebased on top of commit
aad281878: Fix celt_pitch_xcorr_c signature.
which got rid of ugly code around CELT_PITCH_XCORR_IMPL
passing of "arch" parameter.
- Unified with --enable-intrinsics used by x86
- Modified algorithm to be more in-line with algorithm in
celt_pitch_xcorr_arm.s
Viswanath Puttagunta