search for: vtrn

Displaying 6 results from an estimated 6 matches for "vtrn".

Did you mean: etrn
2009 Aug 09
1
[LLVMdev] proposal to add MVT::vAny type
...uld make them easier for target-independent code > to understand. Yes, I have tried to do that as much as possible. There are still a number of operations where we've ended up using intrinsics, for varying reasons. For example, I had been planning to have the front-end translate the VTRN, VZIP, and VUZP builtins to vector shuffles, since that is exactly what they are. But, after discussing it with Evan, I changed these to intrinsics because we couldn't figure out a good way to handle them as shuffles. They take two vector operands and shuffle them in place, producing...
2014 Dec 09
1
[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
...ance-critical code when you say cycle I think machine cycle, and NEON definitely can't process 16 floats in one of those. > + while (len >= 16) { > + /* Accumulate results into single float */ > + tv.val[0] = vadd_f32(vget_low_f32(SUMM), vget_high_f32(SUMM)); > + tv = vtrn_f32(tv.val[0], ZERO); > + tv.val[0] = vadd_f32(tv.val[0], tv.val[1]); > + > + vst1_lane_f32(&sumi, tv.val[0], 0); Accessing tv.val[0] and tv.val[1] directly seems to send these values through the stack, e.g., f4: f3ba7085 vtrn.32 d7, d5 f8: ed0b7b0f vstr...
2009 Aug 09
0
[LLVMdev] proposal to add MVT::vAny type
Hi Bob, An alternative would be to model the operations as regular shuffle, load, and store operators, combined to describe the actual instructions. This would make them easier for target-independent code to understand. Dan On Aug 8, 2009, at 11:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: > The ARM Neon load, store and shuffle operations that I've been >
2009 Aug 09
2
[LLVMdev] proposal to add MVT::vAny type
On Aug 9, 2009, at 8:37 AM, Chris Lattner wrote: > I really do think that bitcast is the right way to go here. I ran > into a couple of similar problems when bringing up the altivec port. > For example, at one time we'd get "all zero vectors" of different > MVTs, which would not be CSEd. > > The fix for this was to be really disciplined about what types to make
2009 Aug 09
4
[LLVMdev] proposal to add MVT::vAny type
The ARM Neon load, store and shuffle operations that I've been implementing recently with LLVM intrinsics do not care about the distinction between vectors with i32 and f32 elements -- only the size matters. But, because we have only MVT::fAny and MVT::iAny types, I've been having to define separate intrinsics for the operations with floating-point vector elements. It
2014 Dec 07
2
[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
Hi, Optimizes celt_pitch_xcorr for floating point. Changes from RFCv1: - Rebased on top of commit aad281878: Fix celt_pitch_xcorr_c signature. which got rid of ugly code around CELT_PITCH_XCORR_IMPL passing of "arch" parameter. - Unified with --enable-intrinsics used by x86 - Modified algorithm to be more in-line with algorithm in celt_pitch_xcorr_arm.s Viswanath Puttagunta