thr3ads.net - search: "vtrn"

Displaying 6 results from an estimated 6 matches for "vtrn".

Did you mean: etrn

[LLVMdev] proposal to add MVT::vAny type

2009 Aug 09

[LLVMdev] proposal to add MVT::vAny type

...uld make them easier for target-independent code > to understand. Yes, I have tried to do that as much as possible. There are still a number of operations where we've ended up using intrinsics, for varying reasons. For example, I had been planning to have the front-end translate the VTRN, VZIP, and VUZP builtins to vector shuffles, since that is exactly what they are. But, after discussing it with Evan, I changed these to intrinsics because we couldn't figure out a good way to handle them as shuffles. They take two vector operands and shuffle them in place, producing...

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 09

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

...ance-critical code when you say cycle I think machine cycle, and NEON definitely can't process 16 floats in one of those. > + while (len >= 16) { > + /* Accumulate results into single float */ > + tv.val[0] = vadd_f32(vget_low_f32(SUMM), vget_high_f32(SUMM)); > + tv = vtrn_f32(tv.val[0], ZERO); > + tv.val[0] = vadd_f32(tv.val[0], tv.val[1]); > + > + vst1_lane_f32(&sumi, tv.val[0], 0); Accessing tv.val[0] and tv.val[1] directly seems to send these values through the stack, e.g., f4: f3ba7085 vtrn.32 d7, d5 f8: ed0b7b0f vstr...

[LLVMdev] proposal to add MVT::vAny type

2009 Aug 09

[LLVMdev] proposal to add MVT::vAny type

Hi Bob, An alternative would be to model the operations as regular shuffle, load, and store operators, combined to describe the actual instructions. This would make them easier for target-independent code to understand. Dan On Aug 8, 2009, at 11:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: > The ARM Neon load, store and shuffle operations that I've been >

[LLVMdev] proposal to add MVT::vAny type

2009 Aug 09

[LLVMdev] proposal to add MVT::vAny type

On Aug 9, 2009, at 8:37 AM, Chris Lattner wrote: > I really do think that bitcast is the right way to go here. I ran > into a couple of similar problems when bringing up the altivec port. > For example, at one time we'd get "all zero vectors" of different > MVTs, which would not be CSEd. > > The fix for this was to be really disciplined about what types to make

[LLVMdev] proposal to add MVT::vAny type

2009 Aug 09

[LLVMdev] proposal to add MVT::vAny type

The ARM Neon load, store and shuffle operations that I've been implementing recently with LLVM intrinsics do not care about the distinction between vectors with i32 and f32 elements -- only the size matters. But, because we have only MVT::fAny and MVT::iAny types, I've been having to define separate intrinsics for the operations with floating-point vector elements. It

[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 07

[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Hi, Optimizes celt_pitch_xcorr for floating point. Changes from RFCv1: - Rebased on top of commit aad281878: Fix celt_pitch_xcorr_c signature. which got rid of ugly code around CELT_PITCH_XCORR_IMPL passing of "arch" parameter. - Unified with --enable-intrinsics used by x86 - Modified algorithm to be more in-line with algorithm in celt_pitch_xcorr_arm.s Viswanath Puttagunta

search for: vtrn