thr3ads.net - similar to: "Bug fix for configure output for Neon --disable-asm --enable-intrinsics"

Displaying 20 results from an estimated 3000 matches similar to: "Bug fix for configure output for Neon --disable-asm --enable-intrinsics"

Opus floating-point NEON jump table question

2017 Jun 01

Opus floating-point NEON jump table question

Thank Jean-Mark and Jonathan! I tested current OPUS encoder in floating-point with Complexity 8. Hacking using the attached patch (which will generate "#define OPUS_ARM_MAY_HAVE_NEON 1" in config.h) will speed up about 14.7% on my Chromebook. Probably it's because many NEON intrinsics optimizations can benefit both fixed-point and floating-point encoder. So if it's safe enough

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 10:11, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com> wrote: > > > > On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > >> > >> > Also is there plans to make the NEON optimisations

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 10:18, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 10:11, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at

[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.

2015 Nov 19

[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.

--- configure.ac | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/configure.ac b/configure.ac index 90a06c8..adcb969 100644 --- a/configure.ac +++ b/configure.ac @@ -503,6 +503,26 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [rtcd_support="$rtcd_support (NE10)"]) ]) + OPUS_CHECK_INTRINSICS( +

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > > Also is there plans to make the NEON optimisations on ARMv7 run time > > detectable like they have in cairo/pixman? For generic distributions > > it would nice to be able to be able to enable them as they offer > > decent performance improvements but have the code

[AArch64 neon intrinsics v4 0/5] Rework Neon intrinsic code for Aarch64 patchset

2015 Dec 23

[AArch64 neon intrinsics v4 0/5] Rework Neon intrinsic code for Aarch64 patchset

Following Tim's comments, here are my reworked patches for the Neon intrinsic function patches of of my Aarch64 patchset, i.e. replacing patches 5-8 of the v2 series. Patches 1-4 and 9-18 of the old series still apply unmodified. The one new (as opposed to changed) patch is the first one in this series, to add named constants for the ARM architecture variants. There are also some minor code

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip? In terms of merging, you posted your patch before I posted mine, so probably I should be

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

On 26 September 2013 17:52, Stanislav Manilov <stanislav.manilov at gmail.com>wrote: > To answer your question I am testing on a pandaboard currently, which has > an arm cortex-a9 processor, which I think is 64-bit. > Cortex-A9 is still 32-bits, so you'll have all support you need. ;) however it doesn't if I remove the -ffreestanding flag. I need to figure > this out

silk_warped_autocorrelation_FIX() NEON optimization

2016 Jul 01

silk_warped_autocorrelation_FIX() NEON optimization

Hi all, I'm sending patch "Optimize silk_warped_autocorrelation_FIX() for ARM NEON" in an separate email. It is based on Tim’s aarch64v8 branch https://git.xiph.org/?p=users/tterribe/opus.git;a=shortlog;h=refs/heads/aarch64v8 Thanks for your comments. Linfeng

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

Hello Renato, It turned out I just didn't do the cross-compilation correctly, and Tim Northover already pointed me to a guide you have written on it ( http://clang.llvm.org/docs/CrossCompilation.html), so I will read that before continuing with my efforts. To answer your question I am testing on a pandaboard currently, which has an arm cortex-a9 processor, which I think is 64-bit. I am much

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 19

[Aarch64 00/11] Patches to enable Aarch64

> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > > I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64

Opus floating-point NEON jump table question

2017 May 31

Opus floating-point NEON jump table question

Hi, ./configure --build x86_64-unknown-linux-gnu --host arm-linux-gnueabihf --disable-assertions --disable-check-asm --enable-intrinsics CFLAGS=-O3 --disable-shared When configuring with floating-point and intrinsics enabled as above, the generated config.h only has OPUS_ARM_MAY_HAVE_NEON_INTR defined (to 1), with /* #undef OPUS_ARM_ASM */ /* #undef OPUS_ARM_INLINE_ASM */ /* #undef

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 07

Patch cleaning up Opus x86 intrinsics configury

Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 24

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 24 November 2014 at 14:53, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 21 November 2014 at 18:06, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > > > Viswanath Puttagunta wrote: > >> > >> a. Simplest use case to validate this optimization for correctness. > >> b. Simplest use case to validate this

Opus floating-point NEON jump table question

2017 Jun 01

Opus floating-point NEON jump table question

Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler supports, and the CPU may support, Neon assembly code, which isn’t necessarily the same thing as the compiler supporting Neon intrinsics. (The Visual Studio ARM compiler, for instance, supports intrinsics but not assembly.) So I don’t think this patch is the right solution. Instead, I think the problem is actually that

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 25 November 2014 at 10:11, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at

Opus floating-point NEON jump table question

2017 Jun 02

Opus floating-point NEON jump table question

Thank Jonathan! I'll fix the MAY_HAVE_NEON() in silk/arm/arm_silk_map.c Linfeng On Thu, Jun 1, 2017 at 3:34 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler > supports, and the CPU may support, Neon assembly code, which isn’t > necessarily the same thing as the compiler supporting Neon intrinsics. >

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 26

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Thanks Timothy and Jean-Marc, I will start NEON optimizing fixed-point FFT. Is it int32 good enough? Benchmark data shows that FFT using int16 is much faster than FFT using int32. > -----Original Message----- > From: Timothy B. Terriberry [mailto:tterribe at xiph.org] > Sent: Friday, December 26, 2014 6:52 AM > To: Phil Wang; opus at xiph.org > Cc: Zhongwei Yao; Yang Zhang; Zhou

[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.

2015 Dec 20

[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.

Jonathan Lennox wrote: > +opus_int32 silk_noise_shape_quantizer_short_prediction_neon(const opus_int32 *buf32, const opus_int32 *coef32) > +{ > + int32x4_t coef0 = vld1q_s32(coef32); > + int32x4_t coef1 = vld1q_s32(coef32 + 4); > + int32x4_t coef2 = vld1q_s32(coef32 + 8); > + int32x4_t coef3 = vld1q_s32(coef32 + 12); > + > + int32x4_t a0 = vld1q_s32(buf32 -

similar to: Bug fix for configure output for Neon --disable-asm --enable-intrinsics