thr3ads.net - similar to: "NEON optimization of speex"

Displaying 20 results from an estimated 11000 matches similar to: "NEON optimization of speex"

not building with --enable-arm-asm -enable-arm5e-asm

2011 Aug 09

not building with --enable-arm-asm -enable-arm5e-asm

Hi, I am getting the following dump while trying to build for arm ./configure --prefix=/root/dump --host=arm-linux --with-gnu-ld --disable-static --enable-fixed-point --enable-arm-asm -enable-arm5e-asm configure: WARNING: unrecognized options: --enable-arm-asm Type "make; make install" to compile and install Speex root at rony-ubuntu:~/speex# make make all-recursive make[1]:

exiting with ogg.h missing

2011 Aug 10

exiting with ogg.h missing

On mer, 2011-08-10 at 09:41 -0400, Rony Nandy wrote: > Hi All, > I have downloaded libogg-1.3.0 along with speex.But,during > build speex is exiting with ogg.h missing.Any suggestions will be highly > appreciated. IIRC, speexenc encodes your data into a speex stream which is encapsulated into an OGG container, so you need to libogg to compile it. Though, it has been ages

exiting with ogg.h missing

2011 Aug 10

exiting with ogg.h missing

Hi All, I have downloaded libogg-1.3.0 along with speex.But,during build speex is exiting with ogg.h missing.Any suggestions will be highly appreciated. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

exiting with ogg.h missing

2011 Aug 10

exiting with ogg.h missing

I ran into a similar problem. The OGG lib and header files need to be in a standard location. To get around this problem I passed the following options to the configure script. --with-ogg-libraries=<libogg_root>/src/.libs --with-ogg-includes=<libogg_root>/include. In this case I built libogg and left all of the files in the default place (so I just did make not make install).

[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics

2014 Nov 09

[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics

Hello, This patch introduces ARM NEON Intrinsics to optimize kf_bfly4 routine in celt part of libopus. Using NEON optimized kf_bfly4(_neon) routine helped improve performance of opus_fft_impl function by about 21.4%. The end use case was decoding a music opus ogg file. The end use case saw performance improvement of about 4.47%. This patch has 2 components i. Actual neon code to improve

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > > Also is there plans to make the NEON optimisations on ARMv7 run time > > detectable like they have in cairo/pixman? For generic distributions > > it would nice to be able to be able to enable them as they offer > > decent performance improvements but have the code

[LLVMdev] Question about ARM/vfp/NEON code generation

2011 May 27

[LLVMdev] Question about ARM/vfp/NEON code generation

Thanks, that helps a lot. > All chips (to date) with NEON have VFP3, so it's safe to assume that a -mfpu=neon will have VFP3, so all the decisions > about code generated for VFP3 can safely be assumed by targets with NEON. Just to confirm my understanding, can I correctly say in general that the llc code generator might blur distinctions between NEON and VFP3 when it can do so

[LLVMdev] Question about ARM/vfp/NEON code generation

2011 May 27

[LLVMdev] Question about ARM/vfp/NEON code generation

On 27 May 2011 02:04, David Dunkle <ddunkle at arxan.com> wrote: > In all cases, I get code that looks pretty very the same; its like what > is below. However, I am expecting to see instruction level differences > between the vfp3 and neon versions. When I do the same with gcc 4.2 I do > see differences in the generated code. Hi David, You could see different instructions (as

2017 May 15

2 patches related to silk_biquad_alt() optimization

Hi Linfeng, Sorry for the delay -- I was actually trying to think of the best option here. For now, my preference would be to keep things bit-exact, but should there be more similar optimizations relying on 64-bit multiplication results, then we could consider having a special option to enable those (even in C). Cheers, Jean-Marc On 08/05/17 12:12 PM, Linfeng Zhang wrote: > Ping for

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

Hi, I was recently looking into the translation of LLVM-IR vector instructions to ARM NEON assembly. Specifically, when this is legal to do and when we need to be careful. I attached a very simple test case: define <4 x float> @fooP(<4 x float> %A, <4 x float> %B) { %C = fmul <4 x float> %A, %B ret <4 x float> %C } If fooP is compiled with "llc -march=arm

[LLVMdev] Question about ARM/vfp/NEON code generation

2011 May 27

[LLVMdev] Question about ARM/vfp/NEON code generation

On May 27, 2011, at 10:49 AM, David Dunkle wrote: > Thanks, that helps a lot. > >> All chips (to date) with NEON have VFP3, so it's safe to assume that a > -mfpu=neon will have VFP3, so all the decisions >> about code generated for VFP3 can safely be assumed by targets with > NEON. > > Just to confirm my understanding, can I correctly say in general that >

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

2009 Nov 11

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

On Nov 11, 2009, at 3:27 AM, Rodolph Perfetta wrote: > > If you know about the alignment, maybe use structured load/store > (vst1.64/vld1.64 {dn-dm}). You may also want to work on whole cache > lines > (64 bytes on A8). You can find more in this discussion: > http://groups.google.com/group/beagleboard/browse_thread/thread/12c7bd415fbc >

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

> |I just looked again at the +neonfp flag. Compiling with and without > |+neonfp flag seems to only affect scalar types in the attached test > |case. If e.g. the LLVM vectorizer introduces vector instructions on > |LLVM-IR level floating point vectors still yield NEON assembly even if > |compiled with "-mattr=+neon,-neonfp". Is this expected? > > I'm virtually

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On 06/06/2013 11:58 PM, Renato Golin wrote: > On 7 June 2013 07:05, Owen Anderson <resistor at mac.com> wrote: Hi Owen, hi Renato, thanks for your replies. >> Darwin uses NEON for floating point, but does *not* (and should not). >> globally enable fast math flags. Use of NEON for FP needs to remain >> achievable without globally setting the fast math flags. Fast

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

>> > Also is there plans to make the NEON optimisations on ARMv7 run time >> > detectable like they have in cairo/pixman? For generic distributions >> > it would nice to be able to be able to enable them as they offer >> > decent performance improvements but have the code fall back on devices >> > that don't support NEON. >> Yep, adding

NEON FP flags

2016 Mar 29

NEON FP flags

On Fri, Mar 25, 2016 at 01:23:03PM +0000, Renato Golin via llvm-dev wrote: > On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote: > > As I understand it, the fundamental property being addresses here is: Are > > the semantics of scalar FP math the same as vector FP math? TTI seems like > > a good place to expose that information. If the semantics are indeed

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

2009 Nov 10

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

On Nov 9, 2009, at 5:59 PM, David Conrad wrote: > On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote: > >> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the >> memcpy intrinsic. I used the Neon load multiple instruction to move >> up >> to 48 bytes at a time . Over 15 scalar instructions collapsed down >> into these 2 Neon instructions. Nice. Thanks

NEON FP flags

2016 Mar 25

NEON FP flags

On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote: > As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations

Opus floating-point NEON jump table question

2017 Jun 02

Opus floating-point NEON jump table question

Thank Jonathan! I'll fix the MAY_HAVE_NEON() in silk/arm/arm_silk_map.c Linfeng On Thu, Jun 1, 2017 at 3:34 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler > supports, and the CPU may support, Neon assembly code, which isn’t > necessarily the same thing as the compiler supporting Neon intrinsics. >

[LLVMdev] NEON intrinsics

2010 Sep 21

[LLVMdev] NEON intrinsics

On 21 September 2010 21:16, Bob Wilson <bob.wilson at apple.com> wrote: > It's referring to the arm.neon.vabds intrinsic, which is different than the old vabal intrinsic. Ok, sorry, those were the ones I was referring to: @llvm.arm.neon.* intrinsics. Is it polluting too much to add the few last (llvm.arm.neon.vadd, llvm.arm.neon.vsub)? It makes it a bit easier to generate neon

similar to: NEON optimization of speex