Displaying 3 results from an estimated 3 matches for "kf_bfly4_c".
Did you mean:
kf_bfly4
2014 Nov 25
1
[Profiling][FFT][AArch64] FFT Profiling data on AArch64
Hi everyone,
I have profiled Opus on AArch64. I just run opus_demo with some pcm files.
Following is time proportion of FFT with different bitrate.
Bitrate | Time cost by FFT/iFFT
24kb/s | 15%
48kb/s | 15%
96kb/s | 13%
Any comment? I want some data close to real application, any suggestion?
Thanks,
Phil Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2014 Nov 09
0
[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics
...uts.h b/celt/_kiss_fft_guts.h
index 5e3d58f..219b431 100644
--- a/celt/_kiss_fft_guts.h
+++ b/celt/_kiss_fft_guts.h
@@ -34,6 +34,19 @@
and defines
typedef struct { kiss_fft_scalar r; kiss_fft_scalar i; }kiss_fft_cpx; */
#include "kiss_fft.h"
+#include "arch.h"
+/*
+void kf_bfly4_c( kiss_fft_cpx * Fout, const size_t fstride,
+ const kiss_fft_state *st,
+ int m, int N, int mm);
+*/
+
+#if defined (ARMv7_NEON_INTRINSICS_FLOAT)
+#include "arm/kiss_fft_neon.h"
+#define kf_bfly4 kf_bfly4_neon
+#else
+#define kf_bfly4 kf_bfly4_c
+#endif
/*
Explanation of macros de...
2014 Nov 09
3
[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics
Hello,
This patch introduces ARM NEON Intrinsics to optimize
kf_bfly4 routine in celt part of libopus.
Using NEON optimized kf_bfly4(_neon) routine helped improve
performance of opus_fft_impl function by about 21.4%. The
end use case was decoding a music opus ogg file. The end
use case saw performance improvement of about 4.47%.
This patch has 2 components
i. Actual neon code to improve