similar to: Testing ARMv8 Ne10 and intrinsics branch

Displaying 20 results from an estimated 500 matches similar to: "Testing ARMv8 Ne10 and intrinsics branch"

2015 Apr 02
0
Testing ARMv8 Ne10 and intrinsics branch
Hello Thomas, I use the following configure command to link against Ne10 Eg: configure --host=arm-linux-gnueabihf --enable-intrinsics --with-NE10-libraries=${BUILD_NE10_LIB} --with-NE10-includes=${BUILD_NE10_INC}" So, in my normal testing, I explicitly specify where the NE10 header files are installed and where the NE10 libraries are installed. Looking back at configure.ac
2015 Oct 06
3
[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library
I'm trying to get these cleaned up and landed, but I'm running into some trouble with this patch. Using commit a08b29d88e3c (July 21) of Ne10, I'm seeing test failures for 60-point FFTs: nfft=60 inverse=0,snr = -3.312408 ** poor snr: -3.312408 ** nfft=60 inverse=1,snr = -16.079597 ** poor snr: -16.079597 ** All other sizes tested appear to work fine (84 to 140 dB of SNR). This
2015 Feb 26
3
[RFC PATCH v2] Encode optimize using libNe10
Viswanath Puttagunta wrote: > Can we please have review on RFCv2? We have quite a few optimizations > (Eg: ifft/mdct_backwards, fixed point fft/ifft mdct_forward/backward > etc) that are in my pipeline that depend on this patch series being > accepted. So, trying to make progress on this... On an armv7l board running Ubuntu, you've broken the build with just --enable-intrinsics
2015 Apr 30
3
[RFC PATCH v1 0/8] Ne10 fft fixed and previous
On 29 April 2015 at 17:22, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > Viswanath Puttagunta wrote: >> >> This patch series is follow up on work I posted on [1]. >> In addition to what was posted on [1], this patch series mainly >> integrates Fixed point FFT implementations in NE10 library into opus. >> You can view my opus wip code at [2]. >
2015 Oct 16
1
[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library
Hi Timothy, Sorry for late reply. I have upstreamed the patch to fix the regression here: https://github.com/projectNe10/Ne10/commit/ee5d856cd9cb8c4a15ace567df4239f4e788d043 I have tested it with Vish's branch: http://git.linaro.org/people/viswanath.puttagunta/opus.git/shortlog/refs/heads/rfcv3_fft_fixed) Both unit test dft and unit test mdct passed on ARM v7/v8, floating point/fixed
2015 May 08
2
(no subject)
Hello Jean-Marc, Below are the results that show test_unit_dft passes, but test_unit_mdct fails (only for nfft=480, 960, 1920) Note: Tested on BeagleboneBlack(Cortex-A8) fixed point on branch [1] ./test_unit_dft nfft=32 inverse=0,snr = 88.394372 nfft=32 inverse=1,snr = 93.896470 nfft=128 inverse=0,snr = 89.185895 nfft=128 inverse=1,snr = 93.537021 nfft=256 inverse=0,snr = 88.353151 nfft=256
2015 Jan 30
1
[RFC PATCH v1 2/2] armv7(float): Optimize encode usecase using NE10 library
Viswanath Puttagunta wrote: > Is the peak stack usage a complete blocker in current form? Since this only affects people who enable NE10, I don't think this is a blocker.
2015 Feb 03
2
opus Digest, Vol 72, Issue 17
Hi all, I have already added support for scaled forward non-power-of-2 floating-point FFT: https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. By setting is_forward_scaled to anything but zero, ne10_fft_c2c_1d_float32_neon will scale the output. So we can remove need for one buffer on
2015 Mar 31
6
[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series
Hi Timothy, As I mentioned earlier [1], I now fixed compile issues with fixed point and resubmitting the patch. I also have new patch that does intrinsics optimizations for celt_pitch_xcorr targetting aarch64. You can find my latest work-in-progress branch at [2] For reference, you can use the Ne10 pre-built libraries at [3] Note that I am working with Phil at ARM to get my patch at [4]
2015 Jan 29
2
[RFC PATCH v1 2/2] armv7(float): Optimize encode usecase using NE10 library
Viswanath Puttagunta wrote: > if OPUS_ARM_NEON_INTR > CELT_ARM_NEON_INTR_OBJ = $(CELT_SOURCES_ARM_NEON_INTR:.c=.lo) \ > - %test_unit_rotation.o %test_unit_mathops.o > -$(CELT_ARM_NEON_INTR_OBJ): CFLAGS += $(OPUS_ARM_NEON_INTR_CPPFLAGS) > + $(CELT_SOURCES_ARM_NE10:.c=.lo) \ > + %test_unit_rotation.o %test_unit_mathops.o \ > +
2015 May 08
1
(no subject)
Hello Jean-Marc, Yep, that was it.. with your patch, test_unit_mdct passes for all nfft. So, what you do you suggest the next step here is? Regards, Vish On 8 May 2015 at 12:30, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi, > > Can you apply this change to the MDCT test and run it again. See if more > (all) sizes pass. Given the results, I strongly suspect an
2015 Mar 04
2
Patch cleaning up Opus x86 intrinsics configury
On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I
2015 Jan 20
6
[RFC PATCH v1 0/2] Encode optimize using libNE10
Hello opus-dev, I've been cooking up this patchset to integrate NE10 library into opus. Current patchset focuses on encode use case mainly effecting performance of clt_mdct_forward() and opus_fft() (for float only) Glad to report the following on Encode use case: (Measured on my Beaglebone Black Cortex-A8 board) - Performance improvement for encode use case ~= 12.34% (Based on time -p
2015 Mar 07
1
Patch cleaning up Opus x86 intrinsics configury
Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at
2015 Feb 04
1
opus Digest, Vol 72, Issue 17
Viswanath Puttagunta wrote: > What should we do for power-of-2? I really want to avoid putting > runtime checks if nfft is power of 2 in opus_fft_float_neon. Given the tests that had to be disabled for NE10, I suspect we will not really be able to use it for CUSTOM_MODES, which should be the only time nfft is a power of 2. So I'd suggest just disabling the support when CUSTOM_MODES
2015 Oct 06
0
[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library
Hello Timothy, Great to hear from you! Fired up my hardware today and this issue looks like a regression in Ne10 library. The commit in Ne10 [1] that I tested to be working successfully back in May 5b63074db45000f9688460990ee3f5e147d93782 which is the Patch Phil at ARM added to fix the overflow issue in nfft=60 case. After git-bisect, looks like the culprit patch in Ne10 [1] is
2015 Feb 04
4
[RFC PATCH v2] Encode optimize using libNe10
Changes from RFC PATCH v1: - passing arch parameter explicitly - reduced stack usage by ~3.5K by using scaled NE10 fft version - moved all optimization array functions to arm_celt_map.c - Other cleanups pointed out by Timothy Phil, As you mentioned earlier, could you please address all compile and linker errors/warnings coming out of Ne10 library? You can find my working Ne10 repo at [1] You
2014 Dec 11
2
[ARM][FFT][NEON] Integrate Ne10 into Opus?
Hi everyone, I am working on Ne10 project. Ne10 provides NEON optimized FFT routines that are much faster (compared to those without NEON), on most ARMv7-A and all ARMv8-A devices. How about integrate it into Opus? I am not familiar with configure script, but I find "Optinal Packages" in it. If we provides --with-ne10-fft option, the one extra thing that users need to do is to
2015 Mar 03
2
[RFC PATCHv3] Encode optimize using libNe10
Changes from RFC PATCH v2 - fixed compile issue when just compiling for --enable-intrinsics for ARMv7 without NE10 - Notes for NE10: - All compile/link warnings are now in upstream NE10 - Only patch pending upstream in NE10 is the one that needs to add -funsafe-math-optimizations for ARMv7 targets. - Phil Wang @ ARM is working on getting this fixed. - Note that even without
2015 Apr 28
10
[RFC PATCH v1 0/8] Ne10 fft fixed and previous
Hello Timothy / Jean-Marc / opus-dev, This patch series is follow up on work I posted on [1]. In addition to what was posted on [1], this patch series mainly integrates Fixed point FFT implementations in NE10 library into opus. You can view my opus wip code at [2]. Note that while I found some issues both with the NE10 library(fixed fft) and with Linaro toolchain (armv8 intrinsics), the work