thr3ads.net - similar to: "PATCH for lpc_intrin

Displaying 20 results from an estimated 100 matches similar to: "PATCH for lpc_intrin_sse41.c: faster shifts"

PATCH for lpc_intrin_sse41.c: faster shifts

2014 Jan 30

PATCH for lpc_intrin_sse41.c: faster shifts

lvqcl wrote: > It turns out that int64 shift is quite slow... > > This patch changes the code from: > (FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization) > into: > _mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization)); > > Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster. > > > The new code works only if quantization <= 32,

PATCH for lpc_asm.nasm

2014 Jan 14

PATCH for lpc_asm.nasm

1) Two comments ";ASSERT(lp_quantization <= 31)" in the new functions ..._wide_asm_ia32() -- just to mention this constraint. (max. possible value of lp_quantization is 15, so it's not a problem) 2) "mov cl, ..." was replaced with "mov ecx, ..." (again Agner Fog, optimizing_assembly.pdf) summary: write to a partial register may result in false dependencies

const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)

2004 Sep 10

const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)

Hello, I just tried to compile libFLAC (using Borland C++ Builder 6 on Windows). The compilers yells at me on line 233 of libFLAC/lpc.c *(residual++) = *(data++) - (sum >> lp_quantization); --> data is const and cannot be modified Funny thing is, if data is declared: const FLAC__int32 *data instead of const FLAC__int32 data[] everything is ok. Is this a bug in my compiler, or

const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)

2004 Sep 10

const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)

On Tue, Jan 13, 2004 at 02:04:48PM -0800, Josh Coalson wrote: > --- Denis Chatelain <listes@octopodus.com> wrote: > > Hello, > > > > > > I just tried to compile libFLAC (using Borland C++ Builder 6 on > > Windows). > > > > The compilers yells at me on line 233 of libFLAC/lpc.c > > > > *(residual++) = *(data++) - (sum >>

[PATCH 4/4] lpc_intrin_sse41 routines

2014 Sep 20

[PATCH 4/4] lpc_intrin_sse41 routines

This patch increases speed of FLAC__lpc_restore_signal_wide_intrin_sse41 (decoding of 24-bit FLAC files for 32-bit platform). -------------- next part -------------- A non-text attachment was scrubbed... Name: lpc_sse4.zip Type: application/zip Size: 3310 bytes Desc: not available Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20140920/a3d8efb4/attachment.zip

two small-ish optimizations (death by a thousand cuts)

2005 Feb 02

two small-ish optimizations (death by a thousand cuts)

This lpc_restore_order was partially inspired by Miroslav's affd, though my (not very great) ARM asm version resembled this, as well. The other two reduce CPU array indexing overhead in loops a little. Additionally, a request for help: My not very optimized lpc_restore_signal is at the below URL, I couldn't get the ldm* instructions to work as advertised, even though I've talked

[PATCH] fix compile errors with asm disabled

2004 Oct 01

[PATCH] fix compile errors with asm disabled

The #endifs are mismatched, and my builds were failing because lpc_restore_signal* weren't getting declared. I've also commented the endifs to make them easier to match. Also, is there any reason #ifdefs for FLAC__HAS_NASM and FLAC__CPU_IA32 are separate and nested the way they are and not combined like this?: #if defined(FLAC__CPU_IA32) && defined(FLAC__HAS_NASM) I'm not

flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)

2004 Oct 06

flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)

Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)

1.2.0: Test suite failures on LP64 archs?

2007 Aug 31

1.2.0: Test suite failures on LP64 archs?

Running the basic (--disable-thorough-tests) test suite, I get these failures round-trip test (rt-1-24-111.raw) encode... Segmentation fault (core dumped) ERROR FAIL: ./test_flac.sh fsd24-01 (--channels=1 --bps=24 -0 -l 16 --lax -m -e -p): encode...ERROR during encode of fsd24-01 FAIL: ./test_streams.sh on alpha and amd64. By contrast, i386 is fine. (All OpenBSD/4.2.) Could be a generic LP64

altivec lpc_restore_signal

2004 Sep 10

altivec lpc_restore_signal

I've had this a long time but haven't submitted it yet. I've tried to mirror the ia32 setup, so there should be a new subdirectory src/libFLAC/ppc . The first two attachments go there. The third is a context diff for src/libFLAC/Makefile.am . I have some more modified files, which I figured I'd submit after the above are checked in and working for somebody other than me. If you

Altivec, automake

2004 Sep 10

Altivec, automake

I think I've gotten FLAC__lpc_restore_signal() about as good as I'm going to get it. Here's what I have: -a new file, lpc_asm.s, which has the assembly routines -changes to cpu.h, cpu.c, and stream_decoder.c to enable them -changes to configure.in to support the new cpu stuff -a preliminary Makefile.am -maybe something else I'm forgetting Now automake complains that configure.in

Altivec, automake

2004 Sep 10

Altivec, automake

Here's what I listed in that email. Merging doesn't appear to be necessary. If you have any build problems, let me know. Note that my detection code is Darwin-specific. It's a BSD call (sysctl()), so a change to the platform-detection macros should enable it to work on other BSDs. However, I don't know what that would be, and I couldn't determine any safe way to do the check

A couple of points about flac 1.1.1 on ppc/linux/altivec

2005 Jan 29

A couple of points about flac 1.1.1 on ppc/linux/altivec

On Thu, 27 Jan 2005, John Steele Scott wrote: > That looks fine to me as well. However, the best solution is something which > Luca suggested a few months ago, which is to use the functions defined in > altivec.h. These are C functions which map directly to Altivec machine > instructions. I am willing to help out, but I don't find the current lpc_asm.s > very easy to follow, and

Re: 1.2.0: Test suite failures on LP64 archs?

2007 Sep 01

Re: 1.2.0: Test suite failures on LP64 archs?

Christian Weisgerber <naddy@mips.inka.de> wrote: > #0 0x0000000040d18810 in FLAC__lpc_compute_residual_from_qlp_coefficients_wide > (data=0x49e4c014, data_len=110, qlp_coeff=0x7f7ffffece70, order=1, > lp_quantization=14, residual=0x4fced000) at lpc.c:745 > 745 residual[i] = > data[i] - (FLAC__int32)((qlp_coeff[0] *

Again about encoding speed of different compiles

2013 Oct 04

Again about encoding speed of different compiles

I downloaded current version of FLAC sources and compiled it with: * GCC 4.8.1 (MSYS from http://xhmikosr.1f0.de/tools/) * Intel C++ Composer XE 2013 update 5 * MSVS 2010 SP1 * MSVS 2012 update 3 (SSSE3 and SSE4.1 code was disabled for all compilers) Stereo 24-bit WAV file was encoded with -8 preset. Encoding time, in seconds: GCC 32-bit: 209 ICC 32-bit: 130 VS10 32-bit: 116 VS12 32-bit: 114

libFLAC bitbuffer optimizations

2005 Jan 01

libFLAC bitbuffer optimizations

Josh Coalson <xflac@yahoo.com> wrote: > thanks for the patch. No prob :) > also, if you have miroslav's patch again a more updated version > of bitbuffer.c that would be great. I have been meaning to get > around to applying it for a long time. This is Miroslav's patch, from the mailing list post I dug up in the archives: --- orig/src/libFLAC/bitbuffer.c +++

Re: Reg. FLAC decoding

2005 Oct 25

Re: Reg. FLAC decoding

Sorry for the delay in getting back to you., I was working on something else and just now got FLAC to work. Ok., FLAC files are playing now :) Cheers. There is a slight noise happening in the background., which i'm figuring out. I hope that it'll be solved soon. However, i wanted to know if there are any ARM specific optimizations that can be done. The processor is a 166MHz processor. Do

Re: Reg. FLAC decoding

2005 Oct 25

Re: Reg. FLAC decoding

--- Joe Steeve <joesteeve@zodiactorp.com> wrote: > Sorry for the delay in getting back to you., I was working on > something > else and just now got FLAC to work. > > Ok., FLAC files are playing now :) Cheers. There is a slight noise > happening in the background., which i'm figuring out. I hope that > it'll > be solved soon. However, i wanted to know if

libFLAC bitbuffer optimizations

2004 Dec 28

libFLAC bitbuffer optimizations

Pulled from my Arch archive, this following patch seems to have made quite a difference in getting my ARM7TDMI chip to play FLAC (compression levels 0-2) on my ipod. I don't have benchmarks with hard numbers, but playing with skips vs playing without skips is a fairly noticeable difference. memcpy and memset on uClibc are optimized in asm for the ARM7TDMI in uClibc. Other hardware/libc

[PATCH 14] preprocessor macros in lpc_intrin_sseN.c

2014 Jun 28

[PATCH 14] preprocessor macros in lpc_intrin_sseN.c

Currently both lpc_intrin_sse2.c and lpc_intrin_sse41.c define macros RESIDUAL_RESULT and DATA_RESULT. This patch changes their names so they become different. Reason: FLAC build systems don't apply specific options (such as -msse4.1) to specific files. So it makes little sense to have separate *_intrin_sseA.c and *_intrin_sseB.c files. IMHO it's not unreasonable to merge

similar to: PATCH for lpc_intrin_sse41.c: faster shifts