Displaying 20 results from an estimated 100 matches similar to: "PATCH for lpc_intrin_sse41.c: faster shifts"
2014 Jan 30
0
PATCH for lpc_intrin_sse41.c: faster shifts
lvqcl wrote:
> It turns out that int64 shift is quite slow...
>
> This patch changes the code from:
> (FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization)
> into:
> _mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization));
>
> Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster.
>
>
> The new code works only if quantization <= 32,
2014 Jan 14
1
PATCH for lpc_asm.nasm
1) Two comments ";ASSERT(lp_quantization <= 31)" in the new functions ..._wide_asm_ia32()
-- just to mention this constraint.
(max. possible value of lp_quantization is 15, so it's not a problem)
2) "mov cl, ..." was replaced with "mov ecx, ..." (again Agner Fog, optimizing_assembly.pdf)
summary: write to a partial register may result in false dependencies
2004 Sep 10
3
const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)
Hello,
I just tried to compile libFLAC (using Borland C++ Builder 6 on Windows).
The compilers yells at me on line 233 of libFLAC/lpc.c
*(residual++) = *(data++) - (sum >> lp_quantization);
--> data is const and cannot be modified
Funny thing is, if data is declared:
const FLAC__int32 *data
instead of
const FLAC__int32 data[]
everything is ok.
Is this a bug in my compiler, or
2004 Sep 10
2
const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)
On Tue, Jan 13, 2004 at 02:04:48PM -0800, Josh Coalson wrote:
> --- Denis Chatelain <listes@octopodus.com> wrote:
> > Hello,
> >
> >
> > I just tried to compile libFLAC (using Borland C++ Builder 6 on
> > Windows).
> >
> > The compilers yells at me on line 233 of libFLAC/lpc.c
> >
> > *(residual++) = *(data++) - (sum >>
2014 Sep 20
2
[PATCH 4/4] lpc_intrin_sse41 routines
This patch increases speed of FLAC__lpc_restore_signal_wide_intrin_sse41
(decoding of 24-bit FLAC files for 32-bit platform).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lpc_sse4.zip
Type: application/zip
Size: 3310 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20140920/a3d8efb4/attachment.zip
2005 Feb 02
0
two small-ish optimizations (death by a thousand cuts)
This lpc_restore_order was partially inspired by Miroslav's affd, though
my (not very great) ARM asm version resembled this, as well.
The other two reduce CPU array indexing overhead in loops a little.
Additionally, a request for help:
My not very optimized lpc_restore_signal is at the below URL, I
couldn't get the ldm* instructions to work as advertised, even though
I've talked
2004 Oct 01
1
[PATCH] fix compile errors with asm disabled
The #endifs are mismatched, and my builds were failing because
lpc_restore_signal* weren't getting declared.
I've also commented the endifs to make them easier to match.
Also, is there any reason #ifdefs for FLAC__HAS_NASM and FLAC__CPU_IA32 are
separate and nested the way they are and not combined like this?:
#if defined(FLAC__CPU_IA32) && defined(FLAC__HAS_NASM)
I'm not
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything.
The asm code isn't gas compliant. the libFLAC linker script has a typo,
disabling the asm optimization and/or altivec won't let a correct build
anyway.
Instant fixes for the asm stuff:
sed -i -e"s:;:\#:" on the lpc_asm.s
to load address instead of addis+ori you could use
lis and la and PLEASE use the @l(register)
2007 Aug 31
2
1.2.0: Test suite failures on LP64 archs?
Running the basic (--disable-thorough-tests) test suite, I get these
failures
round-trip test (rt-1-24-111.raw) encode... Segmentation fault (core
dumped) ERROR
FAIL: ./test_flac.sh
fsd24-01 (--channels=1 --bps=24 -0 -l 16 --lax -m -e -p): encode...ERROR during encode of fsd24-01
FAIL: ./test_streams.sh
on alpha and amd64. By contrast, i386 is fine. (All OpenBSD/4.2.)
Could be a generic LP64
2004 Sep 10
1
altivec lpc_restore_signal
I've had this a long time but haven't submitted it yet.
I've tried to mirror the ia32 setup, so there should be a new subdirectory
src/libFLAC/ppc . The first two attachments go there. The third is a context
diff for src/libFLAC/Makefile.am .
I have some more modified files, which I figured I'd submit after the above
are checked in and working for somebody other than me. If you
2004 Sep 10
3
Altivec, automake
I think I've gotten FLAC__lpc_restore_signal() about as good as I'm going to
get it.
Here's what I have:
-a new file, lpc_asm.s, which has the assembly routines
-changes to cpu.h, cpu.c, and stream_decoder.c to enable them
-changes to configure.in to support the new cpu stuff
-a preliminary Makefile.am
-maybe something else I'm forgetting
Now automake complains that configure.in
2004 Sep 10
2
Altivec, automake
Here's what I listed in that email. Merging doesn't appear to be necessary. If
you have any build problems, let me know.
Note that my detection code is Darwin-specific. It's a BSD call (sysctl()), so
a change to the platform-detection macros should enable it to work on other
BSDs. However, I don't know what that would be, and I couldn't determine any
safe way to do the check
2005 Jan 29
4
A couple of points about flac 1.1.1 on ppc/linux/altivec
On Thu, 27 Jan 2005, John Steele Scott wrote:
> That looks fine to me as well. However, the best solution is something which
> Luca suggested a few months ago, which is to use the functions defined in
> altivec.h. These are C functions which map directly to Altivec machine
> instructions. I am willing to help out, but I don't find the current lpc_asm.s
> very easy to follow, and
2007 Sep 01
2
Re: 1.2.0: Test suite failures on LP64 archs?
Christian Weisgerber <naddy@mips.inka.de> wrote:
> #0 0x0000000040d18810 in FLAC__lpc_compute_residual_from_qlp_coefficients_wide
> (data=0x49e4c014, data_len=110, qlp_coeff=0x7f7ffffece70, order=1,
> lp_quantization=14, residual=0x4fced000) at lpc.c:745
> 745 residual[i] =
> data[i] - (FLAC__int32)((qlp_coeff[0] *
2013 Oct 04
2
Again about encoding speed of different compiles
I downloaded current version of FLAC sources and compiled it with:
* GCC 4.8.1 (MSYS from http://xhmikosr.1f0.de/tools/)
* Intel C++ Composer XE 2013 update 5
* MSVS 2010 SP1
* MSVS 2012 update 3
(SSSE3 and SSE4.1 code was disabled for all compilers)
Stereo 24-bit WAV file was encoded with -8 preset.
Encoding time, in seconds:
GCC 32-bit: 209
ICC 32-bit: 130
VS10 32-bit: 116
VS12 32-bit: 114
2005 Jan 01
2
libFLAC bitbuffer optimizations
Josh Coalson <xflac@yahoo.com> wrote:
> thanks for the patch.
No prob :)
> also, if you have miroslav's patch again a more updated version
> of bitbuffer.c that would be great. I have been meaning to get
> around to applying it for a long time.
This is Miroslav's patch, from the mailing list post I dug up in the archives:
--- orig/src/libFLAC/bitbuffer.c
+++
2005 Oct 25
2
Re: Reg. FLAC decoding
Sorry for the delay in getting back to you., I was working on something
else and just now got FLAC to work.
Ok., FLAC files are playing now :) Cheers. There is a slight noise
happening in the background., which i'm figuring out. I hope that it'll
be solved soon. However, i wanted to know if there are any ARM specific
optimizations that can be done. The processor is a 166MHz processor. Do
2005 Oct 25
0
Re: Reg. FLAC decoding
--- Joe Steeve <joesteeve@zodiactorp.com> wrote:
> Sorry for the delay in getting back to you., I was working on
> something
> else and just now got FLAC to work.
>
> Ok., FLAC files are playing now :) Cheers. There is a slight noise
> happening in the background., which i'm figuring out. I hope that
> it'll
> be solved soon. However, i wanted to know if
2004 Dec 28
2
libFLAC bitbuffer optimizations
Pulled from my Arch archive, this following patch seems to have made
quite a difference in getting my ARM7TDMI chip to play FLAC (compression
levels 0-2) on my ipod. I don't have benchmarks with hard numbers, but
playing with skips vs playing without skips is a fairly noticeable
difference.
memcpy and memset on uClibc are optimized in asm for the ARM7TDMI in
uClibc. Other hardware/libc
2014 Jun 28
0
[PATCH 14] preprocessor macros in lpc_intrin_sseN.c
Currently both lpc_intrin_sse2.c and lpc_intrin_sse41.c
define macros RESIDUAL_RESULT and DATA_RESULT.
This patch changes their names so they become different.
Reason: FLAC build systems don't apply specific options (such as
-msse4.1) to specific files. So it makes little sense to have
separate *_intrin_sseA.c and *_intrin_sseB.c files.
IMHO it's not unreasonable to merge