similar to: IA32 and NASM

Displaying 20 results from an estimated 500 matches similar to: "IA32 and NASM"

2013 Aug 22
2
New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16
libFLAC have three SSE-accelerated functions FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_N (N = 4, 8, 12). They require lpc_order less than N. The best compression preset (flac -8) uses lpc_order up to 12; it means that during encoding FLAC also uses unaccelerated C function. I'm not very familiar with asm so I took FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_12, changed it and
2014 Aug 02
1
[PATCH] new SSE code to calculate autocorrelation
This patch accelerates FLAC__lpc_compute_autocorrelation_intrin_sse_lag_NN routines for AMD and newer Intel CPUs. But it's slower on older Intel CPUs. ('Newer Intel CPUs' means Core i aka Nehalem and newer) According to tests at HA: <http://www.hydrogenaud.io/forums/index.php?s=&showtopic=101082&view=findpost&p=870753> CPU flac -5 flac -8
2004 Sep 10
2
Altivec, automake
Here's what I listed in that email. Merging doesn't appear to be necessary. If you have any build problems, let me know. Note that my detection code is Darwin-specific. It's a BSD call (sysctl()), so a change to the platform-detection macros should enable it to work on other BSDs. However, I don't know what that would be, and I couldn't determine any safe way to do the check
2004 Sep 10
1
lpc slowdown
I have noticed lpc slowdown both in encoding and decoding, not related to new config.h stuff. It seems there is wrong choosing of fastest possible version of lpc function. Patch is attached. -- Miroslav Lichvar -------------- next part -------------- Index: src/libFLAC/stream_decoder.c =================================================================== RCS file:
2005 Jan 25
0
bitbuffer optimizations
On Mon, Jan 24, 2005 at 06:31:21PM -0800, Josh Coalson wrote: > yes, a mere 2 years later it is checked in! > > speed improvement for me is roughly 17% testing flac files on > linux-i386. Thanks! In case you would like to check another old patch, I have attached updated patch for seekable stream decoder, originally posted on 09/07/2003. -- Miroslav Lichvar -------------- next
2012 May 05
5
[PATCH] Optionally, allow distros to use openssl for MD5 verification
This has the advantage of being more efficient than the included routines and allows distros to centralize crypto mainteniance on a few libraries. --- configure.ac | 4 +- m4/ax_check_openssl.m4 | 124 +++++++++++++++++++++++++++++++++++++ src/libFLAC/Makefile.am | 2 +- src/libFLAC/include/private/md5.h | 8 ++- src/libFLAC/md5.c
2005 Feb 02
0
two small-ish optimizations (death by a thousand cuts)
This lpc_restore_order was partially inspired by Miroslav's affd, though my (not very great) ARM asm version resembled this, as well. The other two reduce CPU array indexing overhead in loops a little. Additionally, a request for help: My not very optimized lpc_restore_signal is at the below URL, I couldn't get the ldm* instructions to work as advertised, even though I've talked
2004 Sep 10
2
better seeking
When I was trying to find yesterday's xmms-plugin bug, i have noticed that seeking in stream without seek-table isn't very good. With attached patch it is much better. -- Miroslav Lichvar -------------- next part -------------- --- src/libFLAC/seekable_stream_decoder.c.orig 2003-02-26 19:41:51.000000000 +0100 +++ src/libFLAC/seekable_stream_decoder.c 2003-07-09 23:49:35.000000000 +0200
2006 Nov 03
2
better seeking
On Mon, Oct 30, 2006 at 11:13:25AM -0800, Josh Coalson wrote: > my apologies for not doing this before Miroslav... I will definitely > integrate it this time. Thanks. Sending latest version of the patch. Now it can seek in files that have large id3 tag (or any random data) at the end and it won't loop on streams with shuffled frames. -- Miroslav Lichvar -------------- next part
2012 Apr 05
2
[PATCH 2/2] V2: Use a single definition of MIN and MAX in sources
--- configure.ac | 7 +++++ src/libFLAC/bitreader.c | 12 ++------- src/libFLAC/bitwriter.c | 8 ++---- src/libFLAC/fixed.c | 18 +++++-------- src/libFLAC/format.c | 8 ++---- src/libFLAC/include/private/macros.h | 29 ++++++++++++++++++++ src/libFLAC/metadata_iterators.c | 17 +++---------
2006 Oct 28
3
better seeking
Ok, the patch from 2003 about improving seeking still didn't make it to CVS, so here is another try. I made some benchmarking with the test_seeking utility from flac sources to show how bad the current seeking is, especially without seektable. Track used for the experiment had about 50 minutes. In the following table is average number of seeks and number of decoded frames required for one
2014 Dec 15
1
[PATCH] src/libFLAC/stream_decoder.c : Rework fix for seeking bug.
To avoid crash caused by an unbound LPC decoding when predictor order is larger than blocksize, the sanity check needs to be moved to the subframe decoding functions. --- src/libFLAC/stream_decoder.c | 30 ++++++++++++------------------ 1 file changed, 12 insertions(+), 18 deletions(-) diff --git a/src/libFLAC/stream_decoder.c b/src/libFLAC/stream_decoder.c index d13b23b..211b4db 100644 ---
2015 Mar 09
2
[PATCH 1/1] ensure that stack is aligned for SSE functions if using mingw32
Unable to test on win32 at the moment, please give this a try. Feedback welcome. Avoids crashes due to unaligned ops when built with mingw. --- src/libFLAC/include/private/cpu.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/libFLAC/include/private/cpu.h b/src/libFLAC/include/private/cpu.h index 8927897..bd40012 100644 --- a/src/libFLAC/include/private/cpu.h +++
2014 Jun 19
0
[PATCH] stream_encoder : Improve selection of residual accumulator width
On Thu, Jun 19, 2014 at 03:30:06PM +0200, Miroslav Lichvar wrote: > But, as we have seen with unusual data the residual signal can be > wider than bps. The FLAC format specification doesn't seem to mention > this. Should it be treated as a valid FLAC stream? I think it would be interesting to know how common are such streams. I patched flac to print a warning on decoding or testing
2004 Sep 10
0
better seeking
And here is another one. It allows fast seeking in streams without total_samples information. There is a check for such streams in flac, so flac --skip doesn't work. If the check is removed, it will work with --force-raw-format only, there is an issue with wav and aiff header handling. -- Miroslav Lichvar -------------- next part -------------- ---
2006 Nov 07
0
better seeking
On Mon, Nov 06, 2006 at 08:50:44AM -0800, Josh Coalson wrote: > ok, tried it out... passes test/test_seeking.sh and my > "xmms twitch" test, checked in to CVS. thanks! Thanks! I see you have changed the channels and bps setting, this doesn't work when the decoder hasn't decoded a frame. There should be a fallback for this case. -- Miroslav Lichvar -------------- next
2014 Jun 19
1
[PATCH] stream_encoder : Improve selection of residual accumulator width
On Thu, Jun 19, 2014 at 06:25:57PM +0400, lvqcl wrote: > Now I wonder why evaluate_lpc_subframe_() function in stream_encoder.c contains > almost the same code, but without any comments that it's not enough pessimistic: > evaluate_lpc_subframe_(): > > if(subframe_bps + qlp_coeff_precision + FLAC__bitmath_ilog2(order) <= 32) > if(subframe_bps <= 16 &&
2007 Apr 08
0
FLAC 24 bit test results
On Thu, Apr 05, 2007 at 06:48:06PM +0200, Josh Green wrote: > It seems that generally Wavpack does a little better than FLAC at > compressing audio. But that is generally within a rather small margin. > 20% margin seems a little large to me though. There may indeed be no > problem with the FLAC reference implementation in regards to 24 bit and > its just having trouble compressing
2011 Sep 26
1
mid-side coding and bits per sample
Dear list, i'm doing a bit of analisys on flac's source code and i've run into something i can't quite grasp. flac version 1.2.1 flaclib C stream_encoder.c function "process_subframes_" line 2999 ++++++++++++++++++++++ if(do_mid_side) { FLAC__ASSERT(encoder->protected_->channels == 2); for(channel = 0; channel < 2; channel++) {
2018 Jul 10
9
[PATCH 0/7] PowerPC64 performance improvements
The following series adds initial vector support for PowerPC64. On POWER9, flac --best is about 3.3x faster. Amitay Isaacs (2): Add m4 macro to check for C __attribute__ features Check if compiler supports target attribute on ppc64 Anton Blanchard (5): configure.ac: Remove SPE detection code configure.ac: Add VSX enable/disable configure.ac: Fix FLAC__CPU_PPC on little endian, and add