It turns out that int64 shift is quite slow... This patch changes the code from: (FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization) into: _mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization)); Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster. The new code works only if quantization <= 32, but its max value is 15 so the code always work. (max_shiftlimit == (1 << (FLAC__SUBFRAME_LPC_QLP_SHIFT_LEN-1)) - 1 == 15)
Erik de Castro Lopo
2014-Jan-30 11:20 UTC
[flac-dev] PATCH for lpc_intrin_sse41.c: faster shifts
lvqcl wrote:> It turns out that int64 shift is quite slow... > > This patch changes the code from: > (FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization) > into: > _mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization)); > > Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster. > > > The new code works only if quantization <= 32, but its max value is 15 so the code always work. > (max_shiftlimit == (1 << (FLAC__SUBFRAME_LPC_QLP_SHIFT_LEN-1)) - 1 == 15)I think you forgot to attach the patch for this one :-). Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Erik de Castro Lopo wrote:> I think you forgot to attach the patch for this one :-).Oops. -------------- next part -------------- A non-text attachment was scrubbed... Name: fast_shift.patch Type: application/octet-stream Size: 5826 bytes Desc: not available Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20140130/889aec81/attachment.obj
Maybe Matching Threads
- PATCH for lpc_intrin_sse41.c: faster shifts
- PATCH for lpc_asm.nasm
- const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)
- const issue in FLAC__lpc_compute_residual_from_qlp_coefficients (libFLAC/lpc.c:233)
- [PATCH 4/4] lpc_intrin_sse41 routines