thr3ads.net - flac dev - [flac-dev] PATCH for replaygain

If this information is useful, please help other people find it:
Share via:

lvqcl

2014-May-17 09:26 UTC

[flac-dev] PATCH for replaygain_synthesis

The file src/share/replaygain_synthesis/include/private/fast_float_math_hack.h
redefines 'tanh' as 'tanhf'. This file is intended for Intel
Compiler only,
but it includes outdated mathf.h and doesn't work with current versions of
ICC.

The fixes are trivial though, and I compiled 2 versions of flac.exe: with this
'hack' turned off an on. The difference in decoding speed is very close
to
measurement inaccuracy: for 32-bit encoder the decoding time decreases from
94.5s
to 94.0s, for 64-bit it increases from 82.6s to 82.9s.
(the option for this test was: --apply-replaygain-which-is-not-lossless=Ln0)

So this hack is really useless today, and the first patch removes
fast_float_math_hack.h from the sources.




MSVS profiler shows that tanh calculation doesn't require too much CPU
resources,
the real problem is an integer division (int_64/int_32) in this line:

     val64 = dither_output_(........) / conv_factor;

Since all possible values of conv_factor are powers of 2, it's possible to
replace division with shift. The second patch does this.

Decoding time decreases from 94.5s to 64.1s for 32-bit ICC compile, and
 from 82.6s to 50.0s for 64-bit ICC compile.



*************************************************
P.S. Actually, shift ( x << n ) and division ( x / (1<<n) ) can give
different results if x < 0. The difference is very small though: WAV files
differ by 1 LSB. And probably shift gives better results than division.

Let's compare shift by 2 and division by (1<<2) == 4:

*** shift ***
argument            result
....
12, 13, 14, 15  ->    3
  8,  9, 10, 11  ->    2
  4,  5,  6,  7  ->    1
  0,  1,  2,  3  ->    0
-4, -3, -2, -1  ->   -1
-8, -7, -6, -5  ->   -2
....

*** division ***
argument                       result
....
12, 13, 14, 15              ->    3
  8,  9, 10, 11              ->    2
  4,  5,  6,  7              ->    1
-3, -2, -1,  0,  1,  2,  3  ->    0
-7, -6, -5, -4  ->          ->   -1
-11,-10,-9, -8  ->          ->   -2
....


So, shift results in small DC offset (1/2 LSB), division results in
small 'nonlinearity' near 0.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1__remove_ffmhack.patch
Type: application/octet-stream
Size: 2594 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/flac-dev/attachments/20140517/e2f5dcb0/attachment.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2__apply_gain.patch
Type: application/octet-stream
Size: 3073 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/flac-dev/attachments/20140517/e2f5dcb0/attachment-0001.obj

Erik de Castro Lopo

2014-May-18 03:58 UTC

head link

[flac-dev] PATCH for replaygain_synthesis

lvqcl wrote:
> The file
src/share/replaygain_synthesis/include/private/fast_float_math_hack.h
> redefines 'tanh' as 'tanhf'. This file is intended for
Intel Compiler only,
> but it includes outdated mathf.h and doesn't work with current versions
of ICC.
> 
> The fixes are trivial though, and I compiled 2 versions of flac.exe: with
this
> 'hack' turned off an on. The difference in decoding speed is very
close to
> measurement inaccuracy: for 32-bit encoder the decoding time decreases from
94.5s
> to 94.0s, for 64-bit it increases from 82.6s to 82.9s.
> (the option for this test was:
--apply-replaygain-which-is-not-lossless=Ln0)
> 
> So this hack is really useless today, and the first patch removes
> fast_float_math_hack.h from the sources.
Both patches applied. Thanks.

Cheers,
Erik
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

John Edwards

2014-May-18 10:18 UTC

head link

[flac-dev] PATCH for replaygain_synthesis

I've not benchmarked to know if their is any real benefit, but changing 
the include in fast_float_math_hack.h to <mathimf.h> is all that is 
required to use the latest ICC.

John

On 17/05/2014 10:26, lvqcl wrote:> The file
> src/share/replaygain_synthesis/include/private/fast_float_math_hack.h
> redefines 'tanh' as 'tanhf'. This file is intended for
Intel Compiler only,
> but it includes outdated mathf.h and doesn't work with current versions
> of ICC.
>
> The fixes are trivial though, and I compiled 2 versions of flac.exe:
> with this
> 'hack' turned off an on. The difference in decoding speed is very
close to
> measurement inaccuracy: for 32-bit encoder the decoding time decreases
> from 94.5s
> to 94.0s, for 64-bit it increases from 82.6s to 82.9s.
> (the option for this test was:
> --apply-replaygain-which-is-not-lossless=Ln0)
>
> So this hack is really useless today, and the first patch removes
> fast_float_math_hack.h from the sources.
>
>
>
>
> MSVS profiler shows that tanh calculation doesn't require too much CPU
> resources,
> the real problem is an integer division (int_64/int_32) in this line:
>
>      val64 = dither_output_(........) / conv_factor;
>
> Since all possible values of conv_factor are powers of 2, it's possible
to
> replace division with shift. The second patch does this.
>
> Decoding time decreases from 94.5s to 64.1s for 32-bit ICC compile, and
> from 82.6s to 50.0s for 64-bit ICC compile.
>
>
>
> *************************************************
> P.S. Actually, shift ( x << n ) and division ( x / (1<<n) ) can
give
> different results if x < 0. The difference is very small though: WAV
files
> differ by 1 LSB. And probably shift gives better results than division.
>
> Let's compare shift by 2 and division by (1<<2) == 4:
>
> *** shift ***
> argument            result
> ....
> 12, 13, 14, 15  ->    3
>   8,  9, 10, 11  ->    2
>   4,  5,  6,  7  ->    1
>   0,  1,  2,  3  ->    0
> -4, -3, -2, -1  ->   -1
> -8, -7, -6, -5  ->   -2
> ....
>
> *** division ***
> argument                       result
> ....
> 12, 13, 14, 15              ->    3
>   8,  9, 10, 11              ->    2
>   4,  5,  6,  7              ->    1
> -3, -2, -1,  0,  1,  2,  3  ->    0
> -7, -6, -5, -4  ->          ->   -1
> -11,-10,-9, -8  ->          ->   -2
> ....
>
>
> So, shift results in small DC offset (1/2 LSB), division results in
> small 'nonlinearity' near 0.
>
>
> _______________________________________________
> flac-dev mailing list
> flac-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/flac-dev
>

lvqcl

2014-May-18 10:34 UTC

head link

[flac-dev] PATCH for replaygain_synthesis

John Edwards wrote:
> I've not benchmarked to know if their is any real benefit, but changing
> the include in fast_float_math_hack.h to <mathimf.h> is all that is
> required to use the latest ICC.
>
> John
Well, it was also was necessary to change replaygain_synthesis.c:
the inclusion on private/fast_float_math_hack.h should be before
the inclusion of math.h. But yes, that's all that was necessary
to compile this project.

Maybe Matching Threads

Search for more possibly parallel threads

flac dev - May 2014 - PATCH for replaygain_synthesis

[flac-dev] PATCH for replaygain_synthesis

[flac-dev] PATCH for replaygain_synthesis

[flac-dev] PATCH for replaygain_synthesis

[flac-dev] PATCH for replaygain_synthesis

Maybe Matching Threads