I wrote a patch that enables FLAC__BYTES_PER_WORD==8 in
libFLAC/bitreader.c and libFLAC\bitwriter.c.
The tests were done on an Intel Nehalem CPU, and flac was compiled
with CGG 4.9.x.
Average speed increase for FLAC__BYTES_PER_WORD change from 4 to 8:
Decoding speed:
     ia32 architecture
         16-bit .flac: -15%
         24-bit .flac: -11%
     x86-64 architecture
         16-bit .flac: +3%
         24-bit .flac: -0.6%
Encoding speed (only fastest presets (-0...-5) were tested):
     ia32 architecture
         16-bit .wav: +0.6%
         24-bit .wav: +3%
     x86-64 architecture
         16-bit .wav: +6%
         24-bit .wav: +7%
On 29 December 2015 at 17:10, lvqcl <lvqcl.mail at gmail.com> wrote:> I wrote a patch that enables FLAC__BYTES_PER_WORD==8 in > libFLAC/bitreader.c and libFLAC\bitwriter.c. > The tests were done on an Intel Nehalem CPU, and flac was compiled > with CGG 4.9.x.If you want to share the patch, I am happy to repeat some testing on Sandy Bridge and Core2 with clang.> Average speed increase for FLAC__BYTES_PER_WORD change from 4 to 8: > [...]The slower decoding speed for 24 bit content on x86_64 seems surprising, but minor. However losing 15% decoding speed on i386 would be very bad. Riggs
Thomas Zander wrote:> If you want to share the patch, I am happy to repeat some testing on > Sandy Bridge and Core2 with clang.The patch changes many files, libFLAC/bitwriter.c and test_libFLAC/bitwriter.c among them. So now I wait for the decision for patches #3 and #4 that I posted yesterday.> The slower decoding speed for 24 bit content on x86_64 seems > surprising, but minor. > However losing 15% decoding speed on i386 would be very bad.So, does it make sense to #define FLAC__BYTES_PER_WORD (in bitreader.c) as 4 for 32-bit and as 8 for 64-bit targets?