thr3ads.net - flac dev - [flac-dev] Performance checks [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Martijn van Beurden

2013-May-29 14:08 UTC

[flac-dev] Performance checks

On 28-05-13 20:09, Janne Hyv?rinen wrote:> On Windows the 32-bit NASM enabled compiles are always fastest. If you 
> can run 32-bit code on your Linux box you should compile with assembly 
> optimizations.
That depends on the way you define speed. For decoding this doesn't seem 
to be true. I reran my tests, it took a little longer because I couldn't 
believe the results I got. However, they are perfectly reproducible (on 
my system at least), so I guess I'll have to believe them.

In the linked PDFs is first a test with the average of 5 CDs and second 
the graph of only one of those 5. It is clearly visible that the 'speed 
ranking' for each compression setting match very closely, so the 
accuracy is probably pretty high. I did this comparison on Kubuntu 12.10 
64-bit.

http://www.icer.nl/misc_stuff/All tracks.pdf
http://www.icer.nl/misc_stuff/Coldplay - Parachutes.pdf

I was surprised to see that the Windows compile on wine actually 
outperformed the native Linux one. Probably GCC 4.6 optimized a little 
better or something very weird is going on in wine, I don't know. The 
assembly optimizations work very well on encoding, but actually slow 
things down when decoding. The difference is not very large however.

Anyway, I think I'm convinced now that my lossless codec comparison was 
valid and I can keep running codecs through wine. I should probably run 
all of them through wine just for the sake of clarity.

Miroslav Lichvar

2013-May-31 10:04 UTC

head link

[flac-dev] Performance checks

On Wed, May 29, 2013 at 04:08:57PM +0200, Martijn van Beurden
wrote:> I was surprised to see that the Windows compile on wine actually 
> outperformed the native Linux one. Probably GCC 4.6 optimized a little 
> better or something very weird is going on in wine, I don't know. The 
> assembly optimizations work very well on encoding, but actually slow 
> things down when decoding. The difference is not very large however.
In a quick test with a pre 4.8 gcc on a Core 2 CPU I see a small
improvement in decoding speed with assembly optimizations turned on,
but I think the difference used to be larger. Perhaps the compilers
got better or MMX is slower relative to normal code on current CPUs.

Disabling the FLAC__bitreader_read_rice_signed_block_asm_ia32_bswap
function seems to help a bit. (there is an #if disabling the function
with comment "OPT: not clearly faster, needs more testing" in the
src/libFLAC/stream_decoder.c file)

Here is the relative decoding speed with -5 and -8:
			-5		-8
no asm			99.0%		97.0%
asm			100.0%		100.0%
asm (no ia32_bswap)	102.7%		102.7%

I think we should drop that assembly function as the C
version seems to be faster now.

Can anyone confirm this?

Thanks,

-- 
Miroslav Lichvar

Janne Hyvärinen

2013-Jun-01 11:24 UTC

head link

[flac-dev] Performance checks

On 31.5.2013 13:04, Miroslav Lichvar wrote:> On Wed, May 29, 2013 at 04:08:57PM +0200, Martijn van Beurden wrote:
>> I was surprised to see that the Windows compile on wine actually
>> outperformed the native Linux one. Probably GCC 4.6 optimized a little
>> better or something very weird is going on in wine, I don't know.
The
>> assembly optimizations work very well on encoding, but actually slow
>> things down when decoding. The difference is not very large however.
> In a quick test with a pre 4.8 gcc on a Core 2 CPU I see a small
> improvement in decoding speed with assembly optimizations turned on,
> but I think the difference used to be larger. Perhaps the compilers
> got better or MMX is slower relative to normal code on current CPUs.
>
> Disabling the FLAC__bitreader_read_rice_signed_block_asm_ia32_bswap
> function seems to help a bit. (there is an #if disabling the function
> with comment "OPT: not clearly faster, needs more testing" in the
> src/libFLAC/stream_decoder.c file)
>
> Here is the relative decoding speed with -5 and -8:
> 			-5		-8
> no asm			99.0%		97.0%
> asm			100.0%		100.0%
> asm (no ia32_bswap)	102.7%		102.7%
>
> I think we should drop that assembly function as the C
> version seems to be faster now.
>
> Can anyone confirm this?
>
> Thanks,
>
I can confirm. I see 10% speed improvement with that change on Core i7.
Decoding a 1h18min38.133s long test FLAC -8 encoded file takes with 
normal asm optimizations 7.656s (speed: 616,266x realtime) and with that 
tiny change 6.937s (speed: 680,140x realtime).

Seemingly Similar Threads

Search for more seemingly similar threads

flac dev - Jun 2013 - Performance checks

[flac-dev] Performance checks

[flac-dev] Performance checks

[flac-dev] Performance checks

Seemingly Similar Threads