Currently cpu.h lacks FLAC__SSE_TARGET and FLAC__SSEnn_SUPPORTED macros for clang. I added them, but I cannot properly test them as I can't get compiled flac.exe under Windows (don't know how to setup clang under MSYS2). If somebody has working clang, please test this patch. Does it affect en/decoding speed? Or at least, dows it affect disassembly of functions such as FLAC__precompute_partition_info_sums_intrin_avx2()? -------------- next part -------------- A non-text attachment was scrubbed... Name: clang_cpu_support.patch Type: application/octet-stream Size: 1809 bytes Desc: not available URL: <http://lists.xiph.org/pipermail/flac-dev/attachments/20170126/2b72f4e0/attachment.obj>
lvqcl wrote:> Currently cpu.h lacks FLAC__SSE_TARGET and FLAC__SSEnn_SUPPORTED > macros for clang. I added them, but I cannot properly test them > as I can't get compiled flac.exe under Windows (don't know > how to setup clang under MSYS2).I can relatively easily install Clang on Linux.> If somebody has working clang, please test this patch. > Does it affect en/decoding speed?How reliable a test is that? I do 99.9% of my dev work on a laptop and whenever I need to benchmark something I need to do so on a desketop machine because the laptop doesn't give consistent results.> Or at least, dows it affect disassembly of functions > such as FLAC__precompute_partition_info_sums_intrin_avx2()?What am I looking for? Is posting the before and after versions sufficient? Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Erik de Castro Lopo wrote:> What am I looking for? Is posting the before and after versions > sufficient?Disassembly of the object files (before and after) is here: http://mega-nerd.com/tmp/stream_encoder_intrin_avx2-before.txt http://mega-nerd.com/tmp/stream_encoder_intrin_avx2-after.txt This is with clang version 3.8.1 from Debian testing. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Erik de Castro Lopo wrote:> How reliable a test is that? I do 99.9% of my dev work on a laptop > and whenever I need to benchmark something I need to do so on a > desketop machine because the laptop doesn't give consistent results.About 1.5 years ago I tested AVX2 speed increase on Haswell (i7-4770) using -8 encoding preset. The biggest difference between AVX2 enabled and disabled was 35% (64-bit flac.exe, 24-bit WAV input file). The smallest difference was 10% (32-bit flac.exe, 16-bit WAV input file).> Disassembly of the object files (before and after) is here: > > http://mega-nerd.com/tmp/stream_encoder_intrin_avx2-before.txt > http://mega-nerd.com/tmp/stream_encoder_intrin_avx2-after.txt > > This is with clang version 3.8.1 from Debian testing.I forgot that all avx2 functions are inside "#ifdef FLAC__AVX2_SUPPORTED" conditional, so they simply don't exist if FLAC__AVX2_SUPPORTED is not set. Anyway, stream_encoder_intrin_avx2-after.txt shows that the code contains AVX2 instructions such as vpabsd/vpaddd/vphaddd, so this function was compiled properly.