Hi all, People watching the git commits might have noticed that I have been fixing a number of issues around undefined behaviour. Why you ask? * Some forms of undefined behaviour have potential for security exploits. * Compiler writers are free to replace anything which invokes UB with a NOP or even, nothing at all. * Having large numbers of UB warnings makes it difficult (or rather time consuming) to check them all for the possibility that they are a potentially exploitable. Most of these UB changes have little liklihood of performance regressions. However, this one: commit 1b8af6bb45a9ad74fa374fb6414974e63ffc793b Author: Erik de Castro Lopo <erikd at mega-nerd.com> Date: Sat Aug 29 05:21:43 2015 +1000 libFLAC/fixed.c: Fix undefined behaviour Left shift if a negative integer such that the sign bit is affected is (according to the C spec) undefined behaviour and the residual calculations using the shift operator were hitting this. Fortunately these same calculations using plain multiplication do not invoke UB and according to benchmarking (on x86_64 linux) have the same performance as the bit shift version. has no performance impact on x86_64 linux, *may* not be the same with other compilers. Since this code is in such an inner loop, any banchmark should show up regressions if they exist. Please test. Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Erik de Castro Lopo wrote:> has no performance impact on x86_64 linux, *may* not be the same with > other compilers.The approach we used in Opus was to use unsigned shifts: https://git.xiph.org/?p=opus.git;a=commitdiff;h=1ee139bca076 That relies on implementation-defined behavior, but not undefined behavior.
Timothy B. Terriberry wrote:> The approach we used in Opus was to use unsigned shifts: > > https://git.xiph.org/?p=opus.git;a=commitdiff;h=1ee139bca076 > > That relies on implementation-defined behavior, but not undefined behavior.Yep, that ia an alternative if the benchmarking shows that multiplication is slower than shifting. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Erik de Castro Lopo wrote:> has no performance impact on x86_64 linux, *may* not be the same with > other compilers. > > Since this code is in such an inner loop, any banchmark should show up > regressions if they exist. Please test.FLAC__fixed_restore_signal() is a part of flac *de*coder, so it makes sense to test it on different architectures (ARM, MIPS?). But I have no idea how to make it. P.S. MSVS 2013, Win32: the difference is within measurement error.
lvqcl wrote:> FLAC__fixed_restore_signal() is a part of flac *de*coder, so it makes > sense to test it on different architectures (ARM, MIPS?). But I have no > idea how to make it.Yes, but the code in FLAC__fixed_restore_signal() is (ignoring identifier name differences) identical to FLAC__fixed_compute_residual(), Maybe I should just publish my micro benchmarking code as part of the repo. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
Op 29-08-15 om 01:27 schreef Erik de Castro Lopo:> Most of these UB changes have little liklihood of performance regressions. > However, this one: > [...] > has no performance impact on x86_64 linux, *may* not be the same with > other compilers.I just checked on my Raspberry pi (armv6-hf, GCC 4.6) and it looks like decoding is actually faster with these changes. I benchmarked 1b8af6b against f7c52c8, the results are attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: shorter set of samples-rbpi-checkundefined.pdf Type: application/pdf Size: 19491 bytes Desc: not available Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20150830/7a8cd6dd/attachment.pdf
Martijn van Beurden wrote:> I just checked on my Raspberry pi (armv6-hf, GCC 4.6) and it > looks like decoding is actually faster with these changes. I > benchmarked 1b8af6b against f7c52c8, the results are attached.Interesting results, thanks. (OTOH, GCC 4.6 was released ~4.5 years ago, so it would be also interesting to test it on newer compilers - GCC 4.9.x or 5.x, or some new Clang...)