On 29 December 2015 at 13:33, Rafa?l Carr? <funman at videolan.org> wrote:> On 12/28/2015 08:35 PM, lvqcl wrote: >> In stream_encoder.c there's the following code: >> >> #if defined FLAC__CPU_X86_64 /* and other 64-bit arch, too */ >> if(mean <= 0x80000000/512) { /* 512: more or less optimal for both 16- and 24-bit input */ >> #else >> if(mean <= 0x80000000/8) { /* 32-bit arch: use 32-bit math if possible */ >> #endif >> >> A) How to properly check for 64-bit architectures? >> I can check for "defined FLAC__CPU_X86_64" or "defined _WIN64". >> Is it possible to use SIZEOF_VOIDP? such as "#if SIZEOF_VOIDP == 8" ? > > That would need a special case for Linux x32 which is x86_64 with 32 > bits pointers... and this probably won't be the last time we'd need to handle special cases. Do we really need to handle this at all? Entangling CPU-arch-dependent #ifdefs with input sample size (see "tuned for N-bit input" a few lines below" seems weird. IMHO finding the rice parameter should be independent of the cpu arch unless there is a spectacular benefit by distinguishing. Best regards Riggs
Thomas Zander wrote:> ... and this probably won't be the last time we'd need to handle special cases. > Do we really need to handle this at all? Entangling CPU-arch-dependent > #ifdefs with input sample size (see "tuned for N-bit input" a few > lines below" seems weird. > IMHO finding the rice parameter should be independent of the cpu arch > unless there is a spectacular benefit by distinguishing.I agree that it's a good idea to test the speed of encoding. Maybe different code for 32-bit and 64-bit architectures is over-optimization.
On 29 December 2015 at 17:16, lvqcl <lvqcl.mail at gmail.com> wrote:> I agree that it's a good idea to test the speed of encoding. > Maybe different code for 32-bit and 64-bit architectures > is over-optimization.I completely agree. With today's complicated CPUs it would be extremely hard if not impossible to formulate something like #if 64-bit-CPU //C code that is guaranteed to run faster on any 64 bit CPU of any given arch than its 32 bit counterpart //i.e. 64 bit x86 faster than 32 bit x86; 64 bit ARM faster than 32 bit ARM; ppc, sparc, mips, ... for every chip #else //version that it always faster on 32 bit #endif Maybe it's time for profile guided optimisation :-) Seriously though, we should keep the number of compile-time-#define'able code paths that are supposed to calculate the same result manageable. The Rice parameter estimation / calculation in stream_encoder.c contains surprisingly many #ifdefs. Riggs