Chris Moore
2014-Mar-20 17:56 UTC
[Vorbis-dev] BARK implementation (or specification) error
Hi, In the course of some work which I describe below, I have found a very significant difference between the BARK function described in the Vorbis specification and its implementation in libvorbis. In the specificationhttp://xiph.org/vorbis/doc/Vorbis_I_spec.pdf bark(x) = 13.1arctan(.00074x) + 2.24arctan(.0000000185x**2 + .0001x) In the libvorbis code http://svn.xiph.org/trunk/vorbis/lib/scales.h #define toBARK(n) (13.1f*atan(.00074f*(n))+2.24f*atan((n)*(n)*1.85e-8f)+1e-4f*(n)) You will note that the last term is inside the parentheses of the second arctan function in the specification but outside in the libvorbis implementation. This results in extremely large differences at high frequencies. Which is correct: the specification or the implementation? What should I consider to be the reference Vorbis decoder? Currently I am supposing it to be the vorbisfile_example code in Vorbis but I am rather surprised that while the heavy transcendental functions are in double, many other calculations are only in float. I would have expected a reference decoder to use the greater precision of double throughout. Otherwise, for information, I have been looking into the differences between the output of Tremor and the output of the Vorbis floating point decoder reported in the following post: http://www.mail-archive.com/tremor at xiph.org/msg00072.html Where floor1 is used the RMS differences are already small: about 0.7 However with a couple of trivial modifications I have been able to reduce the RMS differences to about 0.1 The big problem is that where floor0 is used the RMS differences are *enormous*. I have investigated these and they are: a) partly due to an insufficient number of fractional bits in certain places, b) partly due to insufficient precision in the approximation of transcendental functions (cos, atan, exp, sqrt). I have been able to significantly improve the results but I hope to do better. The main difficulties are in the approximation of these transcendental functions. How should I submit my modifications? They are basically of three types: a) trivial improvements having no impact on the results (e.g. constifying, more efficient calculations, saving table space). b) trivial modifications improving the accuracy of floor1 and the MDCT, c) substantial modifications to floor0 (these are currently unfinished). TIA. Cheers, Chris
Tor-Einar Jarnbjo
2014-Mar-21 09:32 UTC
[Vorbis-dev] BARK implementation (or specification) error
Hi Chris, Am 20.03.2014 18:56, schrieb Chris Moore: > In the course of some work which I describe below, I have found a very significant difference between the BARK function described in the Vorbis specification and its implementation in libvorbis. I believe to have implemented one of the first Vorbis decoders based on the specification around 2002 and found several relevant differences between the specification and the reference decoder. Most of the issues I found were corrected in the specification, but I am sure that I didn't find all of the deviations. > In the specificationhttp://xiph.org/vorbis/doc/Vorbis_I_spec.pdf > bark(x) = 13.1arctan(.00074x) + 2.24arctan(.0000000185x**2 + .0001x) > > In the libvorbis code http://svn.xiph.org/trunk/vorbis/lib/scales.h > #define toBARK(n) (13.1f*atan(.00074f*(n))+2.24f*atan((n)*(n)*1.85e-8f)+1e-4f*(n)) > > You will note that the last term is inside the parentheses of the second arctan function in the specification but outside in the libvorbis implementation. > This results in extremely large differences at high frequencies. > Which is correct: the specification or the implementation? Hard to say, since you may not even find an encoder making use of that feature. The function is only relevant when decoding floor type 0 and even the specification states: "Floor 0 is not to be considered deprecated, but it is of limited modern use. No known Vorbis encoder past Xiph.Org's own beta 4 makes use of floor 0." I remember having problems interpreting the floor 0 specification and decided to not implement it at all. I am not sure exactly how much my Java library (J-Ogg) has been used, but I've never heard any complaints that someone has used it and run into problems that floor 0 is unsupported. > What should I consider to be the reference Vorbis decoder? > Currently I am supposing it to be the vorbisfile_example code in Vorbis but I am rather surprised that while the heavy transcendental functions are in double, many other calculations are only in float. > I would have expected a reference decoder to use the greater precision of double throughout. I don't have time to check the specification right now, but I am pretty sure that the specification does not require the decoder to use a given precision. A few places, the specification points out that a fixed point implementation should use a higher precision than what may seem obvious, but there are several "shoulds" in these statements and I don't think many "musts". The result is of course that two decoders may produce slightly different PCM output and both be considered to be correct according to the specification. Regards, Tor
xiphmont at xiph.org
2015-Feb-27 21:47 UTC
[Vorbis-dev] BARK implementation (or specification) error
Digging this up from a year ago, because I'd missed it at the time (and thank you for pointing it out again in the other errata thread)...> In the course of some work which I describe below, I have found a very significant difference between the BARK function described in the Vorbis specification and its implementation in libvorbis. > > In the specificationhttp://xiph.org/vorbis/doc/Vorbis_I_spec.pdf > bark(x) = 13.1arctan(.00074x) + 2.24arctan(.0000000185x**2 + .0001x)The last paren in the above equation is misplaced. The original HTML version of the spec is correct (yay commit history), but this typo crept in a long time ago when the HTML spec was re-typeset to XML and TeX. And, unfortunately, at some point someone noticed they didn't match, and updated the HTML to match the typo (thus once again reinforcing that all you have to do is typeset something in TeX and it automatically will be treated as authoritative :-) The original and correct equation is: bark(x) = 13.1arctan(.00074x) + 2.24arctan(.0000000185x**2) + .0001x> Otherwise, for information, I have been looking into the differences between the output of Tremor and the output of the Vorbis floating point decoder reported in the following post: > http://www.mail-archive.com/tremor at xiph.org/msg00072.html > > Where floor1 is used the RMS differences are already small: about 0.7 > However with a couple of trivial modifications I have been able to reduce the RMS differences to about 0.1 > > The big problem is that where floor0 is used the RMS differences are *enormous*. > I have investigated these and they are: > a) partly due to an insufficient number of fractional bits in certain places, > b) partly due to insufficient precision in the approximation of transcendental functions (cos, atan, exp, sqrt). > I have been able to significantly improve the results but I hope to do better. > The main difficulties are in the approximation of these transcendental functions.There are also #defines to alter the precision. However, the broadband RMS differences are less important than spectral energy and peak differences. Do they also differ by large amounts?> How should I submit my modifications? > They are basically of three types: > a) trivial improvements having no impact on the results (e.g. constifying, more efficient calculations, saving table space). > b) trivial modifications improving the accuracy of floor1 and the MDCT, > c) substantial modifications to floor0 (these are currently unfinished).Submitting patches here on hte list is appropriate, as is opening a Trac bug and attaching the patches there. However, I generally prefer discussion to happen on the list; more people see it with less effort. Monty