thr3ads.net - Vorbis dev - [Vorbis-dev] intro + specification remarks + some questions [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Bram de Jong

2006-Dec-05 13:52 UTC

[Vorbis-dev] intro + specification remarks + some questions

hi everyone,


since about 10 days I've been working on an implementation of ogg vorbis 
*decoder* from scratch, and I've got a few constructive remarks about 
the specs and some questions.

--------------------------
First a question:

 > 9.2.4. low_neighbor
 >
 > "low_neighbor(v,x)" finds the position n in vector [v] of the
greatest
 > value scalar element for which n is less than [x] and vector [v]
 > element n is less than vector [v] element [x].

I'm thoroughly confused by this definition... Does this mean:

A)
   for (n=x-1; n>=0; n--) if (v[n]<v[x]) return n;

B)
low_neighbor(v, x)
{
     max_pos = 0;

     for (n=0...x-1)
         if (v[n] < v[x])
             if (v[n] >= v[max_pos])
                 max_pos = n;

     return max_pos;
}

In case of A), perhaps the specs should be rewritten for clarity, and 
some pseudocode could be added... In case of B), doesn't this mean that, 
depending on the loop and last if( ) one can get different results as in:

http://bram.smartelectronix.com/tmp/low_neighbor.png

( line = vector values, red cross = x, green dots = possible values of n )


--------------------------
Secondly, some remarks on the clarity of the specifications.

* some additional pseudocode overview of the whole process would be 
highly appreciated by newbies like me, even something as simple as:

   - read one packet and decode as header information
   - read one packet and decode as comments
   - read one packet and decode as setup:
       - decode codebooks
       - decode floors
       - decode residues
       - decode mappings
       - decode modes
   - read (packets) as audio packet:
       - header
       - 'packet decode' floor curves
       - 'packet decode' residues
       - synthesise

I.e. a bit more highlevel structure to the docs would be cool and 
provide a lot of clarity. The lowlevel pseudocode is great and highly 
readable, but I did notice that somewhere halfway (after decoding the 
headers) the style changed quite a bit leading me to suspect two authors 
at work.

* The second part (packet decode / synthesis) is the probably the 
hardest to read/understand, and some of the pseudocode is on the edge of 
difficult to grasp.

* 7.2.2. header decode / 7.2.2.1. packet decode
it is quite confusing for a newbie that 7.2.2.1 is a subsection of 7.2.2 
as it is *not* the logical "next step" when decoding the header.
Packet
decode should really have a line saying something like: "when decoding 
an audio packet, the floor is decoded as follows". This goes for both 
floors and residues.

* it would be interesting to note the size of vectors in the pseudocode. 
Some parts of the pseudocode become unusable without this information:
a particular example from "8.6.2. packet decode": "allocate and
zero all
vectors that will be returned." This is a problem throughout the 
specifications.



I hope you guys like constructive criticism!


kindest regards and thanks for any help offered,


  - bram de jong

-- 
http://www.splicemusic.com
http://freesound.iua.upf.edu
http://www.smartelectronix.com
http://www.musicdsp.org

Bram de Jong

2006-Dec-05 13:53 UTC

head link

[Vorbis-dev] intro + specification remarks + some questions

hi everyone,


since about 10 days I've been working on an implementation of ogg vorbis 
*decoder* from scratch, and I've got a few constructive remarks about 
the specs and some questions.

--------------------------
First a question:

 > 9.2.4. low_neighbor
 >
 > "low_neighbor(v,x)" finds the position n in vector [v] of the
greatest
 > value scalar element for which n is less than [x] and vector [v]
 > element n is less than vector [v] element [x].

I'm thoroughly confused by this definition... Does this mean:

A)
   for (n=x-1; n>=0; n--) if (v[n]<v[x]) return n;

B)
low_neighbor(v, x)
{
     max_pos = 0;

     for (n=0...x-1)
         if (v[n] < v[x])
             if (v[n] >= v[max_pos])
                 max_pos = n;

     return max_pos;
}

In case of A), perhaps the specs should be rewritten for clarity, and 
some pseudocode could be added... In case of B), doesn't this mean that, 
depending on the loop and last if( ) one can get different results as in:

http://bram.smartelectronix.com/tmp/low_neighbor.png

( line = vector values, red cross = x, green dots = possible values of n )


--------------------------
Secondly, some remarks on the clarity of the specifications.

* some additional pseudocode overview of the whole process would be 
highly appreciated by newbies like me, even something as simple as:

   - read one packet and decode as header information
   - read one packet and decode as comments
   - read one packet and decode as setup:
       - decode codebooks
       - decode floors
       - decode residues
       - decode mappings
       - decode modes
   - read (packets) as audio packet:
       - header
       - 'packet decode' floor curves
       - 'packet decode' residues
       - synthesise

I.e. a bit more highlevel structure to the docs would be cool and 
provide a lot of clarity. The lowlevel pseudocode is great and highly 
readable, but I did notice that somewhere halfway (after decoding the 
headers) the style changed quite a bit leading me to suspect two authors 
at work.

* The second part (packet decode / synthesis) is the probably the 
hardest to read/understand, and some of the pseudocode is on the edge of 
difficult to grasp.

* 7.2.2. header decode / 7.2.2.1. packet decode
it is quite confusing for a newbie that 7.2.2.1 is a subsection of 7.2.2 
as it is *not* the logical "next step" when decoding the header.
Packet
decode should really have a line saying something like: "when decoding 
an audio packet, the floor is decoded as follows". This goes for both 
floors and residues.

* it would be interesting to note the size of vectors in the pseudocode. 
Some parts of the pseudocode become unusable without this information:
a particular example from "8.6.2. packet decode": "allocate and
zero all
vectors that will be returned." This is a problem throughout the 
specifications.



I hope you guys like constructive criticism!


kindest regards and thanks for any help offered,


  - bram de jong

-- 
http://www.splicemusic.com
http://freesound.iua.upf.edu
http://www.smartelectronix.com
http://www.musicdsp.org

xiphmont@xiph.org

2006-Dec-05 20:15 UTC

head link

[Vorbis-dev] intro + specification remarks + some questions

On 12/5/06, Bram de Jong <bram@splicemusic.com> wrote:
>  > "low_neighbor(v,x)" finds the position n in vector [v] of
the greatest
>  > value scalar element for which n is less than [x] and vector [v]
>  > element n is less than vector [v] element [x].
Yes, this wording is... tough... but so is the encoding optimization
it's trying to express.  Vector [v] is essentially a map and the
values in vector [v] are integer positions; they're offsets into a
another vector.  Vector [v]'s values are these offsets listed in
encoded order.  It's a progressive 'tree' encoding where positions
are
listed in multiple passes of increasing resolution.  Given a value
from the mapping, the low neighbor is the closest value preceeding
that in the encoding up to that point (there may be a closer value if
the vector is scanned in its entirety, but we're only interested in
values that have already been decoded).

So, low_neigbor finds the position in [v] of the mapping value
immediately preceeding vector [v] element [x], but only up to the
current point of decoding, that is [x].  This is why the return value
must be A) less than [x] (only up to the current decode point), B)
less than vector [v] element [x] (must be a lower value) and the value
at the returned position must be the largest of the values that
otherwise satisfy A) and B) (that is, it not only preceeds vector [v]
element [x] in the mapping, but is in fact the mapping value
immediately preceeding it, if vector [v] was sorted)
> I'm thoroughly confused by this definition... Does this mean:
>
> A)
>    for (n=x-1; n>=0; n--) if (v[n]<v[x]) return n;
no.
> B)
> low_neighbor(v, x)
> {
>      max_pos = 0;
>
>      for (n=0...x-1)
>          if (v[n] < v[x])
>              if (v[n] >= v[max_pos])
>                  max_pos = n;
>
>      return max_pos;
> }
yes.
> In case of B), doesn't this mean that,
> depending on the loop and last if( ) one can get different results as in:
>
> http://bram.smartelectronix.com/tmp/low_neighbor.png
>
> ( line = vector values, red cross = x, green dots = possible values of n )
low_neighbor is not well defined if each value in vector[v] is not
unique, however a [v] with non-unique values is not allowed.
> I.e. a bit more highlevel structure to the docs would be cool and
> provide a lot of clarity. The lowlevel pseudocode is great and highly
> readable, but I did notice that somewhere halfway (after decoding the
> headers) the style changed quite a bit leading me to suspect two authors
> at work.
The original spec was written entirely by me.  I do recall needing to
shift style slightly to capture a concept that I couldn't encode int
he original style, but I thought I'd gone back and updated the
previous code to all reflect the 'newer' style.

[I've not commented on several other points that are good suggestions]

Monty

Reasonably Related Threads

Search for more seemingly similar threads

Vorbis dev - Dec 2006 - intro + specification remarks + some questions

[Vorbis-dev] intro + specification remarks + some questions

[Vorbis-dev] intro + specification remarks + some questions

[Vorbis-dev] intro + specification remarks + some questions

Reasonably Related Threads