thr3ads.net - opus - [CELT-dev] On guessing theta [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Benjamin M. Schwartz

2009-Mar-20 05:01 UTC

[CELT-dev] On guessing theta

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So you're the encoder. Tou get two vectors in (for some band), L and R.

One thing you could do with this is compute M = L+R and S = L-R.  (Yes, I
know, this is not how the encoder actually works. Bear with me.) Then let
m = normalize(M) and s = normalize(S).

You transmit, m, s, |L|, and |R|.

The decoder needs to find unknown positive constants a and b to compute

L = a*m + b*s
R = a*m - b*s

To find a and b, we use two constraints
|L|^2 = |a*m + b*s|^2
|R|^2 = |a*m - b*s|^2

That proceeds as follows:
|a*m + b*s|^2 = a^2*|m|^2 + b^2*|s|^2 + 2*a*b*dot(m,s)
            = a^2 + b^2 + 2*a*b*dot(m,s) = |L|^2

|a*m - b*s|^2 = a^2*|m|^2 + b^2*|s|^2 - 2*a*b*dot(m,s)
           = a^2 + b^2 - 2*a*b*dot(m,s) = |R|^2

We now compute the sum and difference:
sum:
2*(a^2 + b^2) = |L|^2 + |R|^2
a^2 + b^2 = (|L|^2 + |R|^2)/2

difference:
4*a*b*dot(m,s) = |L|^2 - |R|^2
a*b = (|L|^2 - |R|^2)/(4*dot(m,s))

Combining these equations again in two ways:
a^2 + b^2 + 2*a*b = (|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s))
a + b = sqrt((|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s)))

a^2 + b^2 - 2*a*b = (|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s))
a - b = sqrt((|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s)))

The remainder of the solution is left as an exercise.

Anyway, the point is: in principle, _if_ M = L+R, then you don't need to
transmit theta.  The solution, ultimately, is equivalent to

theta = (1/2)arcsin(((|L|-|R|)/(|L|+|R|))*(1/dot(m,s)))

Apart from the interesting debate over whether to use M = L+R or M normalize(L)
+ normalize(R), there's one other obvious issue.  This
calculation relies on computing dot(m,s).  Since m and s are coded with
error, the calculation of theta will also have error.

Some quantization error in theta doesn't seem intrinsically unreasonable,
but if m and s have enough error, the above procedure can derive a
contradiction.  For PVQ with a small number of pulses, it seems likely
that dot(m,s) could be zero, even though |L| != |R|.  The decoder then
finds itself in a very awkward situation.  I believe this can be remedied
with a bit of edge-case handling, though.

- --Ben
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAknDIyUACgkQUJT6e6HFtqRPowCghKBuJcPotFdd1lPKoK0ngfSw
rHUAn3NcTW/qP3ckJzwd5qoPG//gdBD6
=xyUt
-----END PGP SIGNATURE-----

Benjamin M. Schwartz

2009-Mar-20 05:59 UTC

head link

[CELT-dev] On guessing theta

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Currently, the encoder does l = normalize(L), r = normalize(R), M = l+r, m
= normalize(M), etc.  The trick will not work in this case.  Here's why:

dot(m,s) = c*dot(M,S) for some constant c.
dot(M,S) = dot(l+r,l-r) = dot(l,l) + dot(r,l) - dot(l,r) - dot(r,r) = 1 +
dot(r,l) - dot(r,l) - 1 = 0

so dot(m,s) is always zero.  The trick described previously does not work
when dot(m,s) = 0, so theta must be transferred explicitly.

However, this orthogonality is still useful.  If m is encoded in N
dimensions, then from this orthogonality s can be encoded in N-1
dimensions.  That means that transmitting theta no longer represents an
extra degree of freedom.

Maybe you're already doing this. I have no idea.

- --Ben
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAknDMNEACgkQUJT6e6HFtqT24ACcDc7UwMfp0J3kaf0vdE0zpbjA
g0IAoJoYrGCT/IbOtGdn9JUQNj4/JN2+
=CDtl
-----END PGP SIGNATURE-----

Timothy B. Terriberry

2009-Mar-20 12:00 UTC

head link

[CELT-dev] On guessing theta

Benjamin M. Schwartz wrote:> However, this orthogonality is still useful.  If m is encoded in N
> dimensions, then from this orthogonality s can be encoded in N-1
> dimensions.  That means that transmitting theta no longer represents an
> extra degree of freedom.
This was one of the first things I proposed to Jean-Marc. Well, more
accurately, I proposed to still encode s in N dimensions, but to use the
orthogonality constraint to take the place of the spectral folding used
to add a noise floor for mono. The reason is that it is very
computationally expensive to determine a basis for s that is orthogonal
to an arbitrary m (O(N^3)). We could have special cased N=3, but...

This approach turns out to harm quality. The quantization in m is too
large, especially for HF bands, for this constraint to actually be
useful, and it was doing more harm than good.

Gregory Maxwell

2009-Mar-20 12:02 UTC

head link

[CELT-dev] On guessing theta

On Fri, Mar 20, 2009 at 1:01 AM, Benjamin M. Schwartz
<bmschwar at fas.harvard.edu> wrote:
[snip]> Apart from the interesting debate over whether to use M = L+R or M >
normalize(L) + normalize(R), there's one other obvious issue. ?This
One of the most important concepts in the CELT design is that the
correct energy in each band must be preserved for perceptual reasons.
It is fairly simple to demonstrate for yourself that relevant hearing
machinery driving this operates independently in each ear: Generate
two test signals, either two tones or a noise and a tone, such that
one masks the other.   Play them together and you can't hear the
masked tone, send the two signals (via headphones) to separate ears
and you can hear the previously masked tone.

If the listener were using loud speakers rather than headphones this
example wouldn't work due to cross-talk.

As such, those kinds of low level psychoacoustic effects must be
evaluated on an ear by ear basis since the listener may be using
headphones.

If you compute M = L+R; then signal energy(M); quantize(normalize(M));
S=? the resulting L and R that the decoder recovers will not likely
have well preserved energy.

Of course, extra data could be sent to make sure that energy was
preserved in this case? but that the problems of sending extra data.
(Nor does would sending the L+R energy in addition to the M+S energy
give us a way to select the bitrate for M/S)

Reasonably Related Threads

Search for more maybe matching threads

opus - Mar 2009 - On guessing theta

[CELT-dev] On guessing theta

[CELT-dev] On guessing theta

[CELT-dev] On guessing theta

[CELT-dev] On guessing theta

Reasonably Related Threads