thr3ads.net - Vorbis dev - [vorbis-dev] Low bitrate high-band coding... [Dec 2000]

If this information is useful, please help other people find it:
Share via:

Jean-Marc Valin

2000-Dec-03 19:28 UTC

[vorbis-dev] Low bitrate high-band coding...

Hi,

I'd like to contribute to Vorbis and I think this may be of some interest
for
low bitrate coding. I have been experimenting with low bit-rate coding for the
high-band (11 kHz to 22 kHz) and, though I haven't yet started quantizing my
coefficients (a gain and an LPC filter), I expect to be able to approximate the
whole 11-22 kHz band with around 1000 bits/s per channel (maybe even 500 bps).
Now, I don't know what is the normal bit-rate allocated for this band, but I
expect it is greater than that. Am I right? (can anyone give me numbers for
this?)

The technique I use to do this is inspired from an acticle I published recently
(http://panoramix.dyndns.org/jm/scw2000.pdf) and is based on the fact that at
these frequencies, the ear is totally insensitive to the spectral fine
structure. The processing also has relativly low complexity (most of it is two
LPC analysis in the encoder and one in the decoder).

I have tested it with some files (including harpsichord, which is supposed to be
hard to code) and the difference with the original (CD rip) is hard to hear. You
can find demo files of this at:
ftp://freespeech.sourceforge.net/pub/freespeech/

There are 6 files:
bach10-ref.sw        : Original file (right channel from Bach's Chromatic
Fantasia)
bach10-ext.sw        : Resulting file from my experiment.
bach10-lp.sw         : Low-passed at 11 kHz
bach10-lame-ref.sw   : Encoded with lame (128 bps), but original in the low band
bach10-ogg-ref.sw    : Encoded with vorbis (160 bps), but original in the low
band
bach10-ogg128-ref.sw : Encoded with vorbis (128 bps), but original in the low
band

For the last 3 files, I put back the original low band (0-11 kHz) so that only
the high band differences are present. The files are PCM 16 bits/sample, little
endian, 44.1 kHz.

Anyone thinks this could be useful? Any interesting audio file you'd like me
to
process?

        Jean-Marc


-- 
Jean-Marc Valin
Universite de Sherbrooke - Genie Electrique
valj01@gel.usherb.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Monty

2000-Dec-04 11:10 UTC

head link

[vorbis-dev] Low bitrate high-band coding...

> I'd like to contribute to Vorbis and I think this may be of some
interest for
> low bitrate coding. I have been experimenting with low bit-rate coding for
the
> high-band (11 kHz to 22 kHz) and, though I haven't yet started
quantizing my
> coefficients (a gain and an LPC filter), I expect to be able to approximate
the
> whole 11-22 kHz band with around 1000 bits/s per channel (maybe even 500
bps).
> Now, I don't know what is the normal bit-rate allocated for this band,
but I
> expect it is greater than that. Am I right? (can anyone give me numbers for
> this?)
Depends.  It varies from zero to a few kilobits depending on what the
psychoacoustics model says.
> The technique I use to do this is inspired from an acticle I published
recently
> (http://panoramix.dyndns.org/jm/scw2000.pdf) and is based on the fact that
at
> these frequencies, the ear is totally insensitive to the spectral fine
> structure. 
Correct, however, the ear is extremely sensitive to preecho and
time-localization of high frequency energy.  You don't hear the pitch
in the high frequencies, you hear the fact that a sharp edge was
smeared (what aggressive quantization in the high end will cause).
> I have tested it with some files (including harpsichord, which is supposed
to be
> hard to code) and the difference with the original (CD rip) is hard to
hear. You
> can find demo files of this at:
> ftp://freespeech.sourceforge.net/pub/freespeech/
Harpsichord (like voice) is well suited to this technique because of
regular harmoncs.  Try it on violin, cymbals, and nonmusical sources.

I hear a brief, glassy preecho ... what block size were you using for
your experiment?  I'm guessing very short.... The results might be
more if not used in situations where ogg/lame would be using short
blocks and used over lapped 2048 sample blocks like ogg.

Monty

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Jean-Marc Valin

2000-Dec-04 14:43 UTC

head link

[vorbis-dev] Low bitrate high-band coding...

>> Now, I don't know what is the normal bit-rate allocated for this
band, but I
>> expect it is greater than that. Am I right? (can anyone give me numbers
for
>> this?)
> 
> Depends. It varies from zero to a few kilobits depending on what the
> psychoacoustics model says.
few kilobits, meaning? In my example can you say what amount of bits vorbis puts
in the 11-22 kHz band?
>> The technique I use to do this is inspired from an acticle I published
recently
>> (http://panoramix.dyndns.org/jm/scw2000.pdf) and is based on the fact
that at
>> these frequencies, the ear is totally insensitive to the spectral fine
>> structure.
> 
> Correct, however, the ear is extremely sensitive to preecho and
> time-localization of high frequency energy. You don't hear the pitch
> in the high frequencies, you hear the fact that a sharp edge was
> smeared (what aggressive quantization in the high end will cause).
The process I used is not subject to pre-echo. The way I extend the residue is
by simply upsampling the LP residue, causing spectral folding (unlike my
article, for which I use a non-linear function). The time-localization will thus
be preserved. For voice, I have even obtained very good results when starting
the extension at 3.5 kHz. 
>> I have tested it with some files (including harpsichord, which is
supposed to be
>> hard to code) and the difference with the original (CD rip) is hard to
hear. You
>> can find demo files of this at:
>> ftp://freespeech.sourceforge.net/pub/freespeech/
> 
> Harpsichord (like voice) is well suited to this technique because of
> regular harmoncs. Try it on violin, cymbals, and nonmusical sources.
I have added a violin file in the same directory
(ftp://freespeech.sourceforge.net/pub/freespeech/) with the "vi4-"
prefix. I
think it works a bit better than the harpsichord. I don't files with
cymbals,
but if you have some, please send them to me. As I said earlier, the ear is
totally insensitive to the spectral fine structure at these frequencies. It
cannot even tell noise from harmonics. The only reason I didn't just put
noise
is that upsampling preserves the time localization within a frame.
> I hear a brief, glassy preecho ... what block size were you using for
> your experiment? I'm guessing very short.... The results might be
> more if not used in situations where ogg/lame would be using short
> blocks and used over lapped 2048 sample blocks like ogg.
I'm using 1024-sample frames and my LPC filter is calculated on a 2048
window.
Anyway, the whole point of this was for very-low bitrate modes where you cannot
afford many bits for the high-band and in which case, you could still afford 500
bps. I think I could go as low as that using vector quantization and prediction.

Right now, the system is not optimal, I still need to play with the window size,
and the LPC regularization params (noise floor, pre-emphasis, bandwidth
expansion/lag windowing). What I'd like to know is whether you think this
could
potentially be interesting.

        Jean-Marc

P.S. Please also reply directly to me, as my subscription to vorbis-dev
doesn't
seem to work.


-- 
Jean-Marc Valin
Universite de Sherbrooke - Genie Electrique
valj01@gel.usherb.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Jean-Marc Valin

2000-Dec-04 21:26 UTC

head link

[vorbis-dev] Low bitrate high-band coding...

OK, I have just finished quantizing my coefficients and the result is better
than I had expected... 345 bps for the whole 11-22 kHz band. The audio files are
still at ftp://freespeech.sourceforge.net/pub/freespeech/

bach10-diffquant.sw  : The high band is quantized with 8 bits/frame, thus 345
bps
bach10-diffquant2.sw : I used shorter frames to get better results, 690 bps

In order to get these bitrates, I used differential vector quantization in the
cepstral domain. There are two things left to try:

1) Use intra-frame LPC interpolation (in the LSF domain)
2) Predict the high-band envelope from the LSP masking curve, and further reduce
the bit-rate

By the way, what I'm proposing here is probably not something that would go
in
the 64 kbits/channel modes, but in a very low bitrate modes, when there are
(almost) no bits left for the high band.

        Jean-Marc


-- 
Jean-Marc Valin
Universite de Sherbrooke - Genie Electrique
valj01@gel.usherb.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

MC Spanky

2000-Dec-08 07:28 UTC

head link

[vorbis-dev] When can we expect low bitrate encoding?

Just curious, our project was hoping to use 56k coding for a mono sound,
and maybe even 36 or 28k if they sound good enough.  Right now, the
smallest rate for a mono source is 64k.

Thanks,
Spanky

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Apparently Analagous Threads

Search for more reasonably related threads

Vorbis dev - Dec 2000 - Low bitrate high-band coding...

[vorbis-dev] Low bitrate high-band coding...

[vorbis-dev] Low bitrate high-band coding...

[vorbis-dev] Low bitrate high-band coding...

[vorbis-dev] Low bitrate high-band coding...

[vorbis-dev] When can we expect low bitrate encoding?

Apparently Analagous Threads