thr3ads.net - Vorbis - [vorbis] Ogg Voxpop [Feb 2001]

If this information is useful, please help other people find it:
Share via:

Keith Wright

2001-Feb-12 16:50 UTC

[vorbis] Ogg Voxpop

I have been thinking about what is needed to make language
teaching/learning tools.  (Like talking flash cards.)
The main thing needed is a low bit-rate encoding of
human voice.  At first I thought I could take one of
the gevernment standard vocoders and embed it as an
Ogg stream.

But:
 (1) there is not a standard vocoder, there is are half a dozen, at least.
 (2) they are fixed bit rate, we really don't want to waste bits
     while the teacher waits student to respond
 (3) they include error correction (not needed here because
     we assume the underlying storage and transport mechanism
     takes care of that
 (4) one might need several different quality/bit rate options
     For example, both hours of Spanish radio broadcast packed
     into as small a file as possible, and a high audio quality
     demonstration of the difference in pronounciation of a "D"
     in English, Spanish, Chinese, and German, using as much space
     as needed to make the difference sound clear.

My current thought is to filter the input down to a bandwidth
of 7KHz or 4KHz (traditional values for high and low quality
speech), decimate the samples so that the sound is sampled
at, say, 44/3=14.6KHz or 44/5=8.8KHz, then run it through
the standard Vorbis encoder.  Vorbis then sees an ordinary
20KHz bandwidth stream that sounds like a tape recording
running at 3 to 5 times normal speed and encodes it as
usual.

I checked the mailing list archives, and found an old thread
about low bit-rate encoding that quickly degenerated into
a highly bogus discussion of the proper way to decimate
the sample sequence.  We don't need to do that again, so
assume the filtering and decimation is done properly,
is there any reason this scheme could not work?  Are
there hooks in the Vorbis stream format to tell the decoder
that this as been done so that it will know to play back
slower than normal?  Do I have to write all this myself,
or is it already in there if I just know the parameter to set?


-- 
     -- Keith Wright  <kwright@free-comp-shop.com>

Programmer in Chief, Free Computer Shop <http://www.free-comp-shop.com>
         ---  Food, Shelter, Source code.  ---

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Dan Hollis

2001-Feb-12 17:10 UTC

head link

[vorbis] Ogg Voxpop

On Mon, 12 Feb 2001, Keith Wright wrote:>  (1) there is not a standard vocoder, there is are half a dozen, at least.
GSM is pretty standard for government use. I think LPC gets some limited
use too.

http://kbs.cs.tu-berlin.de/~jutta/toast.html
>  (2) they are fixed bit rate, we really don't want to waste bits
>      while the teacher waits student to respond
So do silence detection.
>  (3) they include error correction (not needed here because
>      we assume the underlying storage and transport mechanism
>      takes care of that
GSM does not.
>  (4) one might need several different quality/bit rate options
>      For example, both hours of Spanish radio broadcast packed
>      into as small a file as possible, and a high audio quality
>      demonstration of the difference in pronounciation of a "D"
>      in English, Spanish, Chinese, and German, using as much space
>      as needed to make the difference sound clear.
You can overclock GSM if you want.

-Dan

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Gregory Maxwell

2001-Feb-12 19:13 UTC

head link

[vorbis] Ogg Voxpop

On Mon, Feb 12, 2001 at 07:50:25PM -0500, Keith Wright wrote:
[snip]> My current thought is to filter the input down to a bandwidth
> of 7KHz or 4KHz (traditional values for high and low quality
> speech), decimate the samples so that the sound is sampled
> at, say, 44/3=14.6KHz or 44/5=8.8KHz, then run it through
> the standard Vorbis encoder.  Vorbis then sees an ordinary
> 20KHz bandwidth stream that sounds like a tape recording
> running at 3 to 5 times normal speed and encodes it as
> usual.
[snip]> the sample sequence.  We don't need to do that again, so
> assume the filtering and decimation is done properly,
> is there any reason this scheme could not work?  Are
> there hooks in the Vorbis stream format to tell the decoder
> that this as been done so that it will know to play back
> slower than normal?  Do I have to write all this myself,
> or is it already in there if I just know the parameter to set?
Vorbis can happily take in a 8.8KHz (or just about any other) sampling rate
file and act accordingly. If you like to vorbis (make it think it's
chipmunks) you will get HORRIBLE results because the psycoacustic masking is
highly frequency dependant and vorbis will get the masking all wrong.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Michael Smith

2001-Feb-12 19:16 UTC

head link

[vorbis] Ogg Voxpop

> My current thought is to filter the input down to a bandwidth
> of 7KHz or 4KHz (traditional values for high and low quality
> speech), decimate the samples so that the sound is sampled
> at, say, 44/3=14.6KHz or 44/5=8.8KHz, then run it through
> the standard Vorbis encoder.  Vorbis then sees an ordinary
> 20KHz bandwidth stream that sounds like a tape recording
> running at 3 to 5 times normal speed and encodes it as
> usual.
No, vorbis would see a normal stream of data, at a sampling rate you
specify. Vorbis does NOT require that input be 44.1kHz, though that is
what is has been tuned for. Notably, pre-echo will be particularly bad
with lower sampling rates.
> 
> I checked the mailing list archives, and found an old thread
> about low bit-rate encoding that quickly degenerated into
> a highly bogus discussion of the proper way to decimate
> the sample sequence.  We don't need to do that again, so
> assume the filtering and decimation is done properly,
> is there any reason this scheme could not work?  Are
> there hooks in the Vorbis stream format to tell the decoder
> that this as been done so that it will know to play back
> slower than normal?  Do I have to write all this myself,
> or is it already in there if I just know the parameter to set?
There is no 'slower than normal'. The stream format specifies the
sampling rate, vorbis will encode and play back at that rate (well,
playback is really for the frontend to care about - on the decode side I
don't think vorbis itself needs to worry about the sampling rate at
all).

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Possibly Parallel Threads

Search for more possibly parallel threads

Vorbis - Feb 2001 - Ogg Voxpop

[vorbis] Ogg Voxpop

[vorbis] Ogg Voxpop

[vorbis] Ogg Voxpop

[vorbis] Ogg Voxpop

Possibly Parallel Threads