thr3ads.net - Vorbis dev - [vorbis-dev] Impulses [Nov 1999]

If this information is useful, please help other people find it:
Share via:

Gregory Maxwell

1999-Nov-19 05:50 UTC

[vorbis-dev] Impulses

After playing with the vorbis code for a while and doing tons of hacks and
analysis on it, I've found it to perform very poorly with impulse signals.

The MDCT seems to cause lots of spreading, and it seems to result in much
worse impulse performance then mp3. 

What is the current plan on handling this? Will a smart quantizer be able
to avoid it?

I've been looking at various ways of taking care of this, and before I
bother implimenting something I'd like to make sure that no one has gone
down this path before:

Roughtly vorbis currently does:

input wave -> MDCT -> LPC -> LSP -> quant ->
------------------>output
                                         \->delpc->error->quant -^

What do you think of this:

input wav -> DWT -> sum non-impuse factors -> iDWT -> MDCT ... (like
above)
              \
               -> -> sum impulse factors -> iDWT -> LPC -> LSP
-> quant

i.e. use a wavelet transform to seperate out impulsey signals and
compress them in the time domain.

The decoder complexity really isn't increased much (just one more
dequant/LPC and a sum). I think there are optimized versions of the haar
DWT that go really fast too..

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg

Monty

1999-Nov-19 07:22 UTC

head link

[vorbis-dev] Impulses

> After playing with the vorbis code for a while and doing tons of hacks and
> analysis on it, I've found it to perform very poorly with impulse
signals.
At the immediate moment, a commit error on my part is making it worse :-)  I
currently have the psychoacoustics to *only* use a 2048 sample window... I was
doing distortion testing on that specific window size and forgot to put it back
to normal (where it will use a 256 or 512 sample window for impulses).

(I just turned it back on in CVS).

*HOWEVER*, that just puts us back in the mp3 range of dealing.  I've been
interested for a long time in doing something similar to what you propose, but
I'd backed out what I was doing in order to push ahead with more fundamental
parts of the first cut release (too many details for just me to handle at the
moment :-) There are actually bitflags in the stream right now waiting for
exactly this sort of thing to drop in.

(This also means that 'research' on the subject can proceed at a prudent
pace
without holding anything up).
> I've been looking at various ways of taking care of this, and before I
> bother implimenting something I'd like to make sure that no one has
gone
> down this path before:
I've gone part way down the path, so I have some additional clues to offer.
This basic tack has my approval.
> Roughtly vorbis currently does:
> 
> input wave -> MDCT -> LPC -> LSP -> quant ->
------------------>output
> 				         \->delpc->error->quant -^
> 
> What do you think of this:
> 
> input wav -> DWT -> sum non-impuse factors -> iDWT -> MDCT ...
(like above)
>               \
>                -> -> sum impulse factors -> iDWT -> LPC ->
LSP -> quant
> 
> i.e. use a wavelet transform to seperate out impulsey signals and
> compress them in the time domain.
Yes, this is exactly the way I wanted to proceed (only I wasn't using
wavelets;
wavelets are indeed worth pursuing).  The encoder/decoder were structured
exactly for the above flow (the convenience of the layout isn't accidental).
However, we need to find a better way to emcode the impulses.  (More on this
later; I wanted to respond quickly to say that you're on the path that I
started, but hadn't continued).
> The decoder complexity really isn't increased much (just one more
> dequant/LPC and a sum). I think there are optimized versions of the haar
> DWT that go really fast too..
Yes, in addition, wavelets are (IIRC) linear time (O(n)) transforms as well.
The time taken by Haar transform itself would practically be lost in the noise
:-)

I'll have much more to talk about after some sleep.

Monty

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg

Mark Taylor

1999-Nov-19 14:38 UTC

head link

[vorbis-dev] Impulses

> Date: Fri, 19 Nov 1999 08:50:53 -0500 (EST)
> From: Gregory Maxwell <greg@linuxpower.cx>
> X-Sender: greg@link.z.aubbs.cx
> Content-Type: TEXT/PLAIN; charset=US-ASCII
> Sender: owner-vorbis-dev@xiph.org
> Precedence: bulk
> Reply-To: vorbis-dev@xiph.org
> X-UIDL: hBkd9e6]!!IV@!!/j[d9
> X-UID: 632
> 
> 
> After playing with the vorbis code for a while and doing tons of hacks and
> analysis on it, I've found it to perform very poorly with impulse
signals.
> 
> The MDCT seems to cause lots of spreading, and it seems to result in much
> worse impulse performance then mp3. 
> 
> 
> What do you think of this:
> 
> input wav -> DWT -> sum non-impuse factors -> iDWT -> MDCT ...
(like above)
>               \
>                -> -> sum impulse factors -> iDWT -> LPC ->
LSP -> quant
> 
Do you guys really think window switching is so bad?  It clearly
works very well and is not just 'mp3' quailty, since it is used
in AAC which is pretty much the best encoder out there.
The only problem I can see is that the encoding
is not as efficient - you always need to allocated extra bits for
short MDCT windows.  But except for extreme cases like castanets.wav,
the amount of attacks/pulses is usually less than 5%.  Assuming 50%
more bits for the lossless encoding, a more sophisticated technique
would save at most 2.5%.

Also, I believe Vorbis is using a 2048 sample MDCT window?  (like AAC,
but almost twice that of  mp3).   Such a large window results in
more spreading, making short windows even more important?

Monty: you've mentioned comparisions between Vorbis and AAC in
the past - which AAC encoder/decoder were you using?  If you get
a chance, could you the output from a decoded AAC encoding of 
castanets.wav?

Mark

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg

Apparently Analagous Threads

Search for more reasonably related threads

Vorbis dev - Nov 1999 - Impulses

[vorbis-dev] Impulses

[vorbis-dev] Impulses

[vorbis-dev] Impulses

Apparently Analagous Threads