thr3ads.net - Vorbis dev - [Vorbis-dev] Getting masked FFT data out of libvorbisenc [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Steinar H. Gunderson

2007-Apr-23 03:41 UTC

[Vorbis-dev] Getting masked FFT data out of libvorbisenc

[Apologies if this gets through twice. I sent it first without subscribing,
 but it seems like it got stuck in the moderation queue, so I subscribed and
 re-sent it.]

I'm doing some work on audio fingerprinting for a school project (more
precisely, my master's thesis. I got a hint on #vorbis that I might want to
look into the internal floor representations in libvorbisenc to get out audio
data after the psychoacoustic masking, but I'm having problems actually
getting out the right data.

Basically, I'm looking in mapping0.c, dumping out the debugging information
that's already there. One of the most promising places seemed to be just
before floor1_fit, but it seems a bit odd:

  http://home.samfundet.no/~sesse/vorbis_floor3.png

In particular, there's a _lot_ of energy in the treble, where I'd expect
there to be almost none. I don't know very much about the internals of
Vorbis
(nor psychoacoustics in general, I'm afraid), but it seems to be as if the
floor is a rough copy of the FFT _plus_ the tone masking stuff, whereas I'd
probably want it to be a rough copy of the FFT _minus_ the tone masking
stuff.

Is there any way I can actually get out this kind of information, short of
encoding the entire signal and decoding it again (which will obviously also
leave me with all the quantization noise and other artifacts that I don't
want)?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

xiphmont@xiph.org

2007-Apr-23 14:38 UTC

head link

[Vorbis-dev] Getting masked FFT data out of libvorbisenc

On 4/23/07, Steinar H. Gunderson <sgunderson@bigfoot.com> wrote:
>
> Basically, I'm looking in mapping0.c, dumping out the debugging
information
> that's already there. One of the most promising places seemed to be
just
> before floor1_fit, but it seems a bit odd:
>
>   http://home.samfundet.no/~sesse/vorbis_floor3.png
>
> In particular, there's a _lot_ of energy in the treble, where I'd
expect
> there to be almost none.
That's not energy, that's approximate discrimination threshold.  You
end up with so much treble 'masking' because the ear's tonal HF
discrimination hardware is not very sensitive; look at a normal
Absolute Threshold of Hearing curve and you'll see where the sharp
upward slope is coming from.  The only reason it moves around in
Vorbis and it's not a fixed ATH curve is because 0dB is not a fixed
point in absolute terms, and the Vorbis code is calculating the most
pessimistic possible masking curve such that regardless the actual
final playback level of the audio, the masking curve is always either
correct or too low (but never too high).

Monty


 I don't know very much about the internals of
Vorbis> (nor psychoacoustics in general, I'm afraid), but it seems to be as if
the
> floor is a rough copy of the FFT _plus_ the tone masking stuff, whereas
I'd
> probably want it to be a rough copy of the FFT _minus_ the tone masking
> stuff.
>
> Is there any way I can actually get out this kind of information, short of
> encoding the entire signal and decoding it again (which will obviously also
> leave me with all the quantization noise and other artifacts that I
don't
> want)?
>
> /* Steinar */
> --
> Homepage: http://www.sesse.net/
> _______________________________________________
> Vorbis-dev mailing list
> Vorbis-dev@xiph.org
> http://lists.xiph.org/mailman/listinfo/vorbis-dev
>

Seemingly Similar Threads

Search for more maybe matching threads

Vorbis dev - Apr 2007 - Getting masked FFT data out of libvorbisenc

[Vorbis-dev] Getting masked FFT data out of libvorbisenc

[Vorbis-dev] Getting masked FFT data out of libvorbisenc

Seemingly Similar Threads