thr3ads.net - flac dev - [flac-dev] FLAC specification clarification [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Martijn van Beurden

2020-Jun-22 06:32 UTC

[flac-dev] FLAC specification clarification

Yes, this is such a case. However, implementing this in a future
encoder/decoder would break compatibility with most (likely all) existing
decoders, and only in some very, very rare cases where the material is such
that the encoder chooses to use negative shifts, which makes it even harder
to troubleshoot. Furthermore, as this can only be used in very rare cases,
there is no benefit from allowing this.

Op vr 19 jun. 2020 om 18:03 schreef Stephen F. Booth <me at sbooth.org>:
> Is this a case where something allowed by the specification isn't
> implemented by the reference encoder/decoder (such as 25-32 bits per
> sample) but could be in a different implementation? If so, I am not sure
> whether it makes sense to change the specification based on the reference
> implementation.
>
> Stephen
>
> On Wed, Jun 17, 2020 at 3:22 PM Martijn van Beurden <mvanb1 at
gmail.com>
> wrote:
>
>> Hi all,
>>
>> When trying to better understand the way LPC exactly works, I stumbled
>> upon something which, after some digging, was already reported and
(partly)
>> fixed: https://sourceforge.net/p/flac/bugs/424/
>>
>> Apparently, the FLAC specification has a LPC shift that can be both
>> positive and negative, but the encoder specifically makes sure that
only
>> positive shifts are encoded and the decoder only accepts positive
shifts.
>> The ffmpeg FLAC encoder and decoder show the same behaviour.
>>
>> Now, the documentation in the source code is fixed, the documentation
on
>> the website (which I was looking at) isn't yet. The website
format.html
>> states: "Quantized linear predictor coefficient shift needed in
bits (NOTE:
>> this number is signed two's-complement)." The source code
format.html says
>> "Quantized linear predictor coefficient shift needed in bits
(NOTE: this
>> number is signed two's-complement; but, due to implementation
details, must
>> be non-negative)."
>>
>> I was thinking of submitting a patch to the FLAC website git to get the
>> website format.html up-to-date (there have been more changes than just
this
>> one), but I feel the line above isn't clear enough. Maybe change it
to
>> something like this, to make the wording more similar to the rest of
the
>> specification
>>
>> Quantized linear predictor coefficient shift needed in bits (NOTE:
these
>> bits must be 00000-01111. Originally this was a signed integer, but
>> negative shifts were never implemented).
>>
>> Or perhaps:
>>
>> Quantized linear predictor coefficient shift needed in bits (NOTE:
First
>> bit must be zero. Originally this was a signed integer, but negative
shifts
>> were never implemented).
>>
>> Any thoughts?
>>
>> Kind regards, Martijn van Beurden
>> _______________________________________________
>> flac-dev mailing list
>> flac-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/flac-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/flac-dev/attachments/20200622/f9a8e1c9/attachment.html>

Stephen F. Booth

2020-Jun-25 12:09 UTC

head link

[flac-dev] FLAC specification clarification

To me the real question is not whether that portion of the spec has been
implemented by any existing encoders/decoders but whether the spec is
broken (i.e. cannot be implemented as written). I don't know the rationale
for making the LPC shift explicitly signed. In C a negative shift is
undefined and it does seem in FLAC__lpc_restore_signal() for example that
the LPC shift is used as the argument to a right shift operation. It's
possible (generally/conceptually, not necessarily here) a negative shift
value could be used to represent a left shift. However, I know very little
about linear prediction and how coefficients are chosen and whether that
makes sense. If it really is a design flaw in the spec then it makes sense
to change it or document that negative shifts are not supported by any
known implementation as you suggest.

Philosophically my objection to changing the spec based on lack of known
implementations is that it could be artificially limiting, for example the
same argument about breaking existing decoders could be made for 32-bit
samples sizes.


On Mon, Jun 22, 2020 at 1:33 AM Martijn van Beurden <mvanb1 at gmail.com>
wrote:
> Yes, this is such a case. However, implementing this in a future
> encoder/decoder would break compatibility with most (likely all) existing
> decoders, and only in some very, very rare cases where the material is such
> that the encoder chooses to use negative shifts, which makes it even harder
> to troubleshoot. Furthermore, as this can only be used in very rare cases,
> there is no benefit from allowing this.
>
> Op vr 19 jun. 2020 om 18:03 schreef Stephen F. Booth <me at
sbooth.org>:
>
>> Is this a case where something allowed by the specification isn't
>> implemented by the reference encoder/decoder (such as 25-32 bits per
>> sample) but could be in a different implementation? If so, I am not
sure
>> whether it makes sense to change the specification based on the
reference
>> implementation.
>>
>> Stephen
>>
>> On Wed, Jun 17, 2020 at 3:22 PM Martijn van Beurden <mvanb1 at
gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> When trying to better understand the way LPC exactly works, I
stumbled
>>> upon something which, after some digging, was already reported and
(partly)
>>> fixed: https://sourceforge.net/p/flac/bugs/424/
>>>
>>> Apparently, the FLAC specification has a LPC shift that can be both
>>> positive and negative, but the encoder specifically makes sure that
only
>>> positive shifts are encoded and the decoder only accepts positive
shifts.
>>> The ffmpeg FLAC encoder and decoder show the same behaviour.
>>>
>>> Now, the documentation in the source code is fixed, the
documentation on
>>> the website (which I was looking at) isn't yet. The website
format.html
>>> states: "Quantized linear predictor coefficient shift needed
in bits (NOTE:
>>> this number is signed two's-complement)." The source code
format.html says
>>> "Quantized linear predictor coefficient shift needed in bits
(NOTE: this
>>> number is signed two's-complement; but, due to implementation
details, must
>>> be non-negative)."
>>>
>>> I was thinking of submitting a patch to the FLAC website git to get
the
>>> website format.html up-to-date (there have been more changes than
just this
>>> one), but I feel the line above isn't clear enough. Maybe
change it to
>>> something like this, to make the wording more similar to the rest
of the
>>> specification
>>>
>>> Quantized linear predictor coefficient shift needed in bits (NOTE:
these
>>> bits must be 00000-01111. Originally this was a signed integer, but
>>> negative shifts were never implemented).
>>>
>>> Or perhaps:
>>>
>>> Quantized linear predictor coefficient shift needed in bits (NOTE:
First
>>> bit must be zero. Originally this was a signed integer, but
negative shifts
>>> were never implemented).
>>>
>>> Any thoughts?
>>>
>>> Kind regards, Martijn van Beurden
>>> _______________________________________________
>>> flac-dev mailing list
>>> flac-dev at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/flac-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/flac-dev/attachments/20200625/e2e732a5/attachment.html>

Martijn van Beurden

2020-Jun-25 14:03 UTC

head link

[flac-dev] FLAC specification clarification

Op do 25 jun. 2020 om 14:09 schreef Stephen F. Booth <me at sbooth.org>:
> To me the real question is not whether that portion of the spec has been
> implemented by any existing encoders/decoders but whether the spec is
> broken (i.e. cannot be implemented as written).
>
We will never know for sure whether any existing encoder/decoder works this
way, but I can tell that two very influential ones, namely the reference
encoder and decoder, libFLAC and the ffmpeg encoder (previously known as
Flake) and decoder, do not implement negative shifts. As the licenses for
both are very open (libFLAC being BSD) I can imagine most proprietary
implementations are just straight copies.

I think the problem is not that there might be decoders that accept this or
encoders that (rarely) output this. I cannot say this for certain, but with
libFLAC and ffmpeg decoders not accepting this, I would say that the vast
majority of existing FLAC decoders does not accept this, and therefore
encoders should never output files with negative shifts, as most decoders
won't play such files.

> It's possible (generally/conceptually, not necessarily here) a negative
> shift value could be used to represent a left shift.
>
Yes, I think that was what was originally intended.

> However, I know very little about linear prediction and how coefficients
> are chosen and whether that makes sense
>
I will explain why using negative shifts has probably never any benefit.
Decoding LPC is rather simple to understand: to predict a sample, take the
first coefficient and multiply it by the previous (already decoded) sample,
add to that the second coefficient multiplied with the sample before that,
the third coefficient with the sample before etc. To predict sample 25 of a
block, the decoder has to sum this: LPC_1 * sample_24 + LPC_2 * sample_23 +
LPC_3 * sample_22 + LPC_4 * sample_21 etc. To finish the decoding of the
sample, the residual has to be added to the prediction. This residual is
stored and encoded separately.

These LPC coefficients are floating point numbers. Very often, when you sum
the coefficients (without multiplying them with samples) the results are
close to one, which means that the samples form a nicely correlated signal.
However, the FLAC format doesn't store floating point numbers, so it
quantizes them into integers to make sure no rounding errors can make the
result not-lossless.

How does this work? Assume we have a signal that can be predicted nicely (with
efficiently encodable residual) with LPC coefficients 0.75; -0.375;
0.125; 0.5. To store these as integers, we multiply them by 8, and we get
7, -3, 1, 4. We also have to store a shift of +3 (2^3 = 8) so we get our
original LPC coefficients back.

For a "negative shift" to have a place, we would need the sum of the
LPC
coefficients to between -0.5 and 0.5, which means it is a very quick
fade-out (which can only last a few samples). Probably one can synthesize
such signals, but looking at actual audio material, this does rarely
happen, especially with the larger blocksizes where non-fixed LPC
prediction shows its strengths.

Kind regards, Martijn van Beurden
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/flac-dev/attachments/20200625/ab79daa3/attachment.html>

Brian Willoughby

2020-Jun-26 03:05 UTC

head link

[flac-dev] FLAC specification clarification

I am also philosophically opposed to changing the specification.

That said, there's nothing wrong with adding a note to the specification
about the common implementations, particularly the reference library. Then,
future developers will know both the precise specification and still have the
warning that they risk incompatibility by deviating from the reference
implementation.

I own devices with FLAC implemented in firmware that is quite different from the
reference implementation. I wouldn't want to edit the specification to
narrow it to fit those devices, either.

Brian Willoughby

p.s. I seem to recall using a variable to shift in C, rather than a constant,
and negative values were accepted. Then again, C started out a little weak with
regard to certain operations, such as whether shift was signed or unsigned,
etcetera, and thus my memory might be of a really old, esoteric implementation
of C.


On Jun 25, 2020, at 5:09 AM, Stephen F. Booth <me at sbooth.org>
wrote:> To me the real question is not whether that portion of the spec has been
implemented by any existing encoders/decoders but whether the spec is broken
(i.e. cannot be implemented as written). I don't know the rationale for
making the LPC shift explicitly signed. In C a negative shift is undefined and
it does seem in FLAC__lpc_restore_signal() for example that the LPC shift is
used as the argument to a right shift operation. It's possible
(generally/conceptually, not necessarily here) a negative shift value could be
used to represent a left shift. However, I know very little about linear
prediction and how coefficients are chosen and whether that makes sense. If it
really is a design flaw in the spec then it makes sense to change it or document
that negative shifts are not supported by any known implementation as you
suggest.
> 
> Philosophically my objection to changing the spec based on lack of known
implementations is that it could be artificially limiting, for example the same
argument about breaking existing decoders could be made for 32-bit samples
sizes.
> 
> 
> On Mon, Jun 22, 2020 at 1:33 AM Martijn van Beurden <mvanb1 at
gmail.com> wrote:
>> Yes, this is such a case. However, implementing this in a future
encoder/decoder would break compatibility with most (likely all) existing
decoders, and only in some very, very rare cases where the material is such that
the encoder chooses to use negative shifts, which makes it even harder to
troubleshoot. Furthermore, as this can only be used in very rare cases, there is
no benefit from allowing this.
>> 
>> Op vr 19 jun. 2020 om 18:03 schreef Stephen F. Booth <me at
sbooth.org>:
>>> Is this a case where something allowed by the specification
isn't implemented by the reference encoder/decoder (such as 25-32 bits per
sample) but could be in a different implementation? If so, I am not sure whether
it makes sense to change the specification based on the reference
implementation.
>>> 
>>> Stephen
>>> 
>>> On Wed, Jun 17, 2020 at 3:22 PM Martijn van Beurden <mvanb1 at
gmail.com> wrote:
>>>> Hi all,
>>>> 
>>>> When trying to better understand the way LPC exactly works, I
stumbled upon something which, after some digging, was already reported and
(partly) fixed: https://sourceforge.net/p/flac/bugs/424/
>>>> 
>>>> Apparently, the FLAC specification has a LPC shift that can be
both positive and negative, but the encoder specifically makes sure that only
positive shifts are encoded and the decoder only accepts positive shifts. The
ffmpeg FLAC encoder and decoder show the same behaviour.
>>>> 
>>>> Now, the documentation in the source code is fixed, the
documentation on the website (which I was looking at) isn't yet. The website
format.html states: "Quantized linear predictor coefficient shift needed in
bits (NOTE: this number is signed two's-complement)." The source code
format.html says "Quantized linear predictor coefficient shift needed in
bits (NOTE: this number is signed two's-complement; but, due to
implementation details, must be non-negative)."
>>>> 
>>>> I was thinking of submitting a patch to the FLAC website git to
get the website format.html up-to-date (there have been more changes than just
this one), but I feel the line above isn't clear enough. Maybe change it to
something like this, to make the wording more similar to the rest of the
specification
>>>> 
>>>> Quantized linear predictor coefficient shift needed in bits
(NOTE: these bits must be 00000-01111. Originally this was a signed integer, but
negative shifts were never implemented).
>>>> 
>>>> Or perhaps:
>>>> 
>>>> Quantized linear predictor coefficient shift needed in bits
(NOTE: First bit must be zero. Originally this was a signed integer, but
negative shifts were never implemented).
>>>> 
>>>> Any thoughts?
>>>> 
>>>> Kind regards, Martijn van Beurden

Maybe Matching Threads

Search for more seemingly similar threads

flac dev - Jun 2020 - FLAC specification clarification

[flac-dev] FLAC specification clarification

[flac-dev] FLAC specification clarification

[flac-dev] FLAC specification clarification

[flac-dev] FLAC specification clarification

Maybe Matching Threads