thr3ads.net - opus - [opus] Alleged bug in Silk codec [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Marcello Caramma (mcaramma)

2014-Jun-13 16:28 UTC

[opus] Alleged bug in Silk codec

Hi Jean Marc,

please find attached the audio file (mono 16khz). I shortened it to about
10 seconds. I also add 2 patches that worked for me. Further info that
might help:

- The problem seems to be related to silk_burg_modified not reaching the
maximum gain, so the actual filter order is 16 rather than 2 (which is
what would be expected with a sine wave).
- The problem seems to happen when rshifts >= 3
- when pre-scaling the signal to be < 16384 the problem goes away (patch
scale_burg_in.diff)
- When calculating C0 and rshifts based on a 64 bits correlation instead
of using silk_sum_sqr_shift the problem also goes away (patch
new_C0_calc.diff)

I suspect that for very high prediction gain the fixed point
implementation becomes very sensitive to numerical error, but I am not too
sure why the new versions work better.
I favour the version with the new C0 calculation, as it avoids rescaling
the input.
I also played around with a version (not attached) that prescales the
input by rshifts/2 - this might be considering as it simplifies the code.

Regards,

Marcello

PS: I am using 1.1 but the same issue is present with the latest code well.


On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca>
wrote:
>Hi Marcello,
>
>Thanks for the report. It's hard to debug this without the actual file.
>Can you please post the sweep_in.raw file you used?
>
>Cheers,
>
>	Jean-Marc
>
>On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote:
>> Hi,
>> 
>> Apologies if this is a known issues, but I have found what I believe is
>> a bug in the fixed point implementation of the Silk codec and could not
>> find any mention on this in the archives.
>> 
>> The bug can be easily reproduced with the fixed point demo program
>> (./configure ?enable-fixed-point ?disable-float-api && make)
using the
>> following command:
>> 
>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw
>> 
>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0 to
>> 8kHz sampled at 16kHz.
>> 
>> The first 6 seconds of audio after transcoding sound Ok. After that
>> artefacts are introduced all the way to the end of the file.
>> The floating point version does not have the issue (even though the
>> quality is subjectively worse roughly from the same point).
>> 
>> I believe I narrowed down the problem to the file burg_modified_FIX.c ?
>> if I make sure the input signal is scaled down to 14 bits before
>> processing the coefficients of the predictor are calculated correctly
>> and no artefact is introduced.
>> 
>> Is anyone experiencing the same problem or has a proper fix for this?
(I
>> can work around the bug with input scaling for now).
>> 
>> Thanks and best regards,
>> 
>> Marcello Caramma
>> 
>> 
>> _______________________________________________
>> opus mailing list
>> opus at xiph.org
>> http://lists.xiph.org/mailman/listinfo/opus
>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scale_burg_in.diff
Type: application/octet-stream
Size: 2181 bytes
Desc: scale_burg_in.diff
Url :
http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0003.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: new_C0_calc.diff
Type: application/octet-stream
Size: 2368 bytes
Desc: new_C0_calc.diff
Url :
http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0004.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sweep_short_m16k.raw
Type: application/octet-stream
Size: 319224 bytes
Desc: sweep_short_m16k.raw
Url :
http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0005.obj

Jean-Marc Valin

2014-Jun-16 21:23 UTC

head link

[opus] Alleged bug in Silk codec

Hi Marcello,

Thanks for the info and the proposed fixes. I'm currently investigating
what's going on here before deciding on the best way to fix the problem.
Have you been able to figure out why it doesn't work for rshifts >= 3?

Cheers,

	Jean-Marc

On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:> Hi Jean Marc,
> 
> please find attached the audio file (mono 16khz). I shortened it to about
> 10 seconds. I also add 2 patches that worked for me. Further info that
> might help:
> 
> - The problem seems to be related to silk_burg_modified not reaching the
> maximum gain, so the actual filter order is 16 rather than 2 (which is
> what would be expected with a sine wave).
> - The problem seems to happen when rshifts >= 3
> - when pre-scaling the signal to be < 16384 the problem goes away (patch
> scale_burg_in.diff)
> - When calculating C0 and rshifts based on a 64 bits correlation instead
> of using silk_sum_sqr_shift the problem also goes away (patch
> new_C0_calc.diff)
> 
> I suspect that for very high prediction gain the fixed point
> implementation becomes very sensitive to numerical error, but I am not too
> sure why the new versions work better.
> I favour the version with the new C0 calculation, as it avoids rescaling
> the input.
> I also played around with a version (not attached) that prescales the
> input by rshifts/2 - this might be considering as it simplifies the code.
> 
> Regards,
> 
> Marcello
> 
> PS: I am using 1.1 but the same issue is present with the latest code well.
> 
> 
> On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at
jmvalin.ca> wrote:
> 
>> Hi Marcello,
>>
>> Thanks for the report. It's hard to debug this without the actual
file.
>> Can you please post the sweep_in.raw file you used?
>>
>> Cheers,
>>
>> 	Jean-Marc
>>
>> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote:
>>> Hi,
>>>
>>> Apologies if this is a known issues, but I have found what I
believe is
>>> a bug in the fixed point implementation of the Silk codec and could
not
>>> find any mention on this in the archives.
>>>
>>> The bug can be easily reproduced with the fixed point demo program
>>> (./configure ?enable-fixed-point ?disable-float-api &&
make) using the
>>> following command:
>>>
>>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw
>>>
>>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from
0 to
>>> 8kHz sampled at 16kHz.
>>>
>>> The first 6 seconds of audio after transcoding sound Ok. After that
>>> artefacts are introduced all the way to the end of the file.
>>> The floating point version does not have the issue (even though the
>>> quality is subjectively worse roughly from the same point).
>>>
>>> I believe I narrowed down the problem to the file
burg_modified_FIX.c ?
>>> if I make sure the input signal is scaled down to 14 bits before
>>> processing the coefficients of the predictor are calculated
correctly
>>> and no artefact is introduced.
>>>
>>> Is anyone experiencing the same problem or has a proper fix for
this? (I
>>> can work around the bug with input scaling for now).
>>>
>>> Thanks and best regards,
>>>
>>> Marcello Caramma
>>>
>>>
>>> _______________________________________________
>>> opus mailing list
>>> opus at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/opus
>>>
>

Jean-Marc Valin

2014-Jun-18 07:09 UTC

head link

[opus] Alleged bug in Silk codec

Hi Marcello,

It turns out that the problem has a much simpler explanation. As far as
I can tell, it's actually a bug in silk_sum_sqr_shift() and this trivial
patch appears to fix the problem:
http://jmvalin.ca/misc_stuff/sum_sqr_shift_fix.patch

It would still require some testing to check that the fix doesn't have
any bad side effect. Let me know how well the fix works for you. Again,
thanks for investigating this.

Cheers,

	Jean-Marc

On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:> Hi Jean Marc,
> 
> please find attached the audio file (mono 16khz). I shortened it to about
> 10 seconds. I also add 2 patches that worked for me. Further info that
> might help:
> 
> - The problem seems to be related to silk_burg_modified not reaching the
> maximum gain, so the actual filter order is 16 rather than 2 (which is
> what would be expected with a sine wave).
> - The problem seems to happen when rshifts >= 3
> - when pre-scaling the signal to be < 16384 the problem goes away (patch
> scale_burg_in.diff)
> - When calculating C0 and rshifts based on a 64 bits correlation instead
> of using silk_sum_sqr_shift the problem also goes away (patch
> new_C0_calc.diff)
> 
> I suspect that for very high prediction gain the fixed point
> implementation becomes very sensitive to numerical error, but I am not too
> sure why the new versions work better.
> I favour the version with the new C0 calculation, as it avoids rescaling
> the input.
> I also played around with a version (not attached) that prescales the
> input by rshifts/2 - this might be considering as it simplifies the code.
> 
> Regards,
> 
> Marcello
> 
> PS: I am using 1.1 but the same issue is present with the latest code well.
> 
> 
> On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at
jmvalin.ca> wrote:
> 
>> Hi Marcello,
>>
>> Thanks for the report. It's hard to debug this without the actual
file.
>> Can you please post the sweep_in.raw file you used?
>>
>> Cheers,
>>
>> 	Jean-Marc
>>
>> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote:
>>> Hi,
>>>
>>> Apologies if this is a known issues, but I have found what I
believe is
>>> a bug in the fixed point implementation of the Silk codec and could
not
>>> find any mention on this in the archives.
>>>
>>> The bug can be easily reproduced with the fixed point demo program
>>> (./configure ?enable-fixed-point ?disable-float-api &&
make) using the
>>> following command:
>>>
>>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw
>>>
>>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from
0 to
>>> 8kHz sampled at 16kHz.
>>>
>>> The first 6 seconds of audio after transcoding sound Ok. After that
>>> artefacts are introduced all the way to the end of the file.
>>> The floating point version does not have the issue (even though the
>>> quality is subjectively worse roughly from the same point).
>>>
>>> I believe I narrowed down the problem to the file
burg_modified_FIX.c ?
>>> if I make sure the input signal is scaled down to 14 bits before
>>> processing the coefficients of the predictor are calculated
correctly
>>> and no artefact is introduced.
>>>
>>> Is anyone experiencing the same problem or has a proper fix for
this? (I
>>> can work around the bug with input scaling for now).
>>>
>>> Thanks and best regards,
>>>
>>> Marcello Caramma
>>>
>>>
>>> _______________________________________________
>>> opus mailing list
>>> opus at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/opus
>>>
>

Marcello Caramma (mcaramma)

2014-Jun-20 10:10 UTC

head link

[opus] Alleged bug in Silk codec

Hi Jean-Marc,

well spotted! The patch provided fixes the issue for me.

Nevertheless, in my code (and I would suggest doing the same in libopus) I
am going to replace the function with one that accumulates on 64 bits and
then calculates the shift, for at least 4 reasons:
 - It is less and simpler code
 - The result is likely to be slightly more accurate in case big numbers
come early in the array.
 - I don?t like to have branching inside loops

 - Even though it seems to work on most compilers, assuming that overflow
in the accumulator will lead to a negative result calls for undefined
behaviour (NOT implementation defined): if left as is the code should at
least use an unsigned accumulator and replace the test (nrg < 0) with
(nrg> MAX_INT32).
Can you see any issue in doing so?
If you also think it might be a good idea to do the same in the libopus
repository, I can send you the suggested patch.

Cheers,

Marcello


On 18/06/2014 08:09, "Jean-Marc Valin" <jmvalin at jmvalin.ca>
wrote:
>Hi Marcello,
>
>It turns out that the problem has a much simpler explanation. As far as
>I can tell, it's actually a bug in silk_sum_sqr_shift() and this trivial
>patch appears to fix the problem:
>http://jmvalin.ca/misc_stuff/sum_sqr_shift_fix.patch
>
>It would still require some testing to check that the fix doesn't have
>any bad side effect. Let me know how well the fix works for you. Again,
>thanks for investigating this.
>
>Cheers,
>
>	Jean-Marc
>
>On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:
>> Hi Jean Marc,
>> 
>> please find attached the audio file (mono 16khz). I shortened it to
>>about
>> 10 seconds. I also add 2 patches that worked for me. Further info that
>> might help:
>> 
>> - The problem seems to be related to silk_burg_modified not reaching
the
>> maximum gain, so the actual filter order is 16 rather than 2 (which is
>> what would be expected with a sine wave).
>> - The problem seems to happen when rshifts >= 3
>> - when pre-scaling the signal to be < 16384 the problem goes away
(patch
>> scale_burg_in.diff)
>> - When calculating C0 and rshifts based on a 64 bits correlation
instead
>> of using silk_sum_sqr_shift the problem also goes away (patch
>> new_C0_calc.diff)
>> 
>> I suspect that for very high prediction gain the fixed point
>> implementation becomes very sensitive to numerical error, but I am not
>>too
>> sure why the new versions work better.
>> I favour the version with the new C0 calculation, as it avoids
rescaling
>> the input.
>> I also played around with a version (not attached) that prescales the
>> input by rshifts/2 - this might be considering as it simplifies the
>>code.
>> 
>> Regards,
>> 
>> Marcello
>> 
>> PS: I am using 1.1 but the same issue is present with the latest code
>>well.
>> 
>> 
>> On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at
jmvalin.ca> wrote:
>> 
>>> Hi Marcello,
>>>
>>> Thanks for the report. It's hard to debug this without the
actual file.
>>> Can you please post the sweep_in.raw file you used?
>>>
>>> Cheers,
>>>
>>> 	Jean-Marc
>>>
>>> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote:
>>>> Hi,
>>>>
>>>> Apologies if this is a known issues, but I have found what I
believe
>>>>is
>>>> a bug in the fixed point implementation of the Silk codec and
could
>>>>not
>>>> find any mention on this in the archives.
>>>>
>>>> The bug can be easily reproduced with the fixed point demo
program
>>>> (./configure ?enable-fixed-point ?disable-float-api &&
make) using the
>>>> following command:
>>>>
>>>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw
>>>>
>>>> Where sweep_in.raw is a 30 seconds full scale frequency sweep
from 0
>>>>to
>>>> 8kHz sampled at 16kHz.
>>>>
>>>> The first 6 seconds of audio after transcoding sound Ok. After
that
>>>> artefacts are introduced all the way to the end of the file.
>>>> The floating point version does not have the issue (even though
the
>>>> quality is subjectively worse roughly from the same point).
>>>>
>>>> I believe I narrowed down the problem to the file
burg_modified_FIX.c
>>>>?
>>>> if I make sure the input signal is scaled down to 14 bits
before
>>>> processing the coefficients of the predictor are calculated
correctly
>>>> and no artefact is introduced.
>>>>
>>>> Is anyone experiencing the same problem or has a proper fix for
this?
>>>>(I
>>>> can work around the bug with input scaling for now).
>>>>
>>>> Thanks and best regards,
>>>>
>>>> Marcello Caramma
>>>>
>>>>
>>>> _______________________________________________
>>>> opus mailing list
>>>> opus at xiph.org
>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>
>>

Maybe Matching Threads

Search for more maybe matching threads

opus - Jun 2014 - Alleged bug in Silk codec

[opus] Alleged bug in Silk codec

[opus] Alleged bug in Silk codec

[opus] Alleged bug in Silk codec

[opus] Alleged bug in Silk codec

Maybe Matching Threads