Hi Jean Marc, please find attached the audio file (mono 16khz). I shortened it to about 10 seconds. I also add 2 patches that worked for me. Further info that might help: - The problem seems to be related to silk_burg_modified not reaching the maximum gain, so the actual filter order is 16 rather than 2 (which is what would be expected with a sine wave). - The problem seems to happen when rshifts >= 3 - when pre-scaling the signal to be < 16384 the problem goes away (patch scale_burg_in.diff) - When calculating C0 and rshifts based on a 64 bits correlation instead of using silk_sum_sqr_shift the problem also goes away (patch new_C0_calc.diff) I suspect that for very high prediction gain the fixed point implementation becomes very sensitive to numerical error, but I am not too sure why the new versions work better. I favour the version with the new C0 calculation, as it avoids rescaling the input. I also played around with a version (not attached) that prescales the input by rshifts/2 - this might be considering as it simplifies the code. Regards, Marcello PS: I am using 1.1 but the same issue is present with the latest code well. On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote:>Hi Marcello, > >Thanks for the report. It's hard to debug this without the actual file. >Can you please post the sweep_in.raw file you used? > >Cheers, > > Jean-Marc > >On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote: >> Hi, >> >> Apologies if this is a known issues, but I have found what I believe is >> a bug in the fixed point implementation of the Silk codec and could not >> find any mention on this in the archives. >> >> The bug can be easily reproduced with the fixed point demo program >> (./configure ?enable-fixed-point ?disable-float-api && make) using the >> following command: >> >> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw >> >> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0 to >> 8kHz sampled at 16kHz. >> >> The first 6 seconds of audio after transcoding sound Ok. After that >> artefacts are introduced all the way to the end of the file. >> The floating point version does not have the issue (even though the >> quality is subjectively worse roughly from the same point). >> >> I believe I narrowed down the problem to the file burg_modified_FIX.c ? >> if I make sure the input signal is scaled down to 14 bits before >> processing the coefficients of the predictor are calculated correctly >> and no artefact is introduced. >> >> Is anyone experiencing the same problem or has a proper fix for this? (I >> can work around the bug with input scaling for now). >> >> Thanks and best regards, >> >> Marcello Caramma >> >> >> _______________________________________________ >> opus mailing list >> opus at xiph.org >> http://lists.xiph.org/mailman/listinfo/opus >>-------------- next part -------------- A non-text attachment was scrubbed... Name: scale_burg_in.diff Type: application/octet-stream Size: 2181 bytes Desc: scale_burg_in.diff Url : http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0003.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: new_C0_calc.diff Type: application/octet-stream Size: 2368 bytes Desc: new_C0_calc.diff Url : http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0004.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: sweep_short_m16k.raw Type: application/octet-stream Size: 319224 bytes Desc: sweep_short_m16k.raw Url : http://lists.xiph.org/pipermail/opus/attachments/20140613/d45e2ca2/attachment-0005.obj
Hi Marcello, Thanks for the info and the proposed fixes. I'm currently investigating what's going on here before deciding on the best way to fix the problem. Have you been able to figure out why it doesn't work for rshifts >= 3? Cheers, Jean-Marc On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:> Hi Jean Marc, > > please find attached the audio file (mono 16khz). I shortened it to about > 10 seconds. I also add 2 patches that worked for me. Further info that > might help: > > - The problem seems to be related to silk_burg_modified not reaching the > maximum gain, so the actual filter order is 16 rather than 2 (which is > what would be expected with a sine wave). > - The problem seems to happen when rshifts >= 3 > - when pre-scaling the signal to be < 16384 the problem goes away (patch > scale_burg_in.diff) > - When calculating C0 and rshifts based on a 64 bits correlation instead > of using silk_sum_sqr_shift the problem also goes away (patch > new_C0_calc.diff) > > I suspect that for very high prediction gain the fixed point > implementation becomes very sensitive to numerical error, but I am not too > sure why the new versions work better. > I favour the version with the new C0 calculation, as it avoids rescaling > the input. > I also played around with a version (not attached) that prescales the > input by rshifts/2 - this might be considering as it simplifies the code. > > Regards, > > Marcello > > PS: I am using 1.1 but the same issue is present with the latest code well. > > > On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote: > >> Hi Marcello, >> >> Thanks for the report. It's hard to debug this without the actual file. >> Can you please post the sweep_in.raw file you used? >> >> Cheers, >> >> Jean-Marc >> >> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote: >>> Hi, >>> >>> Apologies if this is a known issues, but I have found what I believe is >>> a bug in the fixed point implementation of the Silk codec and could not >>> find any mention on this in the archives. >>> >>> The bug can be easily reproduced with the fixed point demo program >>> (./configure ?enable-fixed-point ?disable-float-api && make) using the >>> following command: >>> >>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw >>> >>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0 to >>> 8kHz sampled at 16kHz. >>> >>> The first 6 seconds of audio after transcoding sound Ok. After that >>> artefacts are introduced all the way to the end of the file. >>> The floating point version does not have the issue (even though the >>> quality is subjectively worse roughly from the same point). >>> >>> I believe I narrowed down the problem to the file burg_modified_FIX.c ? >>> if I make sure the input signal is scaled down to 14 bits before >>> processing the coefficients of the predictor are calculated correctly >>> and no artefact is introduced. >>> >>> Is anyone experiencing the same problem or has a proper fix for this? (I >>> can work around the bug with input scaling for now). >>> >>> Thanks and best regards, >>> >>> Marcello Caramma >>> >>> >>> _______________________________________________ >>> opus mailing list >>> opus at xiph.org >>> http://lists.xiph.org/mailman/listinfo/opus >>> >
Hi Marcello, It turns out that the problem has a much simpler explanation. As far as I can tell, it's actually a bug in silk_sum_sqr_shift() and this trivial patch appears to fix the problem: http://jmvalin.ca/misc_stuff/sum_sqr_shift_fix.patch It would still require some testing to check that the fix doesn't have any bad side effect. Let me know how well the fix works for you. Again, thanks for investigating this. Cheers, Jean-Marc On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:> Hi Jean Marc, > > please find attached the audio file (mono 16khz). I shortened it to about > 10 seconds. I also add 2 patches that worked for me. Further info that > might help: > > - The problem seems to be related to silk_burg_modified not reaching the > maximum gain, so the actual filter order is 16 rather than 2 (which is > what would be expected with a sine wave). > - The problem seems to happen when rshifts >= 3 > - when pre-scaling the signal to be < 16384 the problem goes away (patch > scale_burg_in.diff) > - When calculating C0 and rshifts based on a 64 bits correlation instead > of using silk_sum_sqr_shift the problem also goes away (patch > new_C0_calc.diff) > > I suspect that for very high prediction gain the fixed point > implementation becomes very sensitive to numerical error, but I am not too > sure why the new versions work better. > I favour the version with the new C0 calculation, as it avoids rescaling > the input. > I also played around with a version (not attached) that prescales the > input by rshifts/2 - this might be considering as it simplifies the code. > > Regards, > > Marcello > > PS: I am using 1.1 but the same issue is present with the latest code well. > > > On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote: > >> Hi Marcello, >> >> Thanks for the report. It's hard to debug this without the actual file. >> Can you please post the sweep_in.raw file you used? >> >> Cheers, >> >> Jean-Marc >> >> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote: >>> Hi, >>> >>> Apologies if this is a known issues, but I have found what I believe is >>> a bug in the fixed point implementation of the Silk codec and could not >>> find any mention on this in the archives. >>> >>> The bug can be easily reproduced with the fixed point demo program >>> (./configure ?enable-fixed-point ?disable-float-api && make) using the >>> following command: >>> >>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw >>> >>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0 to >>> 8kHz sampled at 16kHz. >>> >>> The first 6 seconds of audio after transcoding sound Ok. After that >>> artefacts are introduced all the way to the end of the file. >>> The floating point version does not have the issue (even though the >>> quality is subjectively worse roughly from the same point). >>> >>> I believe I narrowed down the problem to the file burg_modified_FIX.c ? >>> if I make sure the input signal is scaled down to 14 bits before >>> processing the coefficients of the predictor are calculated correctly >>> and no artefact is introduced. >>> >>> Is anyone experiencing the same problem or has a proper fix for this? (I >>> can work around the bug with input scaling for now). >>> >>> Thanks and best regards, >>> >>> Marcello Caramma >>> >>> >>> _______________________________________________ >>> opus mailing list >>> opus at xiph.org >>> http://lists.xiph.org/mailman/listinfo/opus >>> >
Hi Jean-Marc, well spotted! The patch provided fixes the issue for me. Nevertheless, in my code (and I would suggest doing the same in libopus) I am going to replace the function with one that accumulates on 64 bits and then calculates the shift, for at least 4 reasons: - It is less and simpler code - The result is likely to be slightly more accurate in case big numbers come early in the array. - I don?t like to have branching inside loops - Even though it seems to work on most compilers, assuming that overflow in the accumulator will lead to a negative result calls for undefined behaviour (NOT implementation defined): if left as is the code should at least use an unsigned accumulator and replace the test (nrg < 0) with (nrg> MAX_INT32).Can you see any issue in doing so? If you also think it might be a good idea to do the same in the libopus repository, I can send you the suggested patch. Cheers, Marcello On 18/06/2014 08:09, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote:>Hi Marcello, > >It turns out that the problem has a much simpler explanation. As far as >I can tell, it's actually a bug in silk_sum_sqr_shift() and this trivial >patch appears to fix the problem: >http://jmvalin.ca/misc_stuff/sum_sqr_shift_fix.patch > >It would still require some testing to check that the fix doesn't have >any bad side effect. Let me know how well the fix works for you. Again, >thanks for investigating this. > >Cheers, > > Jean-Marc > >On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote: >> Hi Jean Marc, >> >> please find attached the audio file (mono 16khz). I shortened it to >>about >> 10 seconds. I also add 2 patches that worked for me. Further info that >> might help: >> >> - The problem seems to be related to silk_burg_modified not reaching the >> maximum gain, so the actual filter order is 16 rather than 2 (which is >> what would be expected with a sine wave). >> - The problem seems to happen when rshifts >= 3 >> - when pre-scaling the signal to be < 16384 the problem goes away (patch >> scale_burg_in.diff) >> - When calculating C0 and rshifts based on a 64 bits correlation instead >> of using silk_sum_sqr_shift the problem also goes away (patch >> new_C0_calc.diff) >> >> I suspect that for very high prediction gain the fixed point >> implementation becomes very sensitive to numerical error, but I am not >>too >> sure why the new versions work better. >> I favour the version with the new C0 calculation, as it avoids rescaling >> the input. >> I also played around with a version (not attached) that prescales the >> input by rshifts/2 - this might be considering as it simplifies the >>code. >> >> Regards, >> >> Marcello >> >> PS: I am using 1.1 but the same issue is present with the latest code >>well. >> >> >> On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote: >> >>> Hi Marcello, >>> >>> Thanks for the report. It's hard to debug this without the actual file. >>> Can you please post the sweep_in.raw file you used? >>> >>> Cheers, >>> >>> Jean-Marc >>> >>> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote: >>>> Hi, >>>> >>>> Apologies if this is a known issues, but I have found what I believe >>>>is >>>> a bug in the fixed point implementation of the Silk codec and could >>>>not >>>> find any mention on this in the archives. >>>> >>>> The bug can be easily reproduced with the fixed point demo program >>>> (./configure ?enable-fixed-point ?disable-float-api && make) using the >>>> following command: >>>> >>>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw >>>> >>>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0 >>>>to >>>> 8kHz sampled at 16kHz. >>>> >>>> The first 6 seconds of audio after transcoding sound Ok. After that >>>> artefacts are introduced all the way to the end of the file. >>>> The floating point version does not have the issue (even though the >>>> quality is subjectively worse roughly from the same point). >>>> >>>> I believe I narrowed down the problem to the file burg_modified_FIX.c >>>>? >>>> if I make sure the input signal is scaled down to 14 bits before >>>> processing the coefficients of the predictor are calculated correctly >>>> and no artefact is introduced. >>>> >>>> Is anyone experiencing the same problem or has a proper fix for this? >>>>(I >>>> can work around the bug with input scaling for now). >>>> >>>> Thanks and best regards, >>>> >>>> Marcello Caramma >>>> >>>> >>>> _______________________________________________ >>>> opus mailing list >>>> opus at xiph.org >>>> http://lists.xiph.org/mailman/listinfo/opus >>>> >>