Hi all, I just filed an issue on github: https://github.com/xiph/flac/issues/306 Long story short: libFLAC currently uses 32-bit signed integers for residuals, but using certain 24-bit (crafted) material can overflow this. For examples and an explanation see the github issue. I'm sending this e-mail because I'd like to attract some extra attention towards it. I'm working on patching this, but the residual is actually exported through the API (as part of the subframe) so I'd like to have some discussion on what would be the best way to fix this. I see the following options: 1) Change all residual handling from 32-bit int to 64-bit int. This might incur a performance penalty and it might invalidate certain approaches now used with intrinsics. It also presents an API change 2) Change the current 32-bit signed integer arrays which are used for residual handling to a union within a struct. This union would hold both a 32-bit int and a 64-bit int variant of the residual, and the struct would include a bool to clarify which of the two is being used. For each function manipulating residuals (bitreader, bitwriter, lpc restore etc.) a new _ultrawide variant is written and at several places in the code a decision has to be added whether to use the _ultrawide variant or the existing ones. This changes the API a little more, but makes it able to keep the current intrinsics accelerated functions 3) Add a note to the FLAC spec that residuals should not exceed 32-bit, and add a mechanism to the encoder to ascertain this. In case the encoder finds a residual exceeding the 32-bit range, it defaults to using a verbatim subframe. Any thoughts?
Hi Martijn, I just want to thank you for your thoughtful attention to these details. Much appreciated! Jim ------------------------ On Thu, March 24, 2022 9:05 am, Martijn van Beurden wrote:> Hi all, > > > I just filed an issue on github: > https://github.com/xiph/flac/issues/306 Long story short: libFLAC > currently uses 32-bit signed integers for residuals, but using certain > 24-bit (crafted) material can overflow this. For examples and an > explanation see the github issue. > > I'm sending this e-mail because I'd like to attract some extra > attention towards it. I'm working on patching this, but the residual is > actually exported through the API (as part of the subframe) so I'd like to > have some discussion on what would be the best way to fix this. I see the > following options: > > 1) Change all residual handling from 32-bit int to 64-bit int. This > might incur a performance penalty and it might invalidate certain > approaches now used with intrinsics. It also presents an API change 2) > Change the current 32-bit signed integer arrays which are used for > residual handling to a union within a struct. This union would hold both a > 32-bit int and a 64-bit int variant of the residual, and the > struct would include a bool to clarify which of the two is being used. For > each function manipulating residuals (bitreader, bitwriter, lpc restore > etc.) a new _ultrawide variant is written and at several places in the > code a decision has to be added whether to use the _ultrawide variant or > the existing ones. This changes the API a little more, but makes it able > to keep the current intrinsics accelerated functions 3) Add a note to the > FLAC spec that residuals should not exceed > 32-bit, and add a mechanism to the encoder to ascertain this. In case > the encoder finds a residual exceeding the 32-bit range, it defaults to > using a verbatim subframe. > > Any thoughts? > _______________________________________________ > flac-dev mailing list flac-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/flac-dev
Can you briefly explain the consequences of this unexpected residual? Will a 24-bit crafted file be compressed in a lossy way? Are there tests in the FLAC suite that would reveal this? Can we add such a crafted 24-bit file to the test suite? When would the new _ultrawide variants of existing functions be used? ... I ask because my use of FLAC is entirely 24-bit, so I appreciate the discovery of this potential issue. Brian Willoughby On Mar 24, 2022, at 09:05, Martijn van Beurden <mvanb1 at gmail.com> wrote:> Hi all, > > I just filed an issue on github: > https://github.com/xiph/flac/issues/306 Long story short: libFLAC > currently uses 32-bit signed integers for residuals, but using certain > 24-bit (crafted) material can overflow this. For examples and an > explanation see the github issue. > > I'm sending this e-mail because I'd like to attract some extra > attention towards it. I'm working on patching this, but the residual > is actually exported through the API (as part of the subframe) so I'd > like to have some discussion on what would be the best way to fix > this. I see the following options: > > 1) Change all residual handling from 32-bit int to 64-bit int. This > might incur a performance penalty and it might invalidate certain > approaches now used with intrinsics. It also presents an API change > 2) Change the current 32-bit signed integer arrays which are used for > residual handling to a union within a struct. This union would hold > both a 32-bit int and a 64-bit int variant of the residual, and the > struct would include a bool to clarify which of the two is being used. > For each function manipulating residuals (bitreader, bitwriter, lpc > restore etc.) a new _ultrawide variant is written and at several > places in the code a decision has to be added whether to use the > _ultrawide variant or the existing ones. This changes the API a little > more, but makes it able to keep the current intrinsics accelerated > functions > 3) Add a note to the FLAC spec that residuals should not exceed > 32-bit, and add a mechanism to the encoder to ascertain this. In case > the encoder finds a residual exceeding the 32-bit range, it defaults > to using a verbatim subframe. > > Any thoughts?
On Thu, Mar 24, 2022 at 05:05:02PM +0100, Martijn van Beurden wrote:> 3) Add a note to the FLAC spec that residuals should not exceed > 32-bit, and add a mechanism to the encoder to ascertain this. In case > the encoder finds a residual exceeding the 32-bit range, it defaults > to using a verbatim subframe. > > Any thoughts?The third option makes most sense to me. I don't think we should complicate decoders by requiring them to support 64-bit residuals only because it's technically possible to encode such a stream. Decoders can produce wrong output or reject the stream as invalid, but they should not crash. -- Miroslav Lichvar