In stream_decoder.c function find_metadata_() checks whether a file is valid or not. There are 4 cases it recognizes: 1) file begins with 'fLaC' 2) file begins with ID3 (skipped), followed by 'fLaC' 3) file may begin with 11111111 111110?? sync code (or 11111111111110, depends on endianess i suppose). That is - a raw file with FLAC frames, without header (right?). 4) file begins with ID3 (again - skipped), followed by the same sync code as in case 3 find_metadata_() accepts all 4 types of files. I don't know what was the intention behind accepting case 3, but case 4 ALSO matches mp3 files - they have ID3 at the beginning and FLAC-like two-byte sync code 11111111 11111011 afterwards. Sequence is slightly different, but when it is read as two separate bytes and the second byte is SHR'ed twice to form 00111110, it matches. Because of this FLAC decoder initially accepts an mp3 file as a valid FLAC file. Of course, once decoding starts, decoder throws an error and discards the file (at least i think it does, i never reached that far myself). Unfortunately, some applications using libFLAC may assume that lack of errors after FLAC__stream_decoder_process_until_end_of_metadata() indicates that header is read and stream information (channel number, for example) is available. Would you kindly check the find_metadata_() function to make sure it only accepts real FLAC sync codes, not the MP3 ones? Or at least mention in the documentation that FLAC__STREAM_DECODER_READ_FRAME status doesn't means that stream info is available.
On 06.11.2008 22:16, LRN wrote:> In stream_decoder.c function find_metadata_() checks whether a file is > valid or not. There are 4 cases it recognizes: > 1) file begins with 'fLaC' > 2) file begins with ID3 (skipped), followed by 'fLaC' > 3) file may begin with 11111111 111110?? sync code (or 11111111111110, > depends on endianess i suppose). That is - a raw file with FLAC > frames, without header (right?). > 4) file begins with ID3 (again - skipped), followed by the same sync > code as in case 3 > > find_metadata_() accepts all 4 types of files. > > I don't know what was the intention behind accepting case 3, but case > 4 ALSO matches mp3 files - they have ID3 at the beginning and > FLAC-like two-byte sync code 11111111 11111011 afterwards. Sequence is > slightly different, but when it is read as two separate bytes and the > second byte is SHR'ed twice to form 00111110, it matches. > Because of this FLAC decoder initially accepts an mp3 file as a valid > FLAC file. Of course, once decoding starts, decoder throws an error > and discards the file (at least i think it does, i never reached that > far myself). > Unfortunately, some applications using libFLAC may assume that lack of > errors after FLAC__stream_decoder_process_until_end_of_metadata() > indicates that header is read and stream information (channel number, > for example) is available. > > Would you kindly check the find_metadata_() function to make sure it > only accepts real FLAC sync codes, not the MP3 ones? Or at least > mention in the documentation that FLAC__STREAM_DECODER_READ_FRAME > status doesn't means that stream info is available.Further information: FLAC header: 11111111 111110AB and so on A 0 - mandatory 1 - reserved B 0 - fixed-blocksize 1 - variable-blocksize CCCC 0000 - reserved 0001 - 192 samples 0010-0101 - 576 * (2^(n-2)) samples 0110 - get 8 bit (blocksize-1) from end of header 0111 - get 16 bit (blocksize-1) from end of header 1000-1111 - 256 * (2^(n-8)) samples MP3: 11111111 111BBCCD and so on BB 00 - MPEG 2.5 01 - reserved 10 - MPEG 2 11 - MPEG 1 CC 00 - reserved 01 - layer 3 10 - layer 2 11 - layer 1 D 0 - with CRC 1 - without CRC The best way to distinguish FLAC header from MP3 header is to look at "0A" (FLAC) or "CC" (MP3) bit pair. In FLAC it must be 00 (first 0 is from magic number second 0 is from reserved bit that should always be 0) In MP3 it must not be 00 (00 is reserved). Suggested code change is from: else if(x >> 2 == 0x3e) { /* MAGIC NUMBER for the last 6 sync bits */ decoder->private_->header_warmup[1] = (FLAC__byte)x; decoder->protected_->state = FLAC__STREAM_DECODER_READ_FRAME; return true; } to else if(x >> 1 == 0x7c) { /* MAGIC NUMBER for the last 6 sync bits and reserved 7th bit*/ decoder->private_->header_warmup[1] = (FLAC__byte)x; decoder->protected_->state = FLAC__STREAM_DECODER_READ_FRAME; return true; }
--- LRN <lrn1986 at gmail.com> wrote:> On 06.11.2008 22:16, LRN wrote: > > In stream_decoder.c function find_metadata_() checks whether a file > is > > valid or not. There are 4 cases it recognizes: > > 1) file begins with 'fLaC' > > 2) file begins with ID3 (skipped), followed by 'fLaC' > > 3) file may begin with 11111111 111110?? sync code (or > 11111111111110, > > depends on endianess i suppose). That is - a raw file with FLAC > > frames, without header (right?). > > 4) file begins with ID3 (again - skipped), followed by the same > sync > > code as in case 3 > > > > find_metadata_() accepts all 4 types of files. > > > > I don't know what was the intention behind accepting case 3, but > case > > 4 ALSO matches mp3 files - they have ID3 at the beginning and > > FLAC-like two-byte sync code 11111111 11111011 afterwards. Sequence > is > > slightly different, but when it is read as two separate bytes and > the > > second byte is SHR'ed twice to form 00111110, it matches. > > Because of this FLAC decoder initially accepts an mp3 file as a > valid > > FLAC file. Of course, once decoding starts, decoder throws an error > > > and discards the file (at least i think it does, i never reached > that > > far myself). > > Unfortunately, some applications using libFLAC may assume that lack > of > > errors after FLAC__stream_decoder_process_until_end_of_metadata() > > indicates that header is read and stream information (channel > number, > > for example) is available. > > > > Would you kindly check the find_metadata_() function to make sure > it > > only accepts real FLAC sync codes, not the MP3 ones? Or at least > > mention in the documentation that FLAC__STREAM_DECODER_READ_FRAME > > status doesn't means that stream info is available. > Further information: > > FLAC header: > 11111111 111110AB and so on > > A > 0 - mandatory > 1 - reserved > B > 0 - fixed-blocksize > 1 - variable-blocksize > CCCC > 0000 - reserved > 0001 - 192 samples > 0010-0101 - 576 * (2^(n-2)) samples > 0110 - get 8 bit (blocksize-1) from end of header > 0111 - get 16 bit (blocksize-1) from end of header > 1000-1111 - 256 * (2^(n-8)) samples > > MP3: > 11111111 111BBCCD and so on > > BB > 00 - MPEG 2.5 > 01 - reserved > 10 - MPEG 2 > 11 - MPEG 1 > CC > 00 - reserved > 01 - layer 3 > 10 - layer 2 > 11 - layer 1 > D > 0 - with CRC > 1 - without CRC > > The best way to distinguish FLAC header from MP3 header is to look at > > "0A" (FLAC) or "CC" (MP3) bit pair. > In FLAC it must be 00 (first 0 is from magic number second 0 is from > reserved bit that should always be 0) > In MP3 it must not be 00 (00 is reserved). > > Suggested code change is > from: > else if(x >> 2 == 0x3e) { /* MAGIC NUMBER for the last 6 > > sync bits */ > decoder->private_->header_warmup[1] = (FLAC__byte)x; > decoder->protected_->state = > FLAC__STREAM_DECODER_READ_FRAME; > return true; > } > to > else if(x >> 1 == 0x7c) { /* MAGIC NUMBER for the last 6 > > sync bits and reserved 7th bit*/ > decoder->private_->header_warmup[1] = (FLAC__byte)x; > decoder->protected_->state = > FLAC__STREAM_DECODER_READ_FRAME; > return true; > }I like this. it's kind of a lucky coincidence that the CC bit patterns are exclusive between the 2 formats. I'll make the fix and also add a note about the reserved bit to the format docs.