Pete Batard
2016-Feb-25  12:41 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
Hi Ady, On 2016.02.25 02:08, Ady via Syslinux wrote:> There is an "extra" sector, in comparison to... what exactly?Sorry if I wasn't clear. I think I implied that the Large FAT32 fat size had an extra sector compared to minfatsize, when of course I meant the opposite (the Large FAT32 has one less sector than the minfatsize computed by the unpatched code, hence the check fails). The additional sector I was talking about is from the (unpatched) minfatsize variable computed in libfat_open() when it is compared to the fatsize variable. If you use a Large FAT32 formatted drive, you will see that fatsize is always less than (unpatched) minfatsize by a difference of 1, and my understanding is that these values are expressed in number of FAT sectors. So I guess it'd probably be more accurate to say that what I found is that Large FAT32 formatted drives have one less sectors of FAT than SysLinux anticipates right now. To make this a bit more concrete, here is the debug output of a 100GB drive being formatted to Large FAT32 and then processed with the _PATCHED_ version of libfat_open(), with display of the relevant variables: ----------------------------------------------------------------------- Formatting (Large FAT32)... Opened drive \\?\Volume{00466cbf-0000-0000-0000-100000000000} for write access Size : 93.2 GB 195369519 sectors Cluster size 32768 bytes, 512 Bytes Per Sector Volume ID is 1311:f60 32 Reserved Sectors, 23843 Sectors per FAT, 2 FATs 3051903 Total clusters 3051902 Free Clusters Clearing out 47782 sectors for reserved sectors, FATs and root cluster... Initializing reserved sectors and FATs... FAT #0 sector at address: 32 FAT #1 sector at address: 23875 Writing partition boot record... Using Standard FAT32 partition boot record Confirmed new volume has a primary FAT32 boot sector Setting primary FAT32 boot sector for boot... Confirmed new volume has a secondary FAT32 boot sector Setting secondary FAT32 boot sector for boot... Setting Label (This may take while)... Format completed. Writing master boot record... (...) Installing Syslinux 6.03... Opened drive \\?\Volume{00466cbf-0000-0000-0000-100000000000} for write access nclusters = 3051903 fat_type = FAT28, LIBFAT_SECTOR_SHIFT = 9 minfatsize = 23843 (patched, after shift) fatsize = 23843 Successfully wrote Syslinux boot record ----------------------------------------------------------------------- If minfatsize wasn't patched (original Syslinux code) it would be 23844 due and the (minfatsize <= fatsize) check would fail.> For instance... Are there other formatting tools (in whichever OS) that > are capable of similar "large" FAT32 volumes (i.e. volume size >32GB, > "big" cluster size) that do not use this "extra" sector?I don't think it matters. Ridgecrop's Large FAT32 formatting algorithm is fairly popular, so I don't think we can go around telling people "Sorry, you'll have to use something else because we don't want to change what is essentially a sanity check in Syslinux..."> Is this "extra" sector compatible with any OS / diverse scenarios?Well, it certainly is compatible with Windows, DOS/FreeDOS and Linux installers, as this is what I use in Rufus for large drives, and so far I have not had a single report from anybody having trouble reading/writing to a Large FAT32 drive. As a side note, Rufus gets downloaded close to 2 million times a month, and I expect a significant percentage of these users to both have a drive large enough (>32GB) and a target that requires the use of FAT32 (Linux ISOs, DOS). Therefore, if the use of Ridegcrop's Large FAT32 was the source of any issues, I'd have had plenty of reports by now...> I guess that the most important matter would be to be sure that there > are no regressions introduced (i.e. that other FAT12/16/32 volumes > would work in Syslinux just as before this patch). Since I am not a > developer, I am not getting into such issues now.My understanding is that the minfatsize check is just a sanity check (i.e. we could drop it and it wouldn't hurt Syslinux that much - users might just get an error further down the line), and what we are doing by increasing minfatsize is simply making this check slightly more permissive. So there is literally no risk for a regression being introduced, as whatever passed that test before will still pass the test. The worst that could happen is that we may allow stuff that wouldn't have passed before. But, as the processing of Large FAT32 volumes demonstrates, that may not be a bad idea, as this check seems to be based on an understanding of FAT limitations that doesn't seem grounded in physical reality: it is most certainly possible to create valid FAT volumes, as far both Windows and Linux are concerned, for which the check doesn't hold. So I do believe that we should relax the check slightly, as the patch proposes. Regards, /Pete
Pete Batard
2016-Feb-25  12:46 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
On 2016.02.25 12:41, Pete Batard wrote:> and what we are doing by increasing minfatsizeI meant "decreasing" here of course... /Pete
> Hi Ady, > > On 2016.02.25 02:08, Ady via Syslinux wrote: > > There is an "extra" sector, in comparison to... what exactly? > > Sorry if I wasn't clear. I think I implied that the Large FAT32 fat size > had an extra sector compared to minfatsize, when of course I meant the > opposite (the Large FAT32 has one less sector than the minfatsize > computed by the unpatched code, hence the check fails). The additional > sector I was talking about is from the (unpatched) minfatsize variable > computed in libfat_open() when it is compared to the fatsize variable. > > If you use a Large FAT32 formatted drive, you will see that fatsize is > always less than (unpatched) minfatsize by a difference of 1, and my > understanding is that these values are expressed in number of FAT > sectors. So I guess it'd probably be more accurate to say that what I > found is that Large FAT32 formatted drives have one less sectors of FAT > than SysLinux anticipates right now. > > To make this a bit more concrete, here is the debug output of a 100GB > drive being formatted to Large FAT32 and then processed with the > _PATCHED_ version of libfat_open(), with display of the relevant variables: > > ----------------------------------------------------------------------- > Formatting (Large FAT32)... > Opened drive \\?\Volume{00466cbf-0000-0000-0000-100000000000} for write > access > Size : 93.2 GB 195369519 sectors > Cluster size 32768 bytes, 512 Bytes Per Sector > Volume ID is 1311:f60 > 32 Reserved Sectors, 23843 Sectors per FAT, 2 FATs > 3051903 Total clusters > 3051902 Free Clusters > Clearing out 47782 sectors for reserved sectors, FATs and root cluster... > Initializing reserved sectors and FATs... > FAT #0 sector at address: 32 > FAT #1 sector at address: 23875 > Writing partition boot record... > Using Standard FAT32 partition boot record > Confirmed new volume has a primary FAT32 boot sector > Setting primary FAT32 boot sector for boot... > Confirmed new volume has a secondary FAT32 boot sector > Setting secondary FAT32 boot sector for boot... > Setting Label (This may take while)... > Format completed. > Writing master boot record... > (...) > Installing Syslinux 6.03... > Opened drive \\?\Volume{00466cbf-0000-0000-0000-100000000000} for write > access > nclusters = 3051903 > fat_type = FAT28, LIBFAT_SECTOR_SHIFT = 9 > minfatsize = 23843 (patched, after shift) > fatsize = 23843 > Successfully wrote Syslinux boot record > ----------------------------------------------------------------------- > > If minfatsize wasn't patched (original Syslinux code) it would be 23844 > due and the (minfatsize <= fatsize) check would fail. > > > For instance... Are there other formatting tools (in whichever OS) that > > are capable of similar "large" FAT32 volumes (i.e. volume size >32GB, > > "big" cluster size) that do not use this "extra" sector? > > I don't think it matters. Ridgecrop's Large FAT32 formatting algorithm > is fairly popular, so I don't think we can go around telling people > "Sorry, you'll have to use something else because we don't want to > change what is essentially a sanity check in Syslinux..." > > > Is this "extra" sector compatible with any OS / diverse scenarios? > > Well, it certainly is compatible with Windows, DOS/FreeDOS and Linux > installers, as this is what I use in Rufus for large drives, and so far > I have not had a single report from anybody having trouble > reading/writing to a Large FAT32 drive. As a side note, Rufus gets > downloaded close to 2 million times a month, and I expect a significant > percentage of these users to both have a drive large enough (>32GB) and > a target that requires the use of FAT32 (Linux ISOs, DOS). Therefore, if > the use of Ridegcrop's Large FAT32 was the source of any issues, I'd > have had plenty of reports by now... > > > I guess that the most important matter would be to be sure that there > > are no regressions introduced (i.e. that other FAT12/16/32 volumes > > would work in Syslinux just as before this patch). Since I am not a > > developer, I am not getting into such issues now. > > My understanding is that the minfatsize check is just a sanity check > (i.e. we could drop it and it wouldn't hurt Syslinux that much - users > might just get an error further down the line), and what we are doing by > increasing minfatsize is simply making this check slightly more > permissive. So there is literally no risk for a regression being > introduced, as whatever passed that test before will still pass the test. > > The worst that could happen is that we may allow stuff that wouldn't > have passed before. But, as the processing of Large FAT32 volumes > demonstrates, that may not be a bad idea, as this check seems to be > based on an understanding of FAT limitations that doesn't seem grounded > in physical reality: it is most certainly possible to create valid FAT > volumes, as far both Windows and Linux are concerned, for which the > check doesn't hold. So I do believe that we should relax the check > slightly, as the patch proposes. > > Regards, > > /Pete >Please allow me to "report" the same example you have already provided, in a different way. It _might_ help us (or it might not). The following text is a kind of description of my calculations; I am hoping it will be understood. Please, do not accept the following statements and calculations as 100% valid; otherwise we have accomplished nothing. The whole point of this text is to evaluate the "real" minimal FAT size value. Whether Syslinux should accept smaller values than the "real" one, that's a matter for additional discussion. Values that we know: _ FAT type: 32 (i.e. Cluster allocation of 28 bits_per_FAT_entry, plus 4 unused bits_per_FAT_entry) _ Generic amount of Cluster limits for FAT32: 65'525 to 268'435'444 clusters Please note that these 2 limits are not 100% accurate for every scenario / implementation / OS, but they are the extreme ones that might be found in any case. This caveat does not affect our specific example, but it might be relevant for other volume's sizes. _ Bytes per Sector: 512 _ FAT Entries per Sector: 128 _ Reserved Sectors: 32 _ Volume's Total Sectors: 195'369'519 _ Sectors per Cluster: 64 ---> This is the reason for the creation of Ridgecrop's Large FAT32 formatting tool. _ Cluster size: 32 KiB per Cluster _ Amount of FATs: 2 _ Root Directory Sectors: 0 (please keep reading) When formatting FAT32, the Root Directory will be created, so the actual amount of Root Directory Sectors is not zero, but at_least_one. The reason I am writing here a value of zero is because, for FAT32, the Root Directory Sectors are not part of the "head" of the filesystem structure (as opposed to what happens in FAT12/16, both including the Root Directory Sectors as being "outside" or "before" the Data Area). So, the Root Directory is as "simple" as any other directory (or file) in FAT32 (i.e. there is nothing special about it), other than the fact that it is created when formatting the FAT32 volume. Since the minimal space that can be allocated in FAT is one entire Cluster, then the Root Directory being created will occupy one Cluster: 64 Sectors in our example. _ The first Cluster of the Data Area corresponds to FAT_entry #2 (as FAT_entries #0 and #1 are reserved). The above values are the ones we are taking as "known". Now, for simplicity, I am going to take the value of Sectors_per_FAT that Ridgecrop's Large FAT32 formatting tool used for our example: Sectors per FAT: 23843 With this Sectors_per_FAT value, the corresponding Maximum FAT entries: 23843 * 128 = 3'051'904 Since the first 2 FAT entries are reserved, then the corresponding Maximum Amount of Clusters: 3'051'904 - 2 = 3'051'902 The amount of Sectors in the Data Area corresponding to such amount of Clusters is: Maximum Amount of "Allocatable Sectors" (please allow me to use such uncommon expression, for brevity): 3'051'902 * 64 = 195'321'728 So we have, for 23843 Sectors_per_FAT in our example: 32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors When comparing this value with the 195'369'519 Volume's Total Sectors: 195'369'519 - 195'369'446 = 73 Sectors This means that with 23843 Sectors_per_FAT in our FAT32 volume, we would have 73 unused / unusable sectors.>From the point of view of FAT32, this situation (i.e. unusable sectors)is valid, as long as the values being used in the relevant fields of the boot sector(s) are "correct" (i.e. do not allow allocation of data on the 73 unusable sectors of the volume). If, for some reason, data would end up being saved to the unusable area, the data would be lost (from the filesystem point of view) and the filesystem might need some repairing action. Next, let's take one additional Sector_per_FAT for our example, and repeat the calculations with it. Sectors per FAT: 23844 ( = 23843 + 1 ) With this Sectors_per_FAT value, the corresponding Maximum FAT entries: 23844 * 128 = 3'052'032 Maximum Amount of Clusters: 3'052'032 - 2 = 3'052'030 Maximum Amount of "Allocatable Sectors": 3'052'030 * 64 = 195'329'920 So we have, for 23844 Sectors_per_FAT in our example: 32 + 23844 * 2 + 195'329'920 = 195'377'640 Sectors When comparing this value with the 195'369'519 Volume's Total Sectors: 195'369'519 - 195'377'640 = - 8'121 Sectors This means that with 23844 Sectors_per_FAT in our FAT32 volume, we would be able to allocate the whole Data Area. Also, with 23844 Sectors_per_FAT, we could potentially have a bigger FAT32 volume (if the device's size permits). This situation is valid from the point of view of the FAT32 filesystem, and in fact it is a common case for FAT volumes. In other words, the normal situation would be for a FAT32 volume to be formatted with a value of Sectors_per_FAT that would allow the allocation of _at least_ the _whole_ Data Area. The situation of having a value of Sectors_per_FAT that would not be enough to cover the whole Data Area is uncommon but not invalid, as long as the relevant fields in the boot sector(s) would not allow allocation of data in the unusable (unallocatable) sectors. Whether every-and-any implementation of FAT (or, every-and-any OS in which Syslinux might be relevant) correctly deals with this less-common situation, I do not know. Now, to our main question regarding this patch. Should Syslinux care about the validity of the Sectors_per_FAT value? Should Syslinux allow this less-common situation? Should SYSLINUX's boot code allow it / refuse it / care about? And, are there any edge cases to be carefully evaluated (e.g. close to the lower and upper limits of the Sectors_per_FAT values for FAT12/16/32). Regards, Ady.> _______________________________________________ > Syslinux mailing list > Submissions to Syslinux at zytor.com > Unsubscribe or set options at: > http://www.zytor.com/mailman/listinfo/syslinux >
Pete Batard
2016-Feb-26  00:59 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
Hi Ady,
Your insightful post prompted me to to a little bit more digging as to 
how the Ridgecrop algorithm computed its FAT size, with the result of my 
investigations presented below.
NB: For those who don't want to go through this whole part, there's a 
TL;DR near the end.
For reference, the computation of the FAT size all done in the 
GetFATSizeSectors(), the code of which is at [1] (Rufus' version hasn't 
been altered from Ridegcrop's). When this code is ran against the same 
disk and same parameters as we've been using for our example, and with 
some debugging of the variables enabled, you'll see the following output:
-----------------------------------------------------------------------
DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
BytesPerSect = 512
Numerator = 4 * (195369519 - 32) = 781477948
Denominator = (64 * 512) + (4 * 2) = 32776
FatSz = (781477948 / 32776) + 1 = 23843
-----------------------------------------------------------------------
Now, the algorithm mentions that this computation is based on a formula 
by Rune Moeller Barnkob from [2], but that page no longer exists. 
Digging around seems to provides a copy of the same content at [3] 
however (where we find that the original page dealt with FAT16), with 
the part of interest to us being the one with the computation, starting at:
-----------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Ds is the size of a entry (=32 bytes)
(...)
-----------------------------------------------------------------------
Now, even though this is a FAT16 computation, I think I was able to work 
out how Tom Thornhill of Ridgecrop got his FAT32 equivalent, which was 
probably as follows:
-----------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Fe is the FAT element size
Nf is the number of FATs
The size of the FAT must equal the free amount of sectors divided by the 
cluster size in sectors multiplied by the FAT element size divided by 
the sector size (the rounding up can be done later).
          Fo * Fe
     Fs = -------
          Cs * Ss
The free amount of sectors must be the total amount minus the FAT's and 
the Reserved Sectors. We'll assume the Root Directory takes no space, 
which is something that Ady mentioned and is safe to do anyway, as this 
will maximize the amount of free sectors we consider in our computation.
     Fo = To - (Nf * Fs) - Rs
If we solve that:
          Fe * ( To - (Nf * Fs) - Rs )
     Fs = ----------------------------
                  Cs * Ss
     Fs * (Cs * Ss)  =  (Fe * To) - (Fe * Nf * Fs) - (Fe * Rs)
     (Fs * Cs * Ss) + (Fs * Fe * Nf)  =  (Fe * To) - (Fe * Rs)
     Fs * ( (Cs * Ss) + (Fe * Nf))  =  Fe * (To - Rs)
              Fe * (To - Rs)
     Fs = ---------------------
          (Cs * Ss) + (Fe * Nf)
We end up with the Numerator and Denominator used by GetFATSizeSectors():
Numerator = FatElementSize * (DskSize - ReservedSecCnt)
Denominator = (SecPerClus * BytesPerSect) + (FatElementSize * NumFATs)
Then +1 is added to the FAT Size for rounding.
-----------------------------------------------------------------------
However, as Ady demonstrated, this computation doesn't actually work 
because it leaves unaddressable sectors.
So let us instead try to follow Ady's post, to derive a proper 
algorithm. First, let me quote the relevant part:
On 2016.02.25 20:49, Ady via Syslinux wrote:> _ Bytes per Sector: 512
> _ FAT Entries per Sector: 128
> _ Reserved Sectors: 32
> _ Volume's Total Sectors: 195'369'519
> _ Sectors per Cluster: 64
> _ Amount of FATs: 2
> _ Root Directory Sectors: 0 (please keep reading)
> Sectors per FAT: 23843
>
> With this Sectors_per_FAT value, the corresponding
>
> Maximum FAT entries:
>   23843 * 128 = 3'051'904
>
> Since the first 2 FAT entries are reserved, then the corresponding
>
> Maximum Amount of Clusters:
>   3'051'904 - 2 = 3'051'902
>
> The amount of Sectors in the Data Area corresponding to such amount of
> Clusters is:
>
> Maximum Amount of "Allocatable Sectors" (please allow me to use
such
> uncommon expression, for brevity):
>   3'051'902 * 64 = 195'321'728
>
> So we have, for 23843 Sectors_per_FAT in our example:
>   32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors
>
> When comparing this value with the 195'369'519 Volume's Total
Sectors:
>   195'369'519 - 195'369'446 = 73 Sectors
>
> This means that with 23843 Sectors_per_FAT in our FAT32 volume, we
> would have 73 unused / unusable sectors.
I'm going to use the same variable names as previously, with just an 
additional intermediate one added:
------------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Fe is the FAT element size
Nf is the number of FATs
MaxFE is the maximum number of FAT entries
As with the post above, we'll start with that last variable, since it is 
the one that is crucial to getting our computation right:
    MaxFatEn = Fs * Ss / Fe
Now, if we follow Ady's post to compute the total number of sectors 
addressable, we want to have that number greater than the number of 
sectors reported for the volume, hence:
    (MaxFatEn - Nf) * Cs + Nf * Fs + Rs >= To
Let's replace MaxFatEn:
    ((Fs * Ss / Fe) - Nf) * Cs + Nf * Fs + Rs >= To
Now of course, we want to isolate the FAT Size (Fs) since that's what 
we're after:
    (Fs * Ss * Cs / Fe) - (Nf * Cs) + (Fs * Nf) + Rs >= To
    Fs * (Ss * Cs / Fe + Nf) >= To - Rs + (Nf * Cs)
    Fs >= (To - Rs + Nf * Cs) / (Ss * Cs / Fe + Nf)
Thus we can finally get a formula for Fs that satisfies the above:
    Fs = (To - Rs + Nf * Cs)  / ((Ss * Cs / Fe) + Nf) + 1
------------------------------------------------------------------------
That's quite different from the earlier formula.
However, it *does* yield the expected result of 23844, instead of 23843:
------------------------------------------------------------------------
DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
BytesPerSect = 512
Numerator = 195369519 * 32 + 2 * 64 = 195369615
Denominator = 64 * 512 / 4 * 2 = 8194
FatSz = (195369615 / 8194) + 1 = 23844
------------------------------------------------------------------------
*TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, 
whether justified or not, the existing Syslinux check does catch FATs 
that are missing addressable sectors.
I have now tested the new computation against a 320GB and 1TB drive, and 
found that the original minfatsize check of Syslinux is no longer an issue.
This being said, and to address Ady's subsequent point:
While I can now address the issue in Rufus (and will contact Tom 
Thornhill of Ridgecrop to let him know about both the issue & fix), I 
suspect there are users out there who are using and will continue to use 
fat32format.exe with the bad computation algorithm, as well as other 
developers who might lift the existing Large FAT32 format code without 
realizing that doing so will break Syslinux installation. So it may 
still be worth relaxing the check especially if, as Ady pointed out, not 
having all sectors addressable doesn't make a disk any less valid.
Regards,
/Pete
[1] 
https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377
[2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html
[3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html