Pete Batard
2016-Feb-26 00:59 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
Hi Ady, Your insightful post prompted me to to a little bit more digging as to how the Ridgecrop algorithm computed its FAT size, with the result of my investigations presented below. NB: For those who don't want to go through this whole part, there's a TL;DR near the end. For reference, the computation of the FAT size all done in the GetFATSizeSectors(), the code of which is at [1] (Rufus' version hasn't been altered from Ridegcrop's). When this code is ran against the same disk and same parameters as we've been using for our example, and with some debugging of the variables enabled, you'll see the following output: ----------------------------------------------------------------------- DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, BytesPerSect = 512 Numerator = 4 * (195369519 - 32) = 781477948 Denominator = (64 * 512) + (4 * 2) = 32776 FatSz = (781477948 / 32776) + 1 = 23843 ----------------------------------------------------------------------- Now, the algorithm mentions that this computation is based on a formula by Rune Moeller Barnkob from [2], but that page no longer exists. Digging around seems to provides a copy of the same content at [3] however (where we find that the original page dealt with FAT16), with the part of interest to us being the one with the computation, starting at: ----------------------------------------------------------------------- Assume: To is the total amount of sectors, Fo is the amount of free sectors for data Fs is the size of one FAT in sectors Cs is the cluster size Ss is the sector size Rs is the reserved sectors before the FAT's Re is the entries in the root-directory Ds is the size of a entry (=32 bytes) (...) ----------------------------------------------------------------------- Now, even though this is a FAT16 computation, I think I was able to work out how Tom Thornhill of Ridgecrop got his FAT32 equivalent, which was probably as follows: ----------------------------------------------------------------------- Assume: To is the total amount of sectors, Fo is the amount of free sectors for data Fs is the size of one FAT in sectors Cs is the cluster size Ss is the sector size Rs is the reserved sectors before the FAT's Re is the entries in the root-directory Fe is the FAT element size Nf is the number of FATs The size of the FAT must equal the free amount of sectors divided by the cluster size in sectors multiplied by the FAT element size divided by the sector size (the rounding up can be done later). Fo * Fe Fs = ------- Cs * Ss The free amount of sectors must be the total amount minus the FAT's and the Reserved Sectors. We'll assume the Root Directory takes no space, which is something that Ady mentioned and is safe to do anyway, as this will maximize the amount of free sectors we consider in our computation. Fo = To - (Nf * Fs) - Rs If we solve that: Fe * ( To - (Nf * Fs) - Rs ) Fs = ---------------------------- Cs * Ss Fs * (Cs * Ss) = (Fe * To) - (Fe * Nf * Fs) - (Fe * Rs) (Fs * Cs * Ss) + (Fs * Fe * Nf) = (Fe * To) - (Fe * Rs) Fs * ( (Cs * Ss) + (Fe * Nf)) = Fe * (To - Rs) Fe * (To - Rs) Fs = --------------------- (Cs * Ss) + (Fe * Nf) We end up with the Numerator and Denominator used by GetFATSizeSectors(): Numerator = FatElementSize * (DskSize - ReservedSecCnt) Denominator = (SecPerClus * BytesPerSect) + (FatElementSize * NumFATs) Then +1 is added to the FAT Size for rounding. ----------------------------------------------------------------------- However, as Ady demonstrated, this computation doesn't actually work because it leaves unaddressable sectors. So let us instead try to follow Ady's post, to derive a proper algorithm. First, let me quote the relevant part: On 2016.02.25 20:49, Ady via Syslinux wrote:> _ Bytes per Sector: 512 > _ FAT Entries per Sector: 128 > _ Reserved Sectors: 32 > _ Volume's Total Sectors: 195'369'519 > _ Sectors per Cluster: 64 > _ Amount of FATs: 2 > _ Root Directory Sectors: 0 (please keep reading)> Sectors per FAT: 23843 > > With this Sectors_per_FAT value, the corresponding > > Maximum FAT entries: > 23843 * 128 = 3'051'904 > > Since the first 2 FAT entries are reserved, then the corresponding > > Maximum Amount of Clusters: > 3'051'904 - 2 = 3'051'902 > > The amount of Sectors in the Data Area corresponding to such amount of > Clusters is: > > Maximum Amount of "Allocatable Sectors" (please allow me to use such > uncommon expression, for brevity): > 3'051'902 * 64 = 195'321'728 > > So we have, for 23843 Sectors_per_FAT in our example: > 32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors > > When comparing this value with the 195'369'519 Volume's Total Sectors: > 195'369'519 - 195'369'446 = 73 Sectors > > This means that with 23843 Sectors_per_FAT in our FAT32 volume, we > would have 73 unused / unusable sectors.I'm going to use the same variable names as previously, with just an additional intermediate one added: ------------------------------------------------------------------------ Assume: To is the total amount of sectors, Fo is the amount of free sectors for data Fs is the size of one FAT in sectors Cs is the cluster size Ss is the sector size Rs is the reserved sectors before the FAT's Re is the entries in the root-directory Fe is the FAT element size Nf is the number of FATs MaxFE is the maximum number of FAT entries As with the post above, we'll start with that last variable, since it is the one that is crucial to getting our computation right: MaxFatEn = Fs * Ss / Fe Now, if we follow Ady's post to compute the total number of sectors addressable, we want to have that number greater than the number of sectors reported for the volume, hence: (MaxFatEn - Nf) * Cs + Nf * Fs + Rs >= To Let's replace MaxFatEn: ((Fs * Ss / Fe) - Nf) * Cs + Nf * Fs + Rs >= To Now of course, we want to isolate the FAT Size (Fs) since that's what we're after: (Fs * Ss * Cs / Fe) - (Nf * Cs) + (Fs * Nf) + Rs >= To Fs * (Ss * Cs / Fe + Nf) >= To - Rs + (Nf * Cs) Fs >= (To - Rs + Nf * Cs) / (Ss * Cs / Fe + Nf) Thus we can finally get a formula for Fs that satisfies the above: Fs = (To - Rs + Nf * Cs) / ((Ss * Cs / Fe) + Nf) + 1 ------------------------------------------------------------------------ That's quite different from the earlier formula. However, it *does* yield the expected result of 23844, instead of 23843: ------------------------------------------------------------------------ DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, BytesPerSect = 512 Numerator = 195369519 * 32 + 2 * 64 = 195369615 Denominator = 64 * 512 / 4 * 2 = 8194 FatSz = (195369615 / 8194) + 1 = 23844 ------------------------------------------------------------------------ *TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, whether justified or not, the existing Syslinux check does catch FATs that are missing addressable sectors. I have now tested the new computation against a 320GB and 1TB drive, and found that the original minfatsize check of Syslinux is no longer an issue. This being said, and to address Ady's subsequent point: While I can now address the issue in Rufus (and will contact Tom Thornhill of Ridgecrop to let him know about both the issue & fix), I suspect there are users out there who are using and will continue to use fat32format.exe with the bad computation algorithm, as well as other developers who might lift the existing Large FAT32 format code without realizing that doing so will break Syslinux installation. So it may still be worth relaxing the check especially if, as Ady pointed out, not having all sectors addressable doesn't make a disk any less valid. Regards, /Pete [1] https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377 [2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html [3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html
In the following text, I am about to use terms such as "inaccurate". I don't mean to question what some code does, but rather to compare the expressions against what I think is a more accurate one, in theory. I mean no disrespect, and I am not saying that developers are doing the wrong thing. In addition, of course I could be wrong (or type in incorrectly, or some formatting issue could appear in some email client, or...). Although Pete arrived to _similar_ conclusions than I did, and he indeed wrote his conclusions at the end of his email, I am about to purposely comment throughout the original logic. By doing it this way I am hoping to communicate my thoughts in a clearer manner (fingers crossed :).> Hi Ady, > > Your insightful post prompted me to to a little bit more digging as to > how the Ridgecrop algorithm computed its FAT size, with the result of my > investigations presented below. > > NB: For those who don't want to go through this whole part, there's a > TL;DR near the end. > > For reference, the computation of the FAT size all done in the > GetFATSizeSectors(), the code of which is at [1] (Rufus' version hasn't > been altered from Ridegcrop's). When this code is ran against the same > disk and same parameters as we've been using for our example, and with > some debugging of the variables enabled, you'll see the following output: > > ----------------------------------------------------------------------- > DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, > BytesPerSect = 512 > > Numerator = 4 * (195369519 - 32) = 781477948 > Denominator = (64 * 512) + (4 * 2) = 32776 > FatSz = (781477948 / 32776) + 1 = 23843 > ----------------------------------------------------------------------- > > Now, the algorithm mentions that this computation is based on a formula > by Rune Moeller Barnkob from [2], but that page no longer exists. > Digging around seems to provides a copy of the same content at [3] > however (where we find that the original page dealt with FAT16), with > the part of interest to us being the one with the computation, starting at: > > ----------------------------------------------------------------------- > Assume: > To is the total amount of sectors, > Fo is the amount of free sectors for data > Fs is the size of one FAT in sectors > Cs is the cluster size > Ss is the sector size > Rs is the reserved sectors before the FAT's > Re is the entries in the root-directory > Ds is the size of a entry (=32 bytes) > (...) > ----------------------------------------------------------------------- > > Now, even though this is a FAT16 computation, I think I was able to work > out how Tom Thornhill of Ridgecrop got his FAT32 equivalent, which was > probably as follows: > > ----------------------------------------------------------------------- > Assume: > To is the total amount of sectors, > Fo is the amount of free sectors for data > Fs is the size of one FAT in sectors > Cs is the cluster size > Ss is the sector size > Rs is the reserved sectors before the FAT's > Re is the entries in the root-directory > Fe is the FAT element size > Nf is the number of FATs > > The size of the FAT must equal the free amount of sectors divided by the > cluster size in sectors multiplied by the FAT element size divided by > the sector size (the rounding up can be done later). > > Fo * Fe > Fs = ------- > Cs * Ss >This is an inaccurate calculation. First, the formula for FAT32 should be: Fs >= ( Fo / Cs + 2 ) * Fe / Ss or: Fs * Ss / Fe >= Fo / Cs + 2 Just in case of any doubt, I'll repeat it while abusing of parentheses (unneeded in math, but perhaps needed in some code): Fs * Ss / Fe >= ( Fo / Cs ) + 2 Note the ">=". In fact ">" is the common case and "=" would be a rare case (yet, desirable). The uncommon case would be "<", resulting in unallocatable (unaddressable) sectors (which is the issue we are discussing).> The free amount of sectors must be the total amount minus the FAT's and > the Reserved Sectors. We'll assume the Root Directory takes no space,Again, inaccurate. First, the ">=" I mentioned before; so the "must" in the above sentence would be "too strong". Second, the sentence should clarify that it refers to FAT32 only, since FAT12/16 would be different. This difference is part of the problem, particularly around the edges of: _ accepted valid values for total amount of Clusters; _ accepted valid values for Sectors_per_FAT; _ accepted valid values for the type of FAT (12/16/32); and, _ the relation between all of the above. So, for FAT32, the potential maximum amount of Sectors for Data would be: Fo = To - ( Nf * Fs ) - Rs (This formula is the same as Pete wrote it; I commented only about the sentence explaining the corresponding formula.) or: Fo = To - Rs - ( Nf * Fs ) As I said, that's the _potential maximum addressable_ value for FAT32 (only). Solving the two formulas for Fs in FAT32: { Fo = To - Rs - ( Nf * Fs ) } { Fs * Ss / Fe >= ( Fo / Cs ) + 2 } we have the resulting: ( To - Rs ) + ( 2 * Cs ) Fs >= __________________________ ( Ss * Cs / Fe ) + Nf Please note that Fs must be a positive integer and that there are additional restrictions to its resulting value. For instance, if its value is "too-low", then the filesystem should not be FAT32 but rather FAT16 or FAT12, considering a similar restriction for the amount of Clusters that actually defines the type of FAT (12/16/32). Please do not forget that these calculations we are presenting here are for FAT32 only. Since we are discussing the minimal values for FAT32, then we should also consider that an amount of Clusters that is around the lower limit of FAT32 might trigger the need to change to FAT16 (instead of FAT32), and then the calculations (and the restrictions to the amount of Clusters) are different (because the Root Directory in FAT12/16 is not just a "simple" Directory in the Data Area).> which is something that Ady mentioned and is safe to do anyway, as this > will maximize the amount of free sectors we consider in our computation. > > Fo = To - (Nf * Fs) - Rs > > If we solve that: > > Fe * ( To - (Nf * Fs) - Rs ) > Fs = ---------------------------- > Cs * Ss > > Fs * (Cs * Ss) = (Fe * To) - (Fe * Nf * Fs) - (Fe * Rs) > > (Fs * Cs * Ss) + (Fs * Fe * Nf) = (Fe * To) - (Fe * Rs) > > Fs * ( (Cs * Ss) + (Fe * Nf)) = Fe * (To - Rs) > > Fe * (To - Rs) > Fs = --------------------- > (Cs * Ss) + (Fe * Nf) > > We end up with the Numerator and Denominator used by GetFATSizeSectors(): > > Numerator = FatElementSize * (DskSize - ReservedSecCnt) > Denominator = (SecPerClus * BytesPerSect) + (FatElementSize * NumFATs) > > Then +1 is added to the FAT Size for rounding. > ----------------------------------------------------------------------- > > However, as Ady demonstrated, this computation doesn't actually work > because it leaves unaddressable sectors. > > > So let us instead try to follow Ady's post, to derive a proper > algorithm. First, let me quote the relevant part: > > On 2016.02.25 20:49, Ady via Syslinux wrote: > > _ Bytes per Sector: 512 > > _ FAT Entries per Sector: 128 > > _ Reserved Sectors: 32 > > _ Volume's Total Sectors: 195'369'519 > > _ Sectors per Cluster: 64 > > _ Amount of FATs: 2 > > _ Root Directory Sectors: 0 (please keep reading) > > > Sectors per FAT: 23843 > > > > With this Sectors_per_FAT value, the corresponding > > > > Maximum FAT entries: > > 23843 * 128 = 3'051'904 > > > > Since the first 2 FAT entries are reserved, then the corresponding > > > > Maximum Amount of Clusters: > > 3'051'904 - 2 = 3'051'902 > > > > The amount of Sectors in the Data Area corresponding to such amount of > > Clusters is: > > > > Maximum Amount of "Allocatable Sectors" (please allow me to use such > > uncommon expression, for brevity): > > 3'051'902 * 64 = 195'321'728 > > > > So we have, for 23843 Sectors_per_FAT in our example: > > 32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors > > > > When comparing this value with the 195'369'519 Volume's Total Sectors: > > 195'369'519 - 195'369'446 = 73 Sectors > > > > This means that with 23843 Sectors_per_FAT in our FAT32 volume, we > > would have 73 unused / unusable sectors. > > I'm going to use the same variable names as previously, with just an > additional intermediate one added: > > ------------------------------------------------------------------------ > Assume: > To is the total amount of sectors, > Fo is the amount of free sectors for data > Fs is the size of one FAT in sectors > Cs is the cluster size > Ss is the sector size > Rs is the reserved sectors before the FAT's > Re is the entries in the root-directory > Fe is the FAT element size > Nf is the number of FATs > MaxFE is the maximum number of FAT entries > > As with the post above, we'll start with that last variable, since it is > the one that is crucial to getting our computation right: > > MaxFatEn = Fs * Ss / Fe > > Now, if we follow Ady's post to compute the total number of sectors > addressable, we want to have that number greater than the number of > sectors reported for the volume, hence: > > (MaxFatEn - Nf) * Cs + Nf * Fs + Rs >= To > > Let's replace MaxFatEn: > > ((Fs * Ss / Fe) - Nf) * Cs + Nf * Fs + Rs >= To > > Now of course, we want to isolate the FAT Size (Fs) since that's what > we're after: > > (Fs * Ss * Cs / Fe) - (Nf * Cs) + (Fs * Nf) + Rs >= To > > Fs * (Ss * Cs / Fe + Nf) >= To - Rs + (Nf * Cs) > > Fs >= (To - Rs + Nf * Cs) / (Ss * Cs / Fe + Nf) > > Thus we can finally get a formula for Fs that satisfies the above: > > Fs = (To - Rs + Nf * Cs) / ((Ss * Cs / Fe) + Nf) + 1I believe such formula is slightly inaccurate too. My resulting inequation is ( To - Rs ) + ( 2 * Cs ) Fs >= __________________________ ( Ss * Cs / Fe ) + Nf Note: the difference starts at the equation of Pete's "MaxFatEn", which is inaccurate. My inequation can adequately be transformed into an equation by means of a "roundup" or "ceiling" (or the "modulo" operation if you want) - let's not get into the negative numbers matters and strict functions' definitions here - instead of _always_ adding "+1" (which would be incorrect and inefficient from the point of view of the resulting allocatable size). In my inequation, when the resulting calculation of (the potential) Fs is (already) an integer, then we should not need to add "+1"; it would not be completely invalid (as per the inequation), but it would be unnecessary. (Although, please note that when we try to obtain some kind of sector alignment, we could potentially increase the Fs value as one of the possible techniques, generally speaking. There are other, possibly more efficient techniques, but increasing the Fs value is indeed permitted). In my inequation, if the resulting calculation of (the potential) Fs happens to not be an integer - I don't really recall whether such cases exist for FAT32 and I have not attempted to find out at this time - then the next integer up would be the minimal Fs value that could cover the whole potential Data Area. Again, the "+1" would not be accurate for such case. As per actual code, "roundup", "ceiling", "modulo", "integer", "quotient", "remainder"... are some of the operations or functions that come to mind (not "+1"). Sure, potential "division by zero" and "undefined operation" (and the treatment of negative values, and...) should be considered in the code, but that's out of my league as I am not a developer. Once the Fs value gets defined, the actual resulting values can be (re)calculated: _ Max potential Data Area; _ Actual addressable Data Area; _ Potential unaddressable sectors for the defined Fs value; We could even calculate a "minimal" potential Data Area: a lower amount of sectors than this minimum and we could potentially use a lower Fs value to address the whole "new" Data Area.> ------------------------------------------------------------------------ > > That's quite different from the earlier formula. > > However, it *does* yield the expected result of 23844, instead of 23843: > > ------------------------------------------------------------------------ > DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, > BytesPerSect = 512 > Numerator = 195369519 * 32 + 2 * 64 = 195369615 > Denominator = 64 * 512 / 4 * 2 = 8194 > FatSz = (195369615 / 8194) + 1 = 23844 > ------------------------------------------------------------------------ > > > *TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, > whether justified or not, the existing Syslinux check does catch FATs > that are missing addressable sectors. > > > I have now tested the new computation against a 320GB and 1TB drive, and > found that the original minfatsize check of Syslinux is no longer an issue. > > This being said, and to address Ady's subsequent point: > > While I can now address the issue in Rufus (and will contact Tom > Thornhill of Ridgecrop to let him know about both the issue & fix), I > suspect there are users out there who are using and will continue to use > fat32format.exe with the bad computation algorithm, as well as other > developers who might lift the existing Large FAT32 format code without > realizing that doing so will break Syslinux installation. So it may > still be worth relaxing the check especially if, as Ady pointed out, not > having all sectors addressable doesn't make a disk any less valid. > > Regards, > > /Pete > > > [1] > https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377 > [2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html > [3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.htmlI would like to point out that being able to allocate / address at least the whole Data Area is indeed the "common" way of calculating the Sectors_per_FAT, but I am _not_ saying that having unaddressable sectors is completely invalid. Perhaps there are cases in which the resulting Data Area is effectively bigger when using a lower (i.e. diff. of minus one) value of Sectors_per_FAT; that is, leaving some few unaddressable sectors with a lower Fs value, instead of having a bigger Fs value ("eating" part of the potential Data Area) that would be capable of addressing the "normal" Data Area. Such excercise might involve some recursive calculations. I am more worried about the edge cases, moving from a valid "Total addressable amount of Clusters" for FAT32 to a range in which the FAT should had been formatted as FAT16. An additional concern is about the boot code in SYSLINUX in such edge cases, considering that, with the current (unpatched) Syslinux code, we are sure that the FAT32 volume that the syslinux command is about to write to is certainly a valid FAT32 filesystem (and the code being written is adequate for FAT32). But, if the volume was formatted for FAT32 but it should had been FAT16 (or FAT12), I fear whether the syslinux command would install code (e.g. a wrong address for ldlinux.sys to be found) for the wrong FAT type. As I said before, I am not a developer, so these matters are out of my league. Apologies for the long emails. Hopefully, it is still worth. Regards, Ady.> _______________________________________________ > Syslinux mailing list > Submissions to Syslinux at zytor.com > Unsubscribe or set options at: > http://www.zytor.com/mailman/listinfo/syslinux >
Gene Cumm
2016-Feb-26 11:49 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
On Thu, Feb 25, 2016 at 7:59 PM, Pete Batard via Syslinux <syslinux at zytor.com> wrote:> Hi Ady, > > Your insightful post prompted me to to a little bit more digging as to how > the Ridgecrop algorithm computed its FAT size, with the result of my > investigations presented below. > > NB: For those who don't want to go through this whole part, there's a TL;DR > near the end.> *TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, whether > justified or not, the existing Syslinux check does catch FATs that are > missing addressable sectors. > > > I have now tested the new computation against a 320GB and 1TB drive, and > found that the original minfatsize check of Syslinux is no longer an issue. > > This being said, and to address Ady's subsequent point: > > While I can now address the issue in Rufus (and will contact Tom Thornhill > of Ridgecrop to let him know about both the issue & fix), I suspect there > are users out there who are using and will continue to use fat32format.exe > with the bad computation algorithm, as well as other developers who might > lift the existing Large FAT32 format code without realizing that doing so > will break Syslinux installation. So it may still be worth relaxing the > check especially if, as Ady pointed out, not having all sectors addressable > doesn't make a disk any less valid.I think there may be another answer to this: 1) a tool to fix the broken FSs by "wasting" the high clusters of the file system, a non-destructive correction. As it stands they're effectively wasted already and might risk a user thinking the file system isn't full when in fact the FAT itself is. 2) The recommendation to use the Syslinux installer's "-f" flag until the individual file system is corrected with the above tool, assuming "-f" circumvents this check. I've been tossing the idea around that a single "-f" might not be the best answer and a longer option specifying a list of checks to skip might be more balanced, ie "-F minfatsize,othercheck". 3) Ensure that the Syslinux installers state which check failed to assist users in correcting their file system, regardless of if "-f" is specified. If "-f" is not specified, installation should fail completely (hopefully with 0 alterations). If "-f" is specified but not necessary (indicating a user/tool that isn't being cautious), consider throwing a warning while still allowing the install and returning a success to the caller of the installer.> [1] > https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377 > [2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html > [3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html-- -Gene
Pete Batard
2016-Feb-26 12:51 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
Hi Ady, I won't comment on the reasons why the original computation was wrong, but thanks for the detailed analysis. On 2016.02.26 08:05, Ady via Syslinux wrote:>> Thus we can finally get a formula for Fs that satisfies the above: >> >> Fs = (To - Rs + Nf * Cs) / ((Ss * Cs / Fe) + Nf) + 1 > > I believe such formula is slightly inaccurate too. > > My resulting inequation is > > ( To - Rs ) + ( 2 * Cs ) > Fs >= __________________________ > ( Ss * Cs / Fe ) + Nf >I believe you're right. I assumed that the 2 that eventually ends up in the numerator was the Number of FATs, but looking further, and especially at Wikipedia [1], I see that it is the number of special clusters, which is fixed to 2 regardless of the number of FATs: "The first two entries in a FAT store special values: The first entry (cluster 0 in the FAT) holds the FAT ID (...) The second entry (cluster 1 in the FAT) nominally stores the end-of-cluster-chain marker (...)" So as you point out, this should be hardcoded to 2. Thanks for picking this up. Wouldn't have been a dramatic mistake, since I don't see any possibility of using a number of FATs that's anything but 2 in the context of Rufus, but we should indeed get our variables right.> instead of _always_ adding "+1" (which would be > incorrect and inefficient from the point of view of the resulting > allocatable size).I carefully considered this, and I dispute the fact that this is incorrect. The problem is we're dealing with a fraction, which will be limited by the number of bits the computer uses to store the data *and might be rounded down behind the scenes*. E.g. unless you use a crazy number of bits to store your numbers, something like (10^100 + 1) / (10^100) will produce 1 and the modulo (10^100 + 1) % (10^100) will produce 0. So if you use the modulo to figure out if you need to round up, you're going to miss some cases. And while I'd like to believe that all of "roundup", "ceiling" and friends are smart enough to know the precision they're dealing with, and compensate accordingly, or, more realistically here, that the numbers we're dealing with in this case will never be large enough to test the limits of our precision (especially our numerator and denominator are set to 64bit and our disks will never be larger than 2TB anyway, because FAT32), I'd rather not take any risks here, even more so as I am fixing code that was missing addressable sectors and the last thing I'd want is find out that, because of a dodgy rounding or a wrong assumption, we might still end up missing some sectors after all. And yeah, we could add the actual roundup in the equations themselves (I actually did that when I was trying to figure out if the original wrong computation wasn't due to the roundup), but since the "only going to be *very slightly* wasteful in an exceedingly limited set of cases" +1 is a safe and sure way to address the rounding "issue", I have to admit that I'm not that interested in trying to come up with a more mathematically satisfying solution. Working and relatively efficient code is what I am after, and as far as I'm concerned, using +1 is both a "correct" and "efficient" way of achieving that. Still, I won't prevent you (or anybody else interested) to provide a proper formula if you want. ;) Regards, /Pete [1] https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#Special_entries
Pete Batard
2016-Feb-26 13:10 UTC
[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32
Hi Gene, On 2016.02.26 11:49, Gene Cumm wrote:> I think there may be another answer to this: > > 1) a tool to fix the broken FSs by "wasting" the high clusters of the > file system, a non-destructive correction. As it stands they're > effectively wasted already and might risk a user thinking the file > system isn't full when in fact the FAT itself is.I'm not exactly sure how that would work (how would you mark those clusters as wasted when my understanding is that the FAT's can't provide any knowledge about them in the first place?) and unless it is automatically integrated and ran during the Syslinux installation, it sounds quite inconvenient for users.> 2) The recommendation to use the Syslinux installer's "-f" flag until > the individual file system is corrected with the above tool, assuming > "-f" circumvents this check. I've been tossing the idea around that a > single "-f" might not be the best answer and a longer option > specifying a list of checks to skip might be more balanced, ie "-F > minfatsize,othercheck". > > 3) Ensure that the Syslinux installers state which check failed to > assist users in correcting their file system, regardless of if "-f" is > specified. If "-f" is not specified, installation should fail > completely (hopefully with 0 alterations). If "-f" is specified but > not necessary (indicating a user/tool that isn't being cautious), > consider throwing a warning while still allowing the install and > returning a success to the caller of the installer.Joining on what you advocate above, I think I'd indeed prefer an installer that fails when -f isn't used, with an explicit message stating why (maybe with a computation of the number of sectors that cannot be addressed, so that users have a good idea of the wastage) along with an indication that they can use '-f' use to bypass that check, should they want to. Since I brought this whole thing up, and provided this is the course of action everybody agrees on, I *may* try to work on a patch that does this... bearing in mind that, as you know, the last time I said I'd work out some patches for Syslinux (this series), it took about 3 months before I posted anything. ;) Regards, /Pete