thr3ads.net - Syslinux - [syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32 [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Pete Batard

2016-Feb-26 00:59 UTC

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Hi Ady,

Your insightful post prompted me to to a little bit more digging as to 
how the Ridgecrop algorithm computed its FAT size, with the result of my 
investigations presented below.

NB: For those who don't want to go through this whole part, there's a 
TL;DR near the end.

For reference, the computation of the FAT size all done in the 
GetFATSizeSectors(), the code of which is at [1] (Rufus' version hasn't 
been altered from Ridegcrop's). When this code is ran against the same 
disk and same parameters as we've been using for our example, and with 
some debugging of the variables enabled, you'll see the following output:

-----------------------------------------------------------------------
DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
BytesPerSect = 512

Numerator = 4 * (195369519 - 32) = 781477948
Denominator = (64 * 512) + (4 * 2) = 32776
FatSz = (781477948 / 32776) + 1 = 23843
-----------------------------------------------------------------------

Now, the algorithm mentions that this computation is based on a formula 
by Rune Moeller Barnkob from [2], but that page no longer exists. 
Digging around seems to provides a copy of the same content at [3] 
however (where we find that the original page dealt with FAT16), with 
the part of interest to us being the one with the computation, starting at:

-----------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Ds is the size of a entry (=32 bytes)
(...)
-----------------------------------------------------------------------

Now, even though this is a FAT16 computation, I think I was able to work 
out how Tom Thornhill of Ridgecrop got his FAT32 equivalent, which was 
probably as follows:

-----------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Fe is the FAT element size
Nf is the number of FATs

The size of the FAT must equal the free amount of sectors divided by the 
cluster size in sectors multiplied by the FAT element size divided by 
the sector size (the rounding up can be done later).

          Fo * Fe
     Fs = -------
          Cs * Ss

The free amount of sectors must be the total amount minus the FAT's and 
the Reserved Sectors. We'll assume the Root Directory takes no space, 
which is something that Ady mentioned and is safe to do anyway, as this 
will maximize the amount of free sectors we consider in our computation.

     Fo = To - (Nf * Fs) - Rs

If we solve that:

          Fe * ( To - (Nf * Fs) - Rs )
     Fs = ----------------------------
                  Cs * Ss

     Fs * (Cs * Ss)  =  (Fe * To) - (Fe * Nf * Fs) - (Fe * Rs)

     (Fs * Cs * Ss) + (Fs * Fe * Nf)  =  (Fe * To) - (Fe * Rs)

     Fs * ( (Cs * Ss) + (Fe * Nf))  =  Fe * (To - Rs)

              Fe * (To - Rs)
     Fs = ---------------------
          (Cs * Ss) + (Fe * Nf)

We end up with the Numerator and Denominator used by GetFATSizeSectors():

Numerator = FatElementSize * (DskSize - ReservedSecCnt)
Denominator = (SecPerClus * BytesPerSect) + (FatElementSize * NumFATs)

Then +1 is added to the FAT Size for rounding.
-----------------------------------------------------------------------

However, as Ady demonstrated, this computation doesn't actually work 
because it leaves unaddressable sectors.

So let us instead try to follow Ady's post, to derive a proper 
algorithm. First, let me quote the relevant part:

On 2016.02.25 20:49, Ady via Syslinux wrote:> _ Bytes per Sector: 512
> _ FAT Entries per Sector: 128
> _ Reserved Sectors: 32
> _ Volume's Total Sectors: 195'369'519
> _ Sectors per Cluster: 64
> _ Amount of FATs: 2
> _ Root Directory Sectors: 0 (please keep reading)
> Sectors per FAT: 23843
>
> With this Sectors_per_FAT value, the corresponding
>
> Maximum FAT entries:
>   23843 * 128 = 3'051'904
>
> Since the first 2 FAT entries are reserved, then the corresponding
>
> Maximum Amount of Clusters:
>   3'051'904 - 2 = 3'051'902
>
> The amount of Sectors in the Data Area corresponding to such amount of
> Clusters is:
>
> Maximum Amount of "Allocatable Sectors" (please allow me to use
such
> uncommon expression, for brevity):
>   3'051'902 * 64 = 195'321'728
>
> So we have, for 23843 Sectors_per_FAT in our example:
>   32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors
>
> When comparing this value with the 195'369'519 Volume's Total
Sectors:
>   195'369'519 - 195'369'446 = 73 Sectors
>
> This means that with 23843 Sectors_per_FAT in our FAT32 volume, we
> would have 73 unused / unusable sectors.
I'm going to use the same variable names as previously, with just an 
additional intermediate one added:

------------------------------------------------------------------------
Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Fe is the FAT element size
Nf is the number of FATs
MaxFE is the maximum number of FAT entries

As with the post above, we'll start with that last variable, since it is 
the one that is crucial to getting our computation right:

    MaxFatEn = Fs * Ss / Fe

Now, if we follow Ady's post to compute the total number of sectors 
addressable, we want to have that number greater than the number of 
sectors reported for the volume, hence:

    (MaxFatEn - Nf) * Cs + Nf * Fs + Rs >= To

Let's replace MaxFatEn:

    ((Fs * Ss / Fe) - Nf) * Cs + Nf * Fs + Rs >= To

Now of course, we want to isolate the FAT Size (Fs) since that's what 
we're after:

    (Fs * Ss * Cs / Fe) - (Nf * Cs) + (Fs * Nf) + Rs >= To

    Fs * (Ss * Cs / Fe + Nf) >= To - Rs + (Nf * Cs)

    Fs >= (To - Rs + Nf * Cs) / (Ss * Cs / Fe + Nf)

Thus we can finally get a formula for Fs that satisfies the above:

    Fs = (To - Rs + Nf * Cs)  / ((Ss * Cs / Fe) + Nf) + 1
------------------------------------------------------------------------

That's quite different from the earlier formula.

However, it *does* yield the expected result of 23844, instead of 23843:

------------------------------------------------------------------------
DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
BytesPerSect = 512
Numerator = 195369519 * 32 + 2 * 64 = 195369615
Denominator = 64 * 512 / 4 * 2 = 8194
FatSz = (195369615 / 8194) + 1 = 23844
------------------------------------------------------------------------

*TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, 
whether justified or not, the existing Syslinux check does catch FATs 
that are missing addressable sectors.

I have now tested the new computation against a 320GB and 1TB drive, and 
found that the original minfatsize check of Syslinux is no longer an issue.

This being said, and to address Ady's subsequent point:

While I can now address the issue in Rufus (and will contact Tom 
Thornhill of Ridgecrop to let him know about both the issue & fix), I 
suspect there are users out there who are using and will continue to use 
fat32format.exe with the bad computation algorithm, as well as other 
developers who might lift the existing Large FAT32 format code without 
realizing that doing so will break Syslinux installation. So it may 
still be worth relaxing the check especially if, as Ady pointed out, not 
having all sectors addressable doesn't make a disk any less valid.

Regards,

/Pete

[1] 
https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377
[2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html
[3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html

Ady

2016-Feb-26 08:05 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

In the following text, I am about to use terms such as "inaccurate". I
don't mean to question what some code does, but rather to compare the 
expressions against what I think is a more accurate one, in theory. I 
mean no disrespect, and I am not saying that developers are doing the 
wrong thing. In addition, of course I could be wrong (or type in 
incorrectly, or some formatting issue could appear in some email 
client, or...).

Although Pete arrived to _similar_ conclusions than I did, and he 
indeed wrote his conclusions at the end of his email, I am about to 
purposely comment throughout the original logic. By doing it this way I 
am hoping to communicate my thoughts in a clearer manner (fingers 
crossed :).

> Hi Ady,
> 
> Your insightful post prompted me to to a little bit more digging as to 
> how the Ridgecrop algorithm computed its FAT size, with the result of my 
> investigations presented below.
> 
> NB: For those who don't want to go through this whole part, there's
a
> TL;DR near the end.
> 
> For reference, the computation of the FAT size all done in the 
> GetFATSizeSectors(), the code of which is at [1] (Rufus' version
hasn't
> been altered from Ridegcrop's). When this code is ran against the same 
> disk and same parameters as we've been using for our example, and with 
> some debugging of the variables enabled, you'll see the following
output:
> 
> -----------------------------------------------------------------------
> DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
> BytesPerSect = 512
> 
> Numerator = 4 * (195369519 - 32) = 781477948
> Denominator = (64 * 512) + (4 * 2) = 32776
> FatSz = (781477948 / 32776) + 1 = 23843
> -----------------------------------------------------------------------
> 
> Now, the algorithm mentions that this computation is based on a formula 
> by Rune Moeller Barnkob from [2], but that page no longer exists. 
> Digging around seems to provides a copy of the same content at [3] 
> however (where we find that the original page dealt with FAT16), with 
> the part of interest to us being the one with the computation, starting at:
> 
> -----------------------------------------------------------------------
> Assume:
> To is the total amount of sectors,
> Fo is the amount of free sectors for data
> Fs is the size of one FAT in sectors
> Cs is the cluster size
> Ss is the sector size
> Rs is the reserved sectors before the FAT's
> Re is the entries in the root-directory
> Ds is the size of a entry (=32 bytes)
> (...)
> -----------------------------------------------------------------------
> 
> Now, even though this is a FAT16 computation, I think I was able to work 
> out how Tom Thornhill of Ridgecrop got his FAT32 equivalent, which was 
> probably as follows:
> 
> -----------------------------------------------------------------------
> Assume:
> To is the total amount of sectors,
> Fo is the amount of free sectors for data
> Fs is the size of one FAT in sectors
> Cs is the cluster size
> Ss is the sector size
> Rs is the reserved sectors before the FAT's
> Re is the entries in the root-directory
> Fe is the FAT element size
> Nf is the number of FATs
> 
> The size of the FAT must equal the free amount of sectors divided by the 
> cluster size in sectors multiplied by the FAT element size divided by 
> the sector size (the rounding up can be done later).
> 
>           Fo * Fe
>      Fs = -------
>           Cs * Ss
>  

This is an inaccurate calculation. First, the formula for FAT32 should 
be:

 Fs >= ( Fo / Cs + 2 ) * Fe / Ss

or:

 Fs * Ss / Fe >= Fo / Cs + 2

Just in case of any doubt, I'll repeat it while abusing of parentheses 
(unneeded in math, but perhaps needed in some code):

 Fs * Ss / Fe >= ( Fo / Cs ) + 2

Note the ">=". In fact ">" is the common case and
"=" would be a rare
case (yet, desirable).

The uncommon case would be "<", resulting in unallocatable 
(unaddressable) sectors (which is the issue we are discussing).

 > The free amount of sectors must be the total amount minus the FAT's and
> the Reserved Sectors. We'll assume the Root Directory takes no space,  

Again, inaccurate. First, the ">=" I mentioned before; so the
"must" in
the above sentence would be "too strong".

Second, the sentence should clarify that it refers to FAT32 only, since 
FAT12/16 would be different. This difference is part of the problem, 
particularly around the edges of:
_ accepted valid values for total amount of Clusters;
_ accepted valid values for Sectors_per_FAT;
_ accepted valid values for the type of FAT (12/16/32); and,
_ the relation between all of the above.

So, for FAT32, the potential maximum amount of Sectors for Data would 
be:

 Fo = To - ( Nf * Fs ) - Rs

(This formula is the same as Pete wrote it; I commented only about the 
sentence explaining the corresponding formula.)

or:

 Fo = To - Rs - ( Nf * Fs )

As I said, that's the _potential maximum addressable_ value for FAT32 
(only).

Solving the two formulas for Fs in FAT32:

 { Fo = To - Rs - ( Nf * Fs ) }
 { Fs * Ss / Fe >= ( Fo / Cs ) + 2 }

we have the resulting:

        ( To - Rs ) + ( 2 * Cs ) 
 Fs >= __________________________
         ( Ss * Cs / Fe ) + Nf   

Please note that Fs must be a positive integer and that there are 
additional restrictions to its resulting value. For instance, if its 
value is "too-low", then the filesystem should not be FAT32 but rather
FAT16 or FAT12, considering a similar restriction for the amount of 
Clusters that actually defines the type of FAT (12/16/32).

Please do not forget that these calculations we are presenting here are 
for FAT32 only. Since we are discussing the minimal values for FAT32, 
then we should also consider that an amount of Clusters that is around 
the lower limit of FAT32 might trigger the need to change to FAT16 
(instead of FAT32), and then the calculations (and the restrictions to 
the amount of Clusters) are different (because the Root Directory in 
FAT12/16 is not just a "simple" Directory in the Data Area).

 > which is something that Ady mentioned and is safe to do anyway, as this 
> will maximize the amount of free sectors we consider in our computation.
> 
>      Fo = To - (Nf * Fs) - Rs
> 
> If we solve that:
> 
>           Fe * ( To - (Nf * Fs) - Rs )
>      Fs = ----------------------------
>                   Cs * Ss
> 
>      Fs * (Cs * Ss)  =  (Fe * To) - (Fe * Nf * Fs) - (Fe * Rs)
> 
>      (Fs * Cs * Ss) + (Fs * Fe * Nf)  =  (Fe * To) - (Fe * Rs)
> 
>      Fs * ( (Cs * Ss) + (Fe * Nf))  =  Fe * (To - Rs)
> 
>               Fe * (To - Rs)
>      Fs = ---------------------
>           (Cs * Ss) + (Fe * Nf)
> 
> We end up with the Numerator and Denominator used by GetFATSizeSectors():
> 
> Numerator = FatElementSize * (DskSize - ReservedSecCnt)
> Denominator = (SecPerClus * BytesPerSect) + (FatElementSize * NumFATs)
> 
> Then +1 is added to the FAT Size for rounding.
> -----------------------------------------------------------------------
> 
> However, as Ady demonstrated, this computation doesn't actually work 
> because it leaves unaddressable sectors.
> 
> 
> So let us instead try to follow Ady's post, to derive a proper 
> algorithm. First, let me quote the relevant part:
> 
> On 2016.02.25 20:49, Ady via Syslinux wrote:
> > _ Bytes per Sector: 512
> > _ FAT Entries per Sector: 128
> > _ Reserved Sectors: 32
> > _ Volume's Total Sectors: 195'369'519
> > _ Sectors per Cluster: 64
> > _ Amount of FATs: 2
> > _ Root Directory Sectors: 0 (please keep reading)
> 
> > Sectors per FAT: 23843
> >
> > With this Sectors_per_FAT value, the corresponding
> >
> > Maximum FAT entries:
> >   23843 * 128 = 3'051'904
> >
> > Since the first 2 FAT entries are reserved, then the corresponding
> >
> > Maximum Amount of Clusters:
> >   3'051'904 - 2 = 3'051'902
> >
> > The amount of Sectors in the Data Area corresponding to such amount of
> > Clusters is:
> >
> > Maximum Amount of "Allocatable Sectors" (please allow me to
use such
> > uncommon expression, for brevity):
> >   3'051'902 * 64 = 195'321'728
> >
> > So we have, for 23843 Sectors_per_FAT in our example:
> >   32 + 23843 * 2 + 195'321'728 = 195'369'446 Sectors
> >
> > When comparing this value with the 195'369'519 Volume's
Total Sectors:
> >   195'369'519 - 195'369'446 = 73 Sectors
> >
> > This means that with 23843 Sectors_per_FAT in our FAT32 volume, we
> > would have 73 unused / unusable sectors.
> 
> I'm going to use the same variable names as previously, with just an 
> additional intermediate one added:
> 
> ------------------------------------------------------------------------
> Assume:
> To is the total amount of sectors,
> Fo is the amount of free sectors for data
> Fs is the size of one FAT in sectors
> Cs is the cluster size
> Ss is the sector size
> Rs is the reserved sectors before the FAT's
> Re is the entries in the root-directory
> Fe is the FAT element size
> Nf is the number of FATs
> MaxFE is the maximum number of FAT entries
> 
> As with the post above, we'll start with that last variable, since it
is
> the one that is crucial to getting our computation right:
> 
>     MaxFatEn = Fs * Ss / Fe
> 
> Now, if we follow Ady's post to compute the total number of sectors 
> addressable, we want to have that number greater than the number of 
> sectors reported for the volume, hence:
> 
>     (MaxFatEn - Nf) * Cs + Nf * Fs + Rs >= To
> 
> Let's replace MaxFatEn:
> 
>     ((Fs * Ss / Fe) - Nf) * Cs + Nf * Fs + Rs >= To
> 
> Now of course, we want to isolate the FAT Size (Fs) since that's what 
> we're after:
> 
>     (Fs * Ss * Cs / Fe) - (Nf * Cs) + (Fs * Nf) + Rs >= To
> 
>     Fs * (Ss * Cs / Fe + Nf) >= To - Rs + (Nf * Cs)
> 
>     Fs >= (To - Rs + Nf * Cs) / (Ss * Cs / Fe + Nf)
> 
> Thus we can finally get a formula for Fs that satisfies the above:
> 
>     Fs = (To - Rs + Nf * Cs)  / ((Ss * Cs / Fe) + Nf) + 1 

I believe such formula is slightly inaccurate too.

My resulting inequation is

        ( To - Rs ) + ( 2 * Cs ) 
 Fs >= __________________________
         ( Ss * Cs / Fe ) + Nf   

Note: the difference starts at the equation of Pete's "MaxFatEn",
which
is inaccurate.

My inequation can adequately be transformed into an equation by means 
of a "roundup" or "ceiling" (or the "modulo"
operation if you want) -
let's not get into the negative numbers matters and strict functions' 
definitions here - instead of _always_ adding "+1" (which would be 
incorrect and inefficient from the point of view of the resulting 
allocatable size).

In my inequation, when the resulting calculation of (the potential) Fs 
is (already) an integer, then we should not need to add "+1"; it would
not be completely invalid (as per the inequation), but it would be 
unnecessary. (Although, please note that when we try to obtain some 
kind of sector alignment, we could potentially increase the Fs value as 
one of the possible techniques, generally speaking. There are other, 
possibly more efficient techniques, but increasing the Fs value is 
indeed permitted).

In my inequation, if the resulting calculation of (the potential) Fs 
happens to not be an integer - I don't really recall whether such cases 
exist for FAT32 and I have not attempted to find out at this time - 
then the next integer up would be the minimal Fs value that could cover 
the whole potential Data Area. Again, the "+1" would not be accurate 
for such case.

As per actual code, "roundup", "ceiling",
"modulo", "integer",
"quotient", "remainder"... are some of the operations or
functions that
come to mind (not "+1"). Sure, potential "division by zero"
and
"undefined operation" (and the treatment of negative values, and...) 
should be considered in the code, but that's out of my league as I am 
not a developer.

Once the Fs value gets defined, the actual resulting values can be 
(re)calculated:
_ Max potential Data Area;
_ Actual addressable Data Area;
_ Potential unaddressable sectors for the defined Fs value;

We could even calculate a "minimal" potential Data Area: a lower
amount
of sectors than this minimum and we could potentially use a lower Fs 
value to address the whole "new" Data Area.

 > ------------------------------------------------------------------------
> 
> That's quite different from the earlier formula.
> 
> However, it *does* yield the expected result of 23844, instead of 23843:
> 
> ------------------------------------------------------------------------
> DskSize = 195369519, ReservedSecCnt = 32, SecPerClus = 64, NumFATs=2, 
> BytesPerSect = 512
> Numerator = 195369519 * 32 + 2 * 64 = 195369615
> Denominator = 64 * 512 / 4 * 2 = 8194
> FatSz = (195369615 / 8194) + 1 = 23844
> ------------------------------------------------------------------------
> 
> 
> *TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and, 
> whether justified or not, the existing Syslinux check does catch FATs 
> that are missing addressable sectors.
> 
> 
> I have now tested the new computation against a 320GB and 1TB drive, and 
> found that the original minfatsize check of Syslinux is no longer an issue.
> 
> This being said, and to address Ady's subsequent point:
> 
> While I can now address the issue in Rufus (and will contact Tom 
> Thornhill of Ridgecrop to let him know about both the issue & fix), I 
> suspect there are users out there who are using and will continue to use 
> fat32format.exe with the bad computation algorithm, as well as other 
> developers who might lift the existing Large FAT32 format code without 
> realizing that doing so will break Syslinux installation. So it may 
> still be worth relaxing the check especially if, as Ady pointed out, not 
> having all sectors addressable doesn't make a disk any less valid.
> 
> Regards,
> 
> /Pete
> 
> 
> [1] 
>
https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377
> [2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html
> [3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html 

I would like to point out that being able to allocate / address at 
least the whole Data Area is indeed the "common" way of calculating
the
Sectors_per_FAT, but I am _not_ saying that having unaddressable 
sectors is completely invalid. Perhaps there are cases in which the 
resulting Data Area is effectively bigger when using a lower (i.e. 
diff. of minus one) value of Sectors_per_FAT; that is, leaving some few 
unaddressable sectors with a lower Fs value, instead of having a bigger 
Fs value ("eating" part of the potential Data Area) that would be 
capable of addressing the "normal" Data Area. Such excercise might 
involve some recursive calculations.

I am more worried about the edge cases, moving from a valid "Total 
addressable amount of Clusters" for FAT32 to a range in which the FAT 
should had been formatted as FAT16.

An additional concern is about the boot code in SYSLINUX in such edge 
cases, considering that, with the current (unpatched) Syslinux code, we 
are sure that the FAT32 volume that the syslinux command is about to 
write to is certainly a valid FAT32 filesystem (and the code being 
written is adequate for FAT32). But, if the volume was formatted for 
FAT32 but it should had been FAT16 (or FAT12), I fear whether the 
syslinux command would install code (e.g. a wrong address for 
ldlinux.sys to be found) for the wrong FAT type. As I said before, I am 
not a developer, so these matters are out of my league.

Apologies for the long emails. Hopefully, it is still worth.

Regards,
Ady.

> _______________________________________________
> Syslinux mailing list
> Submissions to Syslinux at zytor.com
> Unsubscribe or set options at:
> http://www.zytor.com/mailman/listinfo/syslinux
>

Gene Cumm

2016-Feb-26 11:49 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

On Thu, Feb 25, 2016 at 7:59 PM, Pete Batard via Syslinux
<syslinux at zytor.com> wrote:> Hi Ady,
>
> Your insightful post prompted me to to a little bit more digging as to how
> the Ridgecrop algorithm computed its FAT size, with the result of my
> investigations presented below.
>
> NB: For those who don't want to go through this whole part, there's
a TL;DR
> near the end.
> *TL;DR*: The Ridgecrop Fat Size computation algorithm is wrong, and,
whether
> justified or not, the existing Syslinux check does catch FATs that are
> missing addressable sectors.
>
>
> I have now tested the new computation against a 320GB and 1TB drive, and
> found that the original minfatsize check of Syslinux is no longer an issue.
>
> This being said, and to address Ady's subsequent point:
>
> While I can now address the issue in Rufus (and will contact Tom Thornhill
> of Ridgecrop to let him know about both the issue & fix), I suspect
there
> are users out there who are using and will continue to use fat32format.exe
> with the bad computation algorithm, as well as other developers who might
> lift the existing Large FAT32 format code without realizing that doing so
> will break Syslinux installation. So it may still be worth relaxing the
> check especially if, as Ady pointed out, not having all sectors addressable
> doesn't make a disk any less valid.

I think there may be another answer to this:

1) a tool to fix the broken FSs by "wasting" the high clusters of the
file system, a non-destructive correction.  As it stands they're
effectively wasted already and might risk a user thinking the file
system isn't full when in fact the FAT itself is.

2) The recommendation to use the Syslinux installer's "-f" flag
until
the individual file system is corrected with the above tool, assuming
"-f" circumvents this check.  I've been tossing the idea around
that a
single "-f" might not be the best answer and a longer option
specifying a list of checks to skip might be more balanced, ie "-F
minfatsize,othercheck".

3) Ensure that the Syslinux installers state which check failed to
assist users in correcting their file system, regardless of if "-f" is
specified.  If "-f" is not specified, installation should fail
completely (hopefully with 0 alterations).  If "-f" is specified but
not necessary (indicating a user/tool that isn't being cautious),
consider throwing a warning while still allowing the install and
returning a success to the caller of the installer.
> [1]
>
https://github.com/pbatard/rufus/blob/ade5639c0047ee813f71a8bfef8b1cc7be551009/src/format.c#L349-L377
> [2] http://hjem.get2net.dk/rune_moeller_barnkob/filesystems/fat.html
> [3] http://pierrelib.pagesperso-orange.fr/filesystems/fat16.html
-- 
-Gene

Pete Batard

2016-Feb-26 12:51 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Hi Ady,

I won't comment on the reasons why the original computation was wrong, 
but thanks for the detailed analysis.

On 2016.02.26 08:05, Ady via Syslinux wrote:>> Thus we can finally get a formula for Fs that satisfies the above:
>>
>>      Fs = (To - Rs + Nf * Cs)  / ((Ss * Cs / Fe) + Nf) + 1
>
> I believe such formula is slightly inaccurate too.
>
> My resulting inequation is
>
>          ( To - Rs ) + ( 2 * Cs )
>   Fs >= __________________________
>           ( Ss * Cs / Fe ) + Nf
>
I believe you're right. I assumed that the 2 that eventually ends up in 
the numerator was the Number of FATs, but looking further, and 
especially at Wikipedia [1], I see that it is the number of special 
clusters, which is fixed to 2 regardless of the number of FATs:

"The first two entries in a FAT store special values: The first entry 
(cluster 0 in the FAT) holds the FAT ID (...) The second entry (cluster 
1 in the FAT) nominally stores the end-of-cluster-chain marker (...)"

So as you point out, this should be hardcoded to 2. Thanks for picking 
this up.

Wouldn't have been a dramatic mistake, since I don't see any possibility
of using a number of FATs that's anything but 2 in the context of Rufus, 
but we should indeed get our variables right.
> instead of _always_ adding "+1" (which would be
> incorrect and inefficient from the point of view of the resulting
> allocatable size).
I carefully considered this, and I dispute the fact that this is incorrect.

The problem is we're dealing with a fraction, which will be limited by 
the number of bits the computer uses to store the data *and might be 
rounded down behind the scenes*. E.g. unless you use a crazy number of 
bits to store your numbers, something like (10^100 + 1) / (10^100) will 
produce 1 and the modulo (10^100 + 1) % (10^100) will produce 0. So if 
you use the modulo to figure out if you need to round up, you're going 
to miss some cases. And while I'd like to believe that all of
"roundup",
"ceiling" and friends are smart enough to know the precision
they're
dealing with, and compensate accordingly, or, more realistically here, 
that the numbers we're dealing with in this case will never be large 
enough to test the limits of our precision (especially our numerator and 
denominator are set to 64bit and our disks will never be larger than 2TB 
anyway, because FAT32), I'd rather not take any risks here, even more so 
as I am fixing code that was missing addressable sectors and the last 
thing I'd want is find out that, because of a dodgy rounding or a wrong 
assumption, we might still end up missing some sectors after all.

And yeah, we could add the actual roundup in the equations themselves (I 
actually did that when I was trying to figure out if the original wrong 
computation wasn't due to the roundup), but since the "only going to be
*very slightly* wasteful in an exceedingly limited set of cases" +1 is a 
safe and sure way to address the rounding "issue", I have to admit
that
I'm not that interested in trying to come up with a more mathematically 
satisfying solution.

Working and relatively efficient code is what I am after, and as far as 
I'm concerned, using +1 is both a "correct" and
"efficient" way of
achieving that.

Still, I won't prevent you (or anybody else interested) to provide a 
proper formula if you want. ;)

Regards,

/Pete

[1] 
https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#Special_entries

Pete Batard

2016-Feb-26 13:10 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Hi Gene,

On 2016.02.26 11:49, Gene Cumm wrote:> I think there may be another answer to this:
>
> 1) a tool to fix the broken FSs by "wasting" the high clusters of
the
> file system, a non-destructive correction.  As it stands they're
> effectively wasted already and might risk a user thinking the file
> system isn't full when in fact the FAT itself is.
I'm not exactly sure how that would work (how would you mark those 
clusters as wasted when my understanding is that the FAT's can't provide
any knowledge about them in the first place?) and unless it is 
automatically integrated and ran during the Syslinux installation, it 
sounds quite inconvenient for users.
> 2) The recommendation to use the Syslinux installer's "-f"
flag until
> the individual file system is corrected with the above tool, assuming
> "-f" circumvents this check.  I've been tossing the idea
around that a
> single "-f" might not be the best answer and a longer option
> specifying a list of checks to skip might be more balanced, ie "-F
> minfatsize,othercheck".
>
> 3) Ensure that the Syslinux installers state which check failed to
> assist users in correcting their file system, regardless of if
"-f" is
> specified.  If "-f" is not specified, installation should fail
> completely (hopefully with 0 alterations).  If "-f" is specified
but
> not necessary (indicating a user/tool that isn't being cautious),
> consider throwing a warning while still allowing the install and
> returning a success to the caller of the installer.
Joining on what you advocate above, I think I'd indeed prefer an 
installer that fails when -f isn't used, with an explicit message 
stating why (maybe with a computation of the number of sectors that 
cannot be addressed, so that users have a good idea of the wastage) 
along with an indication that they can use '-f' use to bypass that 
check, should they want to.

Since I brought this whole thing up, and provided this is the course of 
action everybody agrees on, I *may* try to work on a patch that does 
this... bearing in mind that, as you know, the last time I said I'd work 
out some patches for Syslinux (this series), it took about 3 months 
before I posted anything. ;)

Regards,

/Pete

Reasonably Related Threads

Search for more apparently analagous threads

Syslinux - Feb 2016 - [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Reasonably Related Threads