thr3ads.net - Syslinux - [syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32 [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Pete Batard

2016-Feb-26 12:51 UTC

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Hi Ady,

I won't comment on the reasons why the original computation was wrong, 
but thanks for the detailed analysis.

On 2016.02.26 08:05, Ady via Syslinux wrote:>> Thus we can finally get a formula for Fs that satisfies the above:
>>
>>      Fs = (To - Rs + Nf * Cs)  / ((Ss * Cs / Fe) + Nf) + 1
>
> I believe such formula is slightly inaccurate too.
>
> My resulting inequation is
>
>          ( To - Rs ) + ( 2 * Cs )
>   Fs >= __________________________
>           ( Ss * Cs / Fe ) + Nf
>
I believe you're right. I assumed that the 2 that eventually ends up in 
the numerator was the Number of FATs, but looking further, and 
especially at Wikipedia [1], I see that it is the number of special 
clusters, which is fixed to 2 regardless of the number of FATs:

"The first two entries in a FAT store special values: The first entry 
(cluster 0 in the FAT) holds the FAT ID (...) The second entry (cluster 
1 in the FAT) nominally stores the end-of-cluster-chain marker (...)"

So as you point out, this should be hardcoded to 2. Thanks for picking 
this up.

Wouldn't have been a dramatic mistake, since I don't see any possibility
of using a number of FATs that's anything but 2 in the context of Rufus, 
but we should indeed get our variables right.
> instead of _always_ adding "+1" (which would be
> incorrect and inefficient from the point of view of the resulting
> allocatable size).
I carefully considered this, and I dispute the fact that this is incorrect.

The problem is we're dealing with a fraction, which will be limited by 
the number of bits the computer uses to store the data *and might be 
rounded down behind the scenes*. E.g. unless you use a crazy number of 
bits to store your numbers, something like (10^100 + 1) / (10^100) will 
produce 1 and the modulo (10^100 + 1) % (10^100) will produce 0. So if 
you use the modulo to figure out if you need to round up, you're going 
to miss some cases. And while I'd like to believe that all of
"roundup",
"ceiling" and friends are smart enough to know the precision
they're
dealing with, and compensate accordingly, or, more realistically here, 
that the numbers we're dealing with in this case will never be large 
enough to test the limits of our precision (especially our numerator and 
denominator are set to 64bit and our disks will never be larger than 2TB 
anyway, because FAT32), I'd rather not take any risks here, even more so 
as I am fixing code that was missing addressable sectors and the last 
thing I'd want is find out that, because of a dodgy rounding or a wrong 
assumption, we might still end up missing some sectors after all.

And yeah, we could add the actual roundup in the equations themselves (I 
actually did that when I was trying to figure out if the original wrong 
computation wasn't due to the roundup), but since the "only going to be
*very slightly* wasteful in an exceedingly limited set of cases" +1 is a 
safe and sure way to address the rounding "issue", I have to admit
that
I'm not that interested in trying to come up with a more mathematically 
satisfying solution.

Working and relatively efficient code is what I am after, and as far as 
I'm concerned, using +1 is both a "correct" and
"efficient" way of
achieving that.

Still, I won't prevent you (or anybody else interested) to provide a 
proper formula if you want. ;)

Regards,

/Pete

[1] 
https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#Special_entries

Ady

2016-Feb-26 15:32 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

> > instead of _always_ adding "+1" (which would be
> > incorrect and inefficient from the point of view of the resulting
> > allocatable size).
> 
> I carefully considered this, and I dispute the fact that this is incorrect.
>  
 (snip) 
 > Still, I won't prevent you (or anybody else interested) to provide a 
> proper formula if you want. ;)
> 
> Regards,
> 
> /Pete
>  
You know I am not a developer, so I cannot comment about the 
difficulties of achieving a more accurate calculation using computer 
code.

As I mentioned in my previous email, I am not saying that some / any 
code is "wrong"; I am only presenting the math as accurate as I can.

Regarding the "+1", when talking about the math (not the computer code
to achieve a result as accurate as it can be), I already expressed the 
difference between using math operators such as "integer",
"quotient",
"modulo", "remainder"... as opposed to just using
"+1". Clearly, if the
result is already an integer (in the math, not in whichever computer 
code), then adding "+1" to that value is incorrect. In fact, the same 
goes to a fraction: adding "+1" to such value will result also in a 
non-integer value. Once again (sorry), I am talking math, not computer 
code.

As per the "big" numbers and accuracy, again, I cannot comment on 
computer code. I do want to remind you that, besides the "generic" 
determination of an "adequate" minfatsize in Syslinux's code, we
should
at least test the values that are around or close to the limits between 
FAT12/16/32.

For simplicity, I'll give basic examples (which I have not truly 
analyzed; developers could / should test relevant code).

Example A:
Volume_Total_Sectors: 66'463
FAT type: FAT32
Bytes_per_sector: 512
Sectors_per_Cluster: 1
Sectors_per_FAT: 511 ---> this value should _not_ be accepted as valid 
minfatsize; 512 should be OK.

Example B:
Volume_Total_Sectors: 4'187'104
FAT type: FAT32
Bytes_per_sector: 512
Sectors_per_Cluster: 64
Sectors_per_FAT: 511 ---> this value should _not_ be accepted as valid 
minfatsize; 512 should be OK.

I have not actually searched potential conflicting values; the above 2 
examples are just "around" the lower size limit of FAT32.

Let me put the example(s) in other words. Whichever calculation of 
minfatsize you use, values such as a Volume_Total_Sectors around:

 65'518 ... 66'463 

should be tested (among others; I could provide more ranges for 
testing). Such values are close to the lower limit of FAT32, and, 
depending on additional fields, they can be problematic (and at least 
some of them should be rejected).

I am not saying that these are typical values for FAT32 volumes that 
users would see / use in real life (but they might); this range is just 
a source for testing either the acceptance or rejection of a FAT32 
volume (in Syslinux's code).

BTW, please note that these values are not really "big" (although, for
computer code there might still be an "accuracy" problem; I
wouldn't
know).

I have not provided more details (nor performed actual analysis) 
regarding these values (or some others), because at this time I fear I 
could be just wasting my time in math (since I am not a developer that 
can actually translate it into useful code). If there would be any 
practical development in the code, I am willing to invest my time too 
(but I would like to know it, before I decide to use my time in such 
endeavour :).

At any rate, I certainly appreciate the people that already dedicated 
the time, 
knowledge, resources, skills... in this email thread and in this set of 
patches.

Regards,
Ady.

Pete Batard

2016-Feb-26 16:37 UTC

head link

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

On 2016.02.26 15:32, Ady via Syslinux wrote:> Regarding the "+1", when talking about the math (not the computer
code
> to achieve a result as accurate as it can be)
Well, sorry, but I will not dissociate the context of application from 
the formula itself. I thought this was implied into what I wrote, which 
was in the context of fixing a computation algorithm bug. The sole 
interest I (and I posit others should) have in the formula is how it 
applies to computer code, as we are not trying to write a mathematical 
proof in the absolute here, but instead code that meets the requirements 
of using all potentially addressable sectors. *This* is the only context 
of correctness that should matter. Anything else, as interesting as it 
may be, is sidetracking and not something I personally want to 
participate in.
> we should
> at least test the values that are around or close to the limits between
> FAT12/16/32.
There I'm going to reduce the context even further: In the context where 
this algorithm is applied in Rufus, we don't care about non FAT32 
computations, or ones performed for disks that are less than 32 GB in 
size (for which the sectors per cluster will never be 1 even with 4K 
sectors as our minimal cluster size is 8K as soon as we reach 31GB).

That's not to say that I wouldn't pick on improvements to the new 
computation algorithm [1] when low values are used, if such improvements 
are actually needed, and if *somebody else* kindly wants to figure them 
out, as there is always the possibility that other developers might 
re-use the code from Rufus and try to apply it to <31GB drives. But, in 
the context I am concerned with, I don't see any potential issue on the 
horizon from not performing that homework. So, there again, I will 
simply say thanks for bringing attention to these matters, while at the 
same time mention that I'm simply gonna pass on those, since they appear 
irrelevant with regards to the context that matters in my application...

Regards,

/Pete

[1] 
https://github.com/pbatard/rufus/blob/b9caf8b6058de12bf028f907471561a6aa50f7e9/src/format.c#L349-L366

Apparently Analagous Threads

Search for more possibly parallel threads

Syslinux - Feb 2016 - [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

[syslinux] [PATCH 1/5] fat: fix minfatsize for large FAT32

Apparently Analagous Threads