Caution, tutorial about ufs/ffs fragmentation, space
and time optimization ahead ... :-)
Oleg Gritsak wrote:
> I'm just curious about some possible mismatch in between
> documentation and reallife OS behaviour... Noticed this thing for
> more than two years ago in 4.X and now seing this in 6.2...
I agree that the description in the manual pages is
oversimplifying and slightly inaccurate.
> It is said in "man newfs" and "man tunefs" that
threshhold for online optimization
> (space or time) is 8 percent.
It's more complex than that. There is no simple
threshold, but a hysteresis which is a function of
the "minsize" value (the -m option to newfs(8) and
tunefs(1)) and the current fragmentation of the
file system.
If the fragmentation grows beyond minfree-2 percent
(i.e. beyond 6% for the default minfree value of 8),
the file system switches to space optimization in
order to reduce fragmentation, or at least avoid
further fragmentation.
If the fragmentation drops below half of the minfree
value (i.e. 4% for the default case), it switches
back to time optimization.
Within the hysteresis interval (i.e. 4% to 6% in the
default case), you can change the optimization with
tunefs -m. Otherwise the file system selects the
optimization automatically whenever it needs to
allocate a new block during a write operation,
overriding the tunefs setting.
> But actually, FreeBSD switches to
> SPACE far more earlier (or at least reports to system message buffer).
Yes, it depends on the fragmentation, as explained
> Does it have any sense? As also noted in "man newfs", the
performance
> while optimizing for space fragmentation is reduced. So, why FreeBSD does
> this when file system is for example 50% empty and has 4-5GBs of free
space?
That can happen if the file system is heavily
fragmented. If you need to avoid it, there are
several possibilities.
First, during newfs, you could set fsize == bsize
(e.g. both 16K). If a fragment is the same size
as a whole block, fragmentation is always 0%.
However, you will possibly waste some space because
a fragment is the smallest allocation unit. But
disks are cheap nowadays ...
Second, you could increase the minfree value with
tunefs -m. For example, set it to 25%, so the
hysteresis grows to cover your current fragmentation.
Then use tunefs -o to manually set the optimization
back to time. The obvious disadvantage is that
larger part of the file system (25%) is reserved
and cannot be used by non-root users, i.e. some
space might be wasted. But, as above, disks are
cheap nowadays ...
However, note that a heavily fragmented file system
can theoretically run out of allocatable free space,
even if it has plenty of free space -- if that "free
space" consists only of unused parts of fragmented
blocks. It can happen in exceptional circumstances.
The purpose of switching to space optimization is to
avoid such a situation. Therefore, to answer your
question "Does it have any sense?": Yes, it does.
By the way, the current fragmentation is reported by
fsck during boot ("dmesg -a | grep fragm" if it is
still in your kernel message buffer). Otherwise,
type "dumpfs <your-file-system> | head" and look
for the "blocks" and "nffree" values. The current
fragmentation is the percent value of nffree of the
total blocks, i.e. nffree * 100 / blocks. For
example, this is the output from one of my file
systems:
$ dumpfs /dev/ad0s1f | head
magic 19540119 (UFS2) time Thu Apr 26 09:40:19 2007
superblock location 65536 id [ 42d80392 3470461f ]
ncg 398 size 37389708 blocks 36211584
bsize 16384 shift 14 mask 0xffffc000
fsize 2048 shift 11 mask 0xfffff800
frag 8 shift 3 fsbtodb 2
minfree 8% optim time symlinklen 120
maxbsize 16384 maxbpg 2048 maxcontig 8 contigsumsize 8
nbfree 973428 ndir 48445 nifree 8879640 nffree 290762
bpg 11761 fpg 94088 ipg 23552
You see that blocks is 36211584 and nffree is 290762,
so the current fragmentation is 0.80%. Also, the
current optimization is reported in the first line
("time" in this case).
Best regards
Oliver
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n-
chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
"What is this talk of 'release'? We do not make software
'releases'.
Our software 'escapes', leaving a bloody trail of designers and quality
assurance people in its wake."