Question: Does ext2/3 (or any other filesystem you know of) support
a system call turning blocks within a file back into "sparse zeros",
i.e. giving the blocks back to the filesystem?
Background:
I am working on a slotted fileformat where internal fragmentation
occurs. One such occurrence is growth of the data in a given slot,
which currently requires me to handle the fragmentation explicitly.
For example:
...==|== slot 1 ==|=== slot 2 ===|==...
Now assume that contents of slot 1 is replaced with a larger chunk of
data. I must either append additional data e.g. at the end of the file
...==|== slot 1a =|=== slot2 ===|==...==|=== slot1b ==|
(and add my own data structures and code infrastructure to read
fragmented slots) or leave the old (defunct) slot 1 data in place
and garbage collect it later:
...==|= deadbeef =|=== slot2 ===|==...==|====== slot1' =====|
It's my impression that mechanisms for handling similar types of
fragmentation is already implemented quite well in most modern
filesystems, and hence I was wondering:
Question: Does ext2/3 (or any other filesystem you know of) support
turning blocks within a file back into "sparse zeros", i.e. giving
the blocks back to the filesystem?
If that was the case I could simply free the disk blocks belonging
entirely to slot1 (as in turning it into zeros in a sparse file) and
append the new data at the end:
...==|XXXXXXXXXXXX|=== slot2 ===|==...==|===== slot1' =====|
and in effect having the file system do my garbage collection for me.
I would (probably very naively) think that this should be possible and
cheap since it only involves by manipulating trees/freelists/whatever
and perhaps "massaging" a small number of actual data blocks.
(I should mention that my slots are typically much larger than a
disk block 100kB-1MB, say)
Simple example: On a file system supporting sparse files, the following
fd = open("sparse1",w");
write(fd, buf, 10);
lseek(fd,1000000,SEEK_CUR);
write(fd, buf, 10);
will create a file occupying a small number of
disk blocks. Ideally I would like to be able to do
the following
fd = open("sparse2",w");
write(fd, buf, 1000020);
lseek(fd,10,SEEK_SET);
giveback(fd, 1000000, SEEK_CUR);
lseek(fd,1000010,SEEK_SET);
write(fd, buf, 10);
and end up with a sparse2 not much larger than
sparse1. "giveback" is my imaginary system call
that tells the file system that n bytes starting
at a given offset should no longer be considered part
of the file and the associated blocks given back to the
file system's freelist. I realize that this might not be
possible currently if the chunk you wish to free is
not aligned with the start of a block etc. etc.
A "cruder" interface, just giving whole disk blocks back
would be acceptable, though.
Your comments would be much appreciated,
Rasmus Munk Larsen, Stanford University.