On Jan 09, 2006 18:45 -0700, Kumaran Rajaram wrote:> On a MDS node, whats the definition of the following /proc fs variables:
>
> i) filestotal
> ii) filesfree
> iii) kbytesavail
> iv) kbytesfree
> v) kbytestotal
They are basically what you would assume, with some caveats below.
> 1). I assumed filestotal is the grand total number of inodes that can be
created and filesfree is the remaining/available inodes (such that filesfree
< filestotal)
>
> On an Empty Lustre FS:
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filestotal
> 214586
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filesfree
> 214552
>
> When mounting the MDS via ext3, I see 47 files while according to procfs
there are total of 32 files (214586-214552)
>
> n1:/mnt/scratch # ls -1aR /mnt/scratch/ | wc -l
> 47
The difference between filestotal and filesfree should in fact reflect the
number of files currently in use in the filesystem. One complicating factor
is that ext3 reserves some inodes for internal use, and Lustre also uses
some inodes internally.
> I created 100 files, and surprisingly the filetotal seems to increase and
filesfree remains constant
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filestotal
> 214686
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filesfree
> 214552
The reason that filetotal is increasing and filesfree is remaining constant
is because filesfree is the "pessimistic" (minimum) number of files
that
can be created in the MDS filesystem. The pessimistic estimate is that
each file needs to store an EA in an external block, and therefore is limited
by the amount of free space in the filesystem (kbytesfree / 4) (blocksize is
4096 bytes), so in your case 858208 / 4 = 214552.
Because the number of inodes is actually slightly more than this (it will
be kbytestotal / 4096 bytes, so 874872 / 4 = 218718) there is a period when
the filesystem is first being used that there are actually more free inodes
than free blocks. Once you are past this threshold the filestotal and
filesfree will be the "real" numbers.
> 2). With respect to kbytesfree and kbytesavail, I saw previous discussions.
> Guess its related to Lustre on a whole
>
> https://lists.clusterfs.com/pipermail/lustre-discuss/2005-July/000818.html
> How about the same parameter with respect to MDS ? The variables seem to
> have same values before and after the creation of 100 files.
> ----------
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytestotal
> 874872
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytesfree
> 858208
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytesavail
> 808208
Because ext3 is formatted so that "default" striped files (as defined
by
--stripe_cnt at Lustre format time) should fit into the "fast" EA
space
of a large inode they do not normally take up any filesystem space. What
DOES take up space is wide-striped files and directories, other EAs (in
upcoming 1.4.6 it is possible to store arbitrary EAs on inodes), and Lustre
internal files, logs, etc.
> n1:~ # dumpe2fs /datadir/metadata_scratch | grep ^Inode | grep size
> dumpe2fs 1.34 (25-Jul-2003)
> Inode size: 512
If you had done "dumpe2fs /datadir/metadata_scratch | grep -i inode"
you
would have gotten the "real" numbers for filestotal and filesfree
("Inode count:" and "Free inodes:").
> I assumed since the default inode size is 512bytes, creating 100
> files would decrease the kbytesavail and kbytesfree on the MDS (since
> 512 bytes is consumed per inode with default striping).
Ext3 preallocates all of the inodes at mkfs time. This is one reason
that ext3 has very good fsck support, because the inodes (and most other
fs metadata, except directories) are in a fixed location and the kernel
knows it should never overwrite the metadata, and e2fsck knows where to
find it regardless of other filesystem corruption.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.