Hi all, I''m still in trouble with numbers: the available, used and necessary space on my MDT: According to "lfs df", I have now filled my file system with 115.3 TB. All of these files are sized 5 MB. That should be roughly 24 million files. For the MDT, "lfs df" reports 28.2 GB used. Now I believed that creating a file on Lustre means using one inode on the MDT. Since all of my Lustre partitions were formatted with the default options (all of this is running Lustre v. 1.6.4.3, btw), an inode should eat up 4kB on the MDT partition. Of course, 24 million files times 4 kB gives you 91 GB rather than 28GB. Obviously, there is something I missed completely. Perhaps somebody could illuminate me here? This issue could also be phrased as "How large should my MDT be to accommodate n TB storage space?" The manual''s answer boils down to "= number of files * 4 kB " (*2 per recommendation). That''s how I calculated above - maybe my test system is broken? I can''t check on the content of these files, it''s just 5MB test files created with the ''stress'' utility. Thanks and regards, Thomas
Thomas Roth wrote:> Hi all, > > I''m still in trouble with numbers: the available, used and necessary > space on my MDT: > According to "lfs df", I have now filled my file system with 115.3 TB. > All of these files are sized 5 MB. That should be roughly 24 million files. > For the MDT, "lfs df" reports 28.2 GB used. > > Now I believed that creating a file on Lustre means using one inode on > the MDT. Since all of my Lustre partitions were formatted with the > default options (all of this is running Lustre v. 1.6.4.3, btw), an > inode should eat up 4kB on the MDT partition. Of course, 24 million > files times 4 kB gives you 91 GB rather than 28GB. > Obviously, there is something I missed completely. Perhaps somebody > could illuminate me here? > > This issue could also be phrased as "How large should my MDT be to > accommodate n TB storage space?" The manual''s answer boils down to "= > number of files * 4 kB " (*2 per recommendation). That''s how I > calculated above - maybe my test system is broken? I can''t check on the > content of these files, it''s just 5MB test files created with the > ''stress'' utility. > > Thanks and regards, > ThomasThe size of the MDS inode depends on the number of stripes. 4.5k is the maximum, 512k the minimum. Actually size varies with number of stripes in the file. So, we advise using 4k as an estimate, as that will cover the vast majority of cases, but actual use in almost all situations will be smaller than 4k. cliffw> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On May 13, 2008 19:41 +0200, Thomas Roth wrote:> I''m still in trouble with numbers: the available, used and necessary > space on my MDT: > According to "lfs df", I have now filled my file system with 115.3 TB. > All of these files are sized 5 MB. That should be roughly 24 million files. > For the MDT, "lfs df" reports 28.2 GB used. > > Now I believed that creating a file on Lustre means using one inode on > the MDT. Since all of my Lustre partitions were formatted with the > default options (all of this is running Lustre v. 1.6.4.3, btw), an > inode should eat up 4kB on the MDT partition. Of course, 24 million > files times 4 kB gives you 91 GB rather than 28GB. > Obviously, there is something I missed completely. Perhaps somebody > could illuminate me here? > > This issue could also be phrased as "How large should my MDT be to > accommodate n TB storage space?" The manual''s answer boils down to "= > number of files * 4 kB " (*2 per recommendation). That''s how I > calculated above - maybe my test system is broken? I can''t check on the > content of these files, it''s just 5MB test files created with the > ''stress'' utility.Please provide output of "lfs df" and "lfs df -i" so we can see the actual numbers. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Andreas Dilger wrote:> On May 13, 2008 19:41 +0200, Thomas Roth wrote: >> I''m still in trouble with numbers: the available, used and necessary >> space on my MDT: >> According to "lfs df", I have now filled my file system with 115.3 TB. >> All of these files are sized 5 MB. That should be roughly 24 million files. >> For the MDT, "lfs df" reports 28.2 GB used. >> >> Now I believed that creating a file on Lustre means using one inode on >> the MDT. Since all of my Lustre partitions were formatted with the >> default options (all of this is running Lustre v. 1.6.4.3, btw), an >> inode should eat up 4kB on the MDT partition. Of course, 24 million >> files times 4 kB gives you 91 GB rather than 28GB. >> Obviously, there is something I missed completely. Perhaps somebody >> could illuminate me here? >> >> This issue could also be phrased as "How large should my MDT be to >> accommodate n TB storage space?" The manual''s answer boils down to "= >> number of files * 4 kB " (*2 per recommendation). That''s how I >> calculated above - maybe my test system is broken? I can''t check on the >> content of these files, it''s just 5MB test files created with the >> ''stress'' utility. > > Please provide output of "lfs df" and "lfs df -i" so we can see the > actual numbers. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. >Ok, I''m attaching these two files with the output. You''ll notice some OSTs with a lower fill level: these OSS had some hardware problems during the production of my (now counted) 22.7 million files. Greetings, Thomas -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output_lfs_df Url: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080514/0b675bb0/attachment-0002.ksh -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output_lfs_df-i Url: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080514/0b675bb0/attachment-0003.ksh
On May 14, 2008 09:41 +0200, Thomas Roth wrote:> Andreas Dilger wrote: >> On May 13, 2008 19:41 +0200, Thomas Roth wrote: >>> I''m still in trouble with numbers: the available, used and necessary >>> space on my MDT: >>> According to "lfs df", I have now filled my file system with 115.3 TB. >>> All of these files are sized 5 MB. That should be roughly 24 million files. >>> For the MDT, "lfs df" reports 28.2 GB used. >>> >>> Now I believed that creating a file on Lustre means using one inode on >>> the MDT. Since all of my Lustre partitions were formatted with the >>> default options (all of this is running Lustre v. 1.6.4.3, btw), an inode >>> should eat up 4kB on the MDT partition. Of course, 24 million files times >>> 4 kB gives you 91 GB rather than 28GB. >>> Obviously, there is something I missed completely. Perhaps somebody could >>> illuminate me here? >>> >>> This issue could also be phrased as "How large should my MDT be to >>> accommodate n TB storage space?" The manual''s answer boils down to "= >>> number of files * 4 kB " (*2 per recommendation). That''s how I >>> calculated above - maybe my test system is broken? I can''t check on the >>> content of these files, it''s just 5MB test files created with the >>> ''stress'' utility. >> >> Please provide output of "lfs df" and "lfs df -i" so we can see the >> actual numbers. > > Ok, I''m attaching these two files with the output. You''ll notice some OSTs > with a lower fill level: these OSS had some hardware problems during the > production of my (now counted) 22.7 million files. > Greetings, > Thomas >> # lfs df > UUID 1K-blocks Used Available Use% Mounted on > gsilust1-MDT0000_UUID 495497804 29575416 465922388 5% /lustre[MDT:0] > filesystem summary: 137454163496 123851912300 13602251196 90% /lustre> # lfs df -i > UUID Inodes IUsed IFree IUse% Mounted on > gsilust1-MDT0000_UUID 141590528 22816486 118774042 16% /lustre[MDT:0] > filesystem summary: 141590528 22816486 118774042 16% /lustreI finally understand your question now. To clarify, with ext3 (ldiskfs) an inode is preallocated, and is counted as part of the filesystem "overhead", and not in the "used" space. Creating and deleting files in ext3 doesn''t "consume" any space for the inode, only for the directory entries. The 4kB/inode guideline is to ensure that there is enough space in the MDS for the "overhead" parts of the filesystem. The default is 512-byte inodes on the MDT, so this gives an overhead of 141590528 * 512 = 67GB, which is not in the "Used" space. There is also directory overhead of somewhere around (12 + filename) * num_files * 2 (directory block overhead). I''ll guess 16-byte filenames, so this gives 141590528 * (12 + 16) * 2 = 7.5GB at the minimum. There is additionally MDS log file overhead in order to keep the distributed filesystem sane in the face of a crash, and the ext3 journal (400MB). If you have longer filenames, or small directories, or are striping files over many OSTs, then using 29GB on the MDT doesn''t seem outrageous. The MDT is using 5% of the space, and 16% of the inodes, so there isn''t much to worry about space consumption. In your case, you are using 1327 bytes per inode on average. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.