Sebastian Gutierrez
2008-Dec-10 20:18 UTC
[Lustre-discuss] Performance issues in directories after large number of I/O operations on small files.
Hello I have a user that keeps having issues after using a directory and performing quite a number of I/O operations on the folder. If he moves delete creates files in a directory on his luster working directory the directory eventually gets unmanageably slow. He is working with a large number of small files. Here is a example of a copied directory that shows the directory metadata much smaller but has the exact same data in it. drwxr-sr-x 3 user group 4096 Dec 3 17:26 StatisticsQualScore drwxr-sr-x 3 user group 12623872 Dec 3 15:59 StatisticsQualScore_old After a copy of the directory the directory speeds improve. Is this happening because of a LRU sizing issue or could there be something else that could be causing this? We currently have a work around but I would like to have a better understanding of this and possible solution or option I can implement. Thank you Sebastian
Brian J. Murrell
2008-Dec-11 13:19 UTC
[Lustre-discuss] Performance issues in directories after large number of I/O operations on small files.
On Wed, 2008-12-10 at 12:18 -0800, Sebastian Gutierrez wrote:> Hello > I have a user that keeps having issues after using a directory and > performing quite a number of I/O operations on the folder. > > If he moves delete creates files in a directory on his luster working > directory the directory eventually gets unmanageably slow. He is > working with a large number of small files. > Here is a example of a copied directory that shows the directory > metadata much smaller but has the exact same data in it. > > drwxr-sr-x 3 user group 4096 Dec 3 17:26 StatisticsQualScore > drwxr-sr-x 3 user group 12623872 Dec 3 15:59 StatisticsQualScore_old > > After a copy of the directory the directory speeds improve. > > Is this happening because of a LRU sizing issue or could there be > something else that could be causing this?Yeah, this sounds like the LRU dynamic resizing bug, 17282.> We currently have a work > around but I would like to have a better understanding of this and > possible solution or option I can implement.The work-around is to set a static LRU size. Details are in the bug. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081211/5480ff76/attachment.bin
Andreas Dilger
2008-Dec-15 18:45 UTC
[Lustre-discuss] Performance issues in directories after large number of I/O operations on small files.
On Dec 10, 2008 12:18 -0800, Sebastian Gutierrez wrote:> If he moves delete creates files in a directory on his luster working > directory the directory eventually gets unmanageably slow. > He is working with a large number of small files. > Here is a example of a copied directory that shows the directory > metadata much smaller but has the exact same data in it. > > drwxr-sr-x 3 user group 4096 Dec 3 17:26 StatisticsQualScore > drwxr-sr-x 3 user group 12623872 Dec 3 15:59 StatisticsQualScore_oldA 12MB directory would hold in the neighbourhood of 250k files. It should take about 150MB to cache all of the files in this directory, which isn''t that much memory these days. The "new" directory appears to be virtually or completely empty as a 4096-byte directory is the minimum possible size, holding only about 100 files in it. Can you be more specific about what is "unmanageably slow" means? Is that the create/delete speed, or the "ls -l" speed, or something else? Is it "more slow" after a number of create/delete cycles, or only slower than after the directory was first created? Is it "more slow" after the files have been deleted than when it the files are still in the directory?> After a copy of the directory the directory speeds improve. > > Is this happening because of a LRU sizing issue or could there be > something else that could be causing this? We currently have a work > around but I would like to have a better understanding of this and > possible solution or option I can implement.I don''t think this is related to the lock LRU size. Rather, the directory size itself is growing because of many files being created in it, but deleting files from the directory does not shrink the directory again. This is how ext3/4/ldiskfs works, on the assumption that the directory will again grow to this size so freeing the blocks again is unnecessary work. The problem is that after deleting 99% of the files in the directory the remaining ones are spread around the disk, so instead of having all of them read/written with only a few blocks of data this requires many more blocks to be read/written. If the directory will grow to contain 250k files again then there isn''t much that can be done because it will again grow to the large size. The ability to shrink ext2/3/ldiskfs directories has been discussed in the past, but hasn''t really gotten much attention because workloads that do this are rare and the workaround is straight forward. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.