I am running into performance issues with ext3. Historically we had our image files (pictures of cars, currently 5.3 million) sub divided into a directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the first 4 letters/numbers of the file name and use that to put it into this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As the file names used to be based on VIN numbers of vehicles, that wasn't a problem. But then our developers changed the image file names using a vehicle ID from the database. And as we rolled over 1,000,000 in vehicle ids we would get large numbers of files into directories. And files do not get well distributed. So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the file name, using then the first 3 letters/numbers to file it away. On initial testing this worked well, distribution nice across the directories, so we could split this on separate file systems or disks. When we actually got to do this, a decision was made to use hard links from the old structure to the new structure for backward capability. And this turned into a disaster. Rsync or find on the new structure takes dramatic longer, talking about 5 minutes for a find on the old structure and hours on the new structure. Using strace I tracked it down to lstat64. On the old structure lstat64 takes on average 37 usecs/call while on the new structure it is over 2,400 usecs/call. EL4 does not seem to have this problem, unfortunately I can't just upgrade, out of other reasons. So anyone have ideas why lstat64 would be so much slower on the new structure? Any help, hints, suggestions would be great. Regards, Ulf. --------------------------------------------------------------------- Autotradecenter.com Inc, T: 650-532-6382, F: 650-532-6441 4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025 ---------------------------------------------------------------------
On Jul 19, 2006 17:00 -0700, Ulf Zimmermann wrote:> I am running into performance issues with ext3. Historically we had our > image files (pictures of cars, currently 5.3 million) sub divided into a > directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the > first 4 letters/numbers of the file name and use that to put it into > this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As > the file names used to be based on VIN numbers of vehicles, that wasn't > a problem. But then our developers changed the image file names using a > vehicle ID from the database. And as we rolled over 1,000,000 in vehicle > ids we would get large numbers of files into directories. And files do > not get well distributed. > > So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the > file name, using then the first 3 letters/numbers to file it away. On > initial testing this worked well, distribution nice across the > directories, so we could split this on separate file systems or disks. > > When we actually got to do this, a decision was made to use hard links > from the old structure to the new structure for backward capability. And > this turned into a disaster. Rsync or find on the new structure takes > dramatic longer, talking about 5 minutes for a find on the old structure > and hours on the new structure. Using strace I tracked it down to > lstat64. On the old structure lstat64 takes on average 37 usecs/call > while on the new structure it is over 2,400 usecs/call. > > EL4 does not seem to have this problem, unfortunately I can't just > upgrade, out of other reasons. So anyone have ideas why lstat64 would be > so much slower on the new structure? Any help, hints, suggestions would > be great.Do you have directories with more than, say, 10-15,000 entries? Do you have dir_index (directory indexing) feature enabled on your filesystem? This is done with "tune2fs -O dir_index" (even while mounted) but only affects new directories. I believe the RHEL3 code has this functionality, but it isn't enabled by default like I suspect it is on FC4. Once you have enabled this, then an OFFLINE run of "e2fsck -fD {dev}" will rebuild the directory indexes for existing directories. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
> -----Original Message----- > From: Andreas Dilger [mailto:adilger at clusterfs.com] > Sent: 07/20/2006 12:18 AM > To: Ulf Zimmermann > Cc: ext3-users at redhat.com > Subject: Re: Problems under Redhat EL3 and ext3 > > On Jul 19, 2006 17:00 -0700, Ulf Zimmermann wrote: > > I am running into performance issues with ext3. Historically we hadour> > image files (pictures of cars, currently 5.3 million) sub dividedinto a> > directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the > > first 4 letters/numbers of the file name and use that to put it into > > this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1,etc. As> > the file names used to be based on VIN numbers of vehicles, thatwasn't> > a problem. But then our developers changed the image file namesusing a> > vehicle ID from the database. And as we rolled over 1,000,000 invehicle> > ids we would get large numbers of files into directories. And filesdo> > not get well distributed. > > > > So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 onthe> > file name, using then the first 3 letters/numbers to file it away.On> > initial testing this worked well, distribution nice across the > > directories, so we could split this on separate file systems ordisks.> > > > When we actually got to do this, a decision was made to use hardlinks> > from the old structure to the new structure for backward capability.And> > this turned into a disaster. Rsync or find on the new structuretakes> > dramatic longer, talking about 5 minutes for a find on the oldstructure> > and hours on the new structure. Using strace I tracked it down to > > lstat64. On the old structure lstat64 takes on average 37 usecs/call > > while on the new structure it is over 2,400 usecs/call. > > > > EL4 does not seem to have this problem, unfortunately I can't just > > upgrade, out of other reasons. So anyone have ideas why lstat64would be> > so much slower on the new structure? Any help, hints, suggestionswould> > be great. > > Do you have directories with more than, say, 10-15,000 entries? > Do you have dir_index (directory indexing) feature enabled on your > filesystem? This is done with "tune2fs -O dir_index" (even while > mounted) but only affects new directories. I believe the RHEL3 code > has this functionality, but it isn't enabled by default like I > suspect it is on FC4.The filesystem was created under EL3. I am currently copying everything in the new structure into a new directory and it seems to be fast. My plan at this point is to rename the hard linked new structure at the end, and use that copy. I did run on one of the nodes e2fsck -D but that did not help. Hmmm, I just ran "tune2fs -O dir_index" on one node, tune2fs -l does show dir_index enabled now. But I am not sure if that will help, as getdents64 wasn't showing much difference in a strace -c, lstat64 on the other hand did.> > Once you have enabled this, then an OFFLINE run of "e2fsck -fD {dev}" > will rebuild the directory indexes for existing directories. > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc.
> -----Original Message----- > From: Theodore Tso [mailto:tytso at mit.edu] > Sent: 07/20/2006 11:25 AM > To: Ulf Zimmermann > Cc: Andreas Dilger; ext3-users at redhat.com > Subject: Re: Problems under Redhat EL3 and ext3 > > On Thu, Jul 20, 2006 at 12:24:41AM -0700, Ulf Zimmermann wrote: > > The filesystem was created under EL3. I am currently copyingeverything> > in the new structure into a new directory and it seems to be fast.My> > plan at this point is to rename the hard linked new structure at the > > end, and use that copy. I did run on one of the nodes e2fsck -D butthat> > did not help. > > e2fsck -D, or e2fsck -fD? You need the -f option in order to force > e2fsck to scan the whole filesystem and optimize all filesystems. > > - TedOn the one node I did, it was -D, which did do a force checked, but not because I specified -f, but because the file system hadn't been checked in > 192 days. Ulf.
> -----Original Message----- > From: ext3-users-bounces at redhat.com[mailto:ext3-users-bounces at redhat.com]> On Behalf Of Ulf Zimmermann > Sent: 07/20/2006 12:07 PM > To: Theodore Tso > Cc: Andreas Dilger; ext3-users at redhat.com > Subject: RE: Problems under Redhat EL3 and ext3 > > > -----Original Message----- > > From: Theodore Tso [mailto:tytso at mit.edu] > > Sent: 07/20/2006 11:25 AM > > To: Ulf Zimmermann > > Cc: Andreas Dilger; ext3-users at redhat.com > > Subject: Re: Problems under Redhat EL3 and ext3 > > > > On Thu, Jul 20, 2006 at 12:24:41AM -0700, Ulf Zimmermann wrote: > > > The filesystem was created under EL3. I am currently copying > everything > > > in the new structure into a new directory and it seems to be fast. > My > > > plan at this point is to rename the hard linked new structure atthe> > > end, and use that copy. I did run on one of the nodes e2fsck -Dbut> that > > > did not help. > > > > e2fsck -D, or e2fsck -fD? You need the -f option in order to force > > e2fsck to scan the whole filesystem and optimize all filesystems. > > > > - Ted > > On the one node I did, it was -D, which did do a force checked, butnot> because I specified -f, but because the file system hadn't beenchecked> in > 192 days. > > Ulf.The one other thing I hadn't answered before, each directory has on average 1,293 files, deviation of less then 100 each direction. In the old structure some directories had over 50,000 files and it didn't seem to slow it down. Dir_index was not enabled on the systems, so I enabled it on one node, waiting for something to finish before I can unmount it and run e2fsck -fD on it. Ulf.