Hello all, I have just configured a 64-bit CentOS 5.5 machine to support an XFS filesystem as specified in the subject line. The filesystem will be used to store an extremely large number of files (in the tens of millions). Due to its extremely large size, would there be any non-standard XFS build/configuration options I should consider? Thanks. Boris. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20100929/f87b087d/attachment.html>
hi Boris, On 09/29/2010 02:00 PM, Boris Epstein wrote:> I have just configured a 64-bit CentOS 5.5 machine to support an XFSI dont have any specific hints for you - but when you are done, a page in the centos wiki would be nice to have, with challenges and options you had to work through along with any recommendations! thanks in advance. :) - KB
On Wednesday 29 September 2010, Boris Epstein wrote:> Hello all, > > I have just configured a 64-bit CentOS 5.5 machine to support an XFS > filesystem as specified in the subject line. The filesystem will be used to > store an extremely large number of files (in the tens of millions). Due to > its extremely large size, would there be any non-standard XFS > build/configuration options I should consider?I have created and tested filesystems larger than 25T using xfs on CentOS-5 (64-bit). I did not use any non-standard options. Do not attempt this on a 32-bit box. However, given the size of the device I assume that this is a raid of some sort. You'll want to make sure to run mkfs.xfs with the proper stripe parameters to get the alignment right. Also, you may want to make sure your LVM or partition table is properly aligned. Even with the above done right you may get worse performance than expected since "lots of small files" typically reads like "terrible performance". Finally I'd suggest you fill the filesystem and read it back (verifying what you wrote). This is, imho, a reasonable level of paranoia. /Peter> Thanks. > > Boris.-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: <http://lists.centos.org/pipermail/centos/attachments/20100929/a4af3174/attachment.sig>
----- Original Message ----- | On Wednesday 29 September 2010, Boris Epstein wrote: | > Hello all, | > | > I have just configured a 64-bit CentOS 5.5 machine to support an XFS | > filesystem as specified in the subject line. The filesystem will be | > used to | > store an extremely large number of files (in the tens of millions). | > Due to | > its extremely large size, would there be any non-standard XFS | > build/configuration options I should consider? | | I have created and tested filesystems larger than 25T using xfs on | CentOS-5 | (64-bit). I did not use any non-standard options. Do not attempt this | on a | 32-bit box. | | However, given the size of the device I assume that this is a raid of | some | sort. You'll want to make sure to run mkfs.xfs with the proper stripe | parameters to get the alignment right. Also, you may want to make sure | your | LVM or partition table is properly aligned. | | Even with the above done right you may get worse performance than | expected | since "lots of small files" typically reads like "terrible | performance". | | Finally I'd suggest you fill the filesystem and read it back | (verifying what | you wrote). This is, imho, a reasonable level of paranoia. | | /Peter | | > Thanks. | > | > Boris. | | _______________________________________________ | CentOS mailing list | CentOS at centos.org | http://lists.centos.org/mailman/listinfo/centos On my 30+TB file systems all I've done is mkfs.xfs with stripe and width parameters and they are very speedy. I've not done anything on the LVM side and see no performance issues, but perhaps I need to investigate that some more. :\ -- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier at sfu.ca Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca MSN : subatomic_spam at hotmail.com Does your OS has a man 8 lart? http://www.xinu.nl/unix/humour/asr-manpages/lart.html
On Wednesday, September 29, 2010 01:25:11 pm Peter Kjellstrom wrote:> You are a bit mistaken. The raid controller does not "copy data around as it > sees fit". It stores data on each disk in chunk-size'ed pieces. It then > stripes this across all drives giving you a stripe-size'ed piece of chunk > size times the number of data drives.[Snip math]> Then again, for other workloads the effect could be insignificant. YMMV.For a simple RAID controller I can see some benefit. However, in my case the 'RAID controller' is on SAN, consisting of three EMC Clariion arrays: a CX3-10c, a CX3-80, and a CX700. The EMC Navisphere/Unisphere tools allow LUN migration across RAID groups; I could very well take a LUN from a RAID1/0 with 16 drives to a RAID5 with 9 drives to a RAID6 with 10 drives to a RAID6 with 16 drives and have different stripe sizes. Further, since this is all being accessed through VMware ESX, I'm limited to 2TB LUNs anyway, even using raw device mappings, which I do, but for a different reason; LVM to the rescue to get this: [root at backup-rdc ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 37G 18G 18G 50% / /dev/sda1 99M 26M 69M 28% /boot /dev/mapper/dasch--backup-volume1 21T 19T 2.6T 88% /opt/backups tmpfs 1006M 0 1006M 0% /dev/shm /dev/mapper/dasch--rdc-cx3--80 23T 19T 4.2T 82% /opt/dasch-rdc [root at backup-rdc ~]# Yeah, the output of pvscan is pretty long (it has been longer, and seeing things like /dev/sdak1 is strange....). Using XFS at the moment. The two volume groups are on two different arrays; one is on the CX700 and the other on the CX3-80, and they're physically separated at two locations on-campus, with single-mode 4Gb/s FC ISL's between switches. They're soon to be connected to different VMware ESX hosts; the dual fibre-channel connect was so the initial sync time would be reasonable. I looked through all the performance optimization howtos for XFS that I could find, but then realized how futile that would be with these 'RAID controllers' and their massive caches (our CX3-80 SP's have 8GB of RAM each; the shared write cache and the variable-sized read cache, which I have set up for a rather large size on our CX3-80: 3GB on each SP for read, and 2GB for write; the CX700 has 4GB (actually 3968MB) split 1GB read 2GB write); the benchmarks that I did (that I can't release due to both EMC and VMware's EULAs' prohibitions) showed that the performance differences with alignment versus without were insignificant with these 'RAID controllers'. But for something inside the server, like a 3ware 9500 or similar, it might be worthwhile to align to stripe size, since that is a fixed constant for the logical drives that controller exports. And Peter is very right: YMMV depending upon workload. Our load for this system is, as can be inferred from the name of the machine, backups of a raw data set that are processed once and then archived. I/O's per second isn't even on the radar for this workload; throughput, on the other hand, is. And man these Clariions are fast.