RafaĆ Radecki
2012-Sep-27 08:10 UTC
[CentOS] 11TB ext4 filesystem - filesystem alternatives?
Hi All. I have a CentOS server: CentOS 5.6 x86_64 2.6.18-238.12.1.el5.centos.plus e4fsprogs-1.41.12-2.el5.x86_64 which has a 11TB ext4 filesystem. I have problems with running fsck on it and would like to change the filesystem because I do not like the possibility of running long fsck on it, it's a production machine. Also I have some problems with running fsck (not enough RAM, problem with scratch_files option) and if the filesystem will need intervention I will be in a problematic situation. Which other mature and stable filesystem can you recommend for such large storage? Best regards, Rafal Radecki.
On 27.09.2012 09:10, Rafa? Radecki wrote:> Hi All. > > I have a CentOS server: > > CentOS 5.6 x86_64 > 2.6.18-238.12.1.el5.centos.plus > e4fsprogs-1.41.12-2.el5.x86_64 > > which has a 11TB ext4 filesystem. I have problems with running fsck > on it > and would like to change the filesystem because I do not like the > possibility of running long fsck on it, it's a production machine. > Also I > have some problems with running fsck (not enough RAM, problem with > scratch_files option) and if the filesystem will need intervention I > will > be in a problematic situation. > > Which other mature and stable filesystem can you recommend for such > large > storage?Never had to deal with such a large filesystem, yet, but I'd try XFS on it. Alternatively you can look at less supported filesystems such as BTRFS. Or even http://zfsonlinux.org/. -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro
Leon Fauster
2012-Sep-27 09:36 UTC
[CentOS] 11TB ext4 filesystem - filesystem alternatives?
Am 27.09.2012 um 10:10 schrieb Rafa? Radecki:> Hi All. > > I have a CentOS server: > > CentOS 5.6 x86_64 > 2.6.18-238.12.1.el5.centos.plus > e4fsprogs-1.41.12-2.el5.x86_64 > > which has a 11TB ext4 filesystem. I have problems with running fsck on it > and would like to change the filesystem because I do not like the > possibility of running long fsck on it, it's a production machine. Also I > have some problems with running fsck (not enough RAM, problem with > scratch_files option) and if the filesystem will need intervention I will > be in a problematic situation. > > Which other mature and stable filesystem can you recommend for such large > storage?what about: $ man tune2fs "maximum / mount count / time" can be changed. and to boot "faster" just do $ touch /fastboot $ reboot -- LF
On Thu, Sep 27, 2012 at 10:10 AM, Rafa? Radecki <radecki.rafal at gmail.com>wrote:> Which other mature and stable filesystem can you recommend for such large > storage? > >I recommend XFS BR Bent
James A. Peltier
2012-Sep-28 23:22 UTC
[CentOS] 11TB ext4 filesystem - filesystem alternatives?
----- Original Message ----- | Hi All. | | I have a CentOS server: | | CentOS 5.6 x86_64 | 2.6.18-238.12.1.el5.centos.plus | e4fsprogs-1.41.12-2.el5.x86_64 | | which has a 11TB ext4 filesystem. I have problems with running fsck | on it | and would like to change the filesystem because I do not like the | possibility of running long fsck on it, it's a production machine. | Also I | have some problems with running fsck (not enough RAM, problem with | scratch_files option) and if the filesystem will need intervention I | will | be in a problematic situation. | | Which other mature and stable filesystem can you recommend for such | large | storage? | | Best regards, | Rafal Radecki. | _______________________________________________ | CentOS mailing list | CentOS at centos.org | http://lists.centos.org/mailman/listinfo/centos | As someone who is working with 15-30TB volumes, use XFS, but be sure you have a lot of memory. 48GB at least and more if you have directories with 10s of thousands of files in them. -- James A. Peltier Manager, IT Services - Research Computing Group Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier at sfu.ca Website : http://www.sfu.ca/itservices http://blogs.sfu.ca/people/jpeltier Success is to be measured not so much by the position that one has reached in life but as by the obstacles they have overcome. - Booker T. Washington
On Friday, September 28, 2012 04:29:55 PM Keith Keller wrote:> No filesystem can fully protect against power failures--that's not its > job. That's why higher-end RAID controllers have battery backups, and > why important servers should be on a UPS. If you are really paranoid, > you can probably tweak the kernel (e.g., using sysctl) to flush disk > writes more frequently, but then you might drag down performance with > it.As far as UPS's are concerned, even those won't protect you from a BRS event. BRS = Big Red Switch, aka EPO, or Emergency Power Off. NEC Article 645 (IIRC) mandates this for Information Technology rooms that use the relaxed rules of that article (and virtually all IT rooms do so, in my experience). The EPO is supposed to take *everything* down hard (including the DC to the UPS's, if the UPS is in the room, and shunt trip the breakers feeding the room so that the room is completely dead), and the fire suppression system is supposed to be tied in to it. And the EPO has to be a push to activate, and it has to be accessible, and people have hit the switch before. Caching controllers are only part of the equation; in a BRS event, the battery is likely to have let go of the cache contents by the time things are back up, depending upon what caused the BRS event. This is a case where you should test this with a server and make see just how long the battery will hold the cache. In the case of EMC Clariions, the write cache (there is only one, mirrored between the storage processors) on the storage processors is flushed to the 'vault' disks in an EPO event; there is a small UPS built in to the rack that keeps the vault disks up long enough to do this, and the SP's can then do an orderly shutdown. Takes about 90 seconds with a medium sized write cache and fast vault drives. Then, when the system boots back up, the vault contents are flushed out to the LUN's. Now, to make this reliable, EMC has custom firmware loaded on their drives that doesn't do any write caching on the drive itself, and that is part of the design of their systems. Drive enclosures (DAE, in EMC's terminology) other than the DAE with the OS and vault disks, can go down hard and the array won't lose data, thanks to the vault and the EMC software. The EMC software periodically tests the battery backup units, and will disable the write cache (and flush it to disk) if the battery faults during the test. It is amazing how much performance is due to good (and large) write caches; modern SATA drives owe much of their performance to their write caches. No if the sprinkler system is what caused the EPO, well, it may not matter how good the write cache vault is, depending on how wet things get...... but that's part of the DR plan, or should be....
On Saturday, September 29, 2012 11:56:04 AM John R Pierce wrote:> On 09/29/12 5:19 AM, Ilyas -- wrote: > > Backend storage is 2 SATA directly attached disks. No any caches on > > SATA controller. > > Both disks run in mdraid mirror. > > > > Zeroed files have written many days (some files was written and closed > > 2 weeks ago) ago before power fail. > > How do 2 sata disks in a mirror make 11TB ?!?They don't, John. Ilyas is not the OP. The point was showing XFS corruption with a fairly simple setup, I think. But Ilyas is welcome to post if I'm wrong...