I have trouble finding definitive information about this. I am considering the use of SME 7.5.1 (centOS based) for my server needs, but I do want to use ZFS and I thus far I have only found information about the ZFS-Fuse implementation and unclear hints that there is another way. Phoronix reported that http://kqinfotech.com/ would release some form of ZFS for the kernel but I have found nothing. Can so. tell me if fuse-ZFS is more trouble than it's worth? Thanks in adv. Dawide
> Can so. tell me if fuse-ZFS is more trouble than it's worth?I've tried both fuse-ZFS, and also zfs installed from rpm's on zfsonlinux.org. Both on centos 5.5. fuse-ZFS is more polished, but cut write speeds in half on my raid 5. I ended up going ext4. SME Server is great by the way - been using it for years.
On 04/02/11 1:54 PM, Dawid Horacio Golebiewski wrote:> I have trouble finding definitive information about this. I am considering > the use of SME 7.5.1 (centOS based) for my server needs, but I do want to > use ZFS and I thus far I have only found information about the ZFS-Fuse > implementation and unclear hints that there is another way. Phoronix > reported that http://kqinfotech.com/ would release some form of ZFS for the > kernel but I have found nothing. > > Can so. tell me if fuse-ZFS is more trouble than it's worth?ZFS isn't GPL, therefore can't be integrated into the kernel where a file system belongs, therefore is pretty much relegated to user space (fuse), and its just not very well supported on Linux. If you really want to use ZFS, I'd suggest using Solaris or one of its derivatives (OpenIndiana, etc) where its native.
On 4/2/2011 2:54 PM, Dawid Horacio Golebiewski wrote:> I do want to > use ZFS and I thus far I have only found information about the ZFS-Fuse > implementation and unclear hints that there is another way.Here are some benchmark numbers I came up with just a week or two ago. (View with fixed-width font.) Test ZFS raidz1 Hardware RAID-6 ------------------------------- ---------- --------------- Sequential write, per character 11.5 (15% CPU) 71.1 MByte/s (97% CPU) Sequential write, block 12.3 (1%) 297.9 MB/s (50%) Sequential write, rewrite 11.8 (2%) 137.4 MB/s (27%) Sequential read, per character 48.8 (63%) 72.5 MB/s (95%) Sequential read, block 148.3 (5%) 344.3 MB/s (31%) Random seeks 103.0/s 279.6/s The fact that the write speeds on the ZFS-FUSE test seem capped at ~12 MB/s strikes me as odd. It doesn't seem to be a FUSE bottleneck, since the read speeds are so much faster, but I can't think where else the problem could be since the hardware was identical for both tests. Nevertheless, it means ZFS-FUSE performed about as well as a Best Buy bus-powered USB drive on this hardware. On only one test did it even exceed the performance of a single one of the drives in the array, and then not by very much. Pitiful. I did this test with Bonnie++ on a 3ware/LSI 9750-8i controller, with eight WD 3 TB disks attached. Both tests were done with XFS on CentOS 5.5, 32-bit. (Yes, 32-bit. Hard requirement for this application.) The base machine was a low-end server with a Core 2 Duo E7500 in it. I interpret several of the results above as suggesting that the 3ware numbers could have been higher if the array were in a faster box. For the ZFS configuration, I exported each disk from the 3ware BIOS as a separate single-disk volume, then collected them together into a single ~19 TB raidz1 pool. (This controller doesn't have a JBOD mode.) I broke that up into three ~7 TB slices, each formatted with XFS. I did the test on only one of the slices, figuring that they'd all perform about equally. For the RAID-6 configuration, I used the 3ware card's hardware RAID, creating a single ~16 TB volume, formatted XFS. You might be asking why I didn't choose to make a ~19 TB RAID-5 volume for the native 3ware RAID test to minimize the number of unnecessary differences. I did that because after testing the ZFS-based system for about a week, we decided we'd rather have the extra redundancy than the capacity. Dropping to 16.37 TB on the RAID configuration by switching to RAID-6 let us put almost the entire array under a single 16 TB XFS filesystem. Realize that this switch from single redundancy to dual is a handicap for the native RAID test, yet it performs better across the board. In-kernel ZFS might have beat the hardware RAID on at least a few of the tests, due to that handicap. (Please don't ask me to test one of the in-kernel ZFS patches for Linux. We can't delay putting this box into production any longer, and in any case, we're building this server for another organization, so we couldn't send the patched box out without violating the GPL.) Oh, and in case anyone is thinking I somehow threw the test, realize that I was rooting for ZFS from the start. I only did the benchmark when it so completely failed to perform under load. ZFS is beautiful tech. Too bad it doesn't play well with others.> Phoronix > reported that http://kqinfotech.com/ would release some form of ZFS for the > kernel but I have found nothing.What a total cock-up that was. Here we had this random company no one had ever heard from before putting out a press release that they *will be releasing* something in a few months. Maybe it's easy to say this 6 months hence and we're all sitting here listening to the crickets, but I called it at the time: Phoronix should have tossed that press release into the trash, or at least held off on saying anything about it until something actually shipped. Reporting a clearly BS press release, seriously? Are they *trying* to destroy their credibility?
Lamar Owen
2011-Apr-05 17:21 UTC
[CentOS] 32-bit kernel+XFS+16.xTB filesystem = potential disaster (was:Re: ZFS @ centOS)
On Monday, April 04, 2011 11:09:29 PM Warren Young wrote:> I did this test with Bonnie++ on a 3ware/LSI 9750-8i controller, with > eight WD 3 TB disks attached. Both tests were done with XFS on CentOS > 5.5, 32-bit. (Yes, 32-bit. Hard requirement for this application.)[snip]> For the RAID-6 configuration, I used the 3ware card's hardware RAID, > creating a single ~16 TB volume, formatted XFS.[snip]> Dropping to 16.37 TB on the RAID configuration by switching > to RAID-6 let us put almost the entire array under a single 16 TB XFS > filesystem.You really, really, really don't want to do this. Not on 32-bit. When you roll one byte over 16TB you will lose access to your filesystem, silently, and it will not remount on a 32-bit kernel. XFS works best on a 64-bit kernel for a number of reasons; the one you're likely to hit first is the 16TB hard limit for *occupied* file space; you can mkfs an XFS filesystem on a 17TB or even larger partition or volume, but the moment the occupied data rolls over the 16TB boundary you will be in disaster recovery mode, and a 64-bit kernel will be required for rescue. The reason I know this? I had it happen. On a CentOS 32-bit backup server with a 17TB LVM logical volume on EMC storage. Worked great, until it rolled 16TB. Then it quit working. Altogether. /var/log/messages told me that the filesystem was too large to be mounted. Had to re-image the VM as a 64-bit CentOS, and then re-attached the RDM's to the LUNs holding the PV's for the LV, and it mounted instantly, and we kept on trucking. There's a reason upstream doesn't do XFS on 32-bit.
On 04/05/2011 09:00 AM, rainer at ultra-secure.de wrote:> > That is really a no-brainer. > In the time it takes to re-build such a "RAID", another disk might just > fail and the "R in "RAID" goes down the toilet. Your 19-disk RAID5 just > got turned into 25kg of scrap-metal. > > As for ZFS - we're using it with FreeBSD with mixed results. > The truth is, you've got to follow the development very closely and work > with the developers (via mailinglists), potentially testing > patches/backports from current - or tracking current from the start. > It works much better with Solaris. > Frankly, I don't know why people want to do this ZFS on Linux thing. > It works perfectly well with Solaris, which runs most stuff that runs on > Linux just as well. > I wouldn't try to run Linux-binaries on Solaris with lxrun, either. >During my current work building a RAID-6 VM Host system (currently testing with SL-6 but later CentOS-6) I had a question rolling around in the back of my mind whether or not I should consider building the Host with OpenSolaris (or the OpenIndiana fork) and ZFS RAID-Z2, which I had heard performs somewhat better on Solaris. I'd then run CentOS Guest OS instances with VirtualBox. But ... I've been reading about some of the issues with ZFS performance and have discovered that it needs a *lot* of RAM to support decent caching ... the recommendation is for a GByte of RAM per TByte of storage just for the metadata, which can add up. Maybe cache memory starvation is one reason why so many disappointing test results are showing up. Chuck
On 4/2/11 10:54 PM, Dawid Horacio Golebiewski wrote:> I have trouble finding definitive information about this. I am considering > the use of SME 7.5.1 (centOS based) for my server needs, but I do want to > use ZFS and I thus far I have only found information about the ZFS-Fuse > implementation and unclear hints that there is another way. Phoronix > reported that http://kqinfotech.com/ would release some form of ZFS for the > kernel but I have found nothing. > > Can so. tell me if fuse-ZFS is more trouble than it's worth? > > Thanks in adv. > > Dawide > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centosif you need ZFS I suggest you to try out FreeBSD, where ZFS has native support. FreeBSD also is an excellent OS, even better than Linux.
Lamar Owen
2011-Apr-06 19:16 UTC
[CentOS] 32-bit kernel+XFS+16.xTB filesystem = potential disaster (was:Re: ZFS @ centOS)
On Wednesday, April 06, 2011 01:16:19 PM Warren Young wrote:> I expect they added some checks for this since you last tried XFS on 32-bit. > > Perhaps it wasn't clear from what I wrote, but the big partition on this > system is actually 15.9mumble TB, just to be sure we don't even get 1 > byte over the limit. The remaining 1/3 TB is currently unused.I didn't get there in one step. Perhaps that's the difference. What you say in the last paragraph will prevent the effect I saw. Just hope you don't need to do an xfs_repair. No, it wasn't completely clear that you were keeping below 16TB from what you wrote, at least not to me. Now, I didn't do mkfs on a 16.xTB disk initially; I got there in steps with LVM, lvextend, and xfs_growfs. The starting size of the filesystem was ~4TB in two ~2TB LUNs/PV's; VMware is limited to 2TB LUNs, so I added storage, as needed, in ~2TB chunks (actually did 2,000GB chunks; pvscan reports these as 1.95TB (with some at 1.92TB for RAID group setup reasons). The 1.32TB and 1.37TB LUNs are there due to the way the RAID groups on this Clariion CX3-10c this is on are set up. So after a while of doing this, I had a hair over 14TB; xfs_growfs going from 14TB to a hair over 16TB didn't complain. But when the data hit 16TB, it quit mounting. So I migrated to a C5 x86_64 VM, and things started working again. I've added one more 1.95TB PV to the VG since then. Current setup: PV /dev/sdd1 VG pachy-mirror lvm2 [1.92 TB / 0 free] PV /dev/sdg1 VG pachy-mirror lvm2 [1.92 TB / 0 free] PV /dev/sde1 VG pachy-mirror lvm2 [1.95 TB / 0 free] PV /dev/sdu1 VG pachy-mirror lvm2 [1.95 TB / 0 free] PV /dev/sdl1 VG pachy-mirror lvm2 [1.37 TB / 0 free] PV /dev/sdm1 VG pachy-mirror lvm2 [1.32 TB / 0 free] PV /dev/sdx1 VG pachy-mirror lvm2 [1.95 TB / 0 free] PV /dev/sdz1 VG pachy-mirror lvm2 [1.95 TB / 0 free] PV /dev/sdab1 VG pachy-mirror lvm2 [1.95 TB / 0 free] PV /dev/sdt1 VG pachy-mirror lvm2 [1.95 TB / 0 free] ACTIVE '/dev/pachy-mirror/home' [18.24 TB] inherit The growth was over a period of two years, incidentally. There are other issues with XFS and 32-bit; see: http://bugs.centos.org/view.php?id=3364 and http://www.mail-archive.com/scientific-linux-users at listserv.fnal.gov/msg05347.html and google for 'XFS 32-bit 4K stacks' for more of the gory details.
Lamar Owen
2011-Apr-06 20:05 UTC
[CentOS] 32-bit kernel+XFS+16.xTB filesystem = potential disaster
On Wednesday, April 06, 2011 01:27:10 PM Warren Young wrote:> Legacy is hard. Next time someone tells you they can't use the latest > and greatest for some reason, you might take them at their word.Yes, it is. To give another for instance, we do work with some interfaces for telescopes (optical and radio) where for insurance and assurance reasons, and the fact that a PE's seal is on the prints, a tested, tried, and proven ISA interface card is a requirement. Ever try to find a Pentium 4 or better motherboard with ISA slots? It's easy to say 'just go PCI' but that's not as easy as it sounds. In fact, for power supplies for the PC's into which this ISA card (which is a custom A/D, D/A, encoder input, counter, and DIO card rolled into one) I have to be careful to pick ones that have enough -5V current capability. Yes, negative 5 volts. That's only used by ISA cards, right? Well..... Do you know how many ATX12V power supplies don't even *have* a -5V output? Most of them, it seems.