Dominic Kay
2008-Apr-28 20:20 UTC
[zfs-discuss] ZFS - Implementation Successes and Failures
Hi Firstly apologies for the spam if you got this email via multiple aliases. I''m trying to document a number of common scenarios where ZFS is used as part of the solution such as email server, $homeserver, RDBMS and so forth but taken from real implementations where things worked and equally importantly threw up things that needed to be avoided (even if that was the whole of ZFS!). I''m not looking to replace the Best Practices or Evil Tuning guides but to take a slightly different slant. If you have been involved in a ZFS implementation small or large and would like to discuss it either in confidence or as a referenceable case study that can be written up, I''d be grateful if you''d make contact. -- Dominic Kay http://blogs.sun.com/dom -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080428/7329dc01/attachment.html>
Vincent Fox
2008-Apr-28 22:55 UTC
[zfs-discuss] ZFS - Implementation Successes and Failures
Cyrus mail-stores for UC Davis are in ZFS. Began as failure ended as success. We hit the FSYNC performance issue and our systems collapsed under user load. We could not track it down and neither could the Sun reps we contacted. Eventually I found a reference to FSYNC bug and we tried out what was then an IDR patch for it. Now I''d say it''s a success, performance is good and very stable. This message posted from opensolaris.org
Bob Friesenhahn
2008-Apr-29 00:32 UTC
[zfs-discuss] ZFS - Implementation Successes and Failures
On Mon, 28 Apr 2008, Dominic Kay wrote:> > I''m not looking to replace the Best Practices or Evil Tuning guides but to > take a slightly different slant. If you have been involved in a ZFS > implementation small or large and would like to discuss it either in > confidence or as a referenceable case study that can be written up, I''d be > grateful if you''d make contact.Back in February I set up ZFS on a 12-disk StorageTek 2540 array and documented my experience (at that time) in the white paper available at "http://www.simplesystems.org/users/bfriesen/zfs-discuss/2540-zfs-performance.pdf". Since then I am still quite satisified. ZFS has yet to report a bad block or cause me any trouble at all. The only complaint I would have is that ''cp -r'' performance is less than would be expected given the raw bandwidth capacity. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Simon Breden
2008-Apr-29 07:01 UTC
[zfs-discuss] ZFS - Implementation Successes and Failures
Hi Dominic, I''ve built a home fileserver using ZFS and I''d be happy to help. I''ve written up my experiences, from the search for suitable devices thru researching compatible hardware, and finally configuring it to share files. I also build a second box for backups, again using ZFS, and used iSCSI, to add a bit of fun. For more fun, I chose to aggregate gigabit ethernet ports into a speedy link between the ZFS fileserver and a Mac Pro computer, which, with limited testing, appears to be transferring data at around 80+ MBytes/sec sustained, using a CIFS share, and this transfer speed appears to be limited by the speed of the Mac''s single disk, so I expect it can be pushed to go even faster. It has been a great experience using ZFS in this way. You can find my write up here: http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ If it sounds of interest, feel free to contact me. Simon This message posted from opensolaris.org
Our ZFS implementation is on hold for 12 months while we wait for a few features to mature. We''re still very interested in ZFS, and have even put off the purchase of a SAN since the technology looks so promising, I just hope it grows to live up to expectations. The main problems we hit were: ZFS seems to be a huge step forward in emulating windows permissions, but unfortunately it''s not quite there yet. Neither Samba nor the Solaris CIFS server worked for us. They''re very nearly there though, so we''re watching developments here closely. Also, ZFS''s snapshots are great, but we don''t consider them ready for production use when we have to stop snapshots in order for a resilver to finish. If we felt this would be fixed soon we might have lived with this. However as far as I can tell the bug report for this has been open for several years, so we''re waiting for this to be fixed before we even consider the rollout. I''m also concerned about the way the admin and management tools hang if ZFS is waiting for a reply from a device. If I have a fully raided or mirrored zpool, a slow reply or failure of one device should not affect the performance of the volume, and should definately not affect the performance of the fault reporting tools. ZFS is a long way behind established raid controllers and NAS or SAN devices here. Some of the ZFS features could do with more documentation. I''ve seen several posts from people here struggling to work out how to use the ZFS tools in a data recovery situation. The manuals would benefits from a few disaster recovery examples in my opinion. In our case we generated a few disaster scenarios during testing, and documented recovery procedures ourselves, but it would have been a lot easier if this information had been provided. However one sucess story is that Solaris and ZFS are very easy to learn and use. I''m a windows admin with no experience of Unix prior to this. In the space of three months I''ve quite happily installed Solaris, OpenSolaris, ZFS, iSCSI, NFS, Samba, the Solaris CIFS server, and a three node OpenSolaris cluster running ZFS, NFS and Samba. None of it presented any major difficulty, and I was very impressed with how easy it was to set up a clustered NFS provider as storage for VMware ESX server. Remote admin of Solaris from a windows box via Cygwin & SSH or X-windows is superb, and an absolute joy to work with. We now have a small ZFS box I''m using at home for some off-site backups as a long term test, with Samba file sharing for my home network. Overall the experience has definately made me a fan of Solaris & ZFS, and I''m actively looking forward to the day we are able to roll it out on our network. This message posted from opensolaris.org
Jonathan Loran
2008-Apr-30 00:01 UTC
[zfs-discuss] ZFS - Implementation Successes and Failures
Dominic Kay wrote:> Hi > > Firstly apologies for the spam if you got this email via multiple aliases. > > I''m trying to document a number of common scenarios where ZFS is used > as part of the solution such as email server, $homeserver, RDBMS and > so forth but taken from real implementations where things worked and > equally importantly threw up things that needed to be avoided (even if > that was the whole of ZFS!). > > I''m not looking to replace the Best Practices or Evil Tuning guides > but to take a slightly different slant. If you have been involved in > a ZFS implementation small or large and would like to discuss it > either in confidence or as a referenceable case study that can be > written up, I''d be grateful if you''d make contact. > > -- > Dominic Kay > http://blogs.sun.com/domFor all the storage under my management, we are deploying ZFS going forward. There have been issues, to be sure, though none of them show stoppers. I agree with other posters that the way the z* commands lockup on a failed device are really not good, and it would be nice to be able to remove devices from a zpool. There have been other performance issues that are more the fault of of our SAN nodes than ZFS. But the ease of management, the unlimited nature (volume size to number of file systems) of everything ZFS, built in snapshots, and the confidence we get in our data make ZFS a winner. The way we''ve deployed ZFS has been to map iSCSI devices from our SAN. I know this isn''t an ideal way to deploy ZFS, but SAN''s do offer flexibility that direct attached drives do not. Performance is now sufficient for our needs, but it wasn''t at first. We do everything here on the cheap, we have to. After all, this is University research ;) Anyway, we buy commodity x86 servers, and use software iSCSI. Most of our iSCSI nodes run Open-E iSCSI-R3. The latest version is actually quite quick, which wasn''t always the case. I am experimenting using ZFS on the iSCSI target, but haven''t finished validating that yet. I''ve also rebuilt an older 24 disk SATA chassis with the following parts: Motherboard:Supermicro PDSME+ Processor: Intel Xeon X3210 Kentsfield 2.13GHz 2 x 4MB L2 Cache LGA 775 Quad-Core Disk Controllers x3: Supermicro AOC-SAT2-MV8 8-Port SATA Hard disks x24: WD-1TB RE2, GP RAM: Crucial, 4x2GB unbuffered ECC PC2-5300 (8GB total) New power supplies... The PDSME+ MB was on the Solaris HCL, and it has four PCI-X slots, so using three of the Super Micro MVs'' is no problem. This is obviously a standalone system, but it will be for nearline backup data, and doesn''t have the same expansion requirements as our other servers. The thing about this guy is how smokin fast it is. I''ve set it up on snv b86, with 4 x 6 drive raid2z stripes, and I''m seeing up to 450MB/sec write and 900MB/sec read speeds. We can''t get data into it anywhere that quick, but the potential is awesome. And it was really cheap, for this amount of storage. Our total storage on ZFS now is at: 103TB, some user home directories, some software distribution, and a whole lot of scientific data. I compress almost everything, since our bandwidth tends to be SAN pinched, not at the head nodes, so we can afford it. I sleep at night, and the users don''t see problems. I''m a happy camper. Cheers, Jon