All, I''m currently working out details on an upgrade from UFS/SDS on DAS to ZFS on a SAN fabric. I''m interested in hearing how ZFS has behaved in more traditional SAN environments using gear that scales vertically like EMC Clarion/HDS AMS/3PAR etc. Do you experience issues with zpool integrity because of MPxIO events? Has the zpool been reliable over your fabric? Has performance been where you would have expected it to be? Thanks much, -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080820/7d5683be/attachment.html>
> <div id="jive-html-wrapper-div"> > <div dir="ltr">All,<br>I''m currently working out > details on an upgrade from UFS/SDS on DAS to ZFS on a > SAN fabric.? I''m interested in hearing how > ZFS has behaved in more traditional SAN environments > using gear that scales vertically like EMC > Clarion/HDS AMS/3PAR etc.? Do you experience > issues with zpool integrity because of MPxIO > events?? Has the zpool been reliable over your > fabric?? Has performance been where you would > have expected it to be?<br> > <br>Thanks much,<br>-Aaron<br></div> > > </div>_______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ssNo Yes Yes Search for ZFS SAN I think we''ve had a bunch of threads on this. I run 2-node clusters with T-2000 units to pairs of either 3510FC (older) or 2540 (newer) over dual SAN switches. Multipath has worked fine, had an entire array go offline during attempt to replace a bad controller. Since ZFS pool was mirrored onto another array, there was no disruption of service. We use a hybrid setup where I build 5-disk RAID-5 LUNs on each array, then that is mirrored in ZFS with a LUN from a different array. Best of both worlds IMO. Each cluster supports about 10K users for Cyrus mail-store since 10u4 fsync performance patch no performance issues. I wish zpool scrub ran a bit quicker, but I think 10u6 will address that. I''m not sure I see the point of EMC but like me you probably already have some equipment and just want to use it or add to it. This message posted from opensolaris.org
Hello Aaron, Wednesday, August 20, 2008, 7:11:01 PM, you wrote: > All, I''m currently working out details on an upgrade from UFS/SDS on DAS to ZFS on a SAN fabric. I''m interested in hearing how ZFS has behaved in more traditional SAN environments using gear that scales vertically like EMC Clarion/HDS AMS/3PAR etc. Do you experience issues with zpool integrity because of MPxIO events? Has the zpool been reliable over your fabric? Has performance been where you would have expected it to be? Thanks much, -Aaron Yes it works fine. The only issue there is, with some disk arrays, is a cache flush issue - you can disable it on disk array or in zfs. Then if you want to leverage ZFS self-healing properties then make sure you have some kind of redundancy on zfs level regardless of your redundancy on the array. -- Best regards, Robert Milkowski mailto:milek@task.gda.pl http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
----- Original Message ----- From: Robert Milkowski <milek at task.gda.pl> Date: Thursday, August 21, 2008 5:47 am Subject: Re: [zfs-discuss] ZFS with Traditional SAN To: Aaron Blew <aaronblew at gmail.com> Cc: zfs-discuss at opensolaris.org> Hello Aaron, > > > Wednesday, August 20, 2008, 7:11:01 PM, you wrote: > > > > > > > > > > All, > I''m currently working out details on an upgrade from UFS/SDS on DAS to > ZFS on a SAN fabric. I''m interested in hearing how ZFS has behaved in > more traditional SAN environments using gear that scales vertically > like EMC Clarion/HDS AMS/3PAR etc. Do you experience issues with > zpool integrity because of MPxIO events? Has the zpool been reliable > over your fabric? Has performance been where you would have expected > it to be? > > > Thanks much, > -Aaron > > > > > > > > > Yes it works fine. > The only issue there is, with some disk arrays, is a cache flush issue > - you can disable it on disk array or in zfs. > > Then if you want to leverage ZFS self-healing properties then make > sure you have some kind of redundancy on zfs level regardless of your > redundancy on the array. >That''s the one that''s been an issue for me and my customers - they get billed back for GB allocated to their servers by the back end arrays. To be more explicit about the ''self-healing properties'' - To deal with any fs corruption situation that would traditionally require an fsck on UFS (SAN switch crash, multipathing issues, cables going flaky or getting pulled, server crash that corrupts fs''s) ZFS needs some disk redundancy in place so it has parity and can recover. (raidz, zfs mirror, etc) Which means to use ZFS a customer have to pay more to get the back end storage redundancy they need to recover from anything that would cause an fsck on UFS. I''m not saying it''s a bad implementation or that the gains aren''t worth it, just that cost-wise, ZFS is more expensive in this particular bill-back model. cheers, Brian> > -- > Best regards, > Robert Milkowski mailto:milek at task.gda.pl > http://milek.blogspot.com > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Brian Wilson wrote:> ----- Original Message ----- > From: Robert Milkowski <milek at task.gda.pl> > Date: Thursday, August 21, 2008 5:47 am > Subject: Re: [zfs-discuss] ZFS with Traditional SAN > To: Aaron Blew <aaronblew at gmail.com> > Cc: zfs-discuss at opensolaris.org > > > >> Hello Aaron, >> >> >> Wednesday, August 20, 2008, 7:11:01 PM, you wrote: >> >> >> >> >> >> >> All, >> I''m currently working out details on an upgrade from UFS/SDS on DAS to >> ZFS on a SAN fabric. I''m interested in hearing how ZFS has behaved in >> more traditional SAN environments using gear that scales vertically >> like EMC Clarion/HDS AMS/3PAR etc. Do you experience issues with >> zpool integrity because of MPxIO events? Has the zpool been reliable >> over your fabric? Has performance been where you would have expected >> it to be? >> >> >> Thanks much, >> -Aaron >> >> >> >> >> >> >> >> >> Yes it works fine. >> The only issue there is, with some disk arrays, is a cache flush issue >> - you can disable it on disk array or in zfs. >> >> Then if you want to leverage ZFS self-healing properties then make >> sure you have some kind of redundancy on zfs level regardless of your >> redundancy on the array. >> >> > > That''s the one that''s been an issue for me and my customers - they get billed back for GB allocated to their servers by the back end arrays. > To be more explicit about the ''self-healing properties'' - > To deal with any fs corruption situation that would traditionally require an fsck on UFS (SAN switch crash, multipathing issues, cables going flaky or getting pulled, server crash that corrupts fs''s) ZFS needs some disk redundancy in place so it has parity and can recover. (raidz, zfs mirror, etc) > Which means to use ZFS a customer have to pay more to get the back end storage redundancy they need to recover from anything that would cause an fsck on UFS. I''m not saying it''s a bad implementation or that the gains aren''t worth it, just that cost-wise, ZFS is more expensive in this particular bill-back model. >Your understanding of UFS fsck is incorrect. It does not repair data, only metadata. ZFS has redundant metadata by default. You can also set the data redundancy on a per-file system or per-volume basis with ZFS. For example, you might want some data to be redundant, but not the whole pool. In such cases you can set the copies=2 parameter on the file systems or volumes which are more important. This is better described in pictures: http://blogs.sun.com/relling/entry/zfs_copies_and_data_protection With ZFS you can also enable compression on a per-file system or volume basis. Depending on your data, you may use less space with a mirrored (fully redundant) ZFS pool than a UFS file system. -- richard
On Thu, Aug 21, 2008 at 11:46:47AM +0100, Robert Milkowski wrote:> > Wednesday, August 20, 2008, 7:11:01 PM, you wrote: > > I''m currently working out details on an upgrade from UFS/SDS on DAS to > ZFS on a SAN fabric. I''m interested in hearing how ZFS has behaved in > more traditional SAN environments using gear that scales vertically > like EMC Clarion/HDS AMS/3PAR etc. Do you experience issues with > zpool integrity because of MPxIO events? Has the zpool been reliable > over your fabric? Has performance been where you would have expected > it to be? > > Yes it works fine.We have a 2-TB ZFS pool on a T2000 server with storage on our Iscsi SAN. Disk devices are four LUNs from our Netapp file server, with multiple IP paths between the two. We''ve tested this by disconnecting and reconnecting ethernet cables in turn. Failover and failback worked as expected, with no interruption to data flow. It looks like this: $ zpool status pool: space state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM space ONLINE 0 0 0 c4t60A98000433469764E4A2D456A644A74d0 ONLINE 0 0 0 c4t60A98000433469764E4A2D456A696579d0 ONLINE 0 0 0 c4t60A98000433469764E4A476D2F6B385Ad0 ONLINE 0 0 0 c4t60A98000433469764E4A476D2F664E4Fd0 ONLINE 0 0 0 errors: No known data errors> The only issue there is, with some disk arrays, is a cache flush issue > - you can disable it on disk array or in zfs. > > Then if you want to leverage ZFS self-healing properties then make > sure you have some kind of redundancy on zfs level regardless of your > redundancy on the array.-- -Gary Mills- -Unix Support- -U of M Academic Computing and Networking-
> > > That''s the one that''s been an issue for me and my customers - they > get billed back for GB allocated to their servers by the back end > arrays. > To be more explicit about the ''self-healing properties'' - > To deal with any fs corruption situation that would traditionally > require an fsck on UFS (SAN switch crash, multipathing issues, > cables going flaky or getting pulled, server crash that corrupts > fs''s) ZFS needs some disk redundancy in place so it has parity and > can recover. (raidz, zfs mirror, etc) > Which means to use ZFS a customer have to pay more to get the back > end storage redundancy they need to recover from anything that would > cause an fsck on UFS. I''m not saying it''s a bad implementation or > that the gains aren''t worth it, just that cost-wise, ZFS is more > expensive in this particular bill-back model. > > cheers, > Brian >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussWhy would the customer need to use raidz or zfs mirroring if the array is doing it for them? As someone else posted, metadata is already redundant by default and doesn''t consume a ton of space. Some people may disagree but the first thing I like about ZFS is the ease of pool management and the second thing is the checksumming. When a customer had issues with Solaris 10 x86, vxfs and EMC powerpath, I took them down the road of using powerpath and zfs. Made some tweaks so we didn''t tell the array to flush to rust and they''re happy as clams.
> Why would the customer need to use raidz or zfs > mirroring if the array > is doing it for them? As someone else posted, > metadata is already > redundant by default and doesn''t consume a ton of > space.Because arrays & drives can suffer silent errors in the data that are not found until too late. My zpool scrubs occasionally find & FIX errors that none of the array or RAID-5 stuff caught. This problem only gets more likely as our pools grow in size. If the user is CHEAP and doesn''t care about their data that much then sure, then run without ZFS redundancy. The metadata redundancy should at least ensure your pool is online immediately and that bad files are at least flagged during a scrub so you know which ones need to be regenerated or retrieved from tape. This message posted from opensolaris.org
>>>>> "vf" == Vincent Fox <vincent_b_fox at yahoo.com> writes:vf> Because arrays & drives can suffer silent errors in the data vf> that are not found until too late. My zpool scrubs vf> occasionally find & FIX errors that none of the array or vf> RAID-5 stuff caught. well, just to make it clear again: * some people on the list believe an incrementing count in the CKSUM column means ZFS is protecting you from other parts of the storage stack mysteriously failing. * others (me) believe CKSUM counts are often but not always the latent symptom of corruption bugs in ZFS. They make guesses about what other parts of the stack might fail, sometimes desperate ones like ``failure on the bus between the ECC ram controller and the CPU,'''' and I make guesses about corruption bugs in ZFS. I call implausible, and they call ``i don''t believe it happened unless it happened in a way that''s convenient to debug.'''' anyway, for example that it does happen, I can make CKSUM errors by saying ''iscsiadm remove discovery-address 1.2.3.4'' to take down the target on one half of a mirror vdev. When the target comes back, it onlines itself, I scrub the pool, and that target accumulates CKSUM errors. But what happened isn''t ``silent corruption''''. It''s plain old resilvering. And ZFS resilvers without requiring a manual scrub and without counting latent CKSUM errors if I take down half the mirror in some other way, such as ''zpool offline''. There are probably other scenarios that make latent CKSUM errors---ex., almost the same thing, fault a device, shutdown, fix the device, boot, scrub in bug 6675685---but my intuition is that a whole class of ZFS bugs will manifest themselves with this symptom. At least the one I just described should be in Sol10u5 if you want to test it. Maybe this is too much detail for Bill and his snarky ``buffer overflow reading your message'''' comments, and too much speculation for some others, but the point is: ZFS indicating an error doesn''t automatically mean there''s no problem with ZFS. -and- You should use zpool-level redundancy, as in different LUN''s not just copies=2, with ZFS on SAN, because experience here shows that you''re less likely to lose an entire pool to metadata corruption if you have this kind of redundancy. There''s some dispute about the ``why'''', but if you don''t do it (and also if you do but definitely if you don''t), be sure to have some kind of real backup not just snapshots and mirrors, and not ''zfs send'' blobs either. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080821/c877d185/attachment.bin>
bfwilson at doit.wisc.edu said:> That''s the one that''s been an issue for me and my customers - they get billed > back for GB allocated to their servers by the back end arrays. To be more > explicit about the ''self-healing properties'' - To deal with any fs > corruption situation that would traditionally require an fsck on UFS (SAN > switch crash, multipathing issues, cables going flaky or getting pulled, > server crash that corrupts fs''s) ZFS needs some disk redundancy in place so > it has parity and can recover. (raidz, zfs mirror, etc) Which means to use > ZFS a customer have to pay more to get the back end storage redundancy they > need to recover from anything that would cause an fsck on UFS. I''m not > saying it''s a bad implementation or that the gains aren''t worth it, just that > cost-wise, ZFS is more expensive in this particular bill-back model.If your back-end array implements RAID-0, you need not suffer the extra expense. Allocate one RAID-0 LUN per physical drive, then use ZFS to make raidz or mirrored pools as appropriate. To add to the other anecdotes on this thread: We have non-redundant ZFS pools on SAN storage, in production use for about a year, replacing some SAM-QFS filesystems which were formerly on the same arrays. We have had the "normal" ZFS panics occur in the presence of I/O errors (SAN zoning mistakes, cable issues, switch bugs), and had no ZFS corruption nor data loss as a result. We run S10U4 and S10U5, both SPARC and x86. MPXIO works fine, once you have OS and arrays configured properly. Note that I''d by far prefer to have ZFS-level redundancy, but our equipment doesn''t support a useful RAID-0, and our customers want cheap storage. But we also charge them for tape backups.... Regards, Marion