thr3ads.net - zfs discuss - [zfs-discuss] ZFS with Traditional SAN [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Aaron Blew

2008-Aug-20 18:11 UTC

[zfs-discuss] ZFS with Traditional SAN

All,
I''m currently working out details on an upgrade from UFS/SDS on DAS to
ZFS
on a SAN fabric.  I''m interested in hearing how ZFS has behaved in more
traditional SAN environments using gear that scales vertically like EMC
Clarion/HDS AMS/3PAR etc.  Do you experience issues with zpool integrity
because of MPxIO events?  Has the zpool been reliable over your fabric?  Has
performance been where you would have expected it to be?

Thanks much,
-Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080820/7d5683be/attachment.html>

Vincent Fox

2008-Aug-20 20:07 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

> <div id="jive-html-wrapper-div">
> <div dir="ltr">All,<br>I''m currently working
out
> details on an upgrade from UFS/SDS on DAS to ZFS on a
> SAN fabric.? I''m interested in hearing how
> ZFS has behaved in more traditional SAN environments
> using gear that scales vertically like EMC
> Clarion/HDS AMS/3PAR etc.? Do you experience
> issues with zpool integrity because of MPxIO
> events?? Has the zpool been reliable over your
> fabric?? Has performance been where you would
> have expected it to be?<br>
> <br>Thanks much,<br>-Aaron<br></div>
> 
> </div>_______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
> ss
No
Yes
Yes

Search for ZFS SAN I think we''ve had a bunch of threads on this.

I run 2-node clusters with T-2000 units to pairs of either 3510FC (older) or
2540 (newer) over dual SAN switches.  Multipath has worked fine, had an entire
array go offline during attempt to replace a bad controller.  Since ZFS pool was
mirrored onto another array, there was no disruption of service.  We use a
hybrid setup where I build 5-disk RAID-5 LUNs on each array, then that is
mirrored in ZFS with a LUN from a different array.  Best of both worlds IMO.  
Each cluster supports about 10K users for Cyrus mail-store since 10u4 fsync
performance patch no performance issues.  I wish zpool scrub ran a bit quicker,
but I think 10u6 will address that.

I''m not sure I see the point of EMC but like me you probably already
have some equipment and just want to use it or add to it.
 
 
This message posted from opensolaris.org

Robert Milkowski

2008-Aug-21 10:46 UTC

head link

Re: ZFS with Traditional SAN

Hello Aaron,

Wednesday, August 20, 2008, 7:11:01 PM, you wrote:

&gt;

All,

I''m currently working out details on an upgrade from UFS/SDS on DAS to
ZFS on a SAN fabric.  I''m interested in hearing how ZFS has behaved in
more traditional SAN environments using gear that scales vertically like EMC
Clarion/HDS AMS/3PAR etc.  Do you experience issues with zpool integrity because
of MPxIO events?  Has the zpool been reliable over your fabric?  Has performance
been where you would have expected it to be?

Thanks much,

-Aaron

Yes it works fine.

The only issue there is, with some disk arrays, is a cache flush issue - you can
disable it on disk array or in zfs.

Then if you want to leverage ZFS self-healing properties then make sure you have
some kind of redundancy on zfs level regardless of your redundancy on the array.

-- 

Best regards,

 Robert  Milkowski                           mailto:milek@task.gda.pl

                                       http://milek.blogspot.com


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Brian Wilson

2008-Aug-21 11:03 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

----- Original Message -----
From: Robert Milkowski <milek at task.gda.pl>
Date: Thursday, August 21, 2008 5:47 am
Subject: Re: [zfs-discuss] ZFS with Traditional SAN
To: Aaron Blew <aaronblew at gmail.com>
Cc: zfs-discuss at opensolaris.org

> Hello Aaron,
> 
> 
> Wednesday, August 20, 2008, 7:11:01 PM, you wrote:
> 
> 
> 
> 
> 
> >
> 
> 
> All,
> I''m currently working out details on an upgrade from UFS/SDS on
DAS to
> ZFS on a SAN fabric.  I''m interested in hearing how ZFS has
behaved in
> more traditional SAN environments using gear that scales vertically 
> like EMC Clarion/HDS AMS/3PAR etc.  Do you experience issues with 
> zpool integrity because of MPxIO events?  Has the zpool been reliable 
> over your fabric?  Has performance been where you would have expected 
> it to be?
> 
> 
> Thanks much,
> -Aaron
> 
> 
> 
> 
> 
> 
> 
> 
> Yes it works fine.
> The only issue there is, with some disk arrays, is a cache flush issue 
> - you can disable it on disk array or in zfs.
>  
> Then if you want to leverage ZFS self-healing properties then make 
> sure you have some kind of redundancy on zfs level regardless of your 
> redundancy on the array.
> 
That''s the one that''s been an issue for me and my customers -
they get billed back for GB allocated to their servers by the back end arrays.
To be more explicit about the ''self-healing properties'' - 
To deal with any fs corruption situation that would traditionally require an
fsck on UFS (SAN switch crash, multipathing issues, cables going flaky or
getting pulled, server crash that corrupts fs''s) ZFS needs some disk
redundancy in place so it has parity and can recover.  (raidz, zfs mirror, etc)
Which means to use ZFS a customer have to pay more to get the back end storage
redundancy they need to recover from anything that would cause an fsck on UFS. 
I''m not saying it''s a bad implementation or that the gains
aren''t worth it, just that cost-wise, ZFS is more expensive in this
particular bill-back model.

cheers,
Brian
> 
> -- 
> Best regards,
>  Robert  Milkowski                           mailto:milek at task.gda.pl
>                                        http://milek.blogspot.com
> 
> 
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Richard Elling

2008-Aug-21 12:34 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

Brian Wilson wrote:> ----- Original Message -----
> From: Robert Milkowski <milek at task.gda.pl>
> Date: Thursday, August 21, 2008 5:47 am
> Subject: Re: [zfs-discuss] ZFS with Traditional SAN
> To: Aaron Blew <aaronblew at gmail.com>
> Cc: zfs-discuss at opensolaris.org
>
>
>   
>> Hello Aaron,
>>
>>
>> Wednesday, August 20, 2008, 7:11:01 PM, you wrote:
>>
>>
>>
>>
>>
>>     
>> All,
>> I''m currently working out details on an upgrade from UFS/SDS
on DAS to
>> ZFS on a SAN fabric.  I''m interested in hearing how ZFS has
behaved in
>> more traditional SAN environments using gear that scales vertically 
>> like EMC Clarion/HDS AMS/3PAR etc.  Do you experience issues with 
>> zpool integrity because of MPxIO events?  Has the zpool been reliable 
>> over your fabric?  Has performance been where you would have expected 
>> it to be?
>>
>>
>> Thanks much,
>> -Aaron
>>
>>
>>
>>
>>
>>
>>
>>
>> Yes it works fine.
>> The only issue there is, with some disk arrays, is a cache flush issue 
>> - you can disable it on disk array or in zfs.
>>  
>> Then if you want to leverage ZFS self-healing properties then make 
>> sure you have some kind of redundancy on zfs level regardless of your 
>> redundancy on the array.
>>
>>     
>
> That''s the one that''s been an issue for me and my
customers - they get billed back for GB allocated to their servers by the back
end arrays.
> To be more explicit about the ''self-healing properties'' -
> To deal with any fs corruption situation that would traditionally require
an fsck on UFS (SAN switch crash, multipathing issues, cables going flaky or
getting pulled, server crash that corrupts fs''s) ZFS needs some disk
redundancy in place so it has parity and can recover.  (raidz, zfs mirror, etc)
> Which means to use ZFS a customer have to pay more to get the back end
storage redundancy they need to recover from anything that would cause an fsck
on UFS.  I''m not saying it''s a bad implementation or that the
gains aren''t worth it, just that cost-wise, ZFS is more expensive in
this particular bill-back model.
>   
Your understanding of UFS fsck is incorrect.  It does not repair data,
only metadata.  ZFS has redundant metadata by default.

You can also set the data redundancy on a per-file system or
per-volume basis with ZFS.  For example, you might want some
data to be redundant, but not the whole pool.  In such cases you
can set the copies=2 parameter on the file systems or volumes
which are more important.  This is better described in pictures:
http://blogs.sun.com/relling/entry/zfs_copies_and_data_protection

With ZFS you can also enable compression on a per-file system
or volume basis.  Depending on your data, you may use less space
with a mirrored (fully redundant) ZFS pool than a UFS file system.
 -- richard

Gary Mills

2008-Aug-21 12:52 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

On Thu, Aug 21, 2008 at 11:46:47AM +0100, Robert Milkowski
wrote:>    
>    Wednesday, August 20, 2008, 7:11:01 PM, you wrote:
>    
>    I''m currently working out details on an upgrade from UFS/SDS on
DAS to
>    ZFS on a SAN fabric.  I''m interested in hearing how ZFS has
behaved in
>    more traditional SAN environments using gear that scales vertically
>    like EMC Clarion/HDS AMS/3PAR etc.  Do you experience issues with
>    zpool integrity because of MPxIO events?  Has the zpool been reliable
>    over your fabric?  Has performance been where you would have expected
>    it to be?
>    
>    Yes it works fine.
We have a 2-TB ZFS pool on a T2000 server with storage on our Iscsi
SAN.  Disk devices are four LUNs from our Netapp file server, with
multiple IP paths between the two.  We''ve tested this by disconnecting
and reconnecting ethernet cables in turn.  Failover and failback
worked as expected, with no interruption to data flow.  It looks like
this:

  $ zpool status
    pool: space
   state: ONLINE
   scrub: none requested
  config:
  
          NAME                                     STATE     READ WRITE CKSUM
          space                                    ONLINE       0     0     0
            c4t60A98000433469764E4A2D456A644A74d0  ONLINE       0     0     0
            c4t60A98000433469764E4A2D456A696579d0  ONLINE       0     0     0
            c4t60A98000433469764E4A476D2F6B385Ad0  ONLINE       0     0     0
            c4t60A98000433469764E4A476D2F664E4Fd0  ONLINE       0     0     0
  
  errors: No known data errors
>    The only issue there is, with some disk arrays, is a cache flush issue
>    - you can disable it on disk array or in zfs.
>    
>    Then if you want to leverage ZFS self-healing properties then make
>    sure you have some kind of redundancy on zfs level regardless of your
>    redundancy on the array.
-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Ed Saipetch

2008-Aug-21 13:47 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

>
>
> That''s the one that''s been an issue for me and my
customers - they
> get billed back for GB allocated to their servers by the back end  
> arrays.
> To be more explicit about the ''self-healing properties'' -
> To deal with any fs corruption situation that would traditionally  
> require an fsck on UFS (SAN switch crash, multipathing issues,  
> cables going flaky or getting pulled, server crash that corrupts  
> fs''s) ZFS needs some disk redundancy in place so it has parity and
> can recover.  (raidz, zfs mirror, etc)
> Which means to use ZFS a customer have to pay more to get the back  
> end storage redundancy they need to recover from anything that would  
> cause an fsck on UFS.  I''m not saying it''s a bad
implementation or
> that the gains aren''t worth it, just that cost-wise, ZFS is more  
> expensive in this particular bill-back model.
>
> cheers,
> Brian
>>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Why would the customer need to use raidz or zfs mirroring if the array  
is doing it for them?  As someone else posted, metadata is already  
redundant by default and doesn''t consume a ton of space.  Some people  
may disagree but the first thing I like about ZFS is the ease of pool  
management and the second thing is the checksumming.

When a customer had issues with Solaris 10 x86, vxfs and EMC  
powerpath, I took them down the road of using powerpath and zfs.  Made  
some tweaks so we didn''t tell the array to flush to rust and
they''re
happy as clams.

Vincent Fox

2008-Aug-21 18:21 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

> Why would the customer need to use raidz or zfs
> mirroring if the array  
> is doing it for them?  As someone else posted,
> metadata is already  
> redundant by default and doesn''t consume a ton of
> space.  
Because arrays & drives can suffer silent errors in the data that are not
found until too late.  My zpool scrubs occasionally find & FIX errors that
none of the array or RAID-5 stuff caught.

This problem only gets more likely as our pools grow in size.  If the user is
CHEAP and doesn''t care about their data that much then sure, then run
without ZFS redundancy.   The metadata redundancy should at least ensure your
pool is online immediately and that bad files are at least flagged during a
scrub so you know which ones need to be regenerated or retrieved from tape.
 
 
This message posted from opensolaris.org

Miles Nordin

2008-Aug-21 19:20 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

>>>>> "vf" == Vincent Fox <vincent_b_fox at
yahoo.com> writes:
vf> Because arrays & drives can suffer silent errors in the data
vf> that are not found until too late. My zpool scrubs
vf> occasionally find & FIX errors that none of the array or
vf> RAID-5 stuff caught.

well, just to make it clear again:

* some people on the list believe an incrementing count in the CKSUM
column means ZFS is protecting you from other parts of the storage
stack mysteriously failing.

* others (me) believe CKSUM counts are often but not always the
latent symptom of corruption bugs in ZFS.

They make guesses about what other parts of the stack might fail,
sometimes desperate ones like ``failure on the bus between the ECC ram
controller and the CPU,'''' and I make guesses about corruption
bugs in
ZFS. I call implausible, and they call ``i don''t believe it happened
unless it happened in a way that''s convenient to
debug.''''

anyway, for example that it does happen, I can make CKSUM errors by
saying ''iscsiadm remove discovery-address 1.2.3.4'' to take
down the
target on one half of a mirror vdev. When the target comes back, it
onlines itself, I scrub the pool, and that target accumulates CKSUM
errors. But what happened isn''t ``silent
corruption''''. It''s plain
old resilvering. And ZFS resilvers without requiring a manual scrub
and without counting latent CKSUM errors if I take down half the
mirror in some other way, such as ''zpool offline''. There are
probably
other scenarios that make latent CKSUM errors---ex., almost the same
thing, fault a device, shutdown, fix the device, boot, scrub in bug
6675685---but my intuition is that a whole class of ZFS bugs will
manifest themselves with this symptom. At least the one I just
described should be in Sol10u5 if you want to test it.

Maybe this is too much detail for Bill and his snarky ``buffer
overflow reading your message'''' comments, and too much
speculation for
some others, but the point is:

ZFS indicating an error doesn''t automatically mean there''s
no
problem with ZFS.

-and-

You should use zpool-level redundancy, as in different LUN''s not
just copies=2, with ZFS on SAN, because experience here shows that
you''re less likely to lose an entire pool to metadata corruption if
you have this kind of redundancy. There''s some dispute about the
``why'''', but if you don''t do it (and also if you do
but definitely if
you don''t), be sure to have some kind of real backup not just
snapshots and mirrors, and not ''zfs send'' blobs either.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080821/c877d185/attachment.bin>

Marion Hakanson

2008-Aug-21 20:43 UTC

head link

[zfs-discuss] ZFS with Traditional SAN

bfwilson at doit.wisc.edu said:> That''s the one that''s been an issue for me and my
customers - they get billed
> back for GB allocated to their servers by the back end arrays.   To be more
> explicit about the ''self-healing properties'' -  To deal
with any fs
> corruption situation that would traditionally require an fsck on UFS (SAN
> switch crash, multipathing issues, cables going flaky or getting pulled,
> server crash that corrupts fs''s) ZFS needs some disk redundancy in
place so
> it has parity and can recover.  (raidz, zfs mirror, etc)  Which means to
use
> ZFS a customer have to pay more to get the back end storage redundancy they
> need to recover from anything that would cause an fsck on UFS. 
I''m not
> saying it''s a bad implementation or that the gains aren''t
worth it, just that
> cost-wise, ZFS is more expensive in this particular bill-back model. 
If your back-end array implements RAID-0, you need not suffer the extra
expense.  Allocate one RAID-0 LUN per physical drive, then use ZFS to
make raidz or mirrored pools as appropriate.

To add to the other anecdotes on this thread:  We have non-redundant
ZFS pools on SAN storage, in production use for about a year, replacing
some SAM-QFS filesystems which were formerly on the same arrays.  We
have had the "normal" ZFS panics occur in the presence of I/O errors
(SAN zoning mistakes, cable issues, switch bugs), and had no ZFS corruption
nor data loss as a result.  We run S10U4 and S10U5, both SPARC and x86.
MPXIO works fine, once you have OS and arrays configured properly.

Note that I''d by far prefer to have ZFS-level redundancy, but our
equipment
doesn''t support a useful RAID-0, and our customers want cheap storage. 
But
we also charge them for tape backups....

Regards,

Marion

zfs discuss - Aug 2008 - ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

Re: ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN

[zfs-discuss] ZFS with Traditional SAN