Philip Brown
2011-Jan-18 19:46 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
Sorry if this is well known.. I tried a bunch of googles, but didnt get anywhere useful. Closest I came, was http://mail.opensolaris.org/pipermail/zfs-discuss/2009-April/028090.html but that doesnt answer my question, below, reguarding zfs mirror recovery. Details of our needs follow. We normally are very into redundancy. Pretty much all our SAN storage is dual ported, along with all our production hosts. Two completely redundant paths to storage. Two independant SANs. However, now, we are encountering a need for "tier 3" storage, aka "not that important, we''re going to go cheap on it" ;-) That being said, we''d still like to make it as reliable and robust as possible. So I was wondering just how robust it would be to do ZFS mirroring, across 2 sans. My specific question is, how easily does ZFS handle *temporary* SAN disconnects, to one side of the mirror? What if the outage is only 60 seconds? 3 minutes? 10 minutes? an hour? If we have 2x1TB drives, in a simple zfs mirror.... if one side goes temporarily off line, will zfs attempt to resync **1 TB** when it comes back? Or does it have enough intelligence to say, "oh hey I know this disk..and I know [these bits] are still good, so I just need to resync [that bit]" ? -- This message posted from opensolaris.org
Torrey McMahon
2011-Jan-18 19:51 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
On 1/18/2011 2:46 PM, Philip Brown wrote:> My specific question is, how easily does ZFS handle*temporary* SAN disconnects, to one side of the mirror? > What if the outage is only 60 seconds? > 3 minutes? > 10 minutes? > an hour?Depends on the multipath drivers and the failure mode. For example, if the link drops completely at the host hba connection some failover drivers will mark the path down immediately which will propagate up the stack faster than an intermittent connection or something father down stream failing.> If we have 2x1TB drives, in a simple zfs mirror.... if one side goes temporarily off line, will zfs attempt to resync **1 TB** when it comes back? Or does it have enough intelligence to say, "oh hey I know this disk..and I know [these bits] are still good, so I just need to resync [that bit]" ?My understanding is yes though I can''t find the reference for this. (I''m sure someone else will find it in short order.)
Erik Trimble
2011-Jan-18 19:57 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
On Tue, 2011-01-18 at 14:51 -0500, Torrey McMahon wrote:> > On 1/18/2011 2:46 PM, Philip Brown wrote: > > My specific question is, how easily does ZFS handle*temporary* SAN disconnects, to one side of the mirror? > > What if the outage is only 60 seconds? > > 3 minutes? > > 10 minutes? > > an hour? > > Depends on the multipath drivers and the failure mode. For example, if > the link drops completely at the host hba connection some failover > drivers will mark the path down immediately which will propagate up the > stack faster than an intermittent connection or something father down > stream failing. > > > If we have 2x1TB drives, in a simple zfs mirror.... if one side goes temporarily off line, will zfs attempt to resync **1 TB** when it comes back? Or does it have enough intelligence to say, "oh hey I know this disk..and I know [these bits] are still good, so I just need to resync [that bit]" ? > > My understanding is yes though I can''t find the reference for this. (I''m > sure someone else will find it in short order.)ZFS''s ability to handle "short-term" interruptions depend heavily on the underlying device driver. If the device driver reports the device as "dead/missing/etc" at any point, then ZFS is going to require a "zpool replace" action before it re-accepts the device. If the underlying driver simply stalls, then it''s more graceful (and no user interaction is required). As far as what the resync does: ZFS does "smart" resilvering, in that it compares what the "good" side of the mirror has against what the "bad" side has, and only copies the differences over to sync them up. This is one of ZFS''s great strengths, in that most other RAID systems can''t do this. -- Erik Trimble Java System Support Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
Chris Banal
2011-Jan-18 21:28 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
Erik Trimble wrote:> On Tue, 2011-01-18 at 14:51 -0500, Torrey McMahon wrote: >> On 1/18/2011 2:46 PM, Philip Brown wrote: >>> My specific question is, how easily does ZFS handle*temporary* SAN disconnects, to one side of the mirror? >>> What if the outage is only 60 seconds? >>> 3 minutes? >>> 10 minutes? >>> an hour?No idea how well it will reconnect the device but we had an X4500 that would randomly boot up and one or two disks would be missing. Reboot again and one or two other disks would be missing. While we were trouble shooting this problem this happened dozens and dozens of times and zfs had no trouble with it as far as I could tell. Would only resliver the data that was changed while that drive was offline. We had no data loss. Thank you, Chris Banal
Philip Brown
2011-Jan-18 21:34 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
> On Tue, 2011-01-18 at 14:51 -0500, Torrey McMahon > wrote: > > ZFS''s ability to handle "short-term" interruptions > depend heavily on the > underlying device driver. > > If the device driver reports the device as > "dead/missing/etc" at any > point, then ZFS is going to require a "zpool replace" > action before it > re-accepts the device. If the underlying driver > simply stalls, then > it''s more graceful (and no user interaction is > required). > > As far as what the resync does: ZFS does "smart" > resilvering, in that > it compares what the "good" side of the mirror has > against what the > "bad" side has, and only copies the differences over > to sync them up. >Hmm. Well, we''re talking fibre, so we''re very concerned with the recovery mode when the fibre drivers have marked it as "failed". (except it hasnt "really" failed, we''ve just had a switch drop out) I THINK what you are saying, is that we could, in this situation, do: zpool replace (old drive) (new drive) and then your "smart" recovery, should do the limited resilvering only. Even for potentially long outages. Is that what you are saying? -- This message posted from opensolaris.org
Erik Trimble
2011-Jan-18 22:35 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
On Tue, 2011-01-18 at 13:34 -0800, Philip Brown wrote:> > On Tue, 2011-01-18 at 14:51 -0500, Torrey McMahon > > wrote: > > > > ZFS''s ability to handle "short-term" interruptions > > depend heavily on the > > underlying device driver. > > > > If the device driver reports the device as > > "dead/missing/etc" at any > > point, then ZFS is going to require a "zpool replace" > > action before it > > re-accepts the device. If the underlying driver > > simply stalls, then > > it''s more graceful (and no user interaction is > > required). > > > > As far as what the resync does: ZFS does "smart" > > resilvering, in that > > it compares what the "good" side of the mirror has > > against what the > > "bad" side has, and only copies the differences over > > to sync them up. > > > > Hmm. Well, we''re talking fibre, so we''re very concerned with the recovery mode when the fibre drivers have marked it as "failed". (except it hasnt "really" failed, we''ve just had a switch drop out) > > I THINK what you are saying, is that we could, in this situation, do: > > zpool replace (old drive) (new drive) > > and then your "smart" recovery, should do the limited resilvering only. Even for potentially long outages. > > Is that what you are saying?Yes. It will always look at the "replaced" drive to see if it was a prior member of the mirror, and do smart resilvering if possible. If the device path stays the same (which, hopefully, it should), you can even do: zpool replace (old device) (old device) -- Erik Trimble Java System Support Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
Edward Ned Harvey
2011-Jan-19 02:32 UTC
[zfs-discuss] How well does zfs mirror handle temporary disk offlines?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Erik Trimble > > As far as what the resync does: ZFS does "smart" resilvering, in that > it compares what the "good" side of the mirror has against what the > "bad" side has, and only copies the differences over to sync them up. > This is one of ZFS''s great strengths, in that most other RAID systems > can''t do this.It''s also one of ZFS''s great weaknesses. It''s a strength as long as not much data has changed, or it was highly sequential in nature, or the drives in the pool have extremely high IOPS (SSD''s etc) because then resilvering just the changed parts can be done very quickly. Much quicker than resilvering the whole drive sequentially as a typical hardware raid would do. However, as is often the case, a large percentage of the drive may have changed, in essentially random order. There are many situations where something like 3% of the drive has changed, yet the resilver takes 100% as long as rewriting the entire drive sequentially would have taken. 10% of the drive changed .... ZFS resilver might be 4x slower than sequentially overwriting the entire disk as a hardware raid would have done. Ultimately, your performance depends entirely on your usage patterns, your pool configuration, and type of hardware. To the OP: If you''ve got one device on one SAN, mirrored to another device on another SAN, you''re probably only expecting very brief outages on either SAN. As such, you probably won''t see any large percentage of the online SAN change, and when the temporarily failed SAN comes back online, you can probably expect a very fast resilver.