Paul B. Henson
2007-Oct-12 20:24 UTC
[zfs-discuss] practicality of zfs send/receive for failover
We''ve been evaluating ZFS as a possible enterprise file system for our campus. Initially, we were considering one large cluster, but it doesn''t look like that will scale to meet our needs. So, now we are thinking about breaking our storage across multiple servers, probably three. However, I don''t necessarily want to incur the expense and hassle of maintaining three clusters, but think I might have three standalone servers instead. If one of them happens to break, we''re only down 1/3 Of our files, not all of them. Given our budget, that''s probably an acceptable compromise. On the other hand, it would be nice to have some level of redundancy, so I''m toying with the idea of having each server be primary for some amount of storage, and secondary for a different set of storage. Each server would use zfs send to replicate snapshots to its backup server. I''ve read a number of threads and blog posts discussing zfs send/receive and its applicability is such an implementation, but I''m curious if anyone has actually done something like that in practice, and if so how well it worked. What authentication/authorization was used to transfer the zfs snapshots between servers? I''m thinking about using ssh with public-key authentication over an internal private network the servers are connected to with different ethernet interfaces than the ones facing the world and actually serving files. Does zfs send/receive have to be done with root privileges, or can RBAC or some other mechanism be used so a lower privileged account could be used? In the various threads I read about this type of failover, there was some issue about marking the filesystems readonly on the slave, or else changes would cause snapshots to fail? Supposedly there was some feature added to zfs receive to rectify this problem, did that make it into S10U4, or is that still only in the development version? Did you have automatic or manual failover? I''m thinking about having a manual failover process, if the process were automatic given the replication is only one way if a failover happened, and the secondary server started providing service, updates would happen there that would not be on the primary server if it suddenly came back to life and took over again. How did you implement the failover at the network level? DNS change? Virtual IP address switched from one server to the other? Thanks much for any feedback... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
Matthew Ahrens
2007-Oct-12 20:44 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Paul B. Henson wrote:> Does zfs send/receive have to be done with root > privileges, or can RBAC or some other mechanism be used so a lower > privileged account could be used?You can use delegated administration ("zfs allow someone send pool/fs"). This is in snv_69. RBAC is much more coarse-grained, but you could use it too.> In the various threads I read about this type of failover, there was some > issue about marking the filesystems readonly on the slave, or else changes > would cause snapshots to fail? Supposedly there was some feature added to > zfs receive to rectify this problem, did that make it into S10U4, or is > that still only in the development version?You can do "zfs recv -F", which will discard any changes made since the most recent snapshot, in order to perform the receive. This is in snv_48 and s10u4. --matt
Vincent Fox
2007-Oct-13 01:04 UTC
[zfs-discuss] practicality of zfs send/receive for failover
So the problem in the zfs send/receive thing, is what if your network glitches out during the transfers? We have these once a day due to some as-yet-undiagnosed switch problem, a chop-out of 50 seconds or so which is enough to trip all our IPMP setups and enough to abort SSH transfers in progress. Just saying you need error-checking to account for these. The transfers in my testing seemed fairly slow I was doing a full send and receive not incremental, for some 400 gigs and it took over 24 hours at which time I lost connection and gave up on the idea. Once you were just down to incrementals it probably wouldn''t be so bad. This message posted from opensolaris.org
Richard Elling
2007-Oct-13 16:37 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Vincent Fox wrote:> So the problem in the zfs send/receive thing, is what if your network glitches out during the transfers?zfs doesn''t know. It depends on how the pipe tolerates breakage.> We have these once a day due to some as-yet-undiagnosed switch problem, a chop-out of 50 seconds or so which is enough to trip all our IPMP setups and enough to abort SSH transfers in progress.See in.mpathd(1m) for details on the algorithm and how to tune the FAILURE_DETECTION_TIME.> Just saying you need error-checking to account for these. The transfers in my testing seemed fairly slow I was doing a full send and receive not incremental, for some 400 gigs and it took over 24 hours at which time I lost connection and gave up on the idea. Once you were just down to incrementals it probably wouldn''t be so bad.The next time this happens, could you collect iostat(1m) data from the sending host? I prefer something like "iostat -xnzT d 10" I''d like to see if we are actually read bound on the sending host, which should be the normal case. -- richard
Robert Milkowski
2007-Oct-15 10:53 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Hello Paul, If you don''t need a support then Sun Cluster 3.2 is free and it works with ZFS. What you could do is to setup 3-node cluster with 3 resource groups each assigned with different primary node and failback set to true. Of course in that config the storage requirements will be different. Still using zfs send|recv as a backup to different storage would be a good idea. -- Best regards, Robert Milkowski mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Paul B. Henson
2007-Oct-17 00:17 UTC
[zfs-discuss] practicality of zfs send/receive for failover
On Fri, 12 Oct 2007, Matthew Ahrens wrote:> You can use delegated administration ("zfs allow someone send pool/fs"). > This is in snv_69. RBAC is much more coarse-grained, but you could use > it too.Out of curiosity, what kind of things are going to be added via patches to S10u4 vs things that are going to need to wait for u5? I keep finding cool stuff I want that''s not in u4 yet, and I''m not really very patient ;).> You can do "zfs recv -F", which will discard any changes made since the > most recent snapshot, in order to perform the receive. This is in snv_48 > and s10u4.Excellent, at least that one is in the production release already. Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
Paul B. Henson
2007-Oct-17 00:19 UTC
[zfs-discuss] practicality of zfs send/receive for failover
On Fri, 12 Oct 2007, Paul B. Henson wrote:> I''ve read a number of threads and blog posts discussing zfs send/receive > and its applicability is such an implementation, but I''m curious if > anyone has actually done something like that in practice, and if so how > well it worked.So I didn''t hear from anyone on this thread actually running such an implementation in production? Could someone maybe comment on a theoretical level :) whether this would be realistic for multiple terabytes, or if I should just give up on it? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
Richard Elling
2007-Oct-17 00:36 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Paul B. Henson wrote:> On Fri, 12 Oct 2007, Paul B. Henson wrote: > >> I''ve read a number of threads and blog posts discussing zfs send/receive >> and its applicability is such an implementation, but I''m curious if >> anyone has actually done something like that in practice, and if so how >> well it worked. > > So I didn''t hear from anyone on this thread actually running such an > implementation in production? Could someone maybe comment on a theoretical > level :) whether this would be realistic for multiple terabytes, or if I > should just give up on it?It should be more reasonable to use ZFS send/recv than a dumb volume block copy. It should be on the same order of goodness as rsync-style copying. I use send/recv quite often, but my wife doesn''t have a TByte of pictures (yet :-) -- richard
Matthew Ahrens
2007-Oct-17 00:46 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Richard Elling wrote:> Paul B. Henson wrote: >> On Fri, 12 Oct 2007, Paul B. Henson wrote: >> >>> I''ve read a number of threads and blog posts discussing zfs send/receive >>> and its applicability is such an implementation, but I''m curious if >>> anyone has actually done something like that in practice, and if so how >>> well it worked. >> So I didn''t hear from anyone on this thread actually running such an >> implementation in production? Could someone maybe comment on a theoretical >> level :) whether this would be realistic for multiple terabytes, or if I >> should just give up on it? > > It should be more reasonable to use ZFS send/recv than a dumb volume > block copy. It should be on the same order of goodness as rsync-style > copying. I use send/recv quite often, but my wife doesn''t have a TByte > of pictures (yet :-)Incremental zfs send/recv is actually orders of magnitude "more goodness" than rsync (due to much faster finding of changed files). I know of customers who are using send|ssh|recv to replicate entire thumpers across the country, in production. I''m sure they''ll speak up here if/when they find this thread... --matt
Robert Milkowski
2007-Oct-17 09:37 UTC
[zfs-discuss] practicality of zfs send/receive for failover
Hello Matthew, Wednesday, October 17, 2007, 1:46:02 AM, you wrote: MA> Richard Elling wrote:>> Paul B. Henson wrote: >>> On Fri, 12 Oct 2007, Paul B. Henson wrote: >>> >>>> I''ve read a number of threads and blog posts discussing zfs send/receive >>>> and its applicability is such an implementation, but I''m curious if >>>> anyone has actually done something like that in practice, and if so how >>>> well it worked. >>> So I didn''t hear from anyone on this thread actually running such an >>> implementation in production? Could someone maybe comment on a theoretical >>> level :) whether this would be realistic for multiple terabytes, or if I >>> should just give up on it? >> >> It should be more reasonable to use ZFS send/recv than a dumb volume >> block copy. It should be on the same order of goodness as rsync-style >> copying. I use send/recv quite often, but my wife doesn''t have a TByte >> of pictures (yet :-)MA> Incremental zfs send/recv is actually orders of magnitude "more goodness" MA> than rsync (due to much faster finding of changed files). MA> I know of customers who are using send|ssh|recv to replicate entire thumpers MA> across the country, in production. I''m sure they''ll speak up here if/when MA> they find this thread... I know such environment too, however just across a server room :) Is it perfect? No... but still comparing to "legacy" backup it''s much better in terms of performance and much worse in terms of manageability. -- Best regards, Robert Milkowski mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Paul B. Henson
2007-Oct-17 18:57 UTC
[zfs-discuss] practicality of zfs send/receive for failover
On Tue, 16 Oct 2007, Matthew Ahrens wrote:> I know of customers who are using send|ssh|recv to replicate entire > thumpers across the country, in production. I''m sure they''ll speak up > here if/when they find this thread...Ah, that''s who I''d like to hear from :)... Thanks for the secondhand information though... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768