Matthieu Fecteau
2010-Oct-26 14:21 UTC
[zfs-discuss] Newbie question : snapshots, replication and recovering failure of Site B
Hi, I''m planning to use the replication scripts on that page : http://www.infrageeks.com/groups/infrageeks/wiki/8fb35/zfs_autoreplicate_script.html It uses the timeslider (other way possible) to take snapshots, uses zfs send/receive to replicate and another script for cleaning up the old snapshots. My question : in the event that there''s no more common snapshot between Site A and Site B, how can we replicate again ? (example : Site B has a power failure and then Site A cleanup his snapshots before Site B is brought back, so that there''s no more common snapshots between the sites). I''m thinking of using OpenSolaris for my 30TB storage (replicated to another 30TB). If a situation like this happens, will I need to erase eveything in Site B, and start all over again ? Or is there another more efficient (faster) way ? How ? Thank you -- This message posted from opensolaris.org
erik.ableson
2010-Oct-26 16:32 UTC
[zfs-discuss] Newbie question : snapshots, replication and recovering failure of Site B
On 26 oct. 2010, at 16:21, Matthieu Fecteau wrote:> Hi, > > I''m planning to use the replication scripts on that page : > http://www.infrageeks.com/groups/infrageeks/wiki/8fb35/zfs_autoreplicate_script.html > > It uses the timeslider (other way possible) to take snapshots, uses zfs send/receive to replicate and another script for cleaning up the old snapshots. > > My question : in the event that there''s no more common snapshot between Site A and Site B, how can we replicate again ? (example : Site B has a power failure and then Site A cleanup his snapshots before Site B is brought back, so that there''s no more common snapshots between the sites). > > I''m thinking of using OpenSolaris for my 30TB storage (replicated to another 30TB). If a situation like this happens, will I need to erase eveything in Site B, and start all over again ? Or is there another more efficient (faster) way ? How ?That''s the risk of using Time Slider managing your snapshot deletion... But there are a few ways around this, the first being to make sure that you avoid using volatile snapshots for replication (filter on daily, weekly or monthly). And the smart people developing ZFS noted this as an issue and in the newer builds (I don''t remember in which one this showed up) you can put a hold on a snapshot so that in order to delete it you explicitly must remove the hold. More details on p203 of the ZFS Admin Guide 2010.01. (http://dlc.sun.com/pdf/817-2271/817-2271.pdf) Or you can roll your own snapshot schedule based on your specific requirements. There are a couple of other scripts on the page that you can use in your own scripts to handle creation and cleanup of snapshots. I use an hourly schedule on some systems, daily on others and weekly for specific off-site backup replication. It all depends on the environment. In all cases, if you run into a serious issue where one site is going to be offline for an extended period, you''ll want to stop your snapshot cleanup routine. The hiccup is that if you''re using Time Slider, the same process manages creation and deletion so you stop taking snapshots when you disable Time Slider. Given the amount of data you''re looking at I would seriously consider writing your own snapshot taking/deleting scripts so that you can have a little more control over them. That said, I seem to recall reading that Time Slider was going to build in the send/recv functions as an option, but I never looked into that any further. Cheers, Erik
Tuomas Leikola
2010-Oct-27 11:21 UTC
[zfs-discuss] Newbie question : snapshots, replication and recovering failure of Site B
On Tue, Oct 26, 2010 at 5:21 PM, Matthieu Fecteau <matthieufecteau at gmail.com> wrote:> My question : in the event that there''s no more common snapshot between Site A and Site B, how can we replicate again ? (example : Site B has a power failure and then Site A cleanup his snapshots before Site B is brought back, so that there''s no more common snapshots between the sites).In this event you cannot send incrementals but need to transfer everything again. It would be advisable to not delete snapshots during backup outage. -- - Tuomas