Ralf Ramge
2007-Jul-12  05:41 UTC
[zfs-discuss] [AVS] Question concerning reverse synchronization of a zpool
Hi, I''m struggling to get a stable ZFS replication using Solaris 10 110/06 (actual patches) and AVS 4.0 for several weeks now. We tried it on VMware first and ended up in kernel panics en masse (yes, we read Jim Dunham''s blog articles :-). Now we try on the real thing, two X4500 servers. Well, I have no trouble replicating our kernel panics there, too ... but I think I learned some important things, too. But one problem is still remaining. I have a zpool on host A. Replication to host B works fine. * "zpool export tank" on the primary - works. * "sndradm -d" on both servers - works (paranoia mode) * "zpool import <id>" on the secondary - works. So far, so good. I chance the contents of the file system, add some files, delete some others ... no problems. The secondary is in production use now, everything is fine. Okay, let''s imagine I switched to the secodary host because had a problem with the primary. Now it''s repaired, now I want my redundancy back. * "sndradm -E -f ...." on both hosts - works. * "sndradm -u -r" on the primary for refreshing the primary - works. `nicstat` shows me a bit of traffic. Good, let''s switch back to the primary. Actual status: zpool is imported on the secondary and NOT imported on the primary. * "zpool export tank" on the secondary - *kernel panic* Sadly, the machine dies fast, I don''t see the kernel panic with `dmesg`. And disabling the replication again later and mounting the zpool on the primary again shows me that the update sync didn''t take place, the file system changes I did on the secondary wren''t replicated. Exporting the zpool on the secondary works *after* the system rebooted. I uses slices for the zpool, not LUNs, because I think many of my problems were caused by exclusive locking, but it doesn''t help with this one. Questions: a) I don''t understand why the kernel panics at the moment. the zpool isn''t mounted on both systems, the zpool itself seems to be fine after a reboot ... and switching the primary and secondary hosts just for resyncing seems to force a full sync, which isn''t an option. b) I''ll try a "sndradm -m -r" the next time ... but I''m not sure if I like that thought. I would accept this if I replaced the primary host with another server, but having to do a 24 TB full sync just because the replication itself had been disabled for a few minutes would be hard to swallow. Or did I do something wrong? c) What performance can I expect from a X4500, 40 disks zpool, when using slices, compared to LUNs? Any experiences? And another thing: I did some experiments with zvols, because I wanted to make desasters and the AVS configuration itself easier to handle - there won''t be a full sync after replacing a disk because AVS doesn''t "see" that a hot spare is being used, and hot spares won''t be replicated to the secondary host as well although the original drive on the secondary never failed. I used the zvol with UFS and this kind of "hardware RAID controller emulation by ZFS" works pretty well, just the performance went down the cliff. Sunsolve told me that this is a flushing problem and there''s a workaround in Nevada build 53 and higher. Has somebody done a comparison, can you share some experiences? I only have a few days left and I don''t waste time on installing Nevada for nothing ... Thanks, Ralf -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 ralf.ramge at webde.de - http://web.de/ 1&1 Internet AG Brauerstra?e 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren
Ralf Ramge
2007-Jul-12  12:54 UTC
[zfs-discuss] [AVS] Question concerning reverse synchronization of a zpool
Ralf Ramge wrote:> Questions: > > a) I don''t understand why the kernel panics at the moment. the zpool > isn''t mounted on both systems, the zpool itself seems to be fine after a > reboot ... and switching the primary and secondary hosts just for > resyncing seems to force a full sync, which isn''t an option. > > b) I''ll try a "sndradm -m -r" the next time ... but I''m not sure if I > like that thought. I would accept this if I replaced the primary host > with another server, but having to do a 24 TB full sync just because the > replication itself had been disabled for a few minutes would be hard to > swallow. Or did I do something wrong? > >I''ve answered these questions myself at the meantime (with a nice employee fo Sun Hamburg giving me the hint). For Google: during a reverse sync, neither side of the replication is allowed to have the zpool imported, because after the reverse sync finishes, SNDR enters replication mode. This renders reverse syncs useless for HA scenarios, switch primary & secondary instead.> c) What performance can I expect from a X4500, 40 disks zpool, when > using slices, compared to LUNs? Any experiences? > >Any input to the question will still be appreciated :-) -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 ralf.ramge at webde.de - http://web.de/ 1&1 Internet AG Brauerstra?e 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren
Jim Dunham
2007-Jul-12  22:01 UTC
[zfs-discuss] [AVS] Question concerning reverse synchronization of a zpool
Ralf,> Ralf Ramge wrote: >> Questions: >> >> a) I don''t understand why the kernel panics at the moment. the zpool >> isn''t mounted on both systems, the zpool itself seems to be fine >> after a >> reboot ... and switching the primary and secondary hosts just for >> resyncing seems to force a full sync, which isn''t an option. >> >> b) I''ll try a "sndradm -m -r" the next time ... but I''m not sure if I >> like that thought. I would accept this if I replaced the primary host >> with another server, but having to do a 24 TB full sync just >> because the >> replication itself had been disabled for a few minutes would be >> hard to >> swallow. Or did I do something wrong? >> >> > I''ve answered these questions myself at the meantime (with a nice > employee fo Sun Hamburg giving me the hint). For Google: during a > reverse sync, neither side of the replication is allowed to have the > zpool imported, because after the reverse sync finishes, SNDR enters > replication mode. This renders reverse syncs useless for HA scenarios, > switch primary & secondary instead.This is close, but not the actual scenario, and actual answer is much better then one would expect. Just prior to issuing a reverse sync, neither side of the replication is allowed to have the, zpool imported. This step is VERY IMPORTANT, since ZFS will detect SNDR replicated writes, and since these writes were not issued by the local ZFS, ZFS will assume these writes are some form of data corruption, since the checksums won''t match, and panic the system. Instantly after issuing a reverse sync, zpool(s) on the SNDR primary node can now be imported, without waiting. Although there may be minutes, hours or days of change that need to be replicated from the SNDR secondary volumes to the SNDR primary volumes, SNDR supports on- demand pull of unreplicated changes. This improves one''s MTTR (Mean Time To Recover), at the cost of performance for the duration of reverse sync, where the duration is a function of the amount of change that happened while running on the secondary node''s volumes. Also any changes made to the primary volume, will be replicated to the secondary volumes during this period, so that the very instant the reverse synchronization operation is complete, both sides of the replica will be identical.> -- > > Ralf Ramge > Senior Solaris Administrator, SCNA, SCSA > > Tel. +49-721-91374-3963 > ralf.ramge at webde.de - http://web.de/ > > 1&1 Internet AG > Brauerstra?e 48 > 76135 Karlsruhe > > Amtsgericht Montabaur HRB 6484 > > Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, > Andreas Gauger, Matthias Greve, Robert Hoffmann, Norbert Lang, > Achim Weiss > Aufsichtsratsvorsitzender: Michael Scheeren > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussJim Dunham Solaris, Storage Software Group Sun Microsystems, Inc. 1617 Southwood Drive Nashua, NH 03063 Phone x24042 / 781-442-4042 Email: James.Dunham at Sun.COM http://blogs.sun.com/avs