Hi all, I''m trying to accomplish server to server storage replication in synchronous mode where each server is a Solaris/OpenSolaris machine with its own local storage. For Linux, I''ve been able to achieve what I want with DRBD but I''m hoping I can find a similar solution on Solaris so that I can leverage ZFS. It seems that solution is Sun Availability Suite (AVS)? One of the major concerns I have is what happens when the primary storage server fails. Will the secondary take over automatically (using some sort of heartbeat mechanism)? Once the secondary node takes over, can it fail-back to the primary node once the primary node is back? My concern is that AVS is not able to repair the primary node after it has failed, as per the conversation in this forum: http://discuss.joyent.com/viewtopic.php?id=19096 "AVS is essentially one-way replication. If your primary fails, your secondary can take over as the primary but the disks remain in the secondary state. There is no way to reverse the replication while the secondary is acting as the primary." Is AVS even the right solution here, or should I be looking at some other technology? Thanks. -Moazam
On Jun 8, 2010, at 20:17, Moazam Raja wrote:> One of the major concerns I have is what happens when the primary > storage server fails. Will the secondary take over automatically > (using some sort of heartbeat mechanism)? Once the secondary node > takes over, can it fail-back to the primary node once the primary > node is back? > > My concern is that AVS is not able to repair the primary node after > it has failed, as per the conversation in this forum:Either the primary node OR the secondary node can have active writes to a volume, but NOT BOTH at the same time. Once the secondary becomes active, and has made changes, you have to replicate the changes back to the primary. Here''s a good (though dated) demo of the basic functionality: http://hub.opensolaris.org/bin/view/Project+avs/Demos The reverse replication is in Part 2, but I recommend watching them in order for proper context. For making the secondary send data to the primary:> -r > > Reverses the direction of the synchronization so the primary volume > is synchronized from the secondary volume. [...]http://docs.sun.com/app/docs/doc/819-2240/sndradm-1m For detecting a node failure, and automatic fail over, you could use Solaris Cluster: http://en.wikipedia.org/wiki/Solaris_Cluster http://hub.opensolaris.org/bin/view/Community+Group+ha-clusters/ http://mail.opensolaris.org/pipermail/ha-clusters-discuss/ If you have a SAN (or iSCSI?), you can have two machines have read- write access to the same LUN using something like QFS: http://en.wikipedia.org/wiki/QFS
Maurice Volaski
2010-Jun-09 11:40 UTC
[zfs-discuss] ZFS host to host replication with AVS?
>For Linux, I''ve been able to achieve what I want with DRBD but I''m >hoping I can find a similar solution on Solaris so that I can leverage >ZFS. It seems that solution is Sun Availability Suite (AVS)?AVS is like DRBD, but only to a point. If the drives on your primary fail, the primary will start reading from the drives on the secondary. However, a critical difference is that after the primary fails and the secondary takes over, you won''t have a mirror until you bring the primary completely back online as the primary. You can''t make it the secondary temporarily. DRBD can trivially reverse the roles on the fly, so you can run the secondary as a primary and primary as the secondary and the mirroring works in reverse automatically.>One of the major concerns I have is what happens when the primary >storage server fails. Will the secondary take over automatically >(using some sort of heartbeat mechanism)? Once the secondary node >takes over, can it fail-back to the primary node once the primary node >is back?When the server fails, your users would lose access because AVS deals only with storage. It has no "heartbeat" functionality for the server itself. This is similar to DRBD. Ordinarily, you run DRBD and Linux-HA (Heartbeat). Unfortunately, there are no simple, easy to implement heartbeat mechanisms for Solaris.>My concern is that AVS is not able to repair the primary node after it >has failed, as per the conversation in this forum:The secondary can repair the primary, but the primary must be running actively as the primary. That is, unlike DRBD where the roles of primary and secondary are reversed, they are not under AVS.> >Is AVS even the right solution here, or should I be looking at some >other technology?The simplest way to achieve DRBD-like functionality is to use iSCSI. You create a zvol and iSCSI target on each server and then set up multipathed initiators. Then create a mirrored pool in ZFS out of the zvols. With some kind of heartbeat mechanism in place, you can move the (degraded) mirrored pool to the secondary. -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
On Wed, Jun 9, 2010 at 7:40 AM, Maurice Volaski <maurice.volaski at einstein.yu.edu> wrote:>> For Linux, I''ve been able to achieve what I want with DRBD but I''m >> hoping I can find a similar solution on Solaris so that I can leverage >> ZFS. It seems that solution is Sun Availability Suite (AVS)? > > AVS is like DRBD, but only to a point. If the drives on your primary fail, > the primary will start reading from the drives on the secondary. > > However, a critical difference is that after the primary fails and the > secondary takes over, you won''t have a mirror until you bring the primary > completely back online as the primary. You can''t make it the secondary > temporarily. DRBD can trivially reverse the roles on the fly, so you can run > the secondary as a primary and primary as the secondary and the mirroring > works in reverse automatically.Are you sure of that? This directly contradicts what David Magda said yesterday.>> One of the major concerns I have is what happens when the primary >> storage server fails. Will the secondary take over automatically >> (using some sort of heartbeat mechanism)? Once the secondary node >> takes over, can it fail-back to the primary node once the primary node >> is back? > > When the server fails, your users would lose access because AVS deals only > with storage. It has no "heartbeat" functionality for the server itself. > This is similar to DRBD. Ordinarily, you run DRBD and Linux-HA (Heartbeat). > Unfortunately, there are no simple, easy to implement heartbeat mechanisms > for Solaris.Not so. Sun/Solaris Cluster is (fairly) simple and (relatively) easy to implement and it will handle all of the requirements that Moazam has laid out so far: intelligent failover and failback of common storage between two redundant nodes. In addition, with HA-NFS, you shouldn''t have any problems with the clients - in the event of a node failure, they will pause for the few seconds it takes for HA-NFS to failover. fpsm
On Jun 8, 2010, at 5:17 PM, Moazam Raja wrote:> Hi all, I''m trying to accomplish server to server storage replication > in synchronous mode where each server is a Solaris/OpenSolaris machine > with its own local storage. > > For Linux, I''ve been able to achieve what I want with DRBD but I''m > hoping I can find a similar solution on Solaris so that I can leverage > ZFS. It seems that solution is Sun Availability Suite (AVS)?Yes.> One of the major concerns I have is what happens when the primary > storage server fails. Will the secondary take over automatically > (using some sort of heartbeat mechanism)? Once the secondary node > takes over, can it fail-back to the primary node once the primary node > is back?This functionality is not built into AVS itself, it is usually done by a high availability manager, such as NexentaStor''s Simple-HA with Auto-CDP, or a disaster recovery manager such as Oracle''s Solaris Cluster Geographic Edition.> My concern is that AVS is not able to repair the primary node after it > has failed, as per the conversation in this forum: > > http://discuss.joyent.com/viewtopic.php?id=19096 > > "AVS is essentially one-way replication. If your primary fails, your > secondary can take over as the primary but the disks remain in the > secondary state. There is no way to reverse the replication while the > secondary is acting as the primary."I''m not sure where the poster got this information, or how it seems to be at odds with the design goals of AVS. Perhaps they only looked at one piece of the puzzle and then got lost?> Is AVS even the right solution here, or should I be looking at some > other technology?Out of the box, AVS is just one part of a total solution. What we''ve done at Nexenta is to automate and greatly simplify AVS administration with the Auto-CDP plugin. From an administrator''s point of view, you point Auto-CDP at the local ZFS volume, point each disk in the volume at a disk on the remote machine, and push the go button. Changing the flow is done with a simple CLI command. On top of Auto-CDP, the Simple-HA plugin manages the start of services and IP address migration between the two systems. Once again, the plumbing is hidden behind the walls and the administration interface is short and sweet. For more information, see the docs at: http://www.nexenta.com/corp/documentation -- richard -- Richard Elling richard at nexenta.com +1-760-896-4422 ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Maurice Volaski
2010-Jun-09 21:06 UTC
[zfs-discuss] ZFS host to host replication with AVS?
>Are you sure of that? This directly contradicts what David Magda >said yesterday.Yes. Just how is what he said contradictory?> > Unfortunately, there are no simple, easy to implement heartbeat mechanisms >> for Solaris. > >Not so. Sun/Solaris Cluster is (fairly) simple and (relatively) easy >to implement and it will handle all of the requirements that MoazamTo be fair, I didn''t actually try it, but, for one thing, though I might be wrong, it must be compiled manually to work with developer builds. Rather, ironically, perhaps, I cobbled together some bash scripts that perform basic heartbeat functionality. -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
Maurice Volaski
2010-Jun-10 03:03 UTC
[zfs-discuss] ZFS host to host replication with AVS?
>I''m not sure where the poster got this information, or how it seems to >be at odds with the design goals of AVS. Perhaps they only looked at >one piece of the puzzle and then got lost?I wrote it :-) It''s right there in the manual, in fact: http://docs.sun.com/source/819-6148-10/chap4.html#pgfId-1009132>Changing the flow is done with a simple CLI command.The point I was trying to get across is that the primary and secondary roles under AVS are fixed. Under DRBD, if a primary has failed and the secondary has taken over, the roles reverse (under the direction of Linux HA). The secondary becomes the acting primary. When the real primary is restarted, it will be the acting secondary and the mirroring changes direction. This happens on the fly. AVS cannot do that at all! The primary must fully be brought back online as the real, active primary for the synchronization to be able to reverse direction. That means until the primary can be switched to that state, which may not be immediately be practical if users needed to be disconnected, there won''t be any replication. (ZFS mirroring of iSCSI volumes works like DRBD in this regard.) Unless, of course, you''re implying Nexenta has done something to AVS'' code to change this behavior. -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
On Wed, Jun 9, 2010 at 5:06 PM, Maurice Volaski <maurice.volaski at einstein.yu.edu> wrote:>> Are you sure of that? This directly contradicts what David Magda said >> yesterday. > > Yes. Just how is what he said contradictory?To quote from his message:> Either the primary node OR the secondary node can have active writes > to a volume, but NOT BOTH at the same time. Once the secondary > becomes active, and has made changes, you have to replicate the > changes back to the primary. Here''s a good (though dated) demo of > the basic functionality: > > http://hub.opensolaris.org/bin/view/Project+avs/Demos > > The reverse replication is in Part 2, but I recommend watching them in > order for proper context. For making the secondary send data to the primary: > >> -r >> >> Reverses the direction of the synchronization so the primary volume is >> synchronized from the secondary volume. [...] >> >> http://docs.sun.com/app/docs/doc/819-2240/sndradm-1m>From your message:> However, a critical difference is that after the primary fails and the secondary > takes over, you won''t have a mirror until you bring the primary completely back > online as the primary. You can''t make it the secondary temporarily. DRBD can > trivially reverse the roles on the fly, so you can run the secondary as a primary > and primary as the secondary and the mirroring works in reverse automatically.Maybe there is another way to read those, but it looks to me like David says you can trivially swap the roles of the nodes using the ''-r'' switch (and he provides a link to the documentation), and you say that you can''t trivially swap the roles of the nodes. [...]>> ?> Unfortunately, there are no simple, easy to implement heartbeat >> mechanisms >>> >>> ?for Solaris. >> >> Not so. Sun/Solaris Cluster is (fairly) simple and (relatively) easy >> to implement and it will handle all of the requirements that Moazam > > To be fair, I didn''t actually try it, but, for one thing, though I might be > wrong, it must be compiled manually to work with developer builds. Rather, > ironically, perhaps, I cobbled together some bash scripts that perform basic > heartbeat functionality.I pretty much stick with the production release Solaris, not developer builds. No compilation is necessary for Cluster on Solaris. It''s a fairly straight forward pkg install and a few configuration commands.
On Jun 10, 2010, at 03:50, Fredrich Maney wrote:> David Magda wrote: > >> Either the primary node OR the secondary node can have active writes >> to a volume, but NOT BOTH at the same time. Once the secondary >> becomes active, and has made changes, you have to replicate the >> changes back to the primary. Here''s a good (though dated) demo of >> the basic functionality: >> >> http://hub.opensolaris.org/bin/view/Project+avs/DemosMaurice, watch the two parts of the demo. It will show how things work (at least with ZFS).
Maurice Volaski
2010-Jun-10 16:57 UTC
[zfs-discuss] ZFS host to host replication with AVS?
>Maybe there is another way to read those, but it looks to me like >David says you >can trivially swap the roles of the nodes using the ''-r'' switch (and >he provides a >link to the documentation), and you say that you can''t trivially swap >the roles of >the nodes.The -r switch temporarily reverses the direction of data flow from the secondary to the primary to sync up an outdated primary. After that, the data flow reverts back to primary to secondary. Reversing the roles requires many more steps (and time)... http://docs.sun.com/source/819-6148-10/chap4.html#pgfId-1009132 -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University