Just make SURE the other host is actually truly DEAD! If for some reason it''s simply wedged, or you have lost console access but the hostA is still "live", then you can end up with 2 systems having access to same ZFS pool. I have done this in test, 2 hosts accessing same pool, and the result is catastrophic pool corruption. I use the simple method if I think hostA is dead, I call the operators and get them to pull the power cords out of it just to be certain. Then I force import on hostB with certainty. -- This message posted from opensolaris.org
Vincent Fox wrote:> Just make SURE the other host is actually truly DEAD! > > If for some reason it''s simply wedged, or you have lost console access but the hostA is still "live", then you can end up with 2 systems having access to same ZFS pool. > > I have done this in test, 2 hosts accessing same pool, and the result is catastrophic pool corruption. > > I use the simple method if I think hostA is dead, I call the operators and get them to pull the power cords out of it just to be certain. Then I force import on hostB with certainty. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussThis is a common cluster scenario, you need to make sure the other node is dead, so you force that result. In lustre set-ups they recommend a STONITH (Shoot the Other Node in the Head) approach. They use a combo of a heartbeat setup like described here: http://www.linux-ha.org/Heartbeat and then something like the powerman framework to ''kill'' the offline node. Perhaps those things could be made to run on Solaris if they don''t already. -tim
Tim Haley wrote:> Vincent Fox wrote: > >> Just make SURE the other host is actually truly DEAD! >> >> If for some reason it''s simply wedged, or you have lost console access but the hostA is still "live", then you can end up with 2 systems having access to same ZFS pool. >> >> I have done this in test, 2 hosts accessing same pool, and the result is catastrophic pool corruption. >> >> I use the simple method if I think hostA is dead, I call the operators and get them to pull the power cords out of it just to be certain. Then I force import on hostB with certainty. >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > This is a common cluster scenario, you need to make sure the other node is > dead, so you force that result. In lustre set-ups they recommend a STONITH > (Shoot the Other Node in the Head) approach. They use a combo of a heartbeat > setup like described here: > > http://www.linux-ha.org/Heartbeat > > and then something like the powerman framework to ''kill'' the offline node. > > > Perhaps those things could be made to run on Solaris if they don''t already. >Of course, Solaris Cluster (and the corresponding open source effort: Open HA Cluster) manage cluster membership and data access. We also use SCSI reservations, so that a rogue node cannot even see the data. IMHO, if you do this without reservations, then you are dancing with the devil in the details. -- richard
Thanks to all for your comments and sharing your experiences. In my setup the pools are split and then NFS mounted to other nodes, mostly Oracle DB boxes. These mounts will provide areas for RMAN Flash backups to be written. If I lose connectivity to any host I will swing the luns over to the alternate host and the NFS mount will be repointed on the Oracle node, so [u]hopefully[/u] we should be safe with regards pool corruption. Thanks again. Max -- This message posted from opensolaris.org
Just curiosity, why don?t use SC? Leal. -- This message posted from opensolaris.org
Good question. Well, the hosts are Netbackup Media servers. The idea behind the design is that we stream the RMAN stuff to disk, via NFS mounts, and then write to tape during the day. With the SAN attached disks sitting on these hosts and with disk storage units configured for NBU the data stream only hits the network once, at a quiet time. In this instance the need for high availability is not really there. The real driver behind this of course is probably the same for most companies....cost. BTW, another question if I may. I noticed that when mounting the ZFS datasets on to individual nodes I have to change the permissions to 777 to allow the oracle user to write to them. It was my understanding that sharenfs=on allows rw by default. Am I doing something wrong here? Again, all help much appreciated. Max -- This message posted from opensolaris.org
Richard Elling wrote:> Tim Haley wrote: >> Vincent Fox wrote: >> >>> Just make SURE the other host is actually truly DEAD! >>> >>> If for some reason it''s simply wedged, or you have lost console >>> access but the hostA is still "live", then you can end up with 2 >>> systems having access to same ZFS pool. >>> >>> I have done this in test, 2 hosts accessing same pool, and the >>> result is catastrophic pool corruption. >>> >>> I use the simple method if I think hostA is dead, I call the >>> operators and get them to pull the power cords out of it just to be >>> certain. Then I force import on hostB with certainty. >>> -- >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> >> >> This is a common cluster scenario, you need to make sure the other >> node is dead, so you force that result. In lustre set-ups they >> recommend a STONITH (Shoot the Other Node in the Head) approach. >> They use a combo of a heartbeat setup like described here: >> >> http://www.linux-ha.org/Heartbeat >> >> and then something like the powerman framework to ''kill'' the offline >> node. >> >> Perhaps those things could be made to run on Solaris if they don''t >> already. >> > > Of course, Solaris Cluster (and the corresponding open source effort: > Open HA Cluster) manage cluster membership and data access. We > also use SCSI reservations, so that a rogue node cannot even see the > data. IMHO, if you do this without reservations, then you are dancing > with the devil in the details.No sooner had I mentioned this, when the optional fencing project was integrated into Open HA Cluster. So you will be able to dance with the devil, even with Solaris Cluster, if you want. http://blogs.sun.com/sc/ -- richard