hi, my two nodes are running fine with 8.2-stable and the LSI 9200-8e and now, I want to build a failover for the Zpool (and later ISCSI target) Both nodes are connected to the same disks (jbod) and now I need a way, to get the zpool(s) running on the node with the CARP public IP. I found something about carp, hooks, devd etc. pp. My first thought was: carp -> hook -> zpool import -> devd -> ISCSI target up Problem HostA: no network: carp -> hook -> zpool export -> devd -> ISCSI target down -> OK Back to master shouldn't be a problem, I think ... Problem HostA: power off: carp -> hook -> zpool import -> devd -> ISCSI target up -> OK But a bad problem could be: Problem HostA: SAS Cntrl/cabel problem: carp is up and running ..., no reason to failover to slave ... -> bad So, I mean, I have to use devd who takes notes, that the /dev/da* disks are gone or inaccessible and inform carp or shutdown the network interface to force carp switching over to slave. So any hints are welcome :-) If it works, I will write a howto for that one :-) cu denny -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110329/185088a0/attachment.pgp
On Tue, 29 Mar 2011 13:17:01 +0200 Denny Schierz wrote: DS> hi, DS> my two nodes are running fine with 8.2-stable and the LSI 9200-8e and DS> now, I want to build a failover for the Zpool (and later ISCSI target) DS> Both nodes are connected to the same disks (jbod) and now I need a way, DS> to get the zpool(s) running on the node with the CARP public IP. You don't need HAST but might you want to try net-mgmt/hastmon? :-) I wrote it because didn't like much failovering with CARP. For hastmon you need at least 3 hosts: 2 cluster nodes (primary/secondary) and watchdog. Watchdog is polling the states of the cluster nodes. Secondary decides to failover when: 1) There is no connection with primary. 2) There are complaints from watchdog. The configuration is simple and would look like below (on all 3 hosts): resource iscsi { exec /etc/iscsi.sh on hostA { remote hostB priority 0 } on hostB { remote hostA priority 1 } on hostW { remote hostA hostB } } /etc/iscsi.sh script should support at least 3 arguments: start -- switch node to primary (iscsi up, IP up, etc); stop -- switch node to secondary; status -- return current status (0 - UP, 1 - DOWN, 2 - UNKNOWN). You can find more information in README: http://code.google.com/p/hastmon/wiki/README -- Mikolaj Golub
Hi Denny, Although I haven't fully implemented this yet I was thinking of a failover system based on carp + ifstated (/usr/ports/net/ifstated). ifstated will allow you to execute stuff on events like the carp interface becoming the master/backup. As a side note I'd suggest that you consider using something like IPMI to shutdown power on the remote node when you take over/import the zpool. At least this was my biggest fear - getting into a splitbrain and both nodes using the pool. I was even thinking of telneting to the APC PDUs and shutting down ports ;) Anyway I hope this is a useful idea. Very interested of what others have done to implement a redundant/failover ZFS solution. Cheers, Rumen Telbizov On Tue, Mar 29, 2011 at 4:17 AM, Denny Schierz <linuxmail@4lin.net> wrote:> hi, > > my two nodes are running fine with 8.2-stable and the LSI 9200-8e and > now, I want to build a failover for the Zpool (and later ISCSI target) > > Both nodes are connected to the same disks (jbod) and now I need a way, > to get the zpool(s) running on the node with the CARP public IP. > > I found something about carp, hooks, devd etc. pp. My first thought was: > > carp -> hook -> zpool import -> devd -> ISCSI target up > > Problem HostA: no network: > > carp -> hook -> zpool export -> devd -> ISCSI target down -> OK > > Back to master shouldn't be a problem, I think ... > > > Problem HostA: power off: > > carp -> hook -> zpool import -> devd -> ISCSI target up -> OK > > But a bad problem could be: > > Problem HostA: SAS Cntrl/cabel problem: > > carp is up and running ..., no reason to failover to slave ... -> bad > > So, I mean, I have to use devd who takes notes, that the /dev/da* disks > are gone or inaccessible and inform carp or shutdown the network > interface to force carp switching over to slave. > > So any hints are welcome :-) > > If it works, I will write a howto for that one :-) > > cu denny > > >-- Rumen Telbizov http://telbizov.com
hi, Am Dienstag, den 29.03.2011, 23:36 +0300 schrieb Mikolaj Golub:> > 2) There are complaints from watchdog.what happens, if the watchdog isn't available and one or both nodes are rebooting or something else? the other thing what could happen: the connection between the host and the SAS switch is death. carp, ifstate and hastmon looking for the reachable IP, but not, if the local storage is available. So I have a closer look to devd and zfs and shutdown in case of problems the carp interface / or whole machine, to force a switch. cu denny -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110427/9205c6ee/attachment.pgp