thr3ads.net - zfs discuss - [zfs-discuss] ZFS on Hitachi SAN, pool recovery [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Vincent Fox

2008-Sep-23 17:14 UTC

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Just make SURE the other host is actually truly DEAD!

If for some reason it''s simply wedged, or you have lost console access
but the hostA is still "live", then you can end up with 2 systems
having access to same ZFS pool.

I have done this in test, 2 hosts accessing same pool, and the result is
catastrophic pool corruption.

I use the simple method if I think hostA is dead, I call the operators and get
them to pull the power cords out of it just to be certain.  Then I force import
on hostB with certainty.
--
This message posted from opensolaris.org

Tim Haley

2008-Sep-23 17:51 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Vincent Fox wrote:> Just make SURE the other host is actually truly DEAD!
> 
> If for some reason it''s simply wedged, or you have lost console
access but the hostA is still "live", then you can end up with 2
systems having access to same ZFS pool.
> 
> I have done this in test, 2 hosts accessing same pool, and the result is
catastrophic pool corruption.
> 
> I use the simple method if I think hostA is dead, I call the operators and
get them to pull the power cords out of it just to be certain.  Then I force
import on hostB with certainty.
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
This is a common cluster scenario, you need to make sure the other node is 
dead, so you force that result.  In lustre set-ups they recommend a STONITH 
(Shoot the Other Node in the Head) approach.  They use a combo of a heartbeat 
setup like described here:

http://www.linux-ha.org/Heartbeat

and then something like the powerman framework to ''kill'' the
offline node.

Perhaps those things could be made to run on Solaris if they don''t
already.

-tim

Richard Elling

2008-Sep-23 18:51 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Tim Haley wrote:> Vincent Fox wrote:
>   
>> Just make SURE the other host is actually truly DEAD!
>>
>> If for some reason it''s simply wedged, or you have lost
console access but the hostA is still "live", then you can end up with
2 systems having access to same ZFS pool.
>>
>> I have done this in test, 2 hosts accessing same pool, and the result
is catastrophic pool corruption.
>>
>> I use the simple method if I think hostA is dead, I call the operators
and get them to pull the power cords out of it just to be certain.  Then I force
import on hostB with certainty.
>> --
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>     
>
> This is a common cluster scenario, you need to make sure the other node is 
> dead, so you force that result.  In lustre set-ups they recommend a STONITH
> (Shoot the Other Node in the Head) approach.  They use a combo of a
heartbeat
> setup like described here:
>
> http://www.linux-ha.org/Heartbeat
>
> and then something like the powerman framework to ''kill''
the offline node.
>
>   
> Perhaps those things could be made to run on Solaris if they don''t
already.
>   
Of course, Solaris Cluster (and the corresponding open source effort:
Open HA Cluster) manage cluster membership and data access.  We
also use SCSI reservations, so that a rogue node cannot even see the
data.  IMHO, if you do this without reservations, then you are dancing
with the devil in the details.
  -- richard

alan baldwin

2008-Sep-24 07:13 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Thanks to all for your comments and sharing your experiences.

In my setup the pools are split and then NFS mounted to other nodes, mostly
Oracle DB boxes. These mounts will provide areas for RMAN Flash backups to be
written.
If I lose connectivity to any host I will swing the luns over to the alternate
host and the NFS mount will be repointed on the Oracle node, so [u]hopefully[/u]
we should be safe with regards pool corruption.

Thanks again.
Max
--
This message posted from opensolaris.org

Marcelo Leal

2008-Sep-24 14:57 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Just curiosity, why don?t use SC?

 Leal.
--
This message posted from opensolaris.org

alan baldwin

2008-Sep-25 06:56 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Good question.
Well, the hosts are Netbackup Media servers. The idea behind the design is that
we stream the RMAN stuff to disk, via NFS mounts, and then write to tape during
the day. With the SAN attached disks sitting on these hosts and with disk
storage units configured for NBU the data stream only hits the network once, at
a quiet time.
In this instance the need for high availability is not really there.
The real driver behind this of course is probably the same for most
companies....cost.

BTW, another question if I may.
I noticed that when mounting the ZFS datasets on to individual nodes I have to
change the permissions to 777 to allow the oracle user to write to them.
It was my understanding that sharenfs=on allows rw by default.
Am I doing something wrong here?
Again, all help much appreciated.
Max
--
This message posted from opensolaris.org

Richard Elling

2008-Sep-25 20:57 UTC

head link

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

Richard Elling wrote:> Tim Haley wrote:
>> Vincent Fox wrote:
>>  
>>> Just make SURE the other host is actually truly DEAD!
>>>
>>> If for some reason it''s simply wedged, or you have lost
console
>>> access but the hostA is still "live", then you can end up
with 2
>>> systems having access to same ZFS pool.
>>>
>>> I have done this in test, 2 hosts accessing same pool, and the 
>>> result is catastrophic pool corruption.
>>>
>>> I use the simple method if I think hostA is dead, I call the 
>>> operators and get them to pull the power cords out of it just to be
>>> certain.  Then I force import on hostB with certainty.
>>> -- 
>>> This message posted from opensolaris.org
>>> _______________________________________________
>>> zfs-discuss mailing list
>>> zfs-discuss at opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>     
>>
>> This is a common cluster scenario, you need to make sure the other 
>> node is dead, so you force that result.  In lustre set-ups they 
>> recommend a STONITH (Shoot the Other Node in the Head) approach.  
>> They use a combo of a heartbeat setup like described here:
>>
>> http://www.linux-ha.org/Heartbeat
>>
>> and then something like the powerman framework to
''kill'' the offline
>> node.
>>
>>   Perhaps those things could be made to run on Solaris if they
don''t
>> already.
>>   
>
> Of course, Solaris Cluster (and the corresponding open source effort:
> Open HA Cluster) manage cluster membership and data access.  We
> also use SCSI reservations, so that a rogue node cannot even see the
> data.  IMHO, if you do this without reservations, then you are dancing
> with the devil in the details.
No sooner had I mentioned this, when the optional fencing project
was integrated into Open HA Cluster.  So you will be able to dance
with the devil, even with Solaris Cluster, if you want.
    http://blogs.sun.com/sc/

 -- richard

zfs discuss - Sep 2008 - ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery

[zfs-discuss] ZFS on Hitachi SAN, pool recovery