thr3ads.net - zfs discuss - [zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Armin Ollig

2008-Oct-27 15:23 UTC

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus

Hi Venku and all others,

 thanks for your suggestions. 
I wrote a script to do some IO from both hosts (in non-cluster-mode) to the
FC-LUNs in questions and check the md5sums of all files afterwards. As expected
there was no corruption.

After recreating the cluster-resource and a few failovers i found the HASP
resource in this state, with the vb1 zfs concurrently mounted on *both* nodes:

# clresource status vb1-storage
=== Cluster Resources ==Resource Name      Node Name      State         Status
Message
-------------      ---------      -----         --------------
vb1-storage        siegfried      Offline       Offline
                        voelsung       Starting      Unknown - Starting


 siegfried# zpool status
  pool: vb1
 state: ONLINE
 scrub: none requested
config:
        NAME                                       STATE     READ WRITE CKSUM
        vb1                                        ONLINE       0     0     0
          c4t600D0230000000000088824BC4228807d0s0  ONLINE       0     0     0

errors: No known data errors
voelsung# zpool status
  pool: vb1
 state: ONLINE
 scrub: none requested
config:
        NAME                                       STATE     READ WRITE CKSUM
        vb1                                        ONLINE       0     0     0
          c4t600D0230000000000088824BC4228807d0s0  ONLINE       0     0    
0[/i]


In this state filesystem-corruption can occur easily. 
The zpool was created using the cluster-wide did device:
zpool create vb1 /dev/did/dsk/d12s0         

There was no fc path failure to the LUNs, both interconnects are normal.
After some some minutes in this state a kernel panic is triggered and both nodes
reboot.

Oct 27 16:09:10 voelsung Cluster.RGM.fed: [ID 922870 daemon.error] tag
vb1.vb1-storage.10: unable to kill process with SIGKILL
Oct 27 16:09:10 voelsung Cluster.RGM.rgmd: [ID 904914 daemon.error] fatal:
Aborting this node because method <hastorageplus_prenet_start> on resource
<vb1-storage> for node <voelsung> is unkillable
 

Best wishes,
 Armin
--
This message posted from opensolaris.org

Victor Latushkin

2008-Oct-27 15:45 UTC

head link

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus

Armin Ollig ?????:> Hi Venku and all others,
> 
> thanks for your suggestions. I wrote a script to do some IO from both
> hosts (in non-cluster-mode) to the FC-LUNs in questions and check the
> md5sums of all files afterwards. As expected there was no corruption.
> 
> 
> After recreating the cluster-resource and a few failovers i found the
> HASP resource in this state, with the vb1 zfs concurrently mounted on
> *both* nodes:
Have you imported your pool manually on on eof the hosts? Do you have 
file /etc/zfs/zpool.cache on these boxes? If yes, could you please 
provide it?
> # clresource status vb1-storage
> === Cluster Resources ==> Resource Name      Node Name      State       
Status Message
> -------------      ---------      -----         --------------
> vb1-storage        siegfried      Offline       Offline
>                         voelsung       Starting      Unknown - Starting
> 
> 
>  siegfried# zpool status
>   pool: vb1
>  state: ONLINE
>  scrub: none requested
> config:
>         NAME                                       STATE     READ WRITE
CKSUM
>         vb1                                        ONLINE       0     0    
0
>           c4t600D0230000000000088824BC4228807d0s0  ONLINE       0     0    
0
> 
> errors: No known data errors
> voelsung# zpool status
>   pool: vb1
>  state: ONLINE
>  scrub: none requested
> config:
>         NAME                                       STATE     READ WRITE
CKSUM
>         vb1                                        ONLINE       0     0    
0
>           c4t600D0230000000000088824BC4228807d0s0  ONLINE       0     0    
0[/i]
> 
> 
> In this state filesystem-corruption can occur easily. 
Yes, this is bad and can lead to corruption.
> The zpool was created using the cluster-wide did device:
> zpool create vb1 /dev/did/dsk/d12s0
But this differs from above status reported by ''zpool status''
- it shows
pool is made from device c4t600D0230000000000088824BC4228807d0s0, and 
you suggest that it was created from /dev/did/dsk/d12s0

victor
> There was no fc path failure to the LUNs, both interconnects are normal.
> After some some minutes in this state a kernel panic is triggered and both
nodes reboot.
> 
> Oct 27 16:09:10 voelsung Cluster.RGM.fed: [ID 922870 daemon.error] tag
vb1.vb1-storage.10: unable to kill process with SIGKILL
> Oct 27 16:09:10 voelsung Cluster.RGM.rgmd: [ID 904914 daemon.error] fatal:
Aborting this node because method <hastorageplus_prenet_start> on resource
<vb1-storage> for node <voelsung> is unkillable

Armin Ollig

2008-Oct-27 16:32 UTC

head link

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStora

Hi Victor,

 it was initially created from c4t600D0230000000000088824BC4228807d0s0, then
destroyed and recreated from  /dev/did/dsk/d12s0. You are right: It still shows
up the old dev.

There seems to be a problem with device reservation and the cluster framework,
since i get a kernel panic once in a while when a node leaves....:


Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Quorum device
/dev/did/rdsk/d11s2: owner set to node 1.
Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Node voelsung (nodeid = 2) is
down.
Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Cluster members: siegfried.
Oct 27 17:14:12 siegfried genunixN: NOTICE: CMM:otnode reconfigurtiion #5
completfying cluster that this node is panicking

panic[cpu1]/thread=ffffff000f5a3c80: Reservation Conflict
Disk: /scsi_vhci/disk at g600d02300000000000888275cd1cc500

ffffff000f5a3a00 sd:sd_panic_for_res_conflict+4f ()
ffffff000f5a3a40 sd:sd_pkt_status_reservation_conflict+a8 ()
ffffff000f5a3a90 sd:sdintr+44e ()
ffffff000f5a3b30 scsi_vhci:vhci_intr+6ac ()
ffffff000f5a3b50 fcp:fcp_post_callback+1e ()
ffffff000f5a3b90 fcp:fcp_cmd_callback+4b ()
ffffff000f5a3bd0 emlxs:emlxs_iodone+b1 ()
ffffff000f5a3c20 emlxs:emlxs_iodone_server+15d ()
ffffff000f5a3c60 emlxs:emlxs_thread+15e ()
ffffff000f5a3c70 unix:thread_start+8 ()

syncing file systems... 5 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 done (not
all i/o completed)
dumping to /dev/dsk/c4t600D0230000000000088824BC4228803d0s1, offset 1719074816,
content: kernel
100% done: 192956 pages dumped, compression ratio 4.32, dump succeeded
rebooting...


I will update to nv99 and try to reproduce the issue.

Best wishes,
 Armin
--
This message posted from opensolaris.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: voelsung-zpool.cache
Type: application/octet-stream
Size: 1072 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/dff42919/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: siegfried-zpool.cache
Type: application/octet-stream
Size: 1076 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/dff42919/attachment-0001.obj>

Victor Latushkin

2008-Oct-27 16:49 UTC

head link

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStora

Armin Ollig ?????:> Hi Victor,
> 
> it was initially created from
> c4t600D0230000000000088824BC4228807d0s0, then destroyed and recreated
> from  /dev/did/dsk/d12s0. You are right: It still shows up the old
> dev.
You pool is cached in zpool.caches of both hosts (with the old device path):

bash-3.2#  usr/sbin/i86/zdb -U /export/home/vl146290/siegfried-zpool.cache
vb1
     version=12
     name=''vb1''
     state=0
     txg=192
     pool_guid=880122360910369369
     hostid=872416547
     hostname=''siegfried''
     vdev_tree
         type=''root''
         id=0
         guid=880122360910369369
         children[0]
                 type=''disk''
                 id=0
                 guid=4904058874028023190
                
path=''/dev/dsk/c4t600D0230000000000088824BC4228807d0s0''
                 devid=''id1,sd at
n600d0230000000000088824bc4228807/a''
 
phys_path=''/scsi_vhci/disk at
g600d0230000000000088824bc4228807:a''
                 whole_disk=0
                 metaslab_array=23
                 metaslab_shift=29
                 ashift=9
                 asize=107369463808
                 is_log=0
bash-3.2#  usr/sbin/i86/zdb -U 
/export/home/vl146290/voelsung-zpool.cache vb1
     version=12
     name=''vb1''
     state=0
     txg=185
     pool_guid=880122360910369369
     hostid=872416547
     hostname=''voelsung''
     vdev_tree
         type=''root''
         id=0
         guid=880122360910369369
         children[0]
                 type=''disk''
                 id=0
                 guid=4904058874028023190
                
path=''/dev/dsk/c4t600D0230000000000088824BC4228807d0s0''
                 devid=''id1,sd at
n600d0230000000000088824bc4228807/a''
		phys_path=''/scsi_vhci/disk at
g600d0230000000000088824bc4228807:a''
                 whole_disk=0
                 metaslab_array=23
                 metaslab_shift=29
                 ashift=9
                 asize=107369463808
                 is_log=0
bash-3.2#


This causes simultaneous pool import. Probably this is a result of 
manual pool imports in the past. Try to get rid of zpool.cache on both 
hosts and see if it reproduces.
> There seems to be a problem with device reservation and the cluster
> framework, since i get a kernel panic once in a while when a node
> leaves....:
Well, then it may be better to start from a known clean state first.

victor
> Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Quorum device
/dev/did/rdsk/d11s2: owner set to node 1.
> Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Node voelsung (nodeid = 2)
is down.
> Oct 27 17:14:12 siegfried genunix: NOTICE: CMM: Cluster members: siegfried.
> Oct 27 17:14:12 siegfried genunixN: NOTICE: CMM:otnode reconfigurtiion #5
completfying cluster that this node is panicking
> 
> panic[cpu1]/thread=ffffff000f5a3c80: Reservation Conflict
> Disk: /scsi_vhci/disk at g600d02300000000000888275cd1cc500
> 
> ffffff000f5a3a00 sd:sd_panic_for_res_conflict+4f ()
> ffffff000f5a3a40 sd:sd_pkt_status_reservation_conflict+a8 ()
> ffffff000f5a3a90 sd:sdintr+44e ()
> ffffff000f5a3b30 scsi_vhci:vhci_intr+6ac ()
> ffffff000f5a3b50 fcp:fcp_post_callback+1e ()
> ffffff000f5a3b90 fcp:fcp_cmd_callback+4b ()
> ffffff000f5a3bd0 emlxs:emlxs_iodone+b1 ()
> ffffff000f5a3c20 emlxs:emlxs_iodone_server+15d ()
> ffffff000f5a3c60 emlxs:emlxs_thread+15e ()
> ffffff000f5a3c70 unix:thread_start+8 ()
> 
> syncing file systems... 5 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 done
(not all i/o completed)
> dumping to /dev/dsk/c4t600D0230000000000088824BC4228803d0s1, offset
1719074816, content: kernel
> 100% done: 192956 pages dumped, compression ratio 4.32, dump succeeded
> rebooting...
> 
> 
> I will update to nv99 and try to reproduce the issue.
> 
> Best wishes,
>  Armin

zfs discuss - Oct 2008 - [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStoragePlus

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStora

[zfs-discuss] [ha-clusters-discuss] (ZFS) file corruption with HAStora