I ran into an odd problem importing a zpool while testing avs. I was
trying to simulate a drive failure, break SNDR replication, and then
import the pool on the secondary. To simulate the drive failure is just
offlined one of the disks in the RAIDZ set.
------------------------------------------------------
pr1# zpool status
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c3t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c5t0d0s0 ONLINE 0 0 0
c5t1d0s0 ONLINE 0 0 0
c5t2d0s0 ONLINE 0 0 0
c5t3d0s0 ONLINE 0 0 0
errors: No known data errors
pr1# zpool offline
missing pool name
usage:
offline [-t] <pool> <device> ...
pr1# zpool offline tank c5t0d0s0
pr1# zpool status
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c3t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning
in a
degraded state.
action: Online the device using ''zpool online'' or replace the
device
with
''zpool replace''.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
c5t0d0s0 OFFLINE 0 0 0
c5t1d0s0 ONLINE 0 0 0
c5t2d0s0 ONLINE 0 0 0
c5t3d0s0 ONLINE 0 0 0
errors: No known data errors
pr1# zpool export tank
-------------------------------------------------------
I then disabled SNDR replication.
--------------------------------------------------------
pr1# sndradm -g zfs-tank -d
Disable Remote Mirror? (Y/N) [N]: Y
---------------------------------------------------------
Then I try to import the ZPOOL on the secondary.
------------------------------------------------------
pr2# zpool import
pool: tank
id: 9795707198744908806
state: DEGRADED
status: One or more devices are offlined.
action: The pool can be imported despite missing or damaged devices.
The
fault tolerance of the pool may be compromised if imported.
config:
tank DEGRADED
raidz1 DEGRADED
c5t0d0s0 OFFLINE
c5t1d0s0 ONLINE
c5t2d0s0 ONLINE
c5t3d0s0 ONLINE
pr2# zpool import tank
cannot import ''tank'': one or more devices is currently
unavailable
pr2# zpool import -f tank
cannot import ''tank'': one or more devices is currently
unavailable
pr2#
-------------------------------------------------------
Importing on the primary gives the same error.
Anyone have any ideas?
Thanks
Corey
Corey,> I ran into an odd problem importing a zpool while testing avs. I was > trying to simulate a drive failure, break SNDR replication, and then > import the pool on the secondary. To simulate the drive failure is > just > offlined one of the disks in the RAIDZ set. >Are all constituent volumes of a single ZFS storage pool in the same SNDR I/O consistency group?> > > ------------------------------------------------------ > pr1# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c3t0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c5t0d0s0 ONLINE 0 0 0 > c5t1d0s0 ONLINE 0 0 0 > c5t2d0s0 ONLINE 0 0 0 > c5t3d0s0 ONLINE 0 0 0 > > errors: No known data errors > pr1# zpool offline > missing pool name > usage: > offline [-t] <pool> <device> ... > pr1# zpool offline tank c5t0d0s0 > pr1# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c3t0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: DEGRADED > status: One or more devices has been taken offline by the > administrator. > Sufficient replicas exist for the pool to continue functioning > in a > degraded state. > action: Online the device using ''zpool online'' or replace the device > with > ''zpool replace''. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > raidz1 DEGRADED 0 0 0 > c5t0d0s0 OFFLINE 0 0 0 > c5t1d0s0 ONLINE 0 0 0 > c5t2d0s0 ONLINE 0 0 0 > c5t3d0s0 ONLINE 0 0 0 > > errors: No known data errors > pr1# zpool export tank > ------------------------------------------------------- > > I then disabled SNDR replication. > -------------------------------------------------------- > pr1# sndradm -g zfs-tank -d > Disable Remote Mirror? (Y/N) [N]: Y > --------------------------------------------------------- > > Then I try to import the ZPOOL on the secondary. > > ------------------------------------------------------ > pr2# zpool import > pool: tank > id: 9795707198744908806 > state: DEGRADED > status: One or more devices are offlined. > action: The pool can be imported despite missing or damaged devices. > The > fault tolerance of the pool may be compromised if imported. > config: > > tank DEGRADED > raidz1 DEGRADED > c5t0d0s0 OFFLINE > c5t1d0s0 ONLINE > c5t2d0s0 ONLINE > c5t3d0s0 ONLINE > pr2# zpool import tank > cannot import ''tank'': one or more devices is currently unavailable > pr2# zpool import -f tank > cannot import ''tank'': one or more devices is currently unavailable > pr2# > ------------------------------------------------------- > > Importing on the primary gives the same error. > > Anyone have any ideas? > > Thanks > > Corey > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussJim Dunham Engineering Manager Storage Platform Software Group Sun Microsystems, Inc.
> -----Original Message----- > From: James.Dunham at Sun.COM [mailto:James.Dunham at Sun.COM] > Sent: Friday, September 12, 2008 4:34 PM > To: Leopold, Corey > Cc: zfs-discuss at opensolaris.org; storage-discuss at opensolaris.org > Subject: Re: [zfs-discuss] ZPOOL Import Problem > > Corey, > > > I ran into an odd problem importing a zpool while testing avs. Iwas> > trying to simulate a drive failure, break SNDR replication, and then > > import the pool on the secondary. To simulate the drive failure is > > just > > offlined one of the disks in the RAIDZ set. > > > > Are all constituent volumes of a single ZFS storage pool in the same > SNDR I/O consistency group? >Yes they were. One thing to note that shows some of my unfamiliarity with SNDR is that I actually deleted the replication set config before trying to mount on the secondary "sndradm -g group -nd" Since then I have been throwing them into logging mode "sndradm -g group -nl" and haven''t had a problem in similar tests. (i.e. offlining a drive before making the secondary active). I don''t believe that deleting the SNDR replication configuration should have made the ZPOOL invalid though. So there may still be a bug somewhere. Corey
Corey>> -----Original Message----- >> From: James.Dunham at Sun.COM [mailto:James.Dunham at Sun.COM] >> Sent: Friday, September 12, 2008 4:34 PM >> To: Leopold, Corey >> Cc: zfs-discuss at opensolaris.org; storage-discuss at opensolaris.org >> Subject: Re: [zfs-discuss] ZPOOL Import Problem >> >> Corey, >> >>> I ran into an odd problem importing a zpool while testing avs. I >>> was >>> trying to simulate a drive failure, break SNDR replication, and then >>> import the pool on the secondary. To simulate the drive failure is >>> just offlined one of the disks in the RAIDZ set. >>> >> >> Are all constituent volumes of a single ZFS storage pool in the same >> SNDR I/O consistency group? >> > > Yes they were. > > One thing to note that shows some of my unfamiliarity with SNDR is > that > I actually deleted the replication set config before trying to mount > on > the secondary "sndradm -g group -nd" > > Since then I have been throwing them into logging mode "sndradm -g > group > -nl" and haven''t had a problem in similar tests. (i.e. offlining a > drive > before making the secondary active). > > I don''t believe that deleting the SNDR replication configuration > should > have made the ZPOOL invalid though. So there may still be a bug > somewhere.If at the time the SNDR replica is deleted the set was actively replicating, along with ZFS actively writing to the ZFS storage pool, I/O consistency will be lost, leaving ZFS storage pool in an indeterministic state on the remote node. To address this issue, prior to deleting the replicas, the replica should be placed into logging mode first. Then ZFS will be left in I/O consistent after the disable is done.> CoreyJim Dunham Engineering Manager Storage Platform Software Group Sun Microsystems, Inc.