I ran into an odd problem importing a zpool while testing avs. I was trying to simulate a drive failure, break SNDR replication, and then import the pool on the secondary. To simulate the drive failure is just offlined one of the disks in the RAIDZ set. ------------------------------------------------------ pr1# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c3t0d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t0d0s0 ONLINE 0 0 0 c5t1d0s0 ONLINE 0 0 0 c5t2d0s0 ONLINE 0 0 0 c5t3d0s0 ONLINE 0 0 0 errors: No known data errors pr1# zpool offline missing pool name usage: offline [-t] <pool> <device> ... pr1# zpool offline tank c5t0d0s0 pr1# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c3t0d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using ''zpool online'' or replace the device with ''zpool replace''. scrub: none requested config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c5t0d0s0 OFFLINE 0 0 0 c5t1d0s0 ONLINE 0 0 0 c5t2d0s0 ONLINE 0 0 0 c5t3d0s0 ONLINE 0 0 0 errors: No known data errors pr1# zpool export tank ------------------------------------------------------- I then disabled SNDR replication. -------------------------------------------------------- pr1# sndradm -g zfs-tank -d Disable Remote Mirror? (Y/N) [N]: Y --------------------------------------------------------- Then I try to import the ZPOOL on the secondary. ------------------------------------------------------ pr2# zpool import pool: tank id: 9795707198744908806 state: DEGRADED status: One or more devices are offlined. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. config: tank DEGRADED raidz1 DEGRADED c5t0d0s0 OFFLINE c5t1d0s0 ONLINE c5t2d0s0 ONLINE c5t3d0s0 ONLINE pr2# zpool import tank cannot import ''tank'': one or more devices is currently unavailable pr2# zpool import -f tank cannot import ''tank'': one or more devices is currently unavailable pr2# ------------------------------------------------------- Importing on the primary gives the same error. Anyone have any ideas? Thanks Corey
Corey,> I ran into an odd problem importing a zpool while testing avs. I was > trying to simulate a drive failure, break SNDR replication, and then > import the pool on the secondary. To simulate the drive failure is > just > offlined one of the disks in the RAIDZ set. >Are all constituent volumes of a single ZFS storage pool in the same SNDR I/O consistency group?> > > ------------------------------------------------------ > pr1# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c3t0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c5t0d0s0 ONLINE 0 0 0 > c5t1d0s0 ONLINE 0 0 0 > c5t2d0s0 ONLINE 0 0 0 > c5t3d0s0 ONLINE 0 0 0 > > errors: No known data errors > pr1# zpool offline > missing pool name > usage: > offline [-t] <pool> <device> ... > pr1# zpool offline tank c5t0d0s0 > pr1# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c3t0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: DEGRADED > status: One or more devices has been taken offline by the > administrator. > Sufficient replicas exist for the pool to continue functioning > in a > degraded state. > action: Online the device using ''zpool online'' or replace the device > with > ''zpool replace''. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > raidz1 DEGRADED 0 0 0 > c5t0d0s0 OFFLINE 0 0 0 > c5t1d0s0 ONLINE 0 0 0 > c5t2d0s0 ONLINE 0 0 0 > c5t3d0s0 ONLINE 0 0 0 > > errors: No known data errors > pr1# zpool export tank > ------------------------------------------------------- > > I then disabled SNDR replication. > -------------------------------------------------------- > pr1# sndradm -g zfs-tank -d > Disable Remote Mirror? (Y/N) [N]: Y > --------------------------------------------------------- > > Then I try to import the ZPOOL on the secondary. > > ------------------------------------------------------ > pr2# zpool import > pool: tank > id: 9795707198744908806 > state: DEGRADED > status: One or more devices are offlined. > action: The pool can be imported despite missing or damaged devices. > The > fault tolerance of the pool may be compromised if imported. > config: > > tank DEGRADED > raidz1 DEGRADED > c5t0d0s0 OFFLINE > c5t1d0s0 ONLINE > c5t2d0s0 ONLINE > c5t3d0s0 ONLINE > pr2# zpool import tank > cannot import ''tank'': one or more devices is currently unavailable > pr2# zpool import -f tank > cannot import ''tank'': one or more devices is currently unavailable > pr2# > ------------------------------------------------------- > > Importing on the primary gives the same error. > > Anyone have any ideas? > > Thanks > > Corey > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussJim Dunham Engineering Manager Storage Platform Software Group Sun Microsystems, Inc.
> -----Original Message----- > From: James.Dunham at Sun.COM [mailto:James.Dunham at Sun.COM] > Sent: Friday, September 12, 2008 4:34 PM > To: Leopold, Corey > Cc: zfs-discuss at opensolaris.org; storage-discuss at opensolaris.org > Subject: Re: [zfs-discuss] ZPOOL Import Problem > > Corey, > > > I ran into an odd problem importing a zpool while testing avs. Iwas> > trying to simulate a drive failure, break SNDR replication, and then > > import the pool on the secondary. To simulate the drive failure is > > just > > offlined one of the disks in the RAIDZ set. > > > > Are all constituent volumes of a single ZFS storage pool in the same > SNDR I/O consistency group? >Yes they were. One thing to note that shows some of my unfamiliarity with SNDR is that I actually deleted the replication set config before trying to mount on the secondary "sndradm -g group -nd" Since then I have been throwing them into logging mode "sndradm -g group -nl" and haven''t had a problem in similar tests. (i.e. offlining a drive before making the secondary active). I don''t believe that deleting the SNDR replication configuration should have made the ZPOOL invalid though. So there may still be a bug somewhere. Corey
Corey>> -----Original Message----- >> From: James.Dunham at Sun.COM [mailto:James.Dunham at Sun.COM] >> Sent: Friday, September 12, 2008 4:34 PM >> To: Leopold, Corey >> Cc: zfs-discuss at opensolaris.org; storage-discuss at opensolaris.org >> Subject: Re: [zfs-discuss] ZPOOL Import Problem >> >> Corey, >> >>> I ran into an odd problem importing a zpool while testing avs. I >>> was >>> trying to simulate a drive failure, break SNDR replication, and then >>> import the pool on the secondary. To simulate the drive failure is >>> just offlined one of the disks in the RAIDZ set. >>> >> >> Are all constituent volumes of a single ZFS storage pool in the same >> SNDR I/O consistency group? >> > > Yes they were. > > One thing to note that shows some of my unfamiliarity with SNDR is > that > I actually deleted the replication set config before trying to mount > on > the secondary "sndradm -g group -nd" > > Since then I have been throwing them into logging mode "sndradm -g > group > -nl" and haven''t had a problem in similar tests. (i.e. offlining a > drive > before making the secondary active). > > I don''t believe that deleting the SNDR replication configuration > should > have made the ZPOOL invalid though. So there may still be a bug > somewhere.If at the time the SNDR replica is deleted the set was actively replicating, along with ZFS actively writing to the ZFS storage pool, I/O consistency will be lost, leaving ZFS storage pool in an indeterministic state on the remote node. To address this issue, prior to deleting the replicas, the replica should be placed into logging mode first. Then ZFS will be left in I/O consistent after the disable is done.> CoreyJim Dunham Engineering Manager Storage Platform Software Group Sun Microsystems, Inc.