thr3ads.net - zfs discuss - [zfs-discuss] Ooops - did it again... Moved disks without export first. [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Jan Hellevik

2010-Oct-27 12:20 UTC

[zfs-discuss] Ooops - did it again... Moved disks without export first.

Ok, so I did it again... I moved my disks around without doing export first.
I promise - after this I will always export before messing with the disks. :-)

Anyway - the problem. I decided to rearrange the disks due to cable lengths and
case layout. I disconnected the disks and moved them around. When I reconnected
the cables and powered on, I got the situation below.

My pools are all mirrored pools. Due to different brands and sizes I tried to
match the mirror mismatches, but no go. Got the same error as below.

What should I do now? I was thinking rescue-CD and import/export/boot from
rpool, but I am afraid I will break something...


Sun Microsystems Inc.   SunOS 5.11      snv_134 February 2010
jan at opensolaris:~# zpool status
  pool: master
 state: UNAVAIL
status: One or more devices could not be used because the label is missing 
        or invalid.  There are insufficient replicas for the pool to continue
        functioning.
action: Destroy and re-create the pool from
        a backup source.
   see: http://www.sun.com/msg/ZFS-8000-5E
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        master      UNAVAIL      0     0     0  insufficient replicas
          mirror-0  ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
            c9d1    ONLINE       0     0     0
          mirror-1  DEGRADED     0     0     0
            c6t2d0  ONLINE       0     0     0
            c7d1    UNAVAIL      0     0     0  corrupted data
          mirror-2  UNAVAIL      0     0     0  insufficient replicas
            c9d0    UNAVAIL      0     0     0  corrupted data
            c6t0d0  UNAVAIL      0     0     0  corrupted data
          mirror-3  ONLINE       0     0     0
            c6t3d0  ONLINE       0     0     0
            c8d0    ONLINE       0     0     0
          mirror-4  ONLINE       0     0     0
            c6t7d0  ONLINE       0     0     0
            c6t6d0  ONLINE       0     0     0

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c5d0s0  ONLINE       0     0     0
            c5d1s0  ONLINE       0     0     0

errors: No known data errors
jan at opensolaris:~# zpool export master
internal error: Invalid argument
Abort (core dumped)

jan at opensolaris:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c5d0 <DEFAULT cyl 14590 alt 2 hd 255 sec 63>
          /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0
       1. c5d1 <DEFAULT cyl 14590 alt 2 hd 255 sec 63>
          /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 1,0
       2. c6t0d0 <ATA-WDC WD10EARS-00Y-0A80-931.51GB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0
       3. c6t1d0 <ATA-SAMSUNG HD154UI-1118-1.36TB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0
       4. c6t2d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0
       5. c6t3d0 <ATA-WDC WD20EARS-00S-0A80-1.82TB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0
       6. c6t6d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 6,0
       7. c6t7d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB>
          /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 7,0
       8. c7d1 <WDC WD10-  WD-WCAV5651037-0001-931.51GB>
          /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0
       9. c8d0 <WDC WD15-  WD-WMAVU150617-0001-1.36TB>
          /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0
      10. c9d0 <SAMSUNG-S1XWJ1BZ50634-0001-1.36TB>
          /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0
      11. c9d1 <WDC WD20-  WD-WCAVY527380-0001-1.82TB>
          /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0
Specify disk (enter its number): ^C

jan at opensolaris:~#
-- 
This message posted from opensolaris.org

Roy Sigurd Karlsbakk

2010-Oct-27 19:07 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

----- Original Message -----> Ok, so I did it again... I moved my disks around without doing export
> first.
> I promise - after this I will always export before messing with the
> disks. :-)
> 
> Anyway - the problem. I decided to rearrange the disks due to cable
> lengths and case layout. I disconnected the disks and moved them
> around. When I reconnected the cables and powered on, I got the
> situation below.
> 
> My pools are all mirrored pools. Due to different brands and sizes I
> tried to match the mirror mismatches, but no go. Got the same error as
> below.
This is rather alarming. I was under the impression that ZFS tagged each disk
and used those tags for internal rearrangement. I have tested this with
virtualbox, and there it worked well, but perhaps I have missed something?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

David Magda

2010-Oct-27 19:55 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

On Wed, October 27, 2010 15:07, Roy Sigurd Karlsbakk
wrote:> ----- Original Message -----
>> Ok, so I did it again... I moved my disks around without doing export
>> first.
>> I promise - after this I will always export before messing with the
>> disks. :-)
>>
>> Anyway - the problem. I decided to rearrange the disks due to cable
>> lengths and case layout. I disconnected the disks and moved them
>> around. When I reconnected the cables and powered on, I got the
>> situation below.
>>
>> My pools are all mirrored pools. Due to different brands and sizes I
>> tried to match the mirror mismatches, but no go. Got the same error as
>> below.
>
> This is rather alarming. I was under the impression that ZFS tagged each
> disk and used those tags for internal rearrangement. I have tested this
> with virtualbox, and there it worked well, but perhaps I have missed
> something?
The errors aren''t "cannot find" but appear to be something
about corruption.

ZFS caches the /dev/dsk entries to speed up importing (since it doesn''t
have to ''taste'' all the LUNs). If you''ve moved things
behind ZFS'' back
(i.e., not exporting/importing), then the cache file may be out of date.
One option is to delete "/etc/zfs/zpool.cache".

If there is issues with "corruption" because you pulled disks behind
ZFS''
back, and thus there were issues with things not being flushed to disk,
then it may be more serious.

If the OP is using >ZFSv19 (?), they may try a "zpool import -F" to
recover the pool to a known good state (the last few TXGs are thrown
away):

  -F
    Recovery mode for a non-importable pool. Attempt to return the pool to
    an importable state by discarding the last few transactions. Not all
    damaged pools can be recovered by using this option. If successful, the
    data from the discarded transactions is irretrievably lost. This option
    is ignored if the pool is importable or already imported.

http://docs.sun.com/app/docs/doc/819-2240/zpool-1m

Jan Hellevik

2010-Oct-28 08:44 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

I think the ''corruption'' is caused by the shuffling and
mismatch of the disks. One 1.5TB is now believed to be part of a mirror with a
2TB, a 1TB part of a mirror with a 1.5TB and so on. It would be better if zfs
would try to find the second disk of each mirror instead of relying on what
controller/channel/port it was previously connected to.

So, my best action would be to delete the zpool.cache and then do a zpool
import?

Should I try to match disks with cables as it was previously connected before I
do the import? Will that make any difference?

BTW, ZFS version is 22.

Thanks, 

Jan
-- 
This message posted from opensolaris.org

David Magda

2010-Oct-28 11:31 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

On Oct 28, 2010, at 04:44, Jan Hellevik wrote:
> So, my best action would be to delete the zpool.cache and then do a  
> zpool import?
>
> Should I try to match disks with cables as it was previously  
> connected before I do the import? Will that make any difference?
>
> BTW, ZFS version is 22.
I''d say export, rename zpool.cache, and then try importing it. ZFS  
should scan all the devices and figure out what''s there. If that still
doesn''t work, try the "-F" option to go back a few
transactions to a
known-good state.

Most file systems don''t take well to having disks pulled on them, and  
ZFS is no different there. It''s just with ZFS it can tell when there  
are (potentially) corrupted blocks because of the checksumming.

Jan Hellevik

2010-Oct-28 14:37 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

Thanks! I will try later today and report back the result.
-- 
This message posted from opensolaris.org

Jan Hellevik

2010-Oct-29 07:48 UTC

head link

[zfs-discuss] Ooops - did it again... Moved disks without export first.

Export did not go very well.

jan at opensolaris:~# zpool export master
internal error: Invalid argument
Abort (core dumped)

So I deleted (renamed) the zpool.cache and rebooted.

After reboot I imported the pool and it seems to have gone well. It is now
scrubbing.

Thanks a lot for the help!

jan at opensolaris:~# zpool import master
jan at opensolaris:~# zpool status
  pool: master
 state: ONLINE
 scrub: scrub in progress for 0h1m, 0.08% done, 38h52m to go
config:

        NAME        STATE     READ WRITE CKSUM
        master      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c8d0    ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            c9d0    ONLINE       0     0     0
            c6t2d0  ONLINE       0     0     0
          mirror-2  ONLINE       0     0     0
            c6t3d0  ONLINE       0     0     0
            c9d1    ONLINE       0     0     0
          mirror-3  ONLINE       0     0     0
            c7d1    ONLINE       0     0     0
            c6t0d0  ONLINE       0     0     0
          mirror-4  ONLINE       0     0     0
            c6t7d0  ONLINE       0     0     0
            c6t6d0  ONLINE       0     0     0

errors: No known data errors
-- 
This message posted from opensolaris.org

zfs discuss - Oct 2010 - Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.

[zfs-discuss] Ooops - did it again... Moved disks without export first.