Jan Hellevik
2010-Oct-27 12:20 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
Ok, so I did it again... I moved my disks around without doing export first. I promise - after this I will always export before messing with the disks. :-) Anyway - the problem. I decided to rearrange the disks due to cable lengths and case layout. I disconnected the disks and moved them around. When I reconnected the cables and powered on, I got the situation below. My pools are all mirrored pools. Due to different brands and sizes I tried to match the mirror mismatches, but no go. Got the same error as below. What should I do now? I was thinking rescue-CD and import/export/boot from rpool, but I am afraid I will break something... Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 jan at opensolaris:~# zpool status pool: master state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM master UNAVAIL 0 0 0 insufficient replicas mirror-0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c9d1 ONLINE 0 0 0 mirror-1 DEGRADED 0 0 0 c6t2d0 ONLINE 0 0 0 c7d1 UNAVAIL 0 0 0 corrupted data mirror-2 UNAVAIL 0 0 0 insufficient replicas c9d0 UNAVAIL 0 0 0 corrupted data c6t0d0 UNAVAIL 0 0 0 corrupted data mirror-3 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c5d0s0 ONLINE 0 0 0 c5d1s0 ONLINE 0 0 0 errors: No known data errors jan at opensolaris:~# zpool export master internal error: Invalid argument Abort (core dumped) jan at opensolaris:~# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c5d0 <DEFAULT cyl 14590 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c5d1 <DEFAULT cyl 14590 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 1,0 2. c6t0d0 <ATA-WDC WD10EARS-00Y-0A80-931.51GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 3. c6t1d0 <ATA-SAMSUNG HD154UI-1118-1.36TB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 4. c6t2d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 5. c6t3d0 <ATA-WDC WD20EARS-00S-0A80-1.82TB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 6. c6t6d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 6,0 7. c6t7d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 7,0 8. c7d1 <WDC WD10- WD-WCAV5651037-0001-931.51GB> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 9. c8d0 <WDC WD15- WD-WMAVU150617-0001-1.36TB> /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 10. c9d0 <SAMSUNG-S1XWJ1BZ50634-0001-1.36TB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 11. c9d1 <WDC WD20- WD-WCAVY527380-0001-1.82TB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 Specify disk (enter its number): ^C jan at opensolaris:~# -- This message posted from opensolaris.org
Roy Sigurd Karlsbakk
2010-Oct-27 19:07 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
----- Original Message -----> Ok, so I did it again... I moved my disks around without doing export > first. > I promise - after this I will always export before messing with the > disks. :-) > > Anyway - the problem. I decided to rearrange the disks due to cable > lengths and case layout. I disconnected the disks and moved them > around. When I reconnected the cables and powered on, I got the > situation below. > > My pools are all mirrored pools. Due to different brands and sizes I > tried to match the mirror mismatches, but no go. Got the same error as > below.This is rather alarming. I was under the impression that ZFS tagged each disk and used those tags for internal rearrangement. I have tested this with virtualbox, and there it worked well, but perhaps I have missed something? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
David Magda
2010-Oct-27 19:55 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
On Wed, October 27, 2010 15:07, Roy Sigurd Karlsbakk wrote:> ----- Original Message ----- >> Ok, so I did it again... I moved my disks around without doing export >> first. >> I promise - after this I will always export before messing with the >> disks. :-) >> >> Anyway - the problem. I decided to rearrange the disks due to cable >> lengths and case layout. I disconnected the disks and moved them >> around. When I reconnected the cables and powered on, I got the >> situation below. >> >> My pools are all mirrored pools. Due to different brands and sizes I >> tried to match the mirror mismatches, but no go. Got the same error as >> below. > > This is rather alarming. I was under the impression that ZFS tagged each > disk and used those tags for internal rearrangement. I have tested this > with virtualbox, and there it worked well, but perhaps I have missed > something?The errors aren''t "cannot find" but appear to be something about corruption. ZFS caches the /dev/dsk entries to speed up importing (since it doesn''t have to ''taste'' all the LUNs). If you''ve moved things behind ZFS'' back (i.e., not exporting/importing), then the cache file may be out of date. One option is to delete "/etc/zfs/zpool.cache". If there is issues with "corruption" because you pulled disks behind ZFS'' back, and thus there were issues with things not being flushed to disk, then it may be more serious. If the OP is using >ZFSv19 (?), they may try a "zpool import -F" to recover the pool to a known good state (the last few TXGs are thrown away): -F Recovery mode for a non-importable pool. Attempt to return the pool to an importable state by discarding the last few transactions. Not all damaged pools can be recovered by using this option. If successful, the data from the discarded transactions is irretrievably lost. This option is ignored if the pool is importable or already imported. http://docs.sun.com/app/docs/doc/819-2240/zpool-1m
Jan Hellevik
2010-Oct-28 08:44 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
I think the ''corruption'' is caused by the shuffling and mismatch of the disks. One 1.5TB is now believed to be part of a mirror with a 2TB, a 1TB part of a mirror with a 1.5TB and so on. It would be better if zfs would try to find the second disk of each mirror instead of relying on what controller/channel/port it was previously connected to. So, my best action would be to delete the zpool.cache and then do a zpool import? Should I try to match disks with cables as it was previously connected before I do the import? Will that make any difference? BTW, ZFS version is 22. Thanks, Jan -- This message posted from opensolaris.org
David Magda
2010-Oct-28 11:31 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
On Oct 28, 2010, at 04:44, Jan Hellevik wrote:> So, my best action would be to delete the zpool.cache and then do a > zpool import? > > Should I try to match disks with cables as it was previously > connected before I do the import? Will that make any difference? > > BTW, ZFS version is 22.I''d say export, rename zpool.cache, and then try importing it. ZFS should scan all the devices and figure out what''s there. If that still doesn''t work, try the "-F" option to go back a few transactions to a known-good state. Most file systems don''t take well to having disks pulled on them, and ZFS is no different there. It''s just with ZFS it can tell when there are (potentially) corrupted blocks because of the checksumming.
Jan Hellevik
2010-Oct-28 14:37 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
Thanks! I will try later today and report back the result. -- This message posted from opensolaris.org
Jan Hellevik
2010-Oct-29 07:48 UTC
[zfs-discuss] Ooops - did it again... Moved disks without export first.
Export did not go very well. jan at opensolaris:~# zpool export master internal error: Invalid argument Abort (core dumped) So I deleted (renamed) the zpool.cache and rebooted. After reboot I imported the pool and it seems to have gone well. It is now scrubbing. Thanks a lot for the help! jan at opensolaris:~# zpool import master jan at opensolaris:~# zpool status pool: master state: ONLINE scrub: scrub in progress for 0h1m, 0.08% done, 38h52m to go config: NAME STATE READ WRITE CKSUM master ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c9d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c9d1 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 c7d1 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 errors: No known data errors -- This message posted from opensolaris.org