Jan Hellevik
2010-May-13 09:46 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Hi! I feel panic is close... Short version: I moved the disks of a pool to a new controller without exporting it first. Then I moved them back to the original controller, but I still cannot import the pool. I am new to Opensolaris and ZFS - have set up a box to keep my images and videos. ASUS motherboard, AMD Phenom II CPU, 4GB RAM, SASUC8I 8-port disk controller. 4x500GB disks in a raid1 setup in external eSATA disk cabinets. After some time I decided to do mirrors, so I put in another 4x500GB in two mirrors. These I put in Chieftec backplanes. Started copying the files from the raid pool to the mirrored pool. Yesterday I decided to move the first 4 disks (with the raid pool) from the external encosures to the backplanes. Being tired after work and late at night I forgot to export the pool before moving the disks. The disks were attached to the onboard sata controller before I moved them. After the move I attached them to the SASUC8I controller. When I got problems I tried to attach them to the onboard controller again, but I still have problems. jh at opensolaris:~$ zpool status pool: vault state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM vault UNAVAIL 0 0 0 insufficient replicas raidz1-0 UNAVAIL 0 0 0 insufficient replicas c12d1 UNAVAIL 0 0 0 cannot open c12d0 UNAVAIL 0 0 0 cannot open c10d1 UNAVAIL 0 0 0 cannot open c11d0 UNAVAIL 0 0 0 cannot open logs c10d0p1 ONLINE 0 0 0 jh at opensolaris:~$ pfexec zpool import vault cannot import ''vault'': a pool with that name is already created/imported, and no additional pools with that name were found jh at opensolaris:~$ pfexec zpool export vault cannot open ''vault'': I/O error jh at opensolaris:~$ pfexec format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c10d0 <DEFAULT cyl 465 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 2. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 3. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 4. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 5. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 6. c13t4d0 <ATA-SAMSUNG HD501LJ-0-12-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 4,0 7. c13t5d0 <ATA-SAMSUNG HD501LJ-0-13-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 5,0 8. c13t6d0 <ATA-SAMSUNG HD501LJ-0-13-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 6,0 9. c13t7d0 <ATA-SAMSUNG HD501LJ-0-12-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 7,0 Specify disk (enter its number): ^C (0 is the boot drive, 1 is a OCZ SSD, 2-5 is the mirrored pool, 6-9 is the problem pool) jh at opensolaris:~$ cfgadm Ap_Id Type Receptacle Occupant Condition c13 scsi-sas connected configured unknown usb5/1 unknown empty unconfigured ok ... jh at opensolaris:~$ zpool status .... cannot see the pool jh at opensolaris:~$ pfexec zpool import vault cannot import ''vault'': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. jh at opensolaris:~$ pfexec poweroff .... moved the disks back to the original controller jh at opensolaris:~$ pfexec zpool import vault cannot import ''vault'': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. jh at opensolaris:~$ pfexec format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c10d0 <DEFAULT cyl 465 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 2. c10d1 <SAMSUNG-S0MUJFWQ38208-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 3. c11d0 <SAMSUNG-S0MUJFWQ38207-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 4. c12d0 <SAMSUNG-S0MUJ1DPC0399-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 5. c12d1 <SAMSUNG-S0MUJ1EPB1834-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 6. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 7. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 8. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 9. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 Specify disk (enter its number): ^C jh at opensolaris:~$ pfexec cfgadm -al Ap_Id Type Receptacle Occupant Condition c13 scsi-sas connected configured unknown c13::dsk/c13t0d0 disk connected configured unknown c13::dsk/c13t1d0 disk connected configured unknown c13::dsk/c13t2d0 disk connected configured unknown c13::dsk/c13t3d0 disk connected configured unknown jh at opensolaris:~$ uname -a SunOS opensolaris 5.11 snv_133 i86pc i386 i86pc Solaris jh at opensolaris:~$ pfexec zpool history vault cannot open ''vault'': no such pool jh at opensolaris:~$ fmdump -ev TIME CLASS ENA May 12 19:39:59.3691 ereport.fs.zfs.vdev.open_failed 0x01a202f6e1300401 May 12 19:39:59.3691 ereport.fs.zfs.vdev.open_failed 0x01a202f6e1300401 May 12 19:39:59.3691 ereport.fs.zfs.vdev.open_failed 0x01a202f6e1300401 May 12 19:39:59.3691 ereport.fs.zfs.vdev.open_failed 0x01a202f6e1300401 May 12 19:39:59.3691 ereport.fs.zfs.vdev.no_replicas 0x01a202f6e1300401 May 12 19:39:59.3691 ereport.fs.zfs.zpool 0x01a202f6e1300401 May 12 20:00:27.5687 ereport.fs.zfs.zpool 0x1381371d32000001 May 12 20:00:27.8720 ereport.fs.zfs.zpool 0x13825859e3200001 May 12 20:09:50.1421 ereport.fs.zfs.zpool 0x02934f4a6bd00401 May 12 20:09:50.1628 ereport.fs.zfs.zpool 0x0293630935800401 May 13 10:49:24.5203 ereport.fs.zfs.zpool 0x02952f070ac00001 May 13 10:49:24.5402 ereport.fs.zfs.zpool 0x0295420d9f800001 jh at opensolaris:~$ zpool status -x all pools are healthy jh at opensolaris:~$ pfexec fmstat module ev_recv ev_acpt wait svc_t %w %b open solve memsz bufsz cpumem-retire 0 0 0.0 66.6 0 0 0 0 0 0 disk-transport 0 0 1.0 997458.1 98 0 0 0 32b 0 eft 0 0 0.0 22.9 0 0 0 0 1.2M 0 ext-event-transport 0 0 0.0 66.6 0 0 0 0 0 0 fabric-xlate 0 0 0.0 66.6 0 0 0 0 0 0 fmd-self-diagnosis 40 0 0.0 41.9 0 0 0 0 0 0 io-retire 0 0 0.0 13.1 0 0 0 0 0 0 sensor-transport 0 0 0.0 2.7 0 0 0 0 32b 0 snmp-trapgen 0 0 0.0 83.8 0 0 0 0 0 0 sysevent-transport 0 0 0.0 407.1 0 0 0 0 0 0 syslog-msgs 0 0 0.0 66.6 0 0 0 0 0 0 zfs-diagnosis 34 0 0.0 1.5 0 0 0 0 0 0 zfs-retire 30 0 0.0 11.6 0 0 0 0 2.4K 0 jh at opensolaris:~$ pfexec fmadm config MODULE VERSION STATUS DESCRIPTION cpumem-retire 1.1 active CPU/Memory Retire Agent disk-transport 1.0 active Disk Transport Agent eft 1.16 active eft diagnosis engine ext-event-transport 0.1 active External FM event transport fabric-xlate 1.0 active Fabric Ereport Translater fmd-self-diagnosis 1.0 active Fault Manager Self-Diagnosis io-retire 2.0 active I/O Retire Agent sensor-transport 1.1 active Sensor Transport Agent snmp-trapgen 1.0 active SNMP Trap Generation Agent sysevent-transport 1.0 active SysEvent Transport Agent syslog-msgs 1.0 active Syslog Messaging Agent zfs-diagnosis 1.0 active ZFS Diagnosis Engine zfs-retire 1.0 active ZFS Retire Agent jh at opensolaris:~$ zpool get version rpool NAME PROPERTY VALUE SOURCE rpool version 22 default ... and this is where I am now. The zpool contains my digital images and videos and I would be really unhappy to lose them. What can I do to get back the pool? Is there hope? Sorry for the long post - tried to assemble as much relevant information as I could. Best regards, Jan Hellevik -- This message posted from opensolaris.org
Jim Horng
2010-May-13 17:21 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
When I boot up without the disks in the slots. I manually bring the pool on line with zpool clear <poolname> I believe that was what you were missing from your command. However I did not try to change controller. Hopefully you only been unplug disks while the system is turn off. If that''s case the pool should still be in consistent state. Otherwise, you may want to consider leave the first disk you removed unplug until you bring up the pool on-line then re-add the device back in. (first out last in). Good Luck. -- This message posted from opensolaris.org
Ron Mexico
2010-May-14 00:17 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
I have moved drives between controllers, rearranged drives in other slots, and moved disk sets between different machines and I''ve never had an issue with a zpool not importing. Are you sure you didn''t remove the drives while the system was powered up? Try this: zpool import -D If zpool lists the pool as destroyed, you can re-import it by doing this: zpool import -D vault I know this is a shot in the dark ? sorry for not having a better idea. -- This message posted from opensolaris.org
Jan Hellevik
2010-May-14 07:25 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
jh at opensolaris:~$ pfexec zpool import -D no pools available to import Any other ideas? -- This message posted from opensolaris.org
Jan Hellevik
2010-May-14 07:26 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
jh at opensolaris:~$ zpool clear vault cannot open ''vault'': no such pool -- This message posted from opensolaris.org
Jan Hellevik
2010-May-14 07:40 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Yes, I turned the system off before I connected the disks to the other controller. And I turned the system off beore moving them back to the original controller. Now it seems like the system does not see the pool at all. The disks are there, and they have not been used so I do not understand why I cannot see the pool anymore. Short version of what I did (actual output is in the original post): zpool status -> pool is there but unavailable zpool import -> pool already created zpool export -> I/O error format cfgadm zpool status -> pool is gone...... It seems like the pool vanished after cfgadm? Any pointers? I am really getting worried now that the pool is gone for good. What I do not understand is why it is gone - the disks are still there, so it should be possible to import the pool? What am I missing here? Any ideas? -- This message posted from opensolaris.org
Jan Hellevik
2010-May-14 08:50 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Hey! It is there! :-) Cannot believe I did not try the import command again. :-) But I still have problems - I had added a slice of a SSD as log and another slice as cache to the pool. The SSD is there - c10d1 but ... Ideas? The log part showed under the pool when I initially tried the import, but now it is gone. I am afraid of doing something stupid at this point in time. Any help is really appreciated! jh at opensolaris:~$ pfexec zpool import pool: vault id: 8738898173956136656 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: vault UNAVAIL missing device raidz1-0 ONLINE c11d0 ONLINE c12d0 ONLINE c12d1 ONLINE c10d1 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. jh at opensolaris:~$ pfexec format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c10d0 <DEFAULT cyl 465 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 2. c10d1 <SAMSUNG-S0MUJFWQ38208-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 3. c11d0 <SAMSUNG-S0MUJFWQ38207-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 4. c12d0 <SAMSUNG-S0MUJ1DPC0399-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 5. c12d1 <SAMSUNG-S0MUJ1EPB1834-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 6. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 7. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 8. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 9. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 Specify disk (enter its number): ^C jh at opensolaris:~$ On Thu, May 13, 2010 at 7:15 PM, Richard Elling <richard.elling at gmail.com>wrote:> now try "zpool import" to see what it thinks the drives are > -- richard > > On May 13, 2010, at 2:46 AM, Jan Hellevik wrote: > > > Short version: I moved the disks of a pool to a new controller without > exporting it first. Then I moved them back to the original controller, but I > still cannot import the pool. > > > > > > jh at opensolaris:~$ zpool status > > > > pool: vault > > state: UNAVAIL > > status: One or more devices could not be opened. There are insufficient > > replicas for the pool to continue functioning. > > action: Attach the missing device and online it using ''zpool online''. > > see: http://www.sun.com/msg/ZFS-8000-3C > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > vault UNAVAIL 0 0 0 insufficient replicas > > raidz1-0 UNAVAIL 0 0 0 insufficient replicas > > c12d1 UNAVAIL 0 0 0 cannot open > > c12d0 UNAVAIL 0 0 0 cannot open > > c10d1 UNAVAIL 0 0 0 cannot open > > c11d0 UNAVAIL 0 0 0 cannot open > > logs > > c10d0p1 ONLINE 0 0 0 > > > > jh at opensolaris:~$ zpool status > > .... cannot see the pool > > > > jh at opensolaris:~$ pfexec zpool import vault > > cannot import ''vault'': one or more devices is currently unavailable > > Destroy and re-create the pool from > > a backup source. > > jh at opensolaris:~$ pfexec poweroff > > > > .... moved the disks back to the original controller > > > > jh at opensolaris:~$ pfexec zpool import vault > > cannot import ''vault'': one or more devices is currently unavailable > > Destroy and re-create the pool from > > a backup source. > > jh at opensolaris:~$ pfexec format > > Searching for disks...done > > > > > > jh at opensolaris:~$ uname -a > > SunOS opensolaris 5.11 snv_133 i86pc i386 i86pc Solaris > > > > jh at opensolaris:~$ pfexec zpool history vault > > cannot open ''vault'': no such pool > > > > > > ... and this is where I am now. > > > > The zpool contains my digital images and videos and I would be really > unhappy to lose them. What can I do to get back the pool? Is there hope? > > > > Sorry for the long post - tried to assemble as much relevant information > as I could. > > > > -- > ZFS storage and performance consulting at http://www.RichardElling.com > > > > > > > > > > > >-- Jan Hellevik Tel: +47-41004070 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100514/c2b34dfc/attachment.html>
Haudy Kazemi
2010-May-14 22:34 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Now that you''ve re-imported, it seems like zpool clear may be the command you need, based on discussion in these links about missing and broken zfs logs: http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg37554.html http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg30469.html http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6707530 http://www.sun.com/msg/ZFS-8000-6X Jan Hellevik wrote:> Hey! It is there! :-) Cannot believe I did not try the import command > again. :-) > > But I still have problems - I had added a slice of a SSD as log and > another slice as cache to the pool. The SSD is there - c10d1 but ... > Ideas? The log part showed under the pool when I initially tried the > import, but now it is gone. I am afraid of doing something stupid at > this point in time. Any help is really appreciated! > > jh at opensolaris:~$ pfexec zpool import > pool: vault > id: 8738898173956136656 > state: UNAVAIL > status: One or more devices are missing from the system. > action: The pool cannot be imported. Attach the missing > devices and try again. > see: http://www.sun.com/msg/ZFS-8000-6X > config: > > vault UNAVAIL missing device > raidz1-0 ONLINE > c11d0 ONLINE > c12d0 ONLINE > c12d1 ONLINE > c10d1 ONLINE > > Additional devices are known to be part of this pool, though their > exact configuration cannot be determined. > jh at opensolaris:~$ pfexec format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 > 1. c10d0 <DEFAULT cyl 465 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 > 2. c10d1 <SAMSUNG-S0MUJFWQ38208-0001-465.76GB> > /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 > 3. c11d0 <SAMSUNG-S0MUJFWQ38207-0001-465.76GB> > /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 > 4. c12d0 <SAMSUNG-S0MUJ1DPC0399-0001-465.76GB> > /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 > 5. c12d1 <SAMSUNG-S0MUJ1EPB1834-0001-465.76GB> > /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 > 6. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 > 7. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 > 8. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 > 9. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 > Specify disk (enter its number): ^C > jh at opensolaris:~$ > > > On Thu, May 13, 2010 at 7:15 PM, Richard Elling > <richard.elling at gmail.com <mailto:richard.elling at gmail.com>> wrote: > > now try "zpool import" to see what it thinks the drives are > -- richard > > On May 13, 2010, at 2:46 AM, Jan Hellevik wrote: > > > Short version: I moved the disks of a pool to a new controller > without exporting it first. Then I moved them back to the original > controller, but I still cannot import the pool. > > > > > > jh at opensolaris:~$ zpool status > > > > pool: vault > > state: UNAVAIL > > status: One or more devices could not be opened. There are > insufficient > > replicas for the pool to continue functioning. > > action: Attach the missing device and online it using ''zpool > online''. > > see: http://www.sun.com/msg/ZFS-8000-3C > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > vault UNAVAIL 0 0 0 insufficient replicas > > raidz1-0 UNAVAIL 0 0 0 insufficient replicas > > c12d1 UNAVAIL 0 0 0 cannot open > > c12d0 UNAVAIL 0 0 0 cannot open > > c10d1 UNAVAIL 0 0 0 cannot open > > c11d0 UNAVAIL 0 0 0 cannot open > > logs > > c10d0p1 ONLINE 0 0 0 > > > > jh at opensolaris:~$ zpool status > > .... cannot see the pool > > > > jh at opensolaris:~$ pfexec zpool import vault > > cannot import ''vault'': one or more devices is currently unavailable > > Destroy and re-create the pool from > > a backup source. > > jh at opensolaris:~$ pfexec poweroff > > > > .... moved the disks back to the original controller > > > > jh at opensolaris:~$ pfexec zpool import vault > > cannot import ''vault'': one or more devices is currently unavailable > > Destroy and re-create the pool from > > a backup source. > > jh at opensolaris:~$ pfexec format > > Searching for disks...done > > > > > > jh at opensolaris:~$ uname -a > > SunOS opensolaris 5.11 snv_133 i86pc i386 i86pc Solaris > > > > jh at opensolaris:~$ pfexec zpool history vault > > cannot open ''vault'': no such pool > > > > > > ... and this is where I am now. > > > > The zpool contains my digital images and videos and I would be > really unhappy to lose them. What can I do to get back the pool? > Is there hope? > > > > Sorry for the long post - tried to assemble as much relevant > information as I could. > > > > -- > ZFS storage and performance consulting at http://www.RichardElling.com > > > > > > > > > > > > > > > -- > Jan Hellevik > Tel: +47-41004070 > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100514/99240bbd/attachment.html>
Haudy Kazemi
2010-May-14 22:37 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Is there any chance that the second controller wrote something onto the disks when it saw the disks attached to it, thus corrupting the ZFS drive signatures or more? I''ve heard that some controllers require drives to be initialized by them and/or signatures written to drives by them. Maybe your second controller wrote to the drives without you knowing about it. If you have a pair of (small) spare drives, make a ZFS mirror out of them and try to recreate the problem by repeating your steps on them. If you can recreate the problem, try to narrow it down to whether the problem is caused by the second controller changing things, or if the skipped zfs export is playing a role. I think the skipped zfs export might have lead to zfs import needing to be forced (-f), but as long as you weren''t trying to access the disks from two systems at the same time it shouldn''t have been catastrophic. Forcing shouldn''t be necessary if things are being handled cleanly and correctly. My hunch is the second controller did something when it saw the drives connected to it, particularly if the second controller was configured in RAID mode rather than JBOD or passthrough. Or maybe you changed some settings on the second controller''s BIOS that caused it to write to the drives while you were trying to get things to work? I''ve seen something similar by the BIOS on a Gigabyte X38 chipset motherboard that has "Quad BIOS". This is partly documented by Gigabyte at http://www.gigabyte.com.tw/FileList/NewTech/2006_motherboard_newtech/how_does_quad_bios_work_dq6.htm From my testing, the BIOS on this board writes a copy of itself using an HPA (Host Protected Area) to a hard drive for BIOS recovery purposes in case of a bad flashing/BIOS upgrade. There is no prompting for the writing, it appears to simply happen to whichever drive was the first one connected to the PC, which is usually the current boot drive. On a new clean disk, this would be harmless, but it risks data loss when reusing drives or transferring drives between systems. This behavior is able to cause data loss and has affected people using Windows Dynamic Disks and UnRaid as can be seen by searching Google for "Gigabyte HPA". More details: As long as that drive is connected to the PC, the BIOS recognizes it as being the ''recovery'' drive and doesn''t write to another drive. If that drive is removed, then another drive will have an HPA created on it. The easiest way to control this is to initially have just one drive connected...the one you don''t mind the HPA being placed on. Then you can add the other drives without them being modified. The HPA is created on 2113 sectors at the end of the drive. HDAT (a low level drive diag/repair/config utility) cannot remove this HPA while the drive is still the first drive (the BIOS must be enforcing protection of that area). Making this drive a secondary drive by forcing the BIOS to create another HPA on another drive allows HDAT to remove the HPA. Manually examining the last 2114 (one more for good measure) sectors will now show that it contains a BIOS backup image. Other observations: Device order in Linux (e.g. /dev/sda /dev/sdb) made no difference to where the HPA ended up. Jan Hellevik wrote:> Yes, I turned the system off before I connected the disks to the other controller. And I turned the system off beore moving them back to the original controller. > > Now it seems like the system does not see the pool at all. > > The disks are there, and they have not been used so I do not understand why I cannot see the pool anymore. > > Short version of what I did (actual output is in the original post): > zpool status -> pool is there but unavailable > zpool import -> pool already created > zpool export -> I/O error > format > cfgadm > zpool status -> pool is gone...... > > It seems like the pool vanished after cfgadm? > > Any pointers? I am really getting worried now that the pool is gone for good. > > What I do not understand is why it is gone - the disks are still there, so it should be possible to import the pool? > > What am I missing here? Any ideas? >
Jim Horng
2010-May-15 01:06 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
You may or may not need to add the log device back. zfs clear should bring the pool online. either way shouldn''t affect the data. -- This message posted from opensolaris.org
Jan Hellevik
2010-May-15 09:03 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Thanks for the help, but I cannot get it to work. jh at opensolaris:~# zpool import pool: vault id: 8738898173956136656 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: vault UNAVAIL missing device raidz1-0 ONLINE c11d0 ONLINE c12d0 ONLINE c12d1 ONLINE c10d1 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. jh at opensolaris:~# zpool clear vault cannot open ''vault'': no such pool jh at opensolaris:~# zpool add vault log /dev/dsk/c10d0p1 cannot open ''vault'': no such pool jh at opensolaris:~# zpool import vault cannot import ''vault'': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. It seems to me that the ''missing'' log device is the problem. The first time I did a ''zpool import'' it was aware of the log device, but the disks were missing (becaus they were moved to a different controller), but now that the disks are back the log is not showing up anymore. I read http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6707530 and if I understand correctly this is related to my problem, but it should also have been fixed in build 96? I am on build 133.... -- This message posted from opensolaris.org
Jan Hellevik
2010-May-15 09:51 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
I cannot import - that is the problem. :-( I have read the discussions you referred to (and quite a few more), and also about the logfix program. I also found a discussion where ''zpool import -FX'' solved a similar problem so I tried that but no luck. Now I have read so many discussions and blog posts I am getting dizzy. :-) To summarize: I moved the disks without exporting first I tried to import I moved them back I cannot import the pool because of missing log The log is there, the disks are back but I still cannot import According to the bug 6707530 this should have been fixed in b96? Since I am on b133 it shouldn''t affect me, or am I wrong? -- This message posted from opensolaris.org
Jan Hellevik
2010-May-15 09:53 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
I don''t think that is the problem (but I am not sure). It seems like te problem is that the ZIL is missing. It is there, but not recognized. I used fdisk to create a 4GB partition of a SSD, and then added it to the pool with the command ''zpool add vault log /dev/dsk/c10d0p1''. When I try to import the pool is says the log is missing. When I try to add the log to the pool it says there is no such pool (since it isn''t imported yet). Catch22? :-) -- This message posted from opensolaris.org
Roy Sigurd Karlsbakk
2010-May-15 10:25 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
----- "Jan Hellevik" <opensolaris at janhellevik.com> skrev:> I don''t think that is the problem (but I am not sure). It seems like > te problem is that the ZIL is missing. It is there, but not > recognized. > > I used fdisk to create a 4GB partition of a SSD, and then added it to > the pool with the command ''zpool add vault log /dev/dsk/c10d0p1''. > > When I try to import the pool is says the log is missing. When I try > to add the log to the pool it says there is no such pool (since it > isn''t imported yet). Catch22? :-)Which version of opensolaris/zpool is this? There was a problem with earlier osol (up to snv_129 or so, don''t remember) that they failed to import a pool if the zil was missing - you effectually lost the whole pool. This was fixed in later (development) versions of opensolaris. I have still seen some reports addressing problems with this in later versions, but I don''t have the links handy - google for it :) Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
Jan Hellevik
2010-May-15 13:23 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
svn_133 and zfs 22. At least my rpool is 22. -- This message posted from opensolaris.org
Richard Elling
2010-May-15 14:48 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
On May 15, 2010, at 2:53 AM, Jan Hellevik wrote:> I don''t think that is the problem (but I am not sure). It seems like te problem is that the ZIL is missing. It is there, but not recognized. > > I used fdisk to create a 4GB partition of a SSD, and then added it to the pool with the command ''zpool add vault log /dev/dsk/c10d0p1''.Ah, this is critical information! By default, ZFS import does not look for fdisk partitions. Hence, your log device is not found. Since the pool is exported, there is no entry in /etc/zfs/zpool.cache to give ZFS a hint to look at the fdisk partition. First, you need to find the partition, because it might have moved to a new controller. For this example, lets assume the new disk pathname is c33d0. 1. verify that you can read the ZFS label on the partition zdb -l /dev/dsk/c33d0p1 you should see 4 labels 2. create a symlink ending with "s0" to the partition. ln -s /dev/dsk/c33d0p1 /dev/dsk/c33d0p1s0 3. see if ZFS can find the log device zpool import 4. if that doesn''t work, let us know and we can do the same trick using another name (than c33d0p1s0) or another way, using the -d option to zpool import.> When I try to import the pool is says the log is missing. When I try to add the log to the pool it says there is no such pool (since it isn''t imported yet). Catch22? :-)By default, Solaris only looks at fdisk partitions with a Solaris2 ID and only one fdisk partition per disk. -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Jan Hellevik
2010-May-15 15:31 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Thanks! Not home right now, but I will try that as soon as I get home. Message was edited by: janh -- This message posted from opensolaris.org
Roy Sigurd Karlsbakk
2010-May-15 16:11 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
----- "Richard Elling" <richard.elling at gmail.com> skrev:> On May 15, 2010, at 2:53 AM, Jan Hellevik wrote: > > > I don''t think that is the problem (but I am not sure). It seems like > te problem is that the ZIL is missing. It is there, but not > recognized. > > > > I used fdisk to create a 4GB partition of a SSD, and then added it > to the pool with the command ''zpool add vault log /dev/dsk/c10d0p1''. > > Ah, this is critical information! By default, ZFS import does not > look for fdisk partitions. Hence, your log device is not found. > Since the pool is exported, there is no entry in /etc/zfs/zpool.cache > to give ZFS a hint to look at the fdisk partition.Will ZFS look for all these devices in case of failure? roy at urd:~$ zpool status pool: dpool state: ONLINE scrub: scrub completed after 63h23m with 0 errors on Tue May 4 08:23:58 2010 config: NAME STATE READ WRITE CKSUM dpool ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c8t0d0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 c9t4d0 ONLINE 0 0 0 52K repaired c9t5d0 ONLINE 0 0 0 c9t6d0 ONLINE 0 0 0 logs mirror-3 ONLINE 0 0 0 c10d1s0 ONLINE 0 0 0 c11d0s0 ONLINE 0 0 0 cache c10d1s1 ONLINE 0 0 0 c11d0s1 ONLINE 0 0 0 spares c9t7d0 AVAIL -- Vennlige hilsener roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
Jan Hellevik
2010-May-15 17:48 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
It did not work. I did not find labels on p1, but on p0. jh at opensolaris:~# zdb -l /dev/dsk/c10d0p1 -------------------------------------------- LABEL 0 -------------------------------------------- failed to unpack label 0 -------------------------------------------- LABEL 1 -------------------------------------------- failed to unpack label 1 -------------------------------------------- LABEL 2 -------------------------------------------- failed to unpack label 2 -------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3 jh at opensolaris:~# zdb -l /dev/dsk/c10d0p0 -------------------------------------------- LABEL 0 -------------------------------------------- version: 22 state: 4 guid: 9172477941882675499 -------------------------------------------- LABEL 1 -------------------------------------------- version: 22 state: 4 guid: 9172477941882675499 -------------------------------------------- LABEL 2 -------------------------------------------- version: 22 state: 4 guid: 9172477941882675499 -------------------------------------------- LABEL 3 -------------------------------------------- version: 22 state: 4 guid: 9172477941882675499 jh at opensolaris:~# ln -s /dev/dsk/c10d0p0 /dev/dsk/c10d0p0s0 jh at opensolaris:~# zpool import pool: vault id: 8738898173956136656 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: vault UNAVAIL missing device raidz1-0 ONLINE c11d0 ONLINE c12d0 ONLINE c12d1 ONLINE c10d1 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. jh at opensolaris:~# -- This message posted from opensolaris.org
Haudy Kazemi
2010-May-15 18:14 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Can you recreate the problem with a second pool on a second set of drives, like I described in my earlier post? Right now it seems like your problem is mostly due to the missing log device. I''m wondering if that missing log device is what messed up the initial move to the other controller, or if the other controller did something to the disks when it saw them. Jan Hellevik wrote:> I don''t think that is the problem (but I am not sure). It seems like te problem is that the ZIL is missing. It is there, but not recognized. > > I used fdisk to create a 4GB partition of a SSD, and then added it to the pool with the command ''zpool add vault log /dev/dsk/c10d0p1''. > > When I try to import the pool is says the log is missing. When I try to add the log to the pool it says there is no such pool (since it isn''t imported yet). Catch22? :-) >
Jan Hellevik
2010-May-15 19:14 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Yes, I can try to do that. I do not have any more of this brand of disk, but I guess that does not matter. It will have to wait until tomorrow (I have an appointment in a few minutes, and it is getting late here in Norway), but I will try first thing tomorrow. I guess a pool on a single drive will do the trick? I can create the log as a partition on yet another drive just as I did with the SSD (do not want to mess with it just yet). Thanks for helping! -- This message posted from opensolaris.org
Haudy Kazemi
2010-May-15 21:04 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Jan Hellevik wrote:> Yes, I can try to do that. I do not have any more of this brand of disk, but I guess that does not matter. It will have to wait until tomorrow (I have an appointment in a few minutes, and it is getting late here in Norway), but I will try first thing tomorrow. I guess a pool on a single drive will do the trick? I can create the log as a partition on yet another drive just as I did with the SSD (do not want to mess with it just yet). Thanks for helping! >In this case the specific brand and model of drive probably does not matter. The most accurate test will be to setup a test pool as similar as possible to the damaged pool, i.e. a 4 disk RAIDZ1 with a log on a partition of a 5th disk. A single drive pool might do the trick for testing, but it has no redundancy. The smallest pool with redundancy is a mirror, thus the suggestion to use a mirror. If you have enough spare/small/old drives that are compatible with the second controller, use them to model your damaged pool. For this test it doesn''t really matter if these are 4 gb or 40 gb or 400 gb drives. Try the following things in order. Keep a copy of the terminal commands you use and the command responses you get. 1.) Wipe (e.g. dban/dd/zero wipe) disks that will make up the test pool, and create the test pool. Copy some test data to the pool, like an OpenSolaris ISO file. Try migrating the disks to the second controller the same way you did with your damaged pool. Use the exact same steps in the same order. See your notes/earlier posts while doing this to make sure you remember them exactly. If that works (a forced import will likely be needed), then you might have had a one time error, or hard to reproduce error, or maybe did a minor step slightly differently from how you remembered doing it with the damaged pool. If that fails, then you may have a repeatable test case. 2.) Wipe (e.g. dban/dd/zero wipe) disks that made up the test pool, and recreate the test pool. Copy some test data to the pool, like an OpenSolaris ISO file. Try migrating the disks the recommended way, using export, powering everything off, and then import. If that works (without needed a forced import), then skipping the export was likely a trigger. If that fails, it seems like the second controller is doing something to the disks. Look at the controller BIOS settings for something relevant and see if there are any firmware updates available. 3.) If you have a third (different model) controller (or a another computer running the same Solaris version with a different controller), repeat step 2 with it. If step 2 failed but this works, that''s more evidence the second controller is up to something.
Jan Hellevik
2010-May-16 14:52 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
I am making a second backup of my other pool - then I''ll use those disks and recreate the problem pool. The only difference will be the SSD - only have one of those. I''ll use a disk in the same slot, so it will be close. Backup will be finished in 2 hours time.... -- This message posted from opensolaris.org
Jan Hellevik
2010-May-16 19:42 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Ok - this is really strange. I did a test. Wiped my second pool (4 disks like the other pool), and used them to create a pool similar to the one I have problems with. Then i powered off, moved the disks and powered on. Same error message as before. Moved the disks back to the original controller. Pool is ok. Moved the disks to the new controller. At first it is exactly like my original problem, but when i did a second zpool import, the pool is imported ok. Zpool status reports the same as before. I run the same command as I did the first time: zpool status zpool import zpool export format cfgadm zpool status zpool import ---> now it imports the pool! How can this be? The only difference (as far as I can tell) is that the cache/log is on a 2.5" Samsung disk insted of a 2.5" OCZ SSD. Details follow (it is long - sorry): Also note below - I did a zpool destroy mpool before poweroff - when I powered on and did a zpool status it show the pool as UNAVAIL. It should not be there at all, if I understand correctly? ----- create the partitions for log and cache Total disk size is 30401 cylinders Cylinder size is 16065 (512 byte) blocks Cylinders Partition Status Type Start End Length % ========= ====== ============ ===== === ====== == 1 Solaris2 1 608 608 2 2 Solaris2 609 3040 2432 8 format> quit jh at opensolaris:~# zpool destroy mpool jh at opensolaris:~# poweroff Last login: Sun May 16 17:07:15 2010 from macpro.janhelle Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 jh at opensolaris:~$ pfexec bash jh at opensolaris:~# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 2. c10d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 3. c11d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB> /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 4. c12d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 5. c12d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB> /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 Specify disk (enter its number): ^C jh at opensolaris:~# zpool create vault2 raidz c10d1 c11d0 c12d0 c12d1 jh at opensolaris:~# zpool status ------ this pool is the one I destroyed - why is it here now? pool: mpool state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM mpool UNAVAIL 0 0 0 insufficient replicas mirror-0 UNAVAIL 0 0 0 insufficient replicas c13t2d0 UNAVAIL 0 0 0 cannot open c13t0d0 UNAVAIL 0 0 0 cannot open mirror-1 UNAVAIL 0 0 0 insufficient replicas c13t3d0 UNAVAIL 0 0 0 cannot open c13t1d0 UNAVAIL 0 0 0 cannot open pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c10d1 ONLINE 0 0 0 c11d0 ONLINE 0 0 0 c12d0 ONLINE 0 0 0 c12d1 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~# zpool destroy mpool cannot open ''mpool'': I/O error jh at opensolaris:~# zpool status -x all pools are healthy jh at opensolaris:~# jh at opensolaris:~# jh at opensolaris:~# zpool status ------ and now the pool is vanished pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c10d1 ONLINE 0 0 0 c11d0 ONLINE 0 0 0 c12d0 ONLINE 0 0 0 c12d1 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~# dmesg 4 times these messages May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:18 CEST 2010 May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: System-Serial-Number, HOSTNAME: opensolaris May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0 May 16 20:36:19 opensolaris EVENT-ID: f5f9feb6-34e9-6465-a15e-a3f4724c6f25 May 16 20:36:19 opensolaris DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information. May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur. May 16 20:36:19 opensolaris IMPACT: Fault tolerance of the pool may be compromised. May 16 20:36:19 opensolaris REC-ACTION: Run ''zpool status -x'' and replace the bad device. and then May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-CS, TYPE: Fault, VER: 1, SEVERITY: Major May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:19 CEST 2010 May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: System-Serial-Number, HOSTNAME: opensolaris May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0 May 16 20:36:19 opensolaris EVENT-ID: 57db7aa6-658a-ef83-875b-b2af77e4493a May 16 20:36:19 opensolaris DESC: A ZFS pool failed to open. Refer to http://sun.com/msg/ZFS-8000-CS for more information. May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur. May 16 20:36:19 opensolaris IMPACT: The pool data is unavailable May 16 20:36:19 opensolaris REC-ACTION: Run ''zpool status -x'' and attach any missing devices, follow May 16 20:36:19 opensolaris any provided recovery instructions or restore from backup. May 16 20:48:48 opensolaris zfs: [ID 249136 kern.info] created version 22 pool vault2 using 22 ------ these are the same commands I used before jh at opensolaris:~# zpool add vault2 log /dev/dsk/c10d0p1 jh at opensolaris:~# zpool add vault2 cache /dev/dsk/c10d0p0 jh at opensolaris:~# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c10d1 ONLINE 0 0 0 c11d0 ONLINE 0 0 0 c12d0 ONLINE 0 0 0 c12d1 ONLINE 0 0 0 logs c10d0p1 ONLINE 0 0 0 cache c10d0p0 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~# poweroff moved the 4 disks to the other controller power on Last login: Sun May 16 20:37:29 2010 from macpro.janhelle Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 jh at opensolaris:~$ zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 UNAVAIL 0 0 0 insufficient replicas raidz1-0 UNAVAIL 0 0 0 insufficient replicas c10d1 UNAVAIL 0 0 0 cannot open c11d0 UNAVAIL 0 0 0 cannot open c12d0 UNAVAIL 0 0 0 cannot open c12d1 UNAVAIL 0 0 0 cannot open logs c10d0p1 ONLINE 0 0 0 jh at opensolaris:~$ pfexec poweroff ---- moved the disk back to the original controller to see if it ok to just move them jh at opensolaris:~$ zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c10d1 ONLINE 0 0 0 c11d0 ONLINE 0 0 0 c12d0 ONLINE 0 0 0 c12d1 ONLINE 0 0 0 logs c10d0p1 ONLINE 0 0 0 cache c10d0p0 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~$ jh at opensolaris:~$ pfexec poweroff -- move the disks to the new controller again Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 jh at opensolaris:~$ pfexec bash jh at opensolaris:~# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 UNAVAIL 0 0 0 insufficient replicas raidz1-0 UNAVAIL 0 0 0 insufficient replicas c10d1 UNAVAIL 0 0 0 cannot open c11d0 UNAVAIL 0 0 0 cannot open c12d0 UNAVAIL 0 0 0 cannot open c12d1 UNAVAIL 0 0 0 cannot open logs c10d0p1 ONLINE 0 0 0 jh at opensolaris:~# zpool import vault2 cannot import ''vault2'': a pool with that name is already created/imported, and no additional pools with that name were found jh at opensolaris:~# zpool export vault2 cannot open ''vault2'': I/O error jh at opensolaris:~# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 2. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 3. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 4. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 5. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 Specify disk (enter its number): ^C jh at opensolaris:~# cfgadm Ap_Id Type Receptacle Occupant Condition c13 scsi-sas connected configured unknown usb5/1 unknown empty unconfigured ok ... jh at opensolaris:~# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~# zpool import vault2 jh at opensolaris:~# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0 errors: No known data errors pool: vault2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vault2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c13t2d0 ONLINE 0 0 0 c13t1d0 ONLINE 0 0 0 c13t3d0 ONLINE 0 0 0 c13t0d0 ONLINE 0 0 0 logs c10d0p1 ONLINE 0 0 0 cache c10d0p0 ONLINE 0 0 0 errors: No known data errors jh at opensolaris:~# -- This message posted from opensolaris.org
Haudy Kazemi
2010-May-17 04:51 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
I don''t really have an explanation. Perhaps flaky second controller hardware that only works sometimes and can corrupt pools? Have you seen any other strangeness/instability on this computer? Did you use zpool export before moving the disks the first time to the second controller, or did you just move them without exporting? If you dd zero wipe the disks that made up this test pool, and then recreate the test pool, does it behave the same way the second time? Jan Hellevik wrote:> Ok - this is really strange. I did a test. Wiped my second pool (4 disks like the other pool), and used them to create a pool similar to the one I have problems with. > > Then i powered off, moved the disks and powered on. Same error message as before. Moved the disks back to the original controller. Pool is ok. Moved the disks to the new controller. At first it is exactly like my original problem, but when i did a second zpool import, the pool is imported ok. > > Zpool status reports the same as before. I run the same command as I did the first time: > zpool status > zpool import > zpool export > format > cfgadm > zpool status > zpool import ---> now it imports the pool! > > How can this be? The only difference (as far as I can tell) is that the cache/log is on a 2.5" Samsung disk insted of a 2.5" OCZ SSD. > > Details follow (it is long - sorry): > > Also note below - I did a zpool destroy mpool before poweroff - when I powered on and did a zpool status it show the pool as UNAVAIL. It should not be there at all, if I understand correctly? > > ----- create the partitions for log and cache > > Total disk size is 30401 cylinders > Cylinder size is 16065 (512 byte) blocks > > Cylinders > Partition Status Type Start End Length % > ========= ====== ============ ===== === ====== ==> 1 Solaris2 1 608 608 2 > 2 Solaris2 609 3040 2432 8 > > format> quit > jh at opensolaris:~# zpool destroy mpool > jh at opensolaris:~# poweroff > > Last login: Sun May 16 17:07:15 2010 from macpro.janhelle > Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 > jh at opensolaris:~$ pfexec bash > jh at opensolaris:~# format > Searching for disks...done > > AVAILABLE DISK SELECTIONS: > 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 > 1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 > 2. c10d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB> > /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0 > 3. c11d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB> > /pci at 0,0/pci-ide at 11/ide at 1/cmdk at 0,0 > 4. c12d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB> > /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 0,0 > 5. c12d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB> > /pci at 0,0/pci-ide at 14,1/ide at 1/cmdk at 1,0 > Specify disk (enter its number): ^C > jh at opensolaris:~# zpool create vault2 raidz c10d1 c11d0 c12d0 c12d1 > jh at opensolaris:~# zpool status > > ------ this pool is the one I destroyed - why is it here now? > > pool: mpool > state: UNAVAIL > status: One or more devices could not be opened. There are insufficient > replicas for the pool to continue functioning. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-3C > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > mpool UNAVAIL 0 0 0 insufficient replicas > mirror-0 UNAVAIL 0 0 0 insufficient replicas > c13t2d0 UNAVAIL 0 0 0 cannot open > c13t0d0 UNAVAIL 0 0 0 cannot open > mirror-1 UNAVAIL 0 0 0 insufficient replicas > c13t3d0 UNAVAIL 0 0 0 cannot open > c13t1d0 UNAVAIL 0 0 0 cannot open > > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c10d1 ONLINE 0 0 0 > c11d0 ONLINE 0 0 0 > c12d0 ONLINE 0 0 0 > c12d1 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~# zpool destroy mpool > cannot open ''mpool'': I/O error > jh at opensolaris:~# zpool status -x > all pools are healthy > jh at opensolaris:~# > jh at opensolaris:~# > jh at opensolaris:~# zpool status > > > ------ and now the pool is vanished > > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c10d1 ONLINE 0 0 0 > c11d0 ONLINE 0 0 0 > c12d0 ONLINE 0 0 0 > c12d1 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~# > > dmesg > 4 times these messages > May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major > May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:18 CEST 2010 > May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: System-Serial-Number, HOSTNAME: opensolaris > May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0 > May 16 20:36:19 opensolaris EVENT-ID: f5f9feb6-34e9-6465-a15e-a3f4724c6f25 > May 16 20:36:19 opensolaris DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information. > May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur. > May 16 20:36:19 opensolaris IMPACT: Fault tolerance of the pool may be compromised. > May 16 20:36:19 opensolaris REC-ACTION: Run ''zpool status -x'' and replace the bad device. > and then > May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-CS, TYPE: Fault, VER: 1, SEVERITY: Major > May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:19 CEST 2010 > May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: System-Serial-Number, HOSTNAME: opensolaris > May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0 > May 16 20:36:19 opensolaris EVENT-ID: 57db7aa6-658a-ef83-875b-b2af77e4493a > May 16 20:36:19 opensolaris DESC: A ZFS pool failed to open. Refer to http://sun.com/msg/ZFS-8000-CS for more information. > May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur. > May 16 20:36:19 opensolaris IMPACT: The pool data is unavailable > May 16 20:36:19 opensolaris REC-ACTION: Run ''zpool status -x'' and attach any missing devices, follow > May 16 20:36:19 opensolaris any provided recovery instructions or restore from backup. > > May 16 20:48:48 opensolaris zfs: [ID 249136 kern.info] created version 22 pool vault2 using 22 > > > ------ these are the same commands I used before > > jh at opensolaris:~# zpool add vault2 log /dev/dsk/c10d0p1 > jh at opensolaris:~# zpool add vault2 cache /dev/dsk/c10d0p0 > jh at opensolaris:~# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c10d1 ONLINE 0 0 0 > c11d0 ONLINE 0 0 0 > c12d0 ONLINE 0 0 0 > c12d1 ONLINE 0 0 0 > logs > c10d0p1 ONLINE 0 0 0 > cache > c10d0p0 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~# > > poweroff > > moved the 4 disks to the other controller > > power on > > Last login: Sun May 16 20:37:29 2010 from macpro.janhelle > Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 > jh at opensolaris:~$ zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: UNAVAIL > status: One or more devices could not be opened. There are insufficient > replicas for the pool to continue functioning. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-3C > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 UNAVAIL 0 0 0 insufficient replicas > raidz1-0 UNAVAIL 0 0 0 insufficient replicas > c10d1 UNAVAIL 0 0 0 cannot open > c11d0 UNAVAIL 0 0 0 cannot open > c12d0 UNAVAIL 0 0 0 cannot open > c12d1 UNAVAIL 0 0 0 cannot open > logs > c10d0p1 ONLINE 0 0 0 > jh at opensolaris:~$ pfexec poweroff > > ---- moved the disk back to the original controller to see if it ok to just move them > > jh at opensolaris:~$ zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c10d1 ONLINE 0 0 0 > c11d0 ONLINE 0 0 0 > c12d0 ONLINE 0 0 0 > c12d1 ONLINE 0 0 0 > logs > c10d0p1 ONLINE 0 0 0 > cache > c10d0p0 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~$ > > jh at opensolaris:~$ pfexec poweroff > > -- move the disks to the new controller again > > Sun Microsystems Inc. SunOS 5.11 snv_134 February 2010 > jh at opensolaris:~$ pfexec bash > jh at opensolaris:~# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: UNAVAIL > status: One or more devices could not be opened. There are insufficient > replicas for the pool to continue functioning. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-3C > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 UNAVAIL 0 0 0 insufficient replicas > raidz1-0 UNAVAIL 0 0 0 insufficient replicas > c10d1 UNAVAIL 0 0 0 cannot open > c11d0 UNAVAIL 0 0 0 cannot open > c12d0 UNAVAIL 0 0 0 cannot open > c12d1 UNAVAIL 0 0 0 cannot open > logs > c10d0p1 ONLINE 0 0 0 > jh at opensolaris:~# zpool import vault2 > cannot import ''vault2'': a pool with that name is already created/imported, > and no additional pools with that name were found > jh at opensolaris:~# zpool export vault2 > cannot open ''vault2'': I/O error > jh at opensolaris:~# format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 0,0 > 1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0 > 2. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 0,0 > 3. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 1,0 > 4. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 2,0 > 5. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> > /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 3,0 > Specify disk (enter its number): ^C > jh at opensolaris:~# cfgadm > Ap_Id Type Receptacle Occupant Condition > c13 scsi-sas connected configured unknown > usb5/1 unknown empty unconfigured ok > ... > jh at opensolaris:~# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~# zpool import vault2 > jh at opensolaris:~# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c8d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: vault2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > vault2 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c13t2d0 ONLINE 0 0 0 > c13t1d0 ONLINE 0 0 0 > c13t3d0 ONLINE 0 0 0 > c13t0d0 ONLINE 0 0 0 > logs > c10d0p1 ONLINE 0 0 0 > cache > c10d0p0 ONLINE 0 0 0 > > errors: No known data errors > jh at opensolaris:~# >
Jan Hellevik
2010-Jun-12 10:45 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Hi! Sorry for the late reply - I have been busy at work and this had to wait. The system has been powered off since my last post. The computer is new - built it to use as file server at home. I have not seen any strange behaviour (other than this). All parts are brand new (except for the disks). I did not do an export when I moved the test pool - I did everything exactly as I did when I had the incident. The only difference is that I used a HDD instead of a SSD as I didn''t have an available SSD to use for the log/cache. I am not sure how to dd zero wipe the disks, but I can give it a try. I''ll google for the syntax. Is there anything else I can do to get my pool back? It seems strange to me that merely moving the disks will render it useless. I have not written anything to the disks, so all the data is there - is there no way to retrieve the files? -- This message posted from opensolaris.org
Richard Elling
2010-Jun-12 17:42 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
On Jun 12, 2010, at 3:45 AM, Jan Hellevik wrote:> Hi! Sorry for the late reply - I have been busy at work and this had to wait. The system has been powered off since my last post. > > The computer is new - built it to use as file server at home. I have not seen any strange behaviour (other than this). All parts are brand new (except for the disks). > > I did not do an export when I moved the test pool - I did everything exactly as I did when I had the incident. The only difference is that I used a HDD instead of a SSD as I didn''t have an available SSD to use for the log/cache. > > I am not sure how to dd zero wipe the disks, but I can give it a try. I''ll google for the syntax. > > Is there anything else I can do to get my pool back? It seems strange to me that merely moving the disks will render it useless. I have not written anything to the disks, so all the data is there - is there no way to retrieve the files?You used the fdisk partitions instead of a slice. If you export the pool, then the hint to look for fdisk partitions instead of a slice is lost (cleared from the zpool.cache file). If you search the forum archives, you''ll find similar situations such as: http://opensolaris.org/jive/thread.jspa?messageID=461199 Cindy, let''s write something up for the ZFS Troubleshooting Guide :-) -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Jan Hellevik
2010-Jun-12 19:43 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
Thanks for the reply. The thread on FreeBSD mentions creating symlinks for the fdisk partitions. So did you earlier in this thread. I tried that but it did not help - you can see the result in my earlier reply to your previous message in this thread. Is this the way to go? Should I try again with symlinks? The server has been powered off since I started this thread, so it should be intact (or at least not worse than before)... Is there hope for my pool or is it lost is really my question. Your latest post wasn?t really clear on that point. :-) I would really like to get this pool online again. Thanks for helping. -- This message posted from opensolaris.org
Richard Elling
2010-Jun-12 21:10 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving back
On Jun 12, 2010, at 12:43 PM, Jan Hellevik wrote:> Thanks for the reply. The thread on FreeBSD mentions creating symlinks for the fdisk partitions. So did you earlier in this thread. I tried that but it did not help - you can see the result in my earlier reply to your previous message in this thread. > > Is this the way to go? Should I try again with symlinks? The server has been powered off since I started this thread, so it should be intact (or at least not worse than before)... > > Is there hope for my pool or is it lost is really my question. Your latest post wasn?t really clear on that point. :-) I would really like to get this pool online again.Hopefully some facts will make it clear: 1. ZFS import looks in the zpool.cache file before looking in /dev/dsk 2. To search in a directory other than /dev/dsk, use the -d option 3. ZFS import will look for devices named c*t*d*s* 4. ZFS import will never look for devices named c*t*d*p* 5. Every file in /dev is a symlink, managed by devfsadm(1m) -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Jan Hellevik
2010-Jun-13 15:09 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
I found a thread that mentions an undocumented parameter -V (http://opensolaris.org/jive/thread.jspa?messageID=444810) and that did the trick! The pool is now online and seems to be working well. Thanks everyone who helped! -- This message posted from opensolaris.org
Richard Elling
2010-Jun-13 16:37 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
On Jun 13, 2010, at 8:09 AM, Jan Hellevik wrote:> I found a thread that mentions an undocumented parameter -V (http://opensolaris.org/jive/thread.jspa?messageID=444810) and that did the trick! > > The pool is now online and seems to be working well.-V is a crutch, not a cure. -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Jan Hellevik
2010-Jun-13 18:14 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
Well, for me it was a cure. Nothing else I tried got the pool back. As far as I can tell, the way to get it back should be to use symlinks to the fdisk partitions on my SSD, but that did not work for me. Using -V got the pool back. What is wrong with that? If you have a better suggestion as to how I should have recovered my pool I am certainly interested in hearing it. -- This message posted from opensolaris.org
Erik Trimble
2010-Jun-13 19:38 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
On 6/13/2010 11:14 AM, Jan Hellevik wrote:> Well, for me it was a cure. Nothing else I tried got the pool back. As far as I can tell, the way to get it back should be to use symlinks to the fdisk partitions on my SSD, but that did not work for me. Using -V got the pool back. What is wrong with that? > > If you have a better suggestion as to how I should have recovered my pool I am certainly interested in hearing it. >I think Richard meant that -V isn''t the real solution to your problem, which is to fix the underlying issue with fdisk partition recognition. -V is undocumented for a reason, and likely to go "poof" and disappear at any time, so we shouldn''t rely on it to solve this issue. For you, though, it obviously worked in this case. The message being, don''t count on this being a general solution if you get in this situation again. But at least you''re back in business. :-) -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
Richard Elling
2010-Jun-13 23:08 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
On Jun 13, 2010, at 12:38 PM, Erik Trimble wrote:> On 6/13/2010 11:14 AM, Jan Hellevik wrote: >> Well, for me it was a cure. Nothing else I tried got the pool back. As far as I can tell, the way to get it back should be to use symlinks to the fdisk partitions on my SSD, but that did not work for me. Using -V got the pool back. What is wrong with that? >> >> If you have a better suggestion as to how I should have recovered my pool I am certainly interested in hearing it. >> > > I think Richard meant that -V isn''t the real solution to your problem, which is to fix the underlying issue with fdisk partition recognition.yes.> -V is undocumented for a reason, and likely to go "poof" and disappear at any time, so we shouldn''t rely on it to solve this issue.I wouldn''t worry about it going away, but the option is only useful when a vdev is missing. It does not cause a missing vdev to reappear, so your problem is not solved. -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
Ross Walker
2010-Jun-14 13:19 UTC
[zfs-discuss] Moved disks to new controller - cannot import pool even after moving ba
On Jun 13, 2010, at 2:14 PM, Jan Hellevik <opensolaris at janhellevik.com> wrote:> Well, for me it was a cure. Nothing else I tried got the pool back. > As far as I can tell, the way to get it back should be to use > symlinks to the fdisk partitions on my SSD, but that did not work > for me. Using -V got the pool back. What is wrong with that? > > If you have a better suggestion as to how I should have recovered my > pool I am certainly interested in hearing it.I would take this time to offline one disk at a time, wipe all it''s tables/labels and re-attach it as an EFI whole disk to avoid hitting this same problem again in the future. -Ross