As a result of a power spike during a thunder storm I lost a sata controller card. This card supported my zfs pool called newsan which is a 4 x samsung 1Tb sata2 disk raid-z. I replaced the card and the devices have the same controller/disk numbers, but now have the following issue. -bash-3.2$ zpool status pool: newsan state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-72 scrub: none requested config: NAME STATE READ WRITE CKSUM newsan FAULTED 1 0 0 corrupted data raidz1 ONLINE 6 0 0 c10d1 ONLINE 17 0 0 c10d0 ONLINE 17 0 0 c9d1 ONLINE 24 0 0 c9d0 ONLINE 24 0 0 Something majorly weird is going on as when i run format i see this :- -bash-3.2$ pfexec format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c3d0 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 0,0 1. c3d1 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 1,0 2. c9d0 <SAMSUNG-S13PJ1BQ60312-0001-31.50MB> /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 0/cmdk at 0,0 3. c9d1 <SAMSUNG-S13PJ1BQ60311-0001-31.50MB> /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 0/cmdk at 1,0 4. c10d0 <SAMSUNG-S13PJ1BQ60311-0001-31.50MB> /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 1/cmdk at 0,0 5. c10d1 <SAMSUNG-S13PJ1BQ60312-0001-31.50MB> /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 1/cmdk at 1,0 ??? 31.50 MB ??? they all used to show as 1Tb i believe (or 931Mb or whatever) Specify disk (enter its number): 2 selecting c9d0 NO Alt slice No defect list found [disk formatted, no defect list found] /dev/dsk/c9d0s0 is part of active ZFS pool newsan. Please see zpool(1M). format> p partition> p Current partition table (original): Total disk sectors available: 1953503710 + 16384 (reserved sectors) Part Tag Flag First Sector Size Last Sector 0 usr wm 256 931.50GB 1953503710 1 unassigned wm 0 0 0 2 unassigned wm 0 0 0 3 unassigned wm 0 0 0 4 unassigned wm 0 0 0 5 unassigned wm 0 0 0 6 unassigned wm 0 0 0 8 reserved wm 1953503711 8.00MB 1953520094 So the partition table is looking correct. I dont believe all 4 disks died concurrently. Any thoughts on how to recover? I dont particularly want to restore the couple of terabytes of data if i dont have to. analyze> read Ready to analyze (won''t harm SunOS). This takes a long time, but is interruptable with CTRL-C. Continue? y Current Defect List must be initialized to do automatic repair. Oh and whats this defect list thing? I havnt seen that before defect> print No working list defined. defect> create Controller does not support creating manufacturer''s defect list. defect> extract Ready to extract working list. This cannot be interrupted and may take a long while. Continue? y NO Alt slice NO Alt slice Extracting defect list...No defect list found Extraction failed. defect> commit Ready to update Current Defect List, continue? y Current Defect List updated, total of 0 defects. Disk must be reformatted for changes to take effect. analyze> read Ready to analyze (won''t harm SunOS). This takes a long time, but is interruptable with CTRL-C. Continue? y pass 0 64386 pass 1 64386 Total of 0 defective blocks repaired. So the read test seemed to work fine. Any suggestions on how to proceed? Thoughts on why the disks are showing weirdly in format? Any way to recover/rebuild the zpool metadata? Any help would be appreciated Regards Rep -- This message posted from opensolaris.org
Eric Schrock
2008-Dec-04 23:17 UTC
[zfs-discuss] help please - The pool metadata is corrupted
Can you send the output of the attached D script when running ''zpool status''? - Eric On Thu, Dec 04, 2008 at 02:58:54PM -0800, Brett wrote:> As a result of a power spike during a thunder storm I lost a sata controller card. This card supported my zfs pool called newsan which is a 4 x samsung 1Tb sata2 disk raid-z. I replaced the card and the devices have the same controller/disk numbers, but now have the following issue. > > -bash-3.2$ zpool status > pool: newsan > state: FAULTED > status: The pool metadata is corrupted and the pool cannot be opened. > action: Destroy and re-create the pool from a backup source. > see: http://www.sun.com/msg/ZFS-8000-72 > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > newsan FAULTED 1 0 0 corrupted data > raidz1 ONLINE 6 0 0 > c10d1 ONLINE 17 0 0 > c10d0 ONLINE 17 0 0 > c9d1 ONLINE 24 0 0 > c9d0 ONLINE 24 0 0 > > Something majorly weird is going on as when i run format i see this :- > -bash-3.2$ pfexec format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c3d0 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> > /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 0,0 > 1. c3d1 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> > /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 1,0 > 2. c9d0 <SAMSUNG-S13PJ1BQ60312-0001-31.50MB> > /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 0/cmdk at 0,0 > 3. c9d1 <SAMSUNG-S13PJ1BQ60311-0001-31.50MB> > /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 0/cmdk at 1,0 > 4. c10d0 <SAMSUNG-S13PJ1BQ60311-0001-31.50MB> > /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 1/cmdk at 0,0 > 5. c10d1 <SAMSUNG-S13PJ1BQ60312-0001-31.50MB> > /pci at 0,0/pci8086,244e at 1e/pci-ide at 1/ide at 1/cmdk at 1,0 > > ??? 31.50 MB ??? they all used to show as 1Tb i believe (or 931Mb or whatever) > > Specify disk (enter its number): 2 > selecting c9d0 > NO Alt slice > No defect list found > [disk formatted, no defect list found] > /dev/dsk/c9d0s0 is part of active ZFS pool newsan. Please see zpool(1M). > format> p > partition> p > Current partition table (original): > Total disk sectors available: 1953503710 + 16384 (reserved sectors) > > Part Tag Flag First Sector Size Last Sector > 0 usr wm 256 931.50GB 1953503710 > 1 unassigned wm 0 0 0 > 2 unassigned wm 0 0 0 > 3 unassigned wm 0 0 0 > 4 unassigned wm 0 0 0 > 5 unassigned wm 0 0 0 > 6 unassigned wm 0 0 0 > 8 reserved wm 1953503711 8.00MB 1953520094 > > So the partition table is looking correct. I dont believe all 4 disks died concurrently. > > Any thoughts on how to recover? I dont particularly want to restore the couple of terabytes of data if i dont have to. > > analyze> read > Ready to analyze (won''t harm SunOS). This takes a long time, > but is interruptable with CTRL-C. Continue? y > Current Defect List must be initialized to do automatic repair. > > Oh and whats this defect list thing? I havnt seen that before > > defect> print > No working list defined. > defect> create > Controller does not support creating manufacturer''s defect list. > defect> extract > Ready to extract working list. This cannot be interrupted > and may take a long while. Continue? y > NO Alt slice > NO Alt slice > Extracting defect list...No defect list found > Extraction failed. > defect> commit > Ready to update Current Defect List, continue? y > Current Defect List updated, total of 0 defects. > Disk must be reformatted for changes to take effect. > analyze> read > Ready to analyze (won''t harm SunOS). This takes a long time, > but is interruptable with CTRL-C. Continue? y > > pass 0 > 64386 > > pass 1 > 64386 > > Total of 0 defective blocks repaired. > > So the read test seemed to work fine. > > Any suggestions on how to proceed? Thoughts on why the disks are showing weirdly in format? Any way to recover/rebuild the zpool metadata? > > Any help would be appreciated > > Regards Rep > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Fishworks http://blogs.sun.com/eschrock -------------- next part -------------- #!/sbin/dtrace -s #pragma D option quiet BEGIN { printf("run ''zpool import'' to generate trace\n\n"); } vdev_raidz_open:entry { printf("%d BEGIN RAIDZ OPEN\n", timestamp); printf("%d config asize = %d\n", timestamp, args[0]->vdev_asize); printf("%d config ashift = %d\n", timestamp, args[0]->vdev_top->vdev_ashift); self->child = 1; self->asize = args[1]; self->ashift = args[2]; } vdev_disk_open:entry /self->child/ { self->disk_asize = args[1]; self->disk_ashift = args[2]; } vdev_disk_open:return /self->child/ { printf("%d child[%d]: asize = %d, ashift = %d\n", timestamp, self->child - 1, *self->disk_asize, *self->disk_ashift); self->disk_asize = 0; self->disk_ashift = 0; self->child++; } vdev_raidz_open:return { printf("%d asize = %d\n", timestamp, *self->asize); printf("%d ashift = %d\n", timestamp, *self->ashift); printf("%d END RAIDZ OPEN\n", timestamp); self->child = 0; self->asize = 0; self->ashift = 0; }
here is the requested output of raidz_open2.d upon running a zpool status :- root at san:/export/home/brett# ./raidz_open2.d run ''zpool import'' to generate trace 60027449049959 BEGIN RAIDZ OPEN 60027449049959 config asize = 4000755744768 60027449049959 config ashift = 9 60027507681841 child[3]: asize = 1000193768960, ashift = 9 60027508294854 asize = 4000755744768 60027508294854 ashift = 9 60027508294854 END RAIDZ OPEN 60027472787344 child[0]: asize = 1000193768960, ashift = 9 60027498558501 child[1]: asize = 1000193768960, ashift = 9 60027505063285 child[2]: asize = 1000193768960, ashift = 9 I hope that helps, means little to me. One thought I had was maybe i somehow messed up the cables and the devices are not in their original sequence. Would this make any difference? I have seen examples for raid-z suggesting that the import of a raid-z should figure out the devices regardless of the order of devices or of new device numbers so i was hoping it didnt matter. Thanks Rep -- This message posted from opensolaris.org
Eric Schrock
2008-Dec-08 17:40 UTC
[zfs-discuss] help please - The pool metadata is corrupted
Well it shows that you''re not suffering from a known bug. The symptoms you were describing were the same as those seen when a device spontaneously shrinks within a raid-z vdev. But it looks like the sizes are the same ("config asize" = "asize"), so I''m at a loss. - Eric On Sun, Dec 07, 2008 at 05:52:10PM -0800, Brett wrote:> here is the requested output of raidz_open2.d upon running a zpool status :- > > root at san:/export/home/brett# ./raidz_open2.d > run ''zpool import'' to generate trace > > 60027449049959 BEGIN RAIDZ OPEN > 60027449049959 config asize = 4000755744768 > 60027449049959 config ashift = 9 > 60027507681841 child[3]: asize = 1000193768960, ashift = 9 > 60027508294854 asize = 4000755744768 > 60027508294854 ashift = 9 > 60027508294854 END RAIDZ OPEN > 60027472787344 child[0]: asize = 1000193768960, ashift = 9 > 60027498558501 child[1]: asize = 1000193768960, ashift = 9 > 60027505063285 child[2]: asize = 1000193768960, ashift = 9 > > I hope that helps, means little to me. > > One thought I had was maybe i somehow messed up the cables and the devices are not in their original sequence. Would this make any difference? I have seen examples for raid-z suggesting that the import of a raid-z should figure out the devices regardless of the order of devices or of new device numbers so i was hoping it didnt matter. > > Thanks Rep > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
Well after a couple of weeks of beating my head, i finally got my data back so I thought I would post what process recovered it. I ran the samsung estool utility ran auto-scan and for each disk that was showing the wrong physical size i :- chose set max address chose recover native size After that when i booted back into solaris format showed the disks being the correct size again and i was able to zpool import :- AVAILABLE DISK SELECTIONS: 0. c3d0 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 0,0 1. c3d1 <DEFAULT cyl 19454 alt 2 hd 255 sec 63> /pci at 0,0/pci8086,2948 at 1c,4/pci-ide at 0/ide at 0/cmdk at 1,0 2. c4d1 <DEFAULT cyl 60798 alt 2 hd 255 sec 126> /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 1,0 3. c5d0 <DEFAULT cyl 60798 alt 2 hd 255 sec 126> /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0 4. c5d1 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 1,0 5. c6d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci-ide at 1f,5/ide at 0/cmdk at 0,0 6. c7d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci-ide at 1f,5/ide at 1/cmdk at 0,0 I will just say though that there is something in zfs which caused this in the first place as when i first replaced teh faulty sata controller, only 1 of the 4 disks showed the incorrect size in format but then as i messed around trying to zpool export/import i eventually wound up in the sate that all 4 disks showed the wrong size. Anyhow, im happy i got it all back working again, and hope this solution assists others. Regards Rep -- This message posted from opensolaris.org
Bob Friesenhahn
2008-Dec-13 16:55 UTC
[zfs-discuss] help please - The pool metadata is corrupted
On Sat, 13 Dec 2008, Brett wrote:> > I will just say though that there is something in zfs which caused > this in the first place as when i first replaced teh faulty sata > controller, only 1 of the 4 disks showed the incorrect size in > format but then as i messed around trying to zpool export/import i > eventually wound up in the sate that all 4 disks showed the wrong > size.ZFS has absolutely nothing to do with the disk sizes reported by ''format''. The problem is elsewhere. Perhaps it is a firmware or driver issue. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/