Jim Leonard
2009-Jul-04 05:06 UTC
[zfs-code] zpool import hangs the entire server (please help; data included)
As the subject says, I can''t import a seemingly okay raidz pool and I really need to as it has some information on it that is newer than the last backup cycle :-( I''m really in a bind; I hope anyone can help... Background: A drive in a four-slice pool failed (I have to use slices due to a motherboard BIOS limitation; EFI labels cause POST to choke). I exported the pool, powered down, replaced the drive, and now the entire server locks up when I attempt an import. At first I suspected the .cache was fubar''d so I did an export to clear it: --begin-- root at fortknox:~# zpool export vault cannot open ''vault'': no such pool ===end== Okay, so the .cache is good. And it can somehow see there is a pool there, but when trying to import: --begin-- root at fortknox:~# zpool import pool: vault id: 12084546386451079719 state: DEGRADED status: One or more devices are offlined. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. config: vault DEGRADED raidz1 DEGRADED c8t0d0s6 ONLINE c8t1d0s6 ONLINE c9t0d0s6 OFFLINE c9t1d0s6 ONLINE root at fortknox:~# zpool import vault ===end== ...it hangs indefinitely, entire server locked up (although it responds to pings). But I know the info is there because zdb -l /dev/dsk/c8t0d0s6 shows: --begin-- root at fortknox:~# zdb -l /dev/dsk/c8t0d0s6 -------------------------------------------- LABEL 0 -------------------------------------------- version=14 name=''vault'' state=1 txg=689703 pool_guid=12084546386451079719 hostid=4288054 hostname=''fortknox'' top_guid=18316851491481709534 guid=9202175319063431582 vdev_tree type=''raidz'' id=0 guid=18316851491481709534 nparity=1 metaslab_array=23 metaslab_shift=35 ashift=9 asize=6000749838336 is_log=0 children[0] type=''disk'' id=0 guid=9202175319063431582 path=''/dev/dsk/c8t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS0N9KW/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 0,0:g'' whole_disk=0 children[1] type=''disk'' id=1 guid=14662350669876577780 path=''/dev/dsk/c8t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS20P51/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 1,0:g'' whole_disk=0 children[2] type=''disk'' id=2 guid=12094645433779503688 path=''/dev/dsk/c9t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS1L8VY/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 0,0:g'' whole_disk=0 DTL=179 offline=1 faulted=1 children[3] type=''disk'' id=3 guid=15554931888608113584 path=''/dev/dsk/c9t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS232H8/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 1,0:g'' whole_disk=0 -------------------------------------------- LABEL 1 -------------------------------------------- version=14 name=''vault'' state=1 txg=689703 pool_guid=12084546386451079719 hostid=4288054 hostname=''fortknox'' top_guid=18316851491481709534 guid=9202175319063431582 vdev_tree type=''raidz'' id=0 guid=18316851491481709534 nparity=1 metaslab_array=23 metaslab_shift=35 ashift=9 asize=6000749838336 is_log=0 children[0] type=''disk'' id=0 guid=9202175319063431582 path=''/dev/dsk/c8t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS0N9KW/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 0,0:g'' whole_disk=0 children[1] type=''disk'' id=1 guid=14662350669876577780 path=''/dev/dsk/c8t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS20P51/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 1,0:g'' whole_disk=0 children[2] type=''disk'' id=2 guid=12094645433779503688 path=''/dev/dsk/c9t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS1L8VY/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 0,0:g'' whole_disk=0 DTL=179 offline=1 faulted=1 children[3] type=''disk'' id=3 guid=15554931888608113584 path=''/dev/dsk/c9t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS232H8/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 1,0:g'' whole_disk=0 -------------------------------------------- LABEL 2 -------------------------------------------- version=14 name=''vault'' state=1 txg=689703 pool_guid=12084546386451079719 hostid=4288054 hostname=''fortknox'' top_guid=18316851491481709534 guid=9202175319063431582 vdev_tree type=''raidz'' id=0 guid=18316851491481709534 nparity=1 metaslab_array=23 metaslab_shift=35 ashift=9 asize=6000749838336 is_log=0 children[0] type=''disk'' id=0 guid=9202175319063431582 path=''/dev/dsk/c8t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS0N9KW/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 0,0:g'' whole_disk=0 children[1] type=''disk'' id=1 guid=14662350669876577780 path=''/dev/dsk/c8t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS20P51/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 1,0:g'' whole_disk=0 children[2] type=''disk'' id=2 guid=12094645433779503688 path=''/dev/dsk/c9t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS1L8VY/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 0,0:g'' whole_disk=0 DTL=179 offline=1 faulted=1 children[3] type=''disk'' id=3 guid=15554931888608113584 path=''/dev/dsk/c9t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS232H8/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 1,0:g'' whole_disk=0 -------------------------------------------- LABEL 3 -------------------------------------------- version=14 name=''vault'' state=1 txg=689703 pool_guid=12084546386451079719 hostid=4288054 hostname=''fortknox'' top_guid=18316851491481709534 guid=9202175319063431582 vdev_tree type=''raidz'' id=0 guid=18316851491481709534 nparity=1 metaslab_array=23 metaslab_shift=35 ashift=9 asize=6000749838336 is_log=0 children[0] type=''disk'' id=0 guid=9202175319063431582 path=''/dev/dsk/c8t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS0N9KW/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 0,0:g'' whole_disk=0 children[1] type=''disk'' id=1 guid=14662350669876577780 path=''/dev/dsk/c8t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS20P51/g'' phys_path=''/pci at 0,0/pci1462,7125 at 7/disk at 1,0:g'' whole_disk=0 children[2] type=''disk'' id=2 guid=12094645433779503688 path=''/dev/dsk/c9t0d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS1L8VY/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 0,0:g'' whole_disk=0 DTL=179 offline=1 faulted=1 children[3] type=''disk'' id=3 guid=15554931888608113584 path=''/dev/dsk/c9t1d0s6'' devid=''id1,sd at SATA_____ST31500341AS________________9VS232H8/g'' phys_path=''/pci at 0,0/pci1462,7125 at 8/disk at 1,0:g'' whole_disk=0 ===end== I ran a dtrace script ("raidz_open2.d") in another window when I attempted the pool *listing* and got this trace: --begin-- root at fortknox:/var/tmp# /var/tmp/raidz_open2.d run ''zpool import'' to generate trace 518760101261 BEGIN RAIDZ OPEN 518760101261 config asize = 6000749838336 518760101261 config ashift = 9 518761394885 child[0]: asize = 1500192368640, ashift = 9 518763420554 child[1]: asize = 1500192368640, ashift = 9 518806056951 child[2]: asize = 1500192368640, ashift = 9 518830934206 asize = 6000749838336 518830934206 ashift = 9 518830934206 END RAIDZ OPEN ===end== But when I attempt the actual IMPORT ("zpool import vault") I get this trace: --begin-- root at fortknox:/var/tmp# /var/tmp/raidz_open2.d run ''zpool import'' to generate trace 614138497547 BEGIN RAIDZ OPEN 614138497547 config asize = 6000749838336 614138497547 config ashift = 9 614139732826 child[0]: asize = 1500192368640, ashift = 9 614141684598 child[1]: asize = 1500192368640, ashift = 9 614143705126 child[2]: asize = 1500192368640, ashift = 9 614144601973 asize = 6000749838336 614144601973 ashift = 9 614144601973 END RAIDZ OPEN 614213906915 BEGIN RAIDZ OPEN 614213906915 config asize = 6000749838336 614213906915 config ashift = 9 614214947122 child[0]: asize = 1500192368640, ashift = 9 614218605590 child[1]: asize = 1500192368640, ashift = 9 614222474452 child[2]: asize = 1500192368640, ashift = 9 614225021617 asize = 6000749838336 614225021617 ashift = 9 614225021617 END RAIDZ OPEN ===end== I did a truss on the zpool import and these are the last lines before the server locks up: --begin-- 733: openat64(6, "c9t1d0s9", O_RDONLY) Err#5 EIO 733: getdents64(6, 0xFEE14000, 8192) = 0 733: close(6) = 0 733: open("/dev/dsk/c8t0d0s6", O_RDONLY) = 6 733: stat64("/lib/libdevid.so.1", 0x08042A68) = 0 733: resolvepath("/lib/libdevid.so.1", "/lib/libdevid.so.1", 1023) = 18 733: open("/lib/libdevid.so.1", O_RDONLY) = 7 733: mmapobj(7, 0x00020000, 0xFECB0DD8, 0x08042AD4, 0x00000000) = 0 733: close(7) = 0 733: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEC60000 733: memcntl(0xFEC70000, 4048, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_DEVID, 0x035C0006, 0x080433CC, 0xFEC71239, 0xFE8E92C0) = 0 733: modctl(MODGETDEVID, 0x035C0006, 0x00000038, 0x080D1D08, 0xFE8E92C0) = 0 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_MINORNAME, 0x035C0006, 0x00006000, 0x080433CC, 0xFE8E92C0) = 0 733: modctl(MODGETMINORNAME, 0x035C0006, 0x00006000, 0x00000002, 0x0808DFC8) = 0 733: close(6) = 0 733: open("/dev/dsk/c8t1d0s6", O_RDONLY) = 6 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_DEVID, 0x035C0046, 0x080433CC, 0xFEC71239, 0xFE8E92C0) = 0 733: modctl(MODGETDEVID, 0x035C0046, 0x00000038, 0x080D1CC8, 0xFE8E92C0) = 0 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_MINORNAME, 0x035C0046, 0x00006000, 0x080433CC, 0xFE8E92C0) = 0 733: modctl(MODGETMINORNAME, 0x035C0046, 0x00006000, 0x00000002, 0x0808DFC8) = 0 733: close(6) = 0 733: open("/dev/dsk/c9t1d0s6", O_RDONLY) = 6 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_DEVID, 0x035C00C6, 0x080433CC, 0xFEC71239, 0xFE8E92C0) = 0 733: modctl(MODGETDEVID, 0x035C00C6, 0x00000038, 0x080D1C48, 0xFE8E92C0) = 0 733: fxstat(2, 6, 0x080433D0) = 0 733: modctl(MODSIZEOF_MINORNAME, 0x035C00C6, 0x00006000, 0x080433CC, 0xFE8E92C0) = 0 733: modctl(MODGETMINORNAME, 0x035C00C6, 0x00006000, 0x00000002, 0x0808DFC8) = 0 733: close(6) = 0 733: ioctl(3, ZFS_IOC_POOL_STATS, 0x08042530) Err#2 ENOENT 733: ioctl(3, ZFS_IOC_POOL_TRYIMPORT, 0x080425A0) = 0 733: open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES/SUNW_OST_OSLIB.mo", O_RDONLY) Err#2 ENOENT ===end== What''s going on? How can I get this pool imported so that I can least get the data off of it? -- This message posted from opensolaris.org