thr3ads.net - zfs discuss - [zfs-discuss] Almost lost my data [Jan 2007]

If this information is useful, please help other people find it:
Share via:
Robert Milkowski
2007-Jan-22 16:40 UTC
[zfs-discuss] Almost lost my data

Hi.

     Fortunately I haven''t.

I reinstalled system to snv_55b (fresh install) on c0d1 disk preserving c0d1s0
slice as there''s my home pool. Before re-install home pool was actually
an mirror between c0d0s0 and c0d1s0. Unfortunatelly I haven''t checked
to preserve data on c0d0 so its vtoc was wiped-out. Ok, my fault but I still
have an copy so it shouldn''t had be a problem. After system was
installed I created an vtoc on c0d0 and created c0d0s0 slice with a size little
bit larger than c0d1s0 (different disk geometries). After that I imported my
home pool and started to see CKSUM errors on c0d0s0 - which is probably ok as
I''m not sure vtoc was re-created the same. I could just scrub but
instead I detached c0d0s0 from home pool. Then I run format on c0d0 and system
panicked.
> ::statusdebugging crash dump vmcore.0 (64-bit) from milek
operating system: 5.11 snv_55b (i86pc)
panic message: 
BAD TRAP: type=e (#pf Page fault) rp=fffffe800039b5c0 addr=58 occurred in module
"dadk" due to a NULL pointer deref
erence
dump content: kernel pages only> ::stackdadk_pktprep+0x2a(0, 0, ffffffff89c29100, fffffffffbbfb960, 0, 0)
dadk_dk+0x3e(0, ffffffff87627708, ffffffff89c29100)
dadk_dk_strategy+0x1c(ffffffff89c29100)
default_physio+0x390(fffffffffbbfbe60, ffffffff89c29100, 6600000002, 100,
fffffffffbbfbe40, fffffe800039b8e0)
physio+0x25(fffffffffbbfbe60, ffffffff89c29100, 6600000002, 100,
fffffffffbbfbe40, fffffe800039b8e0)
dadk_dk_buf_setup+0xc8(ffffffff81c0b3f0, ffffffff87627708, 6600000002, 0, 100)
dadk_ioctl+0x6c(ffffffff81c0b3f0, 6600000002, 5, ffffffff87627708, 100007,
ffffffff81177798)
cmdkioctl+0xe3(6600000002, 5, 8047a74, 100007, ffffffff81177798,
fffffe800039be9c)
cdev_ioctl+0x48(6600000002, 5, 8047a74, 100007, ffffffff81177798,
fffffe800039be9c)
spec_ioctl+0x86(ffffffff86d991c0, 5, 8047a74, 100007, ffffffff81177798,
fffffe800039be9c)
fop_ioctl+0x37(ffffffff86d991c0, 5, 8047a74, 100007, ffffffff81177798,
fffffe800039be9c)
ioctl+0x16b(3, 5, 8047a74)
sys_syscall32+0x101()> 
>$<msgbuf[...]
ata_disk_iosetup: byte count zero
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: abort request, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: abort device, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: reset target, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: reset bus, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: early timeout, target=1 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0/cmdk at 0,0 (Disk0):
        Error for command ''read defect list''    Error Level:
Informational
        Requested Block 52228, Error Block: -82580
        Sense Key: aborted command
        Vendor ''Gen-ATA '' error code: 0x3
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0/cmdk at 1,0 (Disk1):
        Error for command ''write sector''        Error Level:
Informational
        Sense Key: aborted command
        Vendor ''Gen-ATA '' error code: 0x3
WARNING: md: d1: write error on /dev/dsk/c0d1s1
WARNING: md: d1: /dev/dsk/c0d1s1 needs maintenance
WARNING: md: d1: /dev/dsk/c0d1s1 last erred
ata_disk_iosetup: byte count zero
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: abort request, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: abort device, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: reset target, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: reset bus, target=0 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0 (ata0):
        timeout: early timeout, target=1 lun=0
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0/cmdk at 0,0 (Disk0):
        Error for command ''read defect list''    Error Level:
Informational
        Requested Block 52228, Error Block: -82580
        Sense Key: aborted command
        Vendor ''Gen-ATA '' error code: 0x3
WARNING: /pci at 0,0/pci-ide at 7,1/ide at 0/cmdk at 1,0 (Disk1):
        Error for command ''write sector''        Error Level:
Informational
        Sense Key: aborted command
        Vendor ''Gen-ATA '' error code: 0x3

panic[cpu0]/thread=ffffffff85eb96c0: 
BAD TRAP: type=e (#pf Page fault) rp=fffffe800039b5c0 addr=58 occurred in module
"dadk" due to a NULL pointer deref
erence


format: 
#pf Page fault
Bad kernel fault at addr=0x58
pid=4978, pc=0xfffffffffbbfb2ba, sp=0xfffffe800039b6b0, eflags=0x10286
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4:
6f0<xmme,fxsr,pge,mce,pae,pse>
cr2: 58 cr3: b7dd000 cr8: c
        rdi:                0 rsi:                0 rdx: ffffffff89c29100
        rcx: fffffffffbbfb960  r8:                0  r9:                0
        rax:                0 rbx:              100 rbp: fffffe800039b6f0
        r10: 1ffffffff0c90fc0 r11: fffffffffbcdb740 r12:                0
        r13: fffffffffbbfb960 r14: ffffffff87627708 r15:            80000
        fsb: ffffffff80000000 gsb: fffffffffbc27730  ds:               43
         es:               43  fs:                0  gs:              1c3
        trp:                e err:                0 rip: fffffffffbbfb2ba
         cs:               28 rfl:            10286 rsp: fffffe800039b6b0
         ss:               30

fffffe800039b4a0 unix:die+c8 ()
fffffe800039b5b0 unix:trap+12ec ()
fffffe800039b5c0 unix:cmntrap+140 ()
fffffe800039b6f0 dadk:dadk_pktprep+2a ()
fffffe800039b740 dadk:dadk_dk+3e ()
fffffe800039b760 dadk:dadk_dk_strategy+1c ()
fffffe800039b860 genunix:default_physio+390 ()
fffffe800039b8a0 genunix:physio+25 ()
fffffe800039b950 dadk:dadk_dk_buf_setup+c8 ()
fffffe800039b9f0 dadk:dadk_ioctl+6c ()
fffffe800039bce0 cmdk:cmdkioctl+e3 ()
fffffe800039bd20 genunix:cdev_ioctl+48 ()
fffffe800039bd60 specfs:spec_ioctl+86 ()
fffffe800039bdc0 genunix:fop_ioctl+37 ()
fffffe800039bec0 genunix:ioctl+16b ()
fffffe800039bf10 unix:brand_sys_syscall32+1a3 ()

syncing file systems...

panic[cpu0]/thread=ffffffff85eb96c0: 
md: writer lock is held

dumping to /dev/dsk/c0d1s3, offset 65536, content: kernel


And although I detached disks when looking into crashdump I see:
> ::spa -vADDR                 STATE NAME                                                
ffffffff81cd0980    ACTIVE home

    ADDR             STATE     AUX          DESCRIPTION                        
    ffffffff81e94540 HEALTHY   -            root
    ffffffff81e94000 HEALTHY   -              mirror
    ffffffff81e8f040 HEALTHY   -                /dev/dsk/c0d1s0
    ffffffff81e8f580 HEALTHY   -               
/dev/dsk/c0d0s0> 
Which is strange.


Ok, looks like I lost both disks for a moment.

After reboot I got in very unpleasant condition with ZFS.
bash-3.00# zpool status
  pool: home
 state: FAULTED
status: One or more devices could not be used because the the label is missing 
        or invalid.  There are insufficient replicas for the pool to continue
        functioning.
action: Destroy and re-create the pool from a backup source.
   see: http://www.sun.com/msg/ZFS-8000-5E
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        home        FAULTED      0     0     0  corrupted data
          mirror    DEGRADED     0     0     0
            c0d1s0  FAULTED      0     0     0  corrupted data
            c0d0s0  ONLINE       0     0     0
bash-3.00# zpool history home


So despite I detached c0d0s0 it''s still there and this time c0d1s0 has
a problem.
Also no pool history.


bash-3.00# zpool export home
bash-3.00# zpool status
no pools available
bash-3.00# zpool import
  pool: home
    id: 14229052357982344127
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
        The pool may be active on on another system, but can be imported using
        the ''-f'' flag.
config:

        home        ONLINE
          c0d1s0    ONLINE
bash-3.00# 
bash-3.00# 
bash-3.00# zpool import home
cannot import ''home'': pool may be in use from other system
use ''-f'' to import anyway
bash-3.00# zpool import -f home


Really strange I had to use -f option - it was just exported!

bash-3.00# zpool history home
History for ''home'':
[only today]
2007-01-22.16:02:48 zpool import -f home
2007-01-22.16:16:41 zfs snapshot home/milek at 2007-01-22
2007-01-22.16:16:49 zfs snapshot home/milek/mail at 2007-01-22
2007-01-22.16:16:59 zfs snapshot home/var.mail at 2007-01-22
2007-01-22.16:38:40 zpool detach home c0d0s0
2007-01-22.17:11:17 zpool import -f home


ok, you can see I did detach.


Something wrong has happened here.

Any idea?
 
 
This message posted from opensolaris.org
zfs discuss - Jan 2007 - Almost lost my data

[zfs-discuss] Almost lost my data