thr3ads.net - zfs discuss - [zfs-discuss] Kernel panic on import / interrupted zfs destroy [Aug 2010]

If this information is useful, please help other people find it:
Share via:

Matthew Ellison

2010-Aug-18 07:15 UTC

[zfs-discuss] Kernel panic on import / interrupted zfs destroy

I have a box running snv_134 that had a little boo-boo.

The problem first started a couple of weeks ago with some corruption on two
filesystems in a 11 disk 10tb raidz2 set.  I ran a couple of scrubs that
revealed a handful of corrupt files on my 2 de-duplicated zfs filesystems.  No
biggie.

I thought that my problems had something to do with de-duplication in 134, so I
went about the process of creating new filesystems and copying over the
"good" files to another box.  Every time I touched the "bad"
files I got a filesystem error 5.  When trying to delete them manually, I got
kernel panics - which eventually turned into reboot loops.

I tried installing nexenta on another disk to see if that would allow me to get
passed the reboot loop - which it did.  I finished moving the "good"
files over (using rsync, which skipped over the error 5 files, unlike cp or mv),
and destroyed one of the two filesystems.  Unfortunately, this caused a kernel
panic in the middle of the destroy operation, which then became another panic /
reboot loop.

I was able to get in with milestone=none and delete the zfs cache, but now I
have a new problem:  Any attempt to import the pool results in a panic.  I have
tried from my snv_134 install, from the live cd, and from nexenta.  I have tried
various zdb incantations (with aok=1 and zfs:zfs_recover=1), to no avail - these
error out after a few minutes.  I have even tried another controller.

I have zdb -e -bcsvL running now from 134 (without aok=1) which has been running
for several hours.  Can zdb recover from this kind of situation (with a
half-destroyed filesystem that panics the kernel on import?)  What is the impact
of the above zdb operation without aok=1?  Is there any likelihood of a recovery
of non-affected filesystems?

Any suggestions?

Regards,

Matthew Ellison

Matthew Ellison

2010-Aug-18 20:29 UTC

head link

[zfs-discuss] Fwd: Kernel panic on import / interrupted zfs destroy

Hmm still running zdb since last night.  Anyone have any suggestions or advice
how to proceed with this issue?

Thanks,

Matthew Ellison

Begin forwarded message:
> From: Matthew Ellison <matt at mattellison.com>
> Date: August 18, 2010 3:15:39 AM EDT
> To: zfs-discuss at opensolaris.org
> Subject: Kernel panic on import / interrupted zfs destroy
> 
> I have a box running snv_134 that had a little boo-boo.
> 
> The problem first started a couple of weeks ago with some corruption on two
filesystems in a 11 disk 10tb raidz2 set.  I ran a couple of scrubs that
revealed a handful of corrupt files on my 2 de-duplicated zfs filesystems.  No
biggie.
> 
> I thought that my problems had something to do with de-duplication in 134,
so I went about the process of creating new filesystems and copying over the
"good" files to another box.  Every time I touched the "bad"
files I got a filesystem error 5.  When trying to delete them manually, I got
kernel panics - which eventually turned into reboot loops.
> 
> I tried installing nexenta on another disk to see if that would allow me to
get passed the reboot loop - which it did.  I finished moving the
"good" files over (using rsync, which skipped over the error 5 files,
unlike cp or mv), and destroyed one of the two filesystems.  Unfortunately, this
caused a kernel panic in the middle of the destroy operation, which then became
another panic / reboot loop.
> 
> I was able to get in with milestone=none and delete the zfs cache, but now
I have a new problem:  Any attempt to import the pool results in a panic.  I
have tried from my snv_134 install, from the live cd, and from nexenta.  I have
tried various zdb incantations (with aok=1 and zfs:zfs_recover=1), to no avail -
these error out after a few minutes.  I have even tried another controller.
> 
> I have zdb -e -bcsvL running now from 134 (without aok=1) which has been
running for several hours.  Can zdb recover from this kind of situation (with a
half-destroyed filesystem that panics the kernel on import?)  What is the impact
of the above zdb operation without aok=1?  Is there any likelihood of a recovery
of non-affected filesystems?
> 
> Any suggestions?
> 
> Regards,
> 
> Matthew Ellison
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100818/85242048/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

zfs discuss - Aug 2010 - Kernel panic on import / interrupted zfs destroy

[zfs-discuss] Kernel panic on import / interrupted zfs destroy

[zfs-discuss] Fwd: Kernel panic on import / interrupted zfs destroy

Possibly Parallel Threads