thr3ads.net - zfs discuss - [zfs-discuss] Managed to corrupt my pool [Nov 2006]

If this information is useful, please help other people find it:
Share via:

Jim Hranicky

2006-Nov-30 16:58 UTC

[zfs-discuss] Managed to corrupt my pool

Platform:

  - old dell workstation with an Andataco gigaraid enclosure 
    plugged into an Adaptec 39160
  - Nevada b51

Current zpool config:

   - one two-disk mirror with two hot spares

In my ferocious pounding of ZFS I''ve managed to corrupt my data
pool. This is what I''ve been doing to test it:

   - set zil_disable to 1 in /etc/system
   - continually untar a couple of files into the filesystem
   - manually spin down a drive in the mirror by holding down
     the button on the enclosure
   - for any system hangs reboot with a nasty

          reboot -dnq

I''ve gotten different results after the spindown:

   - works properly: short or no hang, hot spare successfully 
      added to the mirror
   - system hangs, and after a reboot the spare is not added
   - tar hangs, but after running "zpool status" the hot
      spare is added properly and tar continues
   - tar continues, but hangs on "zpool status"

The last is what happened just prior to the corruption. Here''s the
output
of zpool status:

nextest-01# zpool status -v
  pool: zmir
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed with 1 errors on Thu Nov 30 11:37:21 2006
config:

        NAME        STATE     READ WRITE CKSUM
        zmir        DEGRADED     8     0     4
          mirror    DEGRADED     8     0     4
            c3t3d0  ONLINE       0     0    24
            c3t4d0  UNAVAIL      0     0     0  cannot open
        spares
          c0t0d0    AVAIL
          c3t1d0    AVAIL

errors: The following persistent errors have been detected:

          DATASET  OBJECT  RANGE
          15       0       lvl=4294967295 blkid=0

So the questions are:

  - is this fixable? I don''t see an inum I could run find on to remove,
    and I can''t even do a zfs volinit anyway:

        nextest-01# zfs volinit
        cannot iterate filesystems: I/O error

   - would not enabling zil_disable have prevented this?

   - Should I have been doing a 3-way mirror?

   - Is there a more optimum configuration to help prevent this
      kind of corruption?

Ultimately, I want to build a ZFS server with performance and reliability
comparable to say, a Netapp, but the fact that I appear to have been
able to nuke my pool by simulating a hardware error gives me pause. 

I''d love to know if I''m off-base in my worries.

Jim
 
 
This message posted from opensolaris.org

Jim Hranicky

2006-Dec-05 18:32 UTC

head link

[zfs-discuss] Re: Managed to corrupt my pool

> So the questions are:
> 
> - is this fixable? I don''t see an inum I could run
>  find on to remove, 
>    and I can''t even do a zfs volinit anyway:
>        nextest-01# zfs volinit
>  cannot iterate filesystems: I/O error
> 
> - would not enabling zil_disable have prevented
>  this?
> 
>    - Should I have been doing a 3-way mirror?
> - Is there a more optimum configuration to help
>  prevent this  kind of corruption?
Anyone have any thoughts on this? I''d really like to be 
able to build a nice ZFS box for file service but if a 
hardware failure can corrupt a disk pool I''ll have to 
try to find another solution, I''m afraid.
 
 
This message posted from opensolaris.org

Jim Hranicky

2006-Dec-05 18:38 UTC

head link

[zfs-discuss] Re: Managed to corrupt my pool

> Anyone have any thoughts on this? I''d really like to
> be able to build a nice ZFS box for file service but if
> a  hardware failure can corrupt a disk pool I''ll have to
>  try to find another solution, I''m afraid.
Sorry, I worded this poorly -- if the loss of a disk in a mirror
can corrupt the pool it''s going to give me pause in implementing
a ZFS solution. 

Jim
 
 
This message posted from opensolaris.org

Neil Perrin

2006-Dec-06 01:40 UTC

head link

[zfs-discuss] Re: Managed to corrupt my pool

Jim,

I''m not at all sure what happened to your pool.
However, I can answer some of your questions.

Jim Hranicky wrote On 12/05/06 11:32,:>>So the questions are:
>>
>>- is this fixable? I don''t see an inum I could run
>> find on to remove, 
I think the pool is busted. Even the message printed in your
previous email is bad:

           DATASET  OBJECT  RANGE
           15       0       lvl=4294967295 blkid=0

as level is way out of range.
>>   and I can''t even do a zfs volinit anyway:
>>       nextest-01# zfs volinit
>> cannot iterate filesystems: I/O error
I''m not sure why you''re using zfs volinit which I believe
creates
the zvol links, but this further shows problems.
>>
>>- would not enabling zil_disable have prevented this?
No the intent log is not needed for pool integrity.
It ensures the synchronous semantics of O_DSYNC/fsync are obeyed.

Anton B. Rang

2006-Dec-06 05:48 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

> I think the pool is busted. Even the message printed in your
> previous email is bad:
> 
>            DATASET  OBJECT  RANGE
> 15       0       lvl=4294967295 blkid=0
> 
> as level is way out of range.
I think this could be from dmu_objset_open_impl().

It sets object to 0 and level to -1 (= 4294967295).  [Hmmm, this also seems to
indicate a truncation from 64 to 32 bits somewhere.]

Would zdb would show any more detail?

(Actually, it looks like the ZIL also sets object to 0 and level to -1 when
accessing its blocks, but since the ZIL was disabled, I''d guess this
isn''t the issue here.)
 
 
This message posted from opensolaris.org

Jim Hranicky

2006-Dec-06 13:50 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

Here''s the output of zdb:

zmir
    version=3
    name=''zmir''
    state=0
    txg=770
    pool_guid=5904723747772934703
    vdev_tree
        type=''root''
        id=0
        guid=5904723747772934703
        children[0]
                type=''mirror''
                id=0
                guid=15067187713781123481
                metaslab_array=15
                metaslab_shift=28
                ashift=9
                asize=36690722816
                children[0]
                        type=''disk''
                        id=0
                        guid=8544021753105415508
                        path=''/dev/dsk/c3t3d0s0''
                        devid=''id1,sd at x00609487b409636e/a''
                        whole_disk=1
                        is_spare=1
                        DTL=19
                children[1]
                        type=''disk''
                        id=1
                        guid=3579059219373561470
                        path=''/dev/dsk/c3t4d0s0''
                        devid=''id1,sd at n5005076710cff8b5/a''
                        whole_disk=1
                        is_spare=1
                        DTL=20

It doesn''t seem to give much information, and I don''t know any
of
the "secret options" :->

Can anyone at all give me a good reason why this did happen,
or give me any options to zdb so I can find out?

I can try plugging the spun-down disk back in and seeing if it can
recover, although that''s not going to be an option if this happens
for real...

Jim
 
 
This message posted from opensolaris.org

Alan Romeril

2006-Dec-06 17:36 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

Hi Jim,
That looks interesting though, I''m not a zfs expert by any means but
look at some of the properties of the children elements of the mirror:-

version=3
name=''zmir''
state=0
txg=770
pool_guid=5904723747772934703
vdev_tree
type=''root''
id=0
guid=5904723747772934703
children[0]
type=''mirror''
id=0
guid=15067187713781123481
metaslab_array=15
metaslab_shift=28
ashift=9
asize=36690722816
children[0]
type=''disk''
id=0
guid=8544021753105415508
[b]path=''/dev/dsk/c3t3d0s0''[/b]
devid=''id1,sd at x00609487b409636e/a''
whole_disk=1
[b]is_spare=1[/b]
DTL=19
children[1]
type=''disk''
id=1
guid=3579059219373561470
[b]path=''/dev/dsk/c3t4d0s0''[/b]
devid=''id1,sd at n5005076710cff8b5/a''
whole_disk=1
[b]is_spare=1[/b]
DTL=20

If those are the original path ids, and you didn''t move the disks on
the bus?  Why is the is_spare flag set?

There are a lot of options to zdb, some can produce a lot of output.
Try

zdb zmir

Check the drive label contents with 

zdb -l /dev/dsk/c3t0d0s0
zdb -l /dev/dsk/c3t1d0s0
zdb -l /dev/dsk/c3t3d0s0
zdb -l /dev/dsk/c3t4d0s0

Uberblock info with 

zdb -uuu zmir

And dataset info with

zdb -dd zmir

There are more options, and they give even more info if you repeat the option
letter more times ( especially the -d flag... )

These might be worth posting to help one of the developers spot something.
Cheers,
Alan
 
 
This message posted from opensolaris.org

Jim Hranicky

2006-Dec-06 20:35 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

> If those are the original path ids, and you didn''t
> move the disks on the bus?  Why is the is_spare flag
Well, I''m not sure, but these drives were set as spares in another pool
I deleted -- should I have done something to the drives (fdisk?) before
rearranging it?

The rest of the options are spitting out a bunch of stuff I''ll be
glad to post links too, but if the problem is that the drives are
erroneously marked as spares I''ll re-init them and start over.

Jim
 
 
This message posted from opensolaris.org

Alan Romeril

2006-Dec-06 22:18 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

Hold fire on the re-init until one of the devs chips in, maybe I''m
barking up the wrong tree ;)

--a
 
 
This message posted from opensolaris.org

Eric Schrock

2006-Dec-06 22:22 UTC

head link

[zfs-discuss] Re: Re: Managed to corrupt my pool

On Wed, Dec 06, 2006 at 12:35:58PM -0800, Jim Hranicky
wrote:> > If those are the original path ids, and you didn''t
> > move the disks on the bus?  Why is the is_spare flag
> 
> Well, I''m not sure, but these drives were set as spares in another
pool
> I deleted -- should I have done something to the drives (fdisk?) before
> rearranging it?
> 
> The rest of the options are spitting out a bunch of stuff I''ll be
> glad to post links too, but if the problem is that the drives are
> erroneously marked as spares I''ll re-init them and start over.
There are known issues with the way spares are tracked and recorded on
disk that can result in a variety of strange behavior in exceptional
circumstances.  We are working on resolving these issues.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

zfs discuss - Nov 2006 - Managed to corrupt my pool

[zfs-discuss] Managed to corrupt my pool

[zfs-discuss] Re: Managed to corrupt my pool

[zfs-discuss] Re: Managed to corrupt my pool

[zfs-discuss] Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool

[zfs-discuss] Re: Re: Managed to corrupt my pool