thr3ads.net - zfs discuss - [zfs-discuss] ZFS causes system restart [Apr 2006]

If this information is useful, please help other people find it:
Share via:

Jeff Davis

2006-Apr-19 23:55 UTC

[zfs-discuss] ZFS causes system restart

I am new to ZFS. I searched around for this problem and did not find it.

# cd /foo
# mkfile 64m 1
# mkfile 64m 2
# mkfile 64m 3
# mkfile 64m 4
# mkfile 64m 5
# dd if=/dev/urandom of=afile bs=1024 count=102400
# zpool create tank2 raidz /foo/1 /foo/2 /foo/3 /foo/4 /foo/5
# cp afile /tank2
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   22.5G   3.58M   22.5G     0%  ONLINE     -
tank2                   296M    125M    171M    42%  ONLINE     -
# echo junk1 > 1
# echo junk2 > 2
# umount /tank2
# zfs mount /tank2
# cat /tank2/afile > /dev/null

causes system restart. Is this the intended result?
The result varies somewhat, but after putting junk in those files (which zfs is
treating as a device), it will do a sudden OS reset at some point.

I think it may be because the files are all of a sudden a different size than
zfs is expecting.
 
 
This message posted from opensolaris.org

Eric Schrock

2006-Apr-20 00:34 UTC

head link

[zfs-discuss] ZFS causes system restart

You are corrupting two copies in a RAID-Z group, which can only has
single fault tolerance.  While we can survive uncorrectable read errors,
write errors will result in a panic.  By cat''ing that file, you are
going to push the atime for the file, which will blow up.

- Eric

On Wed, Apr 19, 2006 at 04:55:47PM -0700, Jeff Davis
wrote:> I am new to ZFS. I searched around for this problem and did not find it.
> 
> # cd /foo
> # mkfile 64m 1
> # mkfile 64m 2
> # mkfile 64m 3
> # mkfile 64m 4
> # mkfile 64m 5
> # dd if=/dev/urandom of=afile bs=1024 count=102400
> # zpool create tank2 raidz /foo/1 /foo/2 /foo/3 /foo/4 /foo/5
> # cp afile /tank2
> # zpool list
> NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
> tank                   22.5G   3.58M   22.5G     0%  ONLINE     -
> tank2                   296M    125M    171M    42%  ONLINE     -
> # echo junk1 > 1
> # echo junk2 > 2
> # umount /tank2
> # zfs mount /tank2
> # cat /tank2/afile > /dev/null
> 
> causes system restart. Is this the intended result?
> The result varies somewhat, but after putting junk in those files (which
zfs is treating as a device), it will do a sudden OS reset at some point.
> 
> I think it may be because the files are all of a sudden a different size
than zfs is expecting.
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Alan Romeril

2006-Apr-20 09:13 UTC

head link

[zfs-discuss] Re: ZFS causes system restart

The system is probably panicing because the ZFS checksums of the data read back
by the cat /tank2/afile > /dev/null are incorrect as the pool was open at the
time you corrupted the vdevs does your machine write a panic string out to
/var/adm/messages at all?


If you export the pool things behave much more gracefully.....

bash-3.00# cd /mnt/
bash-3.00# mkfile 64m 1
bash-3.00# mkfile 64m 2
bash-3.00# mkfile 64m 3
bash-3.00# mkfile 64m 4
bash-3.00# mkfile 64m 5
bash-3.00# dd if=/dev/urandom of=afile bs=1024 count=1024
bash-3.00# zpool create tank2 raidz /mnt/1 /mnt/2 /mnt/3 /mnt/4 /mnt/5
bash-3.00# cp afile /tank2/
bash-3.00# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank2                   296M   1.31M    295M     0%  ONLINE     -
bash-3.00# zpool export tank2
bash-3.00# echo blahblahblah > 1
bash-3.00# zpool import -d /mnt tank2
bash-3.00# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank2                   296M   1.36M    295M     0%  DEGRADED   -
bash-3.00# zpool status
  pool: tank2
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using ''zpool replace''.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: resilver completed with 0 errors on Thu Apr 20 09:50:59 2006
config:

        NAME                      STATE     READ WRITE CKSUM
        tank2                     DEGRADED     0     0     0
          raidz                   DEGRADED     0     0     0
            15027832298606416063  UNAVAIL      0     0     0  was /mnt/1
            /mnt/2                ONLINE       0     0     0
            /mnt/3                ONLINE       0     0     0
            /mnt/4                ONLINE       0     0     0
            /mnt/5                ONLINE       0     0     0

errors: No known data errors
bash-3.00# dd if=/tank2/afile bs=1024k >/dev/null
1+0 records in
1+0 records out
bash-3.00# cksum /tank2/afile
2746568689      1048576 /tank2/afile
bash-3.00# mkfile 64m 6
bash-3.00# zpool replace tank2 /mnt/1 /mnt/6
bash-3.00# zpool status
  pool: tank2
 state: ONLINE
 scrub: resilver completed with 0 errors on Thu Apr 20 09:52:21 2006
config:

        NAME        STATE     READ WRITE CKSUM
        tank2       ONLINE       0     0     0
          raidz     ONLINE       0     0     0
            /mnt/6  ONLINE       0     0     0  267K resilvered
            /mnt/2  ONLINE       0     0     0
            /mnt/3  ONLINE       0     0     0
            /mnt/4  ONLINE       0     0     0
            /mnt/5  ONLINE       0     0     0

errors: No known data errors
bash-3.00# dd if=/tank2/afile bs=1024k >/dev/null
1+0 records in
1+0 records out
bash-3.00# cksum /tank2/afile
2746568689      1048576 /tank2/afile


That''s the replacement of one corrupt vdev.

For 2 or more...

bash-3.00# zpool export tank2
bash-3.00#  echo blahblahblah > 2
bash-3.00#  echo blahblahblah > 3

bash-3.00# zpool import -d /mnt tank2
cannot import ''tank2'': one or more devices is unavailable
bash-3.00# zpool import -d /mnt
  pool: tank2
    id: 11628506508750100983
 state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

        tank2       UNAVAIL   insufficient replicas
          raidz     UNAVAIL   insufficient replicas
            /mnt/6  ONLINE
            /mnt/2  UNAVAIL   corrupted data
            /mnt/3  UNAVAIL   corrupted data
            /mnt/4  ONLINE
            /mnt/5  ONLINE

As expected, okay the data has gone because we have lost too many vdevs but
there''s no panic :)

Cheers,
Alan
 
 
This message posted from opensolaris.org

Robert Milkowski

2006-Apr-20 09:21 UTC

head link

[zfs-discuss] ZFS causes system restart

Hello Eric,

Thursday, April 20, 2006, 2:34:49 AM, you wrote:

ES> You are corrupting two copies in a RAID-Z group, which can only has
ES> single fault tolerance.  While we can survive uncorrectable read errors,
ES> write errors will result in a panic.  By cat''ing that file, you
are
ES> going to push the atime for the file, which will blow up.

IMHO there should be a property which controls what to do in such
situations - similar to UFS onerr=panic|lock.

If I have few hundred TBs in different pools I do not really want to
panic whole system just because one of the many pools is inconsistent.


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Eric Schrock

2006-Apr-20 15:42 UTC

head link

[zfs-discuss] Re: ZFS causes system restart

On Wed, Apr 19, 2006 at 08:27:00PM -0700, jeffrey davis
wrote:> 
> Is it possible to cause an I/O error instead of resetting the system?
Theoretically, yes.  But it''s very difficult.  We would need the
ability
to abort an entire transaction group mid-stride, as well as the ability
to propagate those errors up the stack in a meaningful way.  Something
we''d like to do, but decidedly non-trivial.
> My computer actually reboots. Is there a way after the reboot to tell
> what happened?
Yes, what is actually happening is your machine is panicking.  Most
likely you are running on a desktop, which means the console message
indicating the panic isn''t visible since X has control of the screen.
You can find out what happened after a reboot by doing:

	# cd /var/crash/<hostname>
	# mdb *.0  (or whatever the highest number is)

Then "::status" "$C" and "::msgbuf" are useful
commands.
> What happens when Solaris encounters an unrecoverable error on a
> non-ZFS drive?
It depends on the subsystem that''s using it.  The driver will return
EIO, but whatever happens after that is up to the consumer.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Eric Schrock

2006-Apr-20 15:46 UTC

head link

[zfs-discuss] ZFS causes system restart

On Thu, Apr 20, 2006 at 11:21:34AM +0200, Robert Milkowski
wrote:> 
> IMHO there should be a property which controls what to do in such
> situations - similar to UFS onerr=panic|lock.
> 
> If I have few hundred TBs in different pools I do not really want to
> panic whole system just because one of the many pools is inconsistent.
This is not a case of "if (flag) do something else".  Recovering from
a
write error requires some fundamental changes to the architecture at
multiple levels, as we are in the middle of syncing a transaction group
and have long since lost any correlation with a filesystem-level request
by the time the failure occurs.  And due to the tree-like nature of ZFS,
aborting an entire transaction group could cause a large number of I/O
failures from a single block failure.

Once we figure out how to handle this, then certainly a pool-wide
property would seem reasonable.  However, there is a ton of work that
needs to be done before this.  Some first steps include reallocating
writes, which will let us retry writes on other drives/locations
mid-sync.

It''s important to note, however, that as of build 36, we will survive
any read errors, and provide you a complete list (via ''zpool status
-v'')
of any such errors encountered.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

zfs discuss - Apr 2006 - ZFS causes system restart

[zfs-discuss] ZFS causes system restart

[zfs-discuss] ZFS causes system restart

[zfs-discuss] Re: ZFS causes system restart

[zfs-discuss] ZFS causes system restart

[zfs-discuss] Re: ZFS causes system restart

[zfs-discuss] ZFS causes system restart