thr3ads.net - zfs discuss - [zfs-discuss] ZFS on Freebsd 7.0 [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Jason Morton

2007-Dec-07 17:18 UTC

[zfs-discuss] ZFS on Freebsd 7.0

I am using ZFS on FreeBSD 7.0_beta3. This is the first time i have  
used ZFS and I have run into something that I am not sure if this is  
normal, but am very concerned about.

SYSTEM INFO:
hp 320s (storage array)
12 disks (750GB each)
2GB RAM
1GB flash drive (running the OS)

When I take a disk offline and replace it with my spare, after the  
spare rebuild it shows there are numerous errors. see below:
scrub: resilver completed with 946 errors on Thu Dec  6 15:15:32 2007
config:

	NAME        STATE     READ WRITE CKSUM
	fatty       DEGRADED     0     0 3.71K
	  raidz2    DEGRADED     0     0 3.71K
	    da0     ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	    da2     ONLINE       0     0     0
	    da3     ONLINE       0     0   300
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0
	    da6     ONLINE       0     0   253
	    da7     ONLINE       0     0     0
	    da8     ONLINE       0     0     0
	    spare   DEGRADED     0     0     0
	      da9   OFFLINE      0     0     0
	      da11  ONLINE       0     0     0
	    da10    ONLINE       0     0     0
	spares
	  da11      INUSE     currently in use

errors: 801 data errors, use ''-v'' for a list


After I detach the spare da11 and bring da9 back online all the errors  
go away.

pool: fatty
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the  
errors
	using ''zpool clear'' or replace the device with
''zpool replace''.
  see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed with 0 errors on Thu Dec  6 15:57:23 2007
config:

	NAME        STATE     READ WRITE CKSUM
	fatty       ONLINE       0     0 3.71K
	  raidz2    ONLINE       0     0 3.71K
	    da0     ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	    da2     ONLINE       0     0     0
	    da3     ONLINE       0     0   300
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0
	    da6     ONLINE       0     0   253
	    da7     ONLINE       0     0     0
	    da8     ONLINE       0     0     0
	    da9     ONLINE       0     0     0
	    da10    ONLINE       0     0     0
	spares
	  da11      AVAIL

errors: No known data errors


Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071207/8d254f41/attachment.html>

Peter Schuller

2007-Dec-07 19:59 UTC

head link

[zfs-discuss] ZFS on Freebsd 7.0

> 	NAME        STATE     READ WRITE CKSUM
> 	fatty       DEGRADED     0     0 3.71K
> 	  raidz2    DEGRADED     0     0 3.71K
> 	    da0     ONLINE       0     0     0
> 	    da1     ONLINE       0     0     0
> 	    da2     ONLINE       0     0     0
> 	    da3     ONLINE       0     0   300
> 	    da4     ONLINE       0     0     0
> 	    da5     ONLINE       0     0     0
> 	    da6     ONLINE       0     0   253
> 	    da7     ONLINE       0     0     0
> 	    da8     ONLINE       0     0     0
> 	    spare   DEGRADED     0     0     0
> 	      da9   OFFLINE      0     0     0
> 	      da11  ONLINE       0     0     0
> 	    da10    ONLINE       0     0     0
> 	spares
> 	  da11      INUSE     currently in use
>
> errors: 801 data errors, use ''-v'' for a list
>
>
> After I detach the spare da11 and bring da9 back online all the errors
> go away.
Theory:

Suppose da3 and da6 are either bad drives, have cabling issues, or are on a 
controller suffering corruption (different from the other drives).

If you now were to replace da9 by da11, the resilver operation would be 
reading from these drives, thus triggering checksum issues. Once you bring 
da9 back in, it is either entirely up to date or very close to up to date, so 
the amount of I/O required to resilver it is very small and may not trigger 
problems.

If this theory is correct, a scrub (zpool scrub fatty) should encounter 
checksum errors on da3 and da6.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or ''Peter Schuller <peter.schuller at
infidyne.com>''
Key retrieval: Send an E-Mail to getpgpkey at scode.org
E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: This is a digitally signed message part.
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071207/7d3c0720/attachment.bin>

Karl Pielorz

2007-Dec-07 21:05 UTC

head link

[zfs-discuss] ZFS on Freebsd 7.0

--On 07 December 2007 11:18 -0600 Jason Morton 
<jasonm at layeredtechnologies.com> wrote:
> I am using ZFS on FreeBSD 7.0_beta3. This is the first time i have used
> ZFS and I have run into something that I am not sure if this is normal,
> but am very concerned about.
>
> SYSTEM INFO:
> hp 320s (storage array)
> 12 disks (750GB each)
> 2GB RAM
> 1GB flash drive (running the OS)
Hi There,

I''ve been running ZFS under FreeBSD 7.0 for a few months now, and we
also
have a lot of HP / Proliant Kit - and, touch wood, so far - we''ve not
seen
any issues.

The first thing I''d suggest is make sure you have the absolutely
*latest*
firmware on the BIOS, and RAID controller (P400 I think the 320S is) from 
HP''s site. We''ve had a number of problems with drives
''disappearing''
array''s locking, and errors with previous firmware in the past - which
were
all (finally) resolved by updated firmware. Even our latest delivered batch 
of 360''s and 380''s didn''t have anything like
''current'' firmware on.
> When I take a disk offline and replace it with my spare, after the spare
> rebuild it shows there are numerous errors. see below:
> scrub: resilver completed with 946 errors on Thu Dec  6 15:15:32 2007
Being as they''re checksum errors - they probably won''t be
logged on the
console (as ZFS detected them, and not nesc. the underlying CAM layers) - 
but worth checking in case something "isn''t happy".

With that in mind - you might also want to check if there''s anything in
common with da3 and da6 - either in the physical drives, or where they are 
on the DSL320''s drive bay/box allocations, as shown by the RAID
controller
config (F8 at boot time when the RAID is init''ing).

-Kp

Joe

2008-Mar-28 22:37 UTC

head link

[zfs-discuss] ZFS on Freebsd 7.0

On Dec 7, 2007, at 1:05 PM, Karl Pielorz wrote:
>
>
> --On 07 December 2007 11:18 -0600 Jason Morton
> <jasonm at layeredtechnologies.com> wrote:
>
>> I am using ZFS on FreeBSD 7.0_beta3. This is the first time i have  
>> used
>> ZFS and I have run into something that I am not sure if this is  
>> normal,
>> but am very concerned about.
>>
>> SYSTEM INFO:
>> hp 320s (storage array)
>> 12 disks (750GB each)
>> 2GB RAM
>> 1GB flash drive (running the OS)
>
> Hi There,
>
> I''ve been running ZFS under FreeBSD 7.0 for a few months now, and
we
> also
> have a lot of HP / Proliant Kit - and, touch wood, so far - we''ve
> not seen
> any issues.
>
Jason,

Now that FreeBSD 7 has been out for a while, have you had any problems  
with your ZFS pools?

-joe

zfs discuss - Dec 2007 - ZFS on Freebsd 7.0

[zfs-discuss] ZFS on Freebsd 7.0

[zfs-discuss] ZFS on Freebsd 7.0

[zfs-discuss] ZFS on Freebsd 7.0

[zfs-discuss] ZFS on Freebsd 7.0