thr3ads.net - freebsd stable - ZFS hot spares [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Steve Polyack

2010-Mar-08 18:06 UTC

ZFS hot spares

ZFS in FreeBSD lacks at least one major feature from the Solaris 
version: hot spares.   There is a PR open at 
http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been 
any motion/thoughts posted on it since its creation almost one year ago.

I'm aware that on Solaris, hot spare replacement is handled by a few 
Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug 
into the Solaris FMA (Fault Management Architecture).  Have there been 
any thoughts on porting these over or getting something similar running 
within FreeBSD?  With all of the recent SATA/SAS CAM hotplug work now 
committed, it would be nice to have automatic replacement of hot spares 
with a future hot-replacement of the failed drive.

On the other side, I'd be interested in hearing if anyone has had 
success in rolling their own scripted solution: i.e. something which 
polls 'zpool status' looking for failed drives and performing hot-spare 
replacements automatically.

Thanks,
Steve Polyack

Pawel Jakub Dawidek

2010-Mar-09 09:18 UTC

head link

ZFS hot spares

On Mon, Mar 08, 2010 at 01:06:10PM -0500, Steve Polyack
wrote:> ZFS in FreeBSD lacks at least one major feature from the Solaris 
> version: hot spares.   There is a PR open at 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't
been
> any motion/thoughts posted on it since its creation almost one year ago.
> 
> I'm aware that on Solaris, hot spare replacement is handled by a few 
> Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug 
> into the Solaris FMA (Fault Management Architecture).  Have there been 
> any thoughts on porting these over or getting something similar running 
> within FreeBSD?  With all of the recent SATA/SAS CAM hotplug work now 
> committed, it would be nice to have automatic replacement of hot spares 
> with a future hot-replacement of the failed drive.
> 
> On the other side, I'd be interested in hearing if anyone has had 
> success in rolling their own scripted solution: i.e. something which 
> polls 'zpool status' looking for failed drives and performing
hot-spare
> replacements automatically.
Currently FreeBSD's ZFS sends various events to devd. It should be
possible to implement some scripts (or maybe reuse
zfs-retire/zfs-diagnose?) to perform 'zpool replace' when disk
disappears, etc. This shouldn't be very hard modulo bugs in FreeBSD/ZFS
as this functionality, because unused, wasn't tested.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100309/a9568819/attachment.pgp

Steve Polyack

2010-Mar-09 18:33 UTC

head link

ZFS hot spares

On 03/09/10 05:11, Ivan Voras wrote:> On 03/08/10 19:06, Steve Polyack wrote:
>> ZFS in FreeBSD lacks at least one major feature from the Solaris
>> version: hot spares. There is a PR open at
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't
been
>> any motion/thoughts posted on it since its creation almost one year
ago.
>>
>> I'm aware that on Solaris, hot spare replacement is handled by a
few
>> Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug
>> into the Solaris FMA (Fault Management Architecture). Have there been
>> any thoughts on porting these over or getting something similar running
>> within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now
>> committed, it would be nice to have automatic replacement of hot spares
>> with a future hot-replacement of the failed drive.
>>
>> On the other side, I'd be interested in hearing if anyone has had
>> success in rolling their own scripted solution: i.e. something which
>> polls 'zpool status' looking for failed drives and performing
hot-spare
>> replacements automatically.
>
> You don't have to exactly poll it. See /etc/devd.conf:
>
> # Sample ZFS problem reports handling.
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "zpool";
>         action "logger -p kern.err 'ZFS: failed to load zpool
$pool'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "vdev";
>         action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool
> type=$type'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "data";
>         action "logger -p kern.warn 'ZFS: zpool I/O failure, 
> zpool=$pool error=$zio_err'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "io";
>         action "logger -p kern.warn 'ZFS: vdev I/O failure, 
> zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size 
> error=$zio_err'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "checksum";
>         action "logger -p kern.warn 'ZFS: checksum mismatch, 
> zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'";
> };
>
> I don't really know if these notifications actually work since I
don't
> have hot-plug test machines, but if they do, this looks like a decent 
> starting point.
>
Thanks for the suggestions.  I received a similar one from someone 
else.  If I get time to build a ZFS lab machine then I will certainly 
try these out and provide feedback on how they work.

freebsd stable - Mar 2010 - ZFS hot spares

ZFS hot spares

ZFS hot spares

ZFS hot spares