thr3ads.net - zfs discuss - [zfs-discuss] devid support for EFI partition improved zfs usibility [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Freeman Liu

2006-Jun-15 02:30 UTC

[zfs-discuss] devid support for EFI partition improved zfs usibility

Hi, guys,

I have add devid support for EFI, (not putback yet) and  test it with a
zfs mirror, now the mirror can recover even a usb harddisk is unplugged
and replugged into a different usb port.

But there is still something need to improve. I''m far from zfs expert,
correct me if I''m wrong.

First, zfs should sense the hotplug event.
I use zfs status to check the status of the pool. But it seems
zfs does not received the hotplugged event so it does not reflect the 
change
of the zpool status until some read/write operation is issues. 
Specifically,
when I unplug a disk, the status is still online, until I tried some 
write operation
or write some time. And after a replugged the device, the status remained
degraded, until I issues a zpool scrub. This is a little tricky before I 
figured it out.

Second, zfs should not panic when usb disk is gone.
Zfs will panic for a pool composed on only one usb disk when the disk is 
unplugged.
It''s better to change the status of the pool to something like
UNAVAILABLE
when the disk is unplugged and return to ONLINE after it''s replugged. 
This goal
may be hard to achieve because it''s a tough decision as to whether or
not to
panic when the storage failed. And seems there''s no mechanism for zfs
to
tell
if it''s a usb harddisk or a fixed disk. Anyway, because the
hotpluggable
nature of
usb harddisk, unplug it unintentionally should not panic.

my 1c

Thank you

Eric Schrock

2006-Jun-15 16:31 UTC

head link

[zfs-discuss] devid support for EFI partition improved zfs usibility

On Thu, Jun 15, 2006 at 10:30:02AM +0800, Freeman Liu
wrote:> 
> Hi, guys,
> 
> I have add devid support for EFI, (not putback yet) and  test it with a
> zfs mirror, now the mirror can recover even a usb harddisk is unplugged
> and replugged into a different usb port.
> 
> But there is still something need to improve. I''m far from zfs
expert,
> correct me if I''m wrong.
> 
> First, zfs should sense the hotplug event.
ZFS doesn''t subscribe to hotplug events, yet.  We currently have a
"poor
man''s hotplug", whereby we can lazily notice that a disk is gone,
but
it''s far from perfect.
> I use zfs status to check the status of the pool. But it seems
> zfs does not received the hotplugged event so it does not reflect the
> change of the zpool status until some read/write operation is issues.
> Specifically, when I unplug a disk, the status is still online, until
> I tried some write operation or write some time. And after a replugged
> the device, the status remained degraded, until I issues a zpool
> scrub. This is a little tricky before I figured it out.
Yes, this is all the functionality we have at the moment until the next
phase of ZFS/FMA integration.  Seth putback some basic hotplug support
in the sx4500 diagnosis engine, which we hope to generalize through
libtopo and have a ZFS agent which undertands how to behave in response
to hotplug events.
> Second, zfs should not panic when usb disk is gone.
> Zfs will panic for a pool composed on only one usb disk when the disk
> is unplugged.  It''s better to change the status of the pool to
> something like UNAVAILABLE when the disk is unplugged and return to
> ONLINE after it''s replugged.  This goal may be hard to achieve
because
> it''s a tough decision as to whether or not to panic when the
storage
> failed. And seems there''s no mechanism for zfs to tell if
it''s a usb
> harddisk or a fixed disk. Anyway, because the hotpluggable nature of
> usb harddisk, unplug it unintentionally should not panic.
Yes, this is certainly the goal.  However, it is quite difficult,
particularly for a single disk pool.  We have to be able to abort an
in-progress transaction group, and somehow notify consumers that an I/O
error has failed.  By the time we are in the failure detection context,
we''ve long since lost any relation to processes, vnodes, or anything
useful that would be used to propagate the error.  In addition, we have
to fail all future I/O, because the entire transaction group has failed
and to do otherwise would result in an inconsistent pool.

This is all doable, but (as with many things) requires non-trivial
effort.  Before we tackle this problem, we will likely address this RFE:

6417779 reallocating writes

For multi-disk pools, this will allow us to retry the I/O on another
toplevel vdev, hopefully saving the integrity of the pool in the
process.  Combined with ditto blocks (which should protect the metaslab
and space maps), we should be able to survive write failure in a
multi-device pool with relative grace.

Hope that helps,

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Freeman Liu

2006-Jun-16 02:28 UTC

head link

[zfs-discuss] devid support for EFI partition improved zfs usibility

>
>Yes, this is all the functionality we have at the moment until the next
>phase of ZFS/FMA integration.  Seth putback some basic hotplug support
>in the sx4500 diagnosis engine, which we hope to generalize through
>libtopo and have a ZFS agent which undertands how to behave in response
>to hotplug events.
>  
>Thanks for your explanation.

Will you treat differently between manual-unplug-failure and 
hardware-malfunction-failure ?
What about to handle handle human misoperation differently from hardware 
failure ? Seems generating
different events in these situation will help to achieve that.

I''m not sure if a unplug event generates FMA event, but it will
generate
a unplug event ( I think it some
different from a error ) which  will be caught by tamarack ( on going 
project by Artem Kachitchkin).

Thank you

Artem Kachitchkine

2006-Jun-16 04:04 UTC

head link

[zfs-discuss] devid support for EFI partition improved zfs usibility

> I''m not sure if a unplug event generates FMA event, but it will
generate
> a unplug event ( I think it some different from a error ) which  will be
caught by tamarack
Tamarack is mostly for the desktop. Sure you can use it to do automatic
''zfs
import'' on hotplug, but not for fault recovery. FMA is a better suit
for ZFS, we
just want the filesystem humming away in spite of the failures (self-healing). 
If a disk needs to be replaced (in case of no hot spares), use ''zpool
replace''.
I think surprise removals should be handled the same as failures.

-Artem.

Eric Schrock

2006-Jun-16 15:53 UTC

head link

[zfs-discuss] devid support for EFI partition improved zfs usibility

On Fri, Jun 16, 2006 at 10:28:50AM +0800, Freeman Liu
wrote:> 
> Will you treat differently between manual-unplug-failure and 
> hardware-malfunction-failure ?
> What about to handle handle human misoperation differently from hardware 
> failure ? Seems generating
> different events in these situation will help to achieve that.
Yes, a hotplug event does not correspond to a fault.  Also, if a new
device is plugged into the same physical slot, then there will be a
per-pool toggle switch to enable automatic replace/resilver.  If you
plug the same device back in, it will automatically resilver regardless.
> I''m not sure if a unplug event generates FMA event, but it will
> generate a unplug event ( I think it some different from a error )
> which  will be caught by tamarack ( on going project by Artem
> Kachitchkin).
Currently, the DE subscribes to the sysevents corresponding to hotplug
events.  Moving forward, we want to integrate this into fmd/libtopo so
that any fmd module can react to these events without having to manually
plumb up the underlying mechanism.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Seemingly Similar Threads

Search for more reasonably related threads

zfs discuss - Jun 2006 - devid support for EFI partition improved zfs usibility

[zfs-discuss] devid support for EFI partition improved zfs usibility

[zfs-discuss] devid support for EFI partition improved zfs usibility

[zfs-discuss] devid support for EFI partition improved zfs usibility

[zfs-discuss] devid support for EFI partition improved zfs usibility

[zfs-discuss] devid support for EFI partition improved zfs usibility

Seemingly Similar Threads