thr3ads.net - zfs discuss - [zfs-discuss] Re: I/O write failures on non-replicated pool [Aug 2006]

If this information is useful, please help other people find it:
Share via:

Phi Tran

2006-Aug-10 21:55 UTC

[zfs-discuss] Re: I/O write failures on non-replicated pool

I remember a discussion about I/O write failures causing a panic for a
non-replicated pool and a plan to fix this in the future.  I couldn''t
find a bug for this work though.  Is there still a plan to fix this?

Phi

Eric Schrock

2006-Aug-10 22:03 UTC

head link

[zfs-discuss] Re: I/O write failures on non-replicated pool

Yes, there are three incremental fixes that we plan in this area:

6417772 need nicer message on write failure

	This just cleans up the failure mode so that we get a nice
	FMA failure message and can distinguish this from a random
	failed assert.

6417779 ZFS: I/O failure (write on ...) -- need to reallocate writes

	In a multi-vdev pool, this would take a failed write and attempt
	to do the write on another toplevel vdev.  This would all but
	elminate the problem for multi-vdev pools.

6322646 ZFS should gracefully handle all devices failing (when writing)

	This is the "real" fix.  Unfortunately, it''s also really
hard.
	Even if we manage to abort the current transaction group, 
	dealing with the semantics of a filesystem which has lost an
	arbitrary amount of change and notifying the user in a
	meaningful way is difficult at best.

Hope that helps.

- Eric


On Thu, Aug 10, 2006 at 02:55:51PM -0700, Phi Tran
wrote:> I remember a discussion about I/O write failures causing a panic for a
> non-replicated pool and a plan to fix this in the future.  I
couldn''t
> find a bug for this work though.  Is there still a plan to fix this?
> 
> Phi
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Phi Tran

2006-Aug-10 22:16 UTC

head link

[zfs-discuss] Re: I/O write failures on non-replicated pool

Thanks for the list.

Phi

Eric Schrock wrote:> Yes, there are three incremental fixes that we plan in this area:
> 
> 6417772 need nicer message on write failure
> 
> 	This just cleans up the failure mode so that we get a nice
> 	FMA failure message and can distinguish this from a random
> 	failed assert.
> 
> 6417779 ZFS: I/O failure (write on ...) -- need to reallocate writes
> 
> 	In a multi-vdev pool, this would take a failed write and attempt
> 	to do the write on another toplevel vdev.  This would all but
> 	elminate the problem for multi-vdev pools.
> 
> 6322646 ZFS should gracefully handle all devices failing (when writing)
> 
> 	This is the "real" fix.  Unfortunately, it''s also
really hard.
> 	Even if we manage to abort the current transaction group, 
> 	dealing with the semantics of a filesystem which has lost an
> 	arbitrary amount of change and notifying the user in a
> 	meaningful way is difficult at best.
> 
> Hope that helps.
> 
> - Eric
> 
> 
> On Thu, Aug 10, 2006 at 02:55:51PM -0700, Phi Tran wrote:
> 
>>I remember a discussion about I/O write failures causing a panic for a
>>non-replicated pool and a plan to fix this in the future.  I
couldn''t
>>find a bug for this work though.  Is there still a plan to fix this?
>>
>>Phi
>>
>>_______________________________________________
>>zfs-discuss mailing list
>>zfs-discuss at opensolaris.org
>>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> 
> --
> Eric Schrock, Solaris Kernel Development      
http://blogs.sun.com/eschrock

Nigel Smith

2007-Oct-25 12:02 UTC

head link

[zfs-discuss] I/O write failures on non-replicated pool

Nice to see some progress, at last, on this bug:
http://bugs.opensolaris.org/view_bug.do?bug_id=6417779
"ZFS: I/O failure (write on ...) -- need to reallocate writes"

Commit to Fix:	 snv_77

http://www.opensolaris.org/os/community/arc/caselog/2007/567/onepager/

http://mail.opensolaris.org/pipermail/onnv-notify/2007-October/012782.html

Regards
Nigel Smith
 
 
This message posted from opensolaris.org

Robert Milkowski

2007-Oct-29 17:47 UTC

head link

[zfs-discuss] I/O write failures on non-replicated pool

Hello Nigel,

Thursday, October 25, 2007, 12:02:04 PM, you wrote:

NS> Nice to see some progress, at last, on this bug:
NS> http://bugs.opensolaris.org/view_bug.do?bug_id=6417779
NS> "ZFS: I/O failure (write on ...) -- need to reallocate writes"

NS> Commit to Fix:   snv_77

NS> http://www.opensolaris.org/os/community/arc/caselog/2007/567/onepager/

NS>
http://mail.opensolaris.org/pipermail/onnv-notify/2007-October/012782.html

Thanks for spotting this.
By looking at one-pager it''s not obvious what would happen in case of
one top level vdev failuere - will it wait or will it using ditto
block to write data on another device as suggested in bug (however I''m
not sure it''s good idea)?


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

zfs discuss - Aug 2006 - Re: I/O write failures on non-replicated pool

[zfs-discuss] Re: I/O write failures on non-replicated pool

[zfs-discuss] Re: I/O write failures on non-replicated pool

[zfs-discuss] Re: I/O write failures on non-replicated pool

[zfs-discuss] I/O write failures on non-replicated pool

[zfs-discuss] I/O write failures on non-replicated pool