Hi. What do you think about adding functionality similar to disk''s spare sectors - if a sector die, a new one is assigned from the spare sectors pool. This will be very helpful especially for laptops, where you have only one disk. I simulated returning EIO for one sector from a one-disk pool and as you know system paniced: panic: ZFS: I/O failure (write on <unknown> off 0: zio 0xc436d400 [L0 zvol object] 2000L/2000P DVA[0]=<0:4000:2000> fletcher2 uncompressed LE contiguous birth=11 fill=1 cksum=90519dcb617667ac:e96316f8a73d7efc:8ca812fc04509f9b:9b9632c6959cbd71): error 5 From what I saw, ZFS retried to write to this sector once again before panicing, but why not just try another block? And maybe remember the problematic block somewhere. Of course this won''t safe us when read operation fails, but should work quite well for writes. Not sure how vdev_mirror works exactly, ie. if it needs both mirror components to be identical or if the only guaranty is that they have the same data, but not exactly in the same place. If the latter, proposed mechanism could be also used as a part of the self-healing process, I think. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070405/46cb334f/attachment.bin>
Hello Pawel, Thursday, April 5, 2007, 3:10:11 PM, you wrote: PJD> Hi. PJD> What do you think about adding functionality similar to disk''s spare PJD> sectors - if a sector die, a new one is assigned from the spare sectors PJD> pool. This will be very helpful especially for laptops, where you have PJD> only one disk. I simulated returning EIO for one sector from a one-disk PJD> pool and as you know system paniced: PJD> panic: ZFS: I/O failure (write on <unknown> off 0: zio PJD> 0xc436d400 [L0 zvol object] 2000L/2000P DVA[0]=<0:4000:2000> PJD> fletcher2 uncompressed LE contiguous birth=11 fill=1 PJD> cksum=90519dcb617667ac:e96316f8a73d7efc:8ca812fc04509f9b:9b9632c6959cbd71): error 5 PJD> From what I saw, ZFS retried to write to this sector once again before PJD> panicing, but why not just try another block? And maybe remember the PJD> problematic block somewhere. Of course this won''t safe us when read PJD> operation fails, but should work quite well for writes. PJD> Not sure how vdev_mirror works exactly, ie. if it needs both mirror PJD> components to be identical or if the only guaranty is that they have the PJD> same data, but not exactly in the same place. If the latter, proposed PJD> mechanism could be also used as a part of the self-healing process, I PJD> think. IIRC it was discussed here some time ago (check archives) and it was stated that ZFS does not require in mirror case block to be in the same position on all mirrors. Also in read scenario if you couldn''t read you try to self-heal (it''s now) and if you still can''t write you write to new location. Now I don''t know how it''s with raid-z* -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
This sounds a lot like: 6417779 ZFS: I/O failure (write on ...) -- need to reallocate writes Which would allow us to retry write failures on alternate vdevs. - Eric On Thu, Apr 05, 2007 at 03:10:11PM +0200, Pawel Jakub Dawidek wrote:> Hi. > > What do you think about adding functionality similar to disk''s spare > sectors - if a sector die, a new one is assigned from the spare sectors > pool. This will be very helpful especially for laptops, where you have > only one disk. I simulated returning EIO for one sector from a one-disk > pool and as you know system paniced: > > panic: ZFS: I/O failure (write on <unknown> off 0: zio 0xc436d400 [L0 zvol object] 2000L/2000P DVA[0]=<0:4000:2000> fletcher2 uncompressed LE contiguous birth=11 fill=1 cksum=90519dcb617667ac:e96316f8a73d7efc:8ca812fc04509f9b:9b9632c6959cbd71): error 5 > > From what I saw, ZFS retried to write to this sector once again before > panicing, but why not just try another block? And maybe remember the > problematic block somewhere. Of course this won''t safe us when read > operation fails, but should work quite well for writes. > > Not sure how vdev_mirror works exactly, ie. if it needs both mirror > components to be identical or if the only guaranty is that they have the > same data, but not exactly in the same place. If the latter, proposed > mechanism could be also used as a part of the self-healing process, I > think. > > -- > Pawel Jakub Dawidek http://www.wheel.pl > pjd at FreeBSD.org http://www.FreeBSD.org > FreeBSD committer Am I Evil? Yes, I Am!> _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> This sounds a lot like: > > 6417779 ZFS: I/O failure (write on ...) -- need to > reallocate writes > > Which would allow us to retry write failures on > alternate vdevs.Of course, if there''s only one vdev, the write should be retried to a different block on the original vdev ... right? Anton This message posted from opensolaris.org
Anton B. Rang wrote:>> This sounds a lot like: >> >> 6417779 ZFS: I/O failure (write on ...) -- need to >> reallocate writes >> >> Which would allow us to retry write failures on >> alternate vdevs. > > Of course, if there''s only one vdev, the write should be retried to a different block on the original vdev ... right? >Yes, although it depends on the nature of the write failure. If the write failed because the device is no longer available, ZFS will not continue to try different blocks. -Mark
Mark Maybee wrote:> Anton B. Rang wrote: >>> This sounds a lot like: >>> >>> 6417779 ZFS: I/O failure (write on ...) -- need to >>> reallocate writes >>> >>> Which would allow us to retry write failures on >>> alternate vdevs. >> >> Of course, if there''s only one vdev, the write should be retried to a >> different block on the original vdev ... right? >> > Yes, although it depends on the nature of the write failure. If the > write failed because the device is no longer available, ZFS will not > continue to try different blocks.So if I unplug a USB hard drive at an inopertune time, ZFS will panic Solaris? Or to put it differently, if power goes away from disk before the operating system, things will go badly? Darren