Is there any way to control the resliver speed? Having attached a third disk to a mirror (so I can replace the other disks with larger ones) the resilver goes at a fraction of the speed of the same operation using disk suite. However it still renders the system pretty much unusable for anything else. So I would like to control the rate of the resilver. Either slow it down a lot so that the system is still usable or tell it to go as fast as possible to get it overwith. Also does the resilver deliberately pause? Running iostat I see that it will pause for five to ten seconds where no IO is done at all, then it continues on at a more reasonable pace. -- This message posted from opensolaris.org
Chris Gerhard wrote:> Is there any way to control the resliver speed? Having attached a third disk to a mirror (so I can replace the other disks with larger ones) the resilver goes at a fraction of the speed of the same operation using disk suite. However it still renders the system pretty much unusable for anything else. >Resilvers work at low priority in the ZFS scheduler. In general, they work at the media speed of the disk being resilvered. However, anecdotal evidence suggests that this may be impacted by the number and extent of snapshots. I have a lot of characterization data for resilvers, but without varying the scope and number of snapshots (which is a hard thing to identify). ZFS resilvers in time sequence, not by disk block location, so there are many more variables at play here than might be immediately obvious.> So I would like to control the rate of the resilver. Either slow it down a lot so that the system is still usable or tell it to go as fast as possible to get it overwith. >There are two competing RFEs for this: http://bugs.opensolaris.org/view_bug.do?bug_id=6592835 http://bugs.opensolaris.org/view_bug.do?bug_id=6494473> Also does the resilver deliberately pause? Running iostat I see that it will pause for five to ten seconds where no IO is done at all, then it continues on at a more reasonable pace. >I have not seen such behaviour during resilver characterization. Which OS release are you using? Also, are you using IDE disks or disks which do not handle multiple outstanding operations? You may also be seeing http://bugs.opensolaris.org/view_bug.do?bug_id=6729696 -- richard
Thanks Richard Elling wrote:> > Also, are you using IDE disks or disks which do not handle multiple > outstanding operations?SATA with the cmdk driver which is only sending 2 commands at a time.> > You may also be seeing > http://bugs.opensolaris.org/view_bug.do?bug_id=6729696that could well be the case. Fortunately none of the users would know how to run the sync command. -- Chris Gerhard. __o __o __o Systems TSC Chief Technologist _`\<,`\<,`\<,_ Sun Microsystems Limited (*)/---/---/ (*) Phone: +44 (0) 1252 426033 (ext 26033) http://blogs.sun.com/chrisg -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3253 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080905/e397ac5e/attachment.bin>
On Fri, 2008-09-05 at 09:41 -0700, Richard Elling wrote:> > Also does the resilver deliberately pause? Running iostat I see > that it will pause for five to ten seconds where no IO is done at all, > then it continues on at a more reasonable pace.> I have not seen such behaviour during resilver characterization.I have, post nv_94, and I filed a bug: 6729696 sync causes scrub or resilver to pause for up to 30s - Bill
This might be a dumb question, but isn''t that going to be a big problem on a busy pool? I''m planning to use ZFS as a sync NFS fileserver and I''m expecting a fair bit of traffic. Does that bug mean ZFS is going to have problems resilvering failed disks on my system? -- This message posted from opensolaris.org
bounce Can anybody confirm how bug 6729696 is going to affect a busy system running synchronous NFS shares? Is the sync activity from NFS going to be enough to prevent resilvering from ever working, or have I mis-understood this bug? thanks, Ross -- This message posted from opensolaris.org
On Wed, Oct 8, 2008 at 10:29 AM, Ross <myxiplx at hotmail.com> wrote:> bounce > > Can anybody confirm how bug 6729696 is going to affect a busy system running synchronous NFS shares? Is the sync activity from NFS > going to be enough to prevent resilvering from ever working, or have I mis-understood this bug?A synchronous write will not trigger a sync. The ZIL is used for synchronous writes.