I have a pool composed of a single raidz2 vdev, which is currently degraded (missing a disk): config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c8d1 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 c12t4d0 ONLINE 0 0 0 c12t3d0 ONLINE 0 0 0 c12t2d0 ONLINE 0 0 0 c12t0d0 OFFLINE 0 0 0 logs c10d0 ONLINE 0 0 0 errors: No known data errors I have it scheduled for periodic scrubs, via root''s crontab: 20 2 1 * * /usr/sbin/zpool scrub pool but this scrub was kicked off manually. Last night I checked its status and saw: scrub: scrub in progress for 20h32m, 100.00% done, 0h0m to go This morning I see: scrub: scrub in progress for 31h10m, 100.00% done, 0h0m to go It''s 100% done, but yet hasn''t finished in 10 hours! "zpool iostat -v pool 10" shows it''s doing between 50 and 120 MB/s of reads, when the userspace applications are only doing a few megabytes per second of I/O, as measured by the DTraceToolkit script "rwtop" ("app_r: 4469 KB, app_w: 4579 KB"). What can cause this kind of behavior, and how can I make my pool finish scrubbing? Will
Hi, I noticed that counters will not get updated if data amount increases during scrub/resilver, so if application has written new data during scrub, counter will not give realistic estimate. This happens with resilvering and scrub, somebody could fix this? Yours Markus Kovero -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Will Murnane Sent: 7. syyskuuta 2009 16:42 To: ZFS Mailing List Subject: [zfs-discuss] This is the scrub that never ends... I have a pool composed of a single raidz2 vdev, which is currently degraded (missing a disk): config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c8d1 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 c12t4d0 ONLINE 0 0 0 c12t3d0 ONLINE 0 0 0 c12t2d0 ONLINE 0 0 0 c12t0d0 OFFLINE 0 0 0 logs c10d0 ONLINE 0 0 0 errors: No known data errors I have it scheduled for periodic scrubs, via root''s crontab: 20 2 1 * * /usr/sbin/zpool scrub pool but this scrub was kicked off manually. Last night I checked its status and saw: scrub: scrub in progress for 20h32m, 100.00% done, 0h0m to go This morning I see: scrub: scrub in progress for 31h10m, 100.00% done, 0h0m to go It''s 100% done, but yet hasn''t finished in 10 hours! "zpool iostat -v pool 10" shows it''s doing between 50 and 120 MB/s of reads, when the userspace applications are only doing a few megabytes per second of I/O, as measured by the DTraceToolkit script "rwtop" ("app_r: 4469 KB, app_w: 4579 KB"). What can cause this kind of behavior, and how can I make my pool finish scrubbing? Will _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Looks like this bug: http://bugs.opensolaris.org/view_bug.do?bug_id=6655927 Workaround: Don''t run zpool status as root. --chris -- This message posted from opensolaris.org
Hello Will, On Sep 7, 2009, at 3:42 PM, Will Murnane wrote:> > What can cause this kind of behavior, and how can I make my pool > finish scrubbing? >No idea what is causing this but did you try to stop the scrub? If so what happened? (Might not be a good idea since this is not a normal state?) What release of OpenSolaris are you running? Maybe this could be of interest, but it is a duplicate and it should have been fixed in snv_110: running zpool scrub twice hangs the scrub Regards Henrik http://sparcv9.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090907/d8899b61/attachment.html>
On Mon, Sep 7, 2009 at 12:05, Chris Gerhard <chris.gerhard at sun.com> wrote:> Looks like this bug: > > http://bugs.opensolaris.org/view_bug.do?bug_id=6655927 > > Workaround: Don''t run zpool status as root.I''m not, and yet the scrub continues. To be more specific, here''s a complete current interaction with zpool status: will at box:~$ zpool status pool pool: pool state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using ''zpool online'' or replace the device with ''zpool replace''. scrub: scrub in progress for 39h37m, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c8d1 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 c12t4d0 ONLINE 0 0 0 c12t3d0 ONLINE 0 0 0 c12t2d0 ONLINE 0 0 0 c12t0d0 OFFLINE 0 0 0 logs c10d0 ONLINE 0 0 0 errors: No known data errors will at box:~$ Running the same command again immediately shows the same thing. In other words, the scrub is not restarting, just never finishing. iostat shows this: r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 303.9 0.0 12380.2 0.0 33.0 2.0 108.5 6.6 100 100 c8d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c9d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c10d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7d1 303.9 0.0 12348.2 0.0 33.0 2.0 108.5 6.6 100 100 c8d1 366.9 0.0 13627.8 0.0 0.0 4.6 0.0 12.5 0 51 c12t2d0 351.9 0.0 12956.0 0.0 0.0 4.3 0.0 12.2 0 58 c12t3d0 369.9 0.0 13787.8 0.0 0.0 6.8 0.0 18.3 0 72 c12t4d0 while rwtop shows about 3 MB/s to and from applications. Will
On Mon, Sep 7, 2009 at 15:59, Henrik Johansson <henrikj at henkis.net> wrote:> Hello Will, > On Sep 7, 2009, at 3:42 PM, Will Murnane wrote: > > What can cause this kind of behavior, and how can I make my pool > finish scrubbing? > > > No idea what is causing this but did you try to stop the scrub?I haven''t done so yet. Perhaps that would be a reasonable next step. I could run zpool status as root and see if that triggers the "restart-scrub" bug. I don''t mind scrubbing my data, but I do mind getting stuck in "scrub-forever" mode.> If so what > happened? (Might not be a good idea since this is not a normal state?) What > release of OpenSolaris are you running?$ uname -a SunOS will-fs 5.11 snv_118 i86pc i386 i86xpv I can update to latest /dev if someone can suggest a reason why that might help. Otherwise I''m sort of once-bitten twice-shy on upgrading for fun.> Maybe this could be of interest, but it is a duplicate and it should have > been fixed in snv_110:?running zpool scrub twice hangs the scrubInteresting. Note my crontab entry doesn''t have any protection against this, so perhaps this bug is back in different form now. Will
I left the scrub running all day: scrub: scrub in progress for 67h57m, 100.00% done, 0h0m to go but as you can see, it didn''t finish. So, I ran pkg image-update, rebooted, and am now running b122. On reboot, the scrub restarted from the beginning, and currently estimates 17h to go. I''ll post an update in about 17 hours ;) On Mon, Sep 7, 2009 at 18:06, Will Murnane <will.murnane at gmail.com> wrote:> On Mon, Sep 7, 2009 at 15:59, Henrik Johansson <henrikj at henkis.net> wrote: >> Hello Will, >> On Sep 7, 2009, at 3:42 PM, Will Murnane wrote: >> >> What can cause this kind of behavior, and how can I make my pool >> finish scrubbing? >> >> >> No idea what is causing this but did you try to stop the scrub? > I haven''t done so yet. ?Perhaps that would be a reasonable next step. > I could run zpool status as root and see if that triggers the > "restart-scrub" bug. ?I don''t mind scrubbing my data, but I do mind > getting stuck in "scrub-forever" mode. > >> If so what >> happened? (Might not be a good idea since this is not a normal state?) What >> release of OpenSolaris are you running? > $ uname -a > SunOS will-fs 5.11 snv_118 i86pc i386 i86xpv > I can update to latest /dev if someone can suggest a reason why that > might help. ?Otherwise I''m sort of once-bitten twice-shy on upgrading > for fun. > >> Maybe this could be of interest, but it is a duplicate and it should have >> been fixed in snv_110:?running zpool scrub twice hangs the scrub > Interesting. ?Note my crontab entry doesn''t have any protection > against this, so perhaps this bug is back in different form now. > > Will >
On Tue, Sep 8, 2009 at 10:24 PM, Will Murnane <will.murnane at gmail.com>wrote:> I left the scrub running all day: > scrub: scrub in progress for 67h57m, 100.00% done, 0h0m to go > but as you can see, it didn''t finish. So, I ran pkg image-update, > rebooted, and am now running b122. On reboot, the scrub restarted > from the beginning, and currently estimates 17h to go. I''ll post an > update in about 17 hours ;) > > On Mon, Sep 7, 2009 at 18:06, Will Murnane <will.murnane at gmail.com> wrote: > > On Mon, Sep 7, 2009 at 15:59, Henrik Johansson <henrikj at henkis.net> > wrote: > >> Hello Will, > >> On Sep 7, 2009, at 3:42 PM, Will Murnane wrote: > >> > >> What can cause this kind of behavior, and how can I make my pool > >> finish scrubbing? > >> > >> > >> No idea what is causing this but did you try to stop the scrub? > > I haven''t done so yet. Perhaps that would be a reasonable next step. > > I could run zpool status as root and see if that triggers the > > "restart-scrub" bug. I don''t mind scrubbing my data, but I do mind > > getting stuck in "scrub-forever" mode. > > > >> If so what > >> happened? (Might not be a good idea since this is not a normal state?) > What > >> release of OpenSolaris are you running? > > $ uname -a > > SunOS will-fs 5.11 snv_118 i86pc i386 i86xpv > > I can update to latest /dev if someone can suggest a reason why that > > might help. Otherwise I''m sort of once-bitten twice-shy on upgrading > > for fun. > > > >> Maybe this could be of interest, but it is a duplicate and it should > have > >> been fixed in snv_110: running zpool scrub twice hangs the scrub > > Interesting. Note my crontab entry doesn''t have any protection > > against this, so perhaps this bug is back in different form now. > > > > Will > > >Might wanna be careful with b122. There''s issues with raid-z raidsets producing phantom checksum errors. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090908/a8c86b89/attachment.html>
On Wed, Sep 9, 2009 at 03:27, Tim Cook <tim at cook.ms> wrote:>> I left the scrub running all day: >> ?scrub: scrub in progress for 67h57m, 100.00% done, 0h0m to go >> but as you can see, it didn''t finish. ?So, I ran pkg image-update, >> rebooted, and am now running b122. ?On reboot, the scrub restarted >> from the beginning, and currently estimates 17h to go. ?I''ll post an >> update in about 17 hours ;)Some hours later, here I am again: scrub: scrub in progress for 18h24m, 100.00% done, 0h0m to go Any suggestions?> Might wanna be careful with b122.? There''s issues with raid-z raidsets > producing phantom checksum errors.This is a dual-parity pool, which doesn''t have the rotating parity that a raidz1 pool does. Thus, it''s safe from the particular type of complaints known of in b122. Will
On Wed, 2009-09-09 at 21:30 +0000, Will Murnane wrote:> Some hours later, here I am again: > scrub: scrub in progress for 18h24m, 100.00% done, 0h0m to go > Any suggestions?Let it run for another day. A pool on a build server I manage takes about 75-100 hours to scrub, but typically starts reporting "100.00% done, 0h0m to go" at about the 50-60 hour point. I suspect the combination of frequent time-based snapshots and a pretty active set of users causes the progress estimate to be off.. - Bill
On Sep 9, 2009, at 9:29 PM, Bill Sommerfeld wrote:> > On Wed, 2009-09-09 at 21:30 +0000, Will Murnane wrote: >> Some hours later, here I am again: >> scrub: scrub in progress for 18h24m, 100.00% done, 0h0m to go >> Any suggestions? > > Let it run for another day. > > A pool on a build server I manage takes about 75-100 hours to scrub, > but > typically starts reporting "100.00% done, 0h0m to go" at about the > 50-60 > hour point. > > I suspect the combination of frequent time-based snapshots and a > pretty > active set of users causes the progress estimate to be off.. >out of curiousity - do you have a lot of small files in the filesystem? zdb -s <pool> might be interesting to observe too --- .je (oh, and thanks for the subject line .. now i''ve had this song stuck in my head for a couple days :P)
On Thu, Sep 10, 2009 at 11:11, Jonathan Edwards <Jonathan.Edwards at sun.com> wrote:> out of curiousity - do you have a lot of small files in the filesystem?Most of the space in the filesystem is taken by a few large files, but most of the files in the filesystem are small. For example, I have my recorded TV collection on this pool, which is 900 files that add up to 400GB, but I also have a full mirror of Ubuntu--164k files in 131GB. So yes, I have many small files, depending on your metric. There are about 570k files on the filesystem overall.> zdb -s <pool> might be interesting to observe too# zdb -s pool capacity operations bandwidth ---- errors ---- description used avail read write read write read write cksum pool 3.92T 1.52T 8 0 151K 0 0 0 0 raidz2 3.92T 1.52T 6 0 8.96K 0 0 0 0 /dev/dsk/c8d1s0 2 0 134K 0 0 0 0 /dev/dsk/c8d0s0 2 0 153K 0 0 0 0 /dev/dsk/c12t4d0s0 2 0 163K 0 0 0 0 /dev/dsk/c12t3d0s0 3 0 183K 0 0 0 0 /dev/dsk/c12t2d0s0 2 0 168K 0 0 0 0 /dev/dsk/c12t0d0s0 0 0 0 0 0 0 0 log /dev/dsk/c10d0s0 52.0K 1008M 1 0 142K 0 0 0 0> (oh, and thanks for the subject line .. now i''ve had this song stuck in my > head for a couple days :P)I''m sorry. I suggest finding a trump song--something which gets more firmly ingrained than Lamb Chop. I have no advice on how to get that song out of your head, though. Will
On Wed, Sep 9, 2009 at 21:29, Bill Sommerfeld <sommerfeld at sun.com> wrote:>> Any suggestions? > > Let it run for another day.I''ll let it keep running as long as it wants this time.> I suspect the combination of frequent time-based snapshots and a pretty > active set of users causes the progress estimate to be off..I''m the main user of this box, and automatic snapshots are off. Will
On Thu, Sep 10, 2009 at 13:06, Will Murnane <will.murnane at gmail.com> wrote:> On Wed, Sep 9, 2009 at 21:29, Bill Sommerfeld <sommerfeld at sun.com> wrote: >>> Any suggestions? >> >> Let it run for another day. > I''ll let it keep running as long as it wants this time.scrub: scrub completed after 42h32m with 0 errors on Thu Sep 10 17:20:19 2009 And the people rejoiced. So I guess the issue is more "scrubs may report ETA very inaccurately" than "scrubs never finish". Thanks for the suggestions and support. Will
On Fri, 2009-09-11 at 13:51 -0400, Will Murnane wrote:> On Thu, Sep 10, 2009 at 13:06, Will Murnane <will.murnane at gmail.com> wrote: > > On Wed, Sep 9, 2009 at 21:29, Bill Sommerfeld <sommerfeld at sun.com> wrote: > >>> Any suggestions? > >> > >> Let it run for another day. > > I''ll let it keep running as long as it wants this time. > scrub: scrub completed after 42h32m with 0 errors on Thu Sep 10 17:20:19 2009 > > And the people rejoiced. So I guess the issue is more "scrubs may > report ETA very inaccurately" than "scrubs never finish". Thanks for > the suggestions and support.One of my pools routinely does this -- the scrub gets to 100% after about 50 hours but keeps going for another day or more after that. It turns out that zpool reports "number of blocks visited" vs "number of blocks allocated", but clamps the ratio at 100%. If there is substantial turnover in the pool, it appears you may end up needing to visit more blocks than are actually allocated at any one point in time. I made a modified version of the zpool command and this is what it prints for me: ... scrub: scrub in progress for 74h25m, 119.90% done, 0h0m to go 5428197411840 blocks examined, 4527262118912 blocks allocated ... This is the (trivial) source change I made to see what''s going on under the covers: diff -r 12fb4fb507d6 usr/src/cmd/zpool/zpool_main.c --- a/usr/src/cmd/zpool/zpool_main.c Mon Oct 26 22:25:39 2009 -0700 +++ b/usr/src/cmd/zpool/zpool_main.c Tue Nov 10 17:07:59 2009 -0500 @@ -2941,12 +2941,15 @@ if (examined == 0) examined = 1; - if (examined > total) - total = examined; fraction_done = (double)examined / total; - minutes_left = (uint64_t)((now - start) * - (1 - fraction_done) / fraction_done / 60); + if (fraction_done < 1) { + minutes_left = (uint64_t)((now - start) * + (1 - fraction_done) / fraction_done / 60); + } else { + minutes_left = 0; + } + minutes_taken = (uint64_t)((now - start) / 60); (void) printf(gettext("%s in progress for %lluh%um, %.2f%% done, " @@ -2954,6 +2957,9 @@ scrub_type, (u_longlong_t)(minutes_taken / 60), (uint_t)(minutes_taken % 60), 100 * fraction_done, (u_longlong_t)(minutes_left / 60), (uint_t)(minutes_left % 60)); + (void) printf(gettext("\t %lld blocks examined, %lld blocks allocated\n"), + examined, + total); } static void