thr3ads.net - zfs discuss - [zfs-discuss] never ending resilver [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Francois

2010-Jul-05 14:21 UTC

[zfs-discuss] never ending resilver

Hi list,

Here''s my case :

pool: mypool
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
         continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scrub: resilver in progress for 147h19m, 100.00% done, 0h0m to go
config:

         NAME           STATE     READ WRITE CKSUM
         filerbackup13  DEGRADED     0     0     0
           raidz2       DEGRADED     0     0     0
             c0t8d0     ONLINE       0     0     0
             replacing  DEGRADED     0     0     0
               c0t9d0   OFFLINE      0     0     0
               c0t23d0  ONLINE       0     0     0  454G resilvered
             c0t10d0    ONLINE       0     0     0
             c0t11d0    ONLINE       0     0     0
             c0t12d0    ONLINE       0     0     0
             c0t13d0    ONLINE       0     0     0
             c0t14d0    ONLINE       0     0     0
             c0t15d0    ONLINE       0     0     0
             c0t16d0    ONLINE       0     0     0
             c0t17d0    ONLINE       0     0     0
             c0t18d0    ONLINE       0     0     0
             c0t19d0    ONLINE       0     0     0
             c0t20d0    ONLINE       0     0     0
             c0t21d0    ONLINE       0     0     0
             c0t22d0    ONLINE       0     0     0


After having launched replace command, I had to offlined c0t9d0 because 
it was generating too many warnings and slow down i/os.

Now replace seems to be finished but zpool status still displays 
"replacing" and according to scrub status, resilver seems to continue
?

Any idea how to clarify this situation ?

Thanks.

--
Francois

Roy Sigurd Karlsbakk

2010-Jul-05 14:41 UTC

head link

[zfs-discuss] never ending resilver

> After having launched replace command, I had to offlined c0t9d0
> because
> it was generating too many warnings and slow down i/os.
> 
> Now replace seems to be finished but zpool status still displays
> "replacing" and according to scrub status, resilver seems to
continue
> ?
> 
> Any idea how to clarify this situation ?
I''ve seen this happen earlier, and then, the resilvering (or scrub) was
finished after a while - an hour or so. Watching iostat -xd showed high i/o
traffic (without much from the users).

- What sort of drives are you using?
- For how long has the pool been in ''100% done'', while still
resilvering?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Orvar Korvar

2010-Jul-05 17:46 UTC

head link

[zfs-discuss] never ending resilver

If you have one zpool consisting of only one large raidz2, then you have a slow
raid. To reach high speed, you need maximum 8 drives in each raidz2. So one of
the reasons it takes time, is because you have too many drives in your raidz2.
Everything would be much faster if you split your zpool into two raidz2, each
consisting of 7 or 8 drives. Then it would be fast.
-- 
This message posted from opensolaris.org

Roy Sigurd Karlsbakk

2010-Jul-05 18:08 UTC

head link

[zfs-discuss] never ending resilver

----- Original Message -----> If you have one zpool consisting of only one large raidz2, then you
> have a slow raid. To reach high speed, you need maximum 8 drives in
> each raidz2. So one of the reasons it takes time, is because you have
> too many drives in your raidz2. Everything would be much faster if you
> split your zpool into two raidz2, each consisting of 7 or 8 drives.
> Then it would be fast.
Keeping the VDEVs small is one thing, but this is about resilvering spending far
more time than reported. The same applies to scrubbing at times.

Would it be hard to rewrite the reporting mechanisms in ZFS to report something
more likely, than just a first guess? ZFS scrub reports tremendous times at
start, but slows down after it''s worked it''s way through the
metadata. What ZFS is doing when the system still scrubs after 100 hours at 100%
is beyond my knowledge.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Tomas Ögren

2010-Jul-05 19:27 UTC

head link

[zfs-discuss] never ending resilver

On 05 July, 2010 - Roy Sigurd Karlsbakk sent me these 1,9K bytes:
> ----- Original Message -----
> > If you have one zpool consisting of only one large raidz2, then you
> > have a slow raid. To reach high speed, you need maximum 8 drives in
> > each raidz2. So one of the reasons it takes time, is because you have
> > too many drives in your raidz2. Everything would be much faster if you
> > split your zpool into two raidz2, each consisting of 7 or 8 drives.
> > Then it would be fast.
> 
> Keeping the VDEVs small is one thing, but this is about resilvering
spending far more time than reported. The same applies to scrubbing at times.
> 
> Would it be hard to rewrite the reporting mechanisms in ZFS to report
> something more likely, than just a first guess? ZFS scrub reports
> tremendous times at start, but slows down after it''s worked
it''s way
> through the metadata. What ZFS is doing when the system still scrubs
> after 100 hours at 100% is beyond my knowledge.
I believe it''s something like this:
* When starting, it notes the number of blocks to visit
* .. visiting blocks ...
* .. adding more data (which then will be beyond the original 100%) ..
  and visiting blocks ...
* .. reaching the initial "last block", which since then has gotten
lots
  of new friends afterwards.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6899970

/Tomas
-- 
Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Ian Collins

2010-Jul-05 22:24 UTC

head link

[zfs-discuss] never ending resilver

On 07/ 6/10 02:21 AM, Francois wrote:> Hi list,
>
> Here''s my case :
>
> pool: mypool
> state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>         continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 147h19m, 100.00% done, 0h0m to go
> config:
>
<snip>>
> After having launched replace command, I had to offlined c0t9d0 
> because it was generating too many warnings and slow down i/os.
>
> Now replace seems to be finished but zpool status still displays 
> "replacing" and according to scrub status, resilver seems to
continue ?
>As others have noted, your wide raidz2 will be slow to resilver.

As for the reported progress, I see this all the time with an x4500.  
The resilver is often 100% done for over half of the real resilver time 
(which is normally >100 hours for a 500G drive in an 8 drive raidz).  
This box is a backup server, so there is a fair amount of churn, which I 
assume confuses the reporting.

-- 
Ian.

Seemingly Similar Threads

Search for more reasonably related threads

zfs discuss - Jul 2010 - never ending resilver

[zfs-discuss] never ending resilver

[zfs-discuss] never ending resilver

[zfs-discuss] never ending resilver

[zfs-discuss] never ending resilver

[zfs-discuss] never ending resilver

[zfs-discuss] never ending resilver

Seemingly Similar Threads