thr3ads.net - zfs discuss - [zfs-discuss] Scrub works in parallel? [Jun 2012]

If this information is useful, please help other people find it:
Share via:

Kalle Anka

2012-Jun-10 18:29 UTC

[zfs-discuss] Scrub works in parallel?

Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub one
disk. If I scrub the zpool, how long time will it take?


Will it scrub one disk at a time, so it will take 500 hours, i.e. in sequence,
just serial? Or is it possible to run the scrub in parallel, so it takes 5h no
matter how many disks?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120610/4b5f6470/attachment.html>

Tomas Forsman

2012-Jun-10 20:01 UTC

head link

[zfs-discuss] Scrub works in parallel?

On 10 June, 2012 - Kalle Anka sent me these 1,5K bytes:
> Assume we have 100 disks in one zpool. Assume it takes 5 hours to
> scrub one disk. If I scrub the zpool, how long time will it take? 
> 
> 
> Will it scrub one disk at a time, so it will take 500 hours, i.e. in
> sequence, just serial? Or is it possible to run the scrub in parallel,
> so it takes 5h no matter how many disks?
It walks the filesystem/pool trees, so it''s not just reading the disk
from track 0 to track 12345, but validates all possible copies.

/Tomas
-- 
Tomas Forsman, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Edward Ned Harvey

2012-Jun-11 01:37 UTC

head link

[zfs-discuss] Scrub works in parallel?

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Kalle Anka
> 
> Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub
one> disk. If I scrub the zpool, how long time will it take?
> 
> Will it scrub one disk at a time, so it will take 500 hours, i.e. in
sequence, just> serial? Or is it possible to run the scrub in parallel, so it takes 5h no
matter> how many disks?
It will be approximately parallel, because it''s actually scrubbing only
the
used blocks, and the order it scrubs in will be approximately the order they
were written, which was intentionally parallel.

Aside from that, your question doesn''t really make sense, because you
don''t
just stick a bunch of disks in a pool.  You make a pool out of vdev''s
which
are made of storage devices (in this case, disks.)  The type and size of
vdev (raidz, raidzN, mirror, etc) will greatly affect the performance, as
well as your data usage patterns.

Scrubbing is an approximately random IOPS task.  Mirrors parallelize random
IO much better than raid.

The amount of time it takes to scrub or resilver is dependent both on the
amount of used data on the vdev, and the on-disk ordering.

Jim Klimov

2012-Jun-11 13:05 UTC

head link

[zfs-discuss] Scrub works in parallel?

2012-06-11 5:37, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>> bounces at opensolaris.org] On Behalf Of Kalle Anka
>>
>> Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub
> one
>> disk. If I scrub the zpool, how long time will it take?
>>
>> Will it scrub one disk at a time, so it will take 500 hours, i.e. in
> sequence, just
>> serial? Or is it possible to run the scrub in parallel, so it takes 5h
no
> matter
>> how many disks?
>
> It will be approximately parallel, because it''s actually scrubbing
only the
> used blocks, and the order it scrubs in will be approximately the order
they
> were written, which was intentionally parallel.
What the other posters said, plus: 100 disks is quite a lot
of contention on the bus(es), so even if it is all parallel,
the bus and CPU bottlenecks would raise the scrubbing time
somewhat above the single-disk scrub time.

Roughly, if all else is ideal (i.e. no/few random seeks and
a fast scrub at 100Mbps/disk), the SATA3 interface at 6Gbit/s
(on the order of ~600Mbyte/s) will be maxed out at about
6 disks. If your disks are colocated on one HBA receptacle
(i.e. via a backplane), this may be an issue for many disks
in an enclosure (a 4-lane link will sustain about 24 drives
at such speed, and that''s not the market''s max speed).

Further on, the PCI buses will become a bottleneck and the
CPU processing power might become one too, and for a box
with 100 disks this may be noticeable, depending on the other
architectural choices, components and their specs.

HTH,
//Jim

Roch Bourbonnais

2012-Jun-12 12:20 UTC

head link

[zfs-discuss] Scrub works in parallel?

Scrubs are run at very low priority and yield very quickly in the presence of
other work.
So I really would not expect to see scrub create any impact on an other type of
storage activity.
Resilvering will more aggressively push forward on what is has to do, but
resilvering does not need to
read any of the  data blocks on the non-resilvering vdevs.

-r

Le 11 juin 2012 ? 09:05, Jim Klimov a ?crit :
> 2012-06-11 5:37, Edward Ned Harvey wrote:
>>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>>> bounces at opensolaris.org] On Behalf Of Kalle Anka
>>> 
>>> Assume we have 100 disks in one zpool. Assume it takes 5 hours to
scrub
>> one
>>> disk. If I scrub the zpool, how long time will it take?
>>> 
>>> Will it scrub one disk at a time, so it will take 500 hours, i.e.
in
>> sequence, just
>>> serial? Or is it possible to run the scrub in parallel, so it takes
5h no
>> matter
>>> how many disks?
>> 
>> It will be approximately parallel, because it''s actually
scrubbing only the
>> used blocks, and the order it scrubs in will be approximately the order
they
>> were written, which was intentionally parallel.
> 
> What the other posters said, plus: 100 disks is quite a lot
> of contention on the bus(es), so even if it is all parallel,
> the bus and CPU bottlenecks would raise the scrubbing time
> somewhat above the single-disk scrub time.
> 
> Roughly, if all else is ideal (i.e. no/few random seeks and
> a fast scrub at 100Mbps/disk), the SATA3 interface at 6Gbit/s
> (on the order of ~600Mbyte/s) will be maxed out at about
> 6 disks. If your disks are colocated on one HBA receptacle
> (i.e. via a backplane), this may be an issue for many disks
> in an enclosure (a 4-lane link will sustain about 24 drives
> at such speed, and that''s not the market''s max speed).
> 
> Further on, the PCI buses will become a bottleneck and the
> CPU processing power might become one too, and for a box
> with 100 disks this may be noticeable, depending on the other
> architectural choices, components and their specs.
> 
> HTH,
> //Jim
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim Klimov

2012-Jun-12 12:28 UTC

head link

[zfs-discuss] Scrub works in parallel?

2012-06-12 16:20, Roch Bourbonnais wrote:>
> Scrubs are run at very low priority and yield very quickly in the presence
of other work.
> So I really would not expect to see scrub create any impact on an other
type of storage activity.
> Resilvering will more aggressively push forward on what is has to do, but
resilvering does not need to
> read any of the  data blocks on the non-resilvering vdevs.
Thanks, I agree - and that''s important to notice, at least
on the current versions of ZFS :)

What I meant to stress that if a "scrub of one disk takes
5 hours" (whichever way that measurement can be made, such
as making a 1-disk pool with same data distribution), then
there are physical reasons why a 100-disk pool probably
would take some way more than 5 hours to scrub; or at least
which bottlenecks should be paid attention to in order to
minimize such increase in scrub time.

Also, yes, presence of pool activity would likely delay
the scrub completion time, perhaps even more noticeably.

Thanks,
//Jim Klimov

Roch Bourbonnais

2012-Jun-12 12:45 UTC

head link

[zfs-discuss] Scrub works in parallel?

The process should be scalable.
Scrub all of the data on one disk using one disk worth of IOPS 
Scrub all of the data on N disks  using N  disk''s worth of IOPS.

THat will take ~ the same total time. 
-r

Le 12 juin 2012 ? 08:28, Jim Klimov a ?crit :
> 2012-06-12 16:20, Roch Bourbonnais wrote:
>> 
>> Scrubs are run at very low priority and yield very quickly in the
presence of other work.
>> So I really would not expect to see scrub create any impact on an other
type of storage activity.
>> Resilvering will more aggressively push forward on what is has to do,
but resilvering does not need to
>> read any of the  data blocks on the non-resilvering vdevs.
> 
> Thanks, I agree - and that''s important to notice, at least
> on the current versions of ZFS :)
> 
> What I meant to stress that if a "scrub of one disk takes
> 5 hours" (whichever way that measurement can be made, such
> as making a 1-disk pool with same data distribution), then
> there are physical reasons why a 100-disk pool probably
> would take some way more than 5 hours to scrub; or at least
> which bottlenecks should be paid attention to in order to
> minimize such increase in scrub time.
> 
> Also, yes, presence of pool activity would likely delay
> the scrub completion time, perhaps even more noticeably.
> 
> Thanks,
> //Jim Klimov
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim Klimov

2012-Jun-12 13:04 UTC

head link

[zfs-discuss] Scrub works in parallel?

2012-06-12 16:45, Roch Bourbonnais wrote:>
> The process should be scalable.
> Scrub all of the data on one disk using one disk worth of IOPS
> Scrub all of the data on N disks  using N  disk''s worth of IOPS.
>
> THat will take ~ the same total time.
IF the uplink or processing power or some other bottleneck does
not limit that (i.e. a single 4-lane SAS link to the daisy-chain
of 100 or 200 disks would likely impose a bandwidth bottleneck).

I know that well-engineered servers spec''ed by a vendor/integrator
for the customer''s tasks and environment, such as those from Sun,
are built to avoid such apparent bottlenecks. But people who
construct their own storage should know of (and try to avoid)
such possible problem-makers ;)

Thanks, Roch,
//Jim Klimov

Richard Elling

2012-Jun-12 13:07 UTC

head link

[zfs-discuss] Scrub works in parallel?

On Jun 11, 2012, at 6:05 AM, Jim Klimov wrote:
> 2012-06-11 5:37, Edward Ned Harvey wrote:
>>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>>> bounces at opensolaris.org] On Behalf Of Kalle Anka
>>> 
>>> Assume we have 100 disks in one zpool. Assume it takes 5 hours to
scrub
>> one
>>> disk. If I scrub the zpool, how long time will it take?
>>> 
>>> Will it scrub one disk at a time, so it will take 500 hours, i.e.
in
>> sequence, just
>>> serial? Or is it possible to run the scrub in parallel, so it takes
5h no
>> matter
>>> how many disks?
>> 
>> It will be approximately parallel, because it''s actually
scrubbing only the
>> used blocks, and the order it scrubs in will be approximately the order
they
>> were written, which was intentionally parallel.
> 
> What the other posters said, plus: 100 disks is quite a lot
> of contention on the bus(es), so even if it is all parallel,
> the bus and CPU bottlenecks would raise the scrubbing time
> somewhat above the single-disk scrub time.
In general, this is not true for HDDs or modern CPUs. Modern systems
are overprovisioned for bandwidth. In fact, bandwidth has been a poor
design point for storage for a long time. Dave Patterson has some 
interesting observations on this, now 8 years dated.
http://www.ll.mit.edu/HPEC/agendas/proc04/invited/patterson_keynote.pdf

SSDs tend to be a different story, and there is some interesting work being
done in this area, both on the systems side as well as the SSD side. This is
where the fun work is progressing :-)
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com

zfs discuss - Jun 2012 - Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?

[zfs-discuss] Scrub works in parallel?