thr3ads.net - zfs discuss - [zfs-discuss] Quickest way to find files with cksum errors without doing scrub [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Richard Elling

2009-Sep-28 16:58 UTC

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Sep 28, 2009, at 2:41 PM, Albert Chin wrote:
> Without doing a zpool scrub, what''s the quickest way to find files
> in a
> filesystem with cksum errors? Iterating over all files with
"find"
> takes
> quite a bit of time. Maybe there''s some zdb fu that will perform
the
> check for me?
Scrub could be faster, but you can try
	tar cf - . > /dev/null

If you think about it, validating checksums requires reading the data.
So you simply need to read the data.
  -- richard

Bob Friesenhahn

2009-Sep-28 17:09 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, 28 Sep 2009, Richard Elling wrote:>
> Scrub could be faster, but you can try
> 	tar cf - . > /dev/null
>
> If you think about it, validating checksums requires reading the data.
> So you simply need to read the data.
This should work but it does not verify the redundant metadata.  For 
example, the duplicate metadata copy might be corrupt but the problem 
is not detected since it did not happen to be used.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Richard Elling

2009-Sep-28 17:16 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
> On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>
>>> Scrub could be faster, but you can try
>>> 	tar cf - . > /dev/null
>>>
>>> If you think about it, validating checksums requires reading the  
>>> data.
>>> So you simply need to read the data.
>>
>> This should work but it does not verify the redundant metadata.  For
>> example, the duplicate metadata copy might be corrupt but the problem
>> is not detected since it did not happen to be used.
>
> Too bad we cannot scrub a dataset/object.
Can you provide a use case? I don''t see why scrub couldn''t
start and
stop at specific txgs for instance. That won''t necessarily get you to a
specific file, though.
  -- richard

Tim Cook

2009-Sep-28 17:22 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, Sep 28, 2009 at 12:16 PM, Richard Elling
<richard.elling at gmail.com>wrote:
> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>
>  On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>
>>>>
>>>> Scrub could be faster, but you can try
>>>>        tar cf - . > /dev/null
>>>>
>>>> If you think about it, validating checksums requires reading
the data.
>>>> So you simply need to read the data.
>>>>
>>>
>>> This should work but it does not verify the redundant metadata. 
For
>>> example, the duplicate metadata copy might be corrupt but the
problem
>>> is not detected since it did not happen to be used.
>>>
>>
>> Too bad we cannot scrub a dataset/object.
>>
>
> Can you provide a use case? I don''t see why scrub
couldn''t start and
> stop at specific txgs for instance. That won''t necessarily get you
to a
> specific file, though.
>  -- richard
>
>
> On Mon, Sep 28, 2009 at 12:16 PM, Richard Elling <richard.elling at
gmail.com
> wrote:
> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>
>  On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>
>>>>
>>>> Scrub could be faster, but you can try
>>>>        tar cf - . > /dev/null
>>>>
>>>> If you think about it, validating checksums requires reading
the data.
>>>> So you simply need to read the data.
>>>>
>>>
>>> This should work but it does not verify the redundant metadata. 
For
>>> example, the duplicate metadata copy might be corrupt but the
problem
>>> is not detected since it did not happen to be used.
>>>
>>
>> Too bad we cannot scrub a dataset/object.
>>
>
> Can you provide a use case? I don''t see why scrub
couldn''t start and
> stop at specific txgs for instance. That won''t necessarily get you
to a
> specific file, though.
>  -- richard
>
I get the impression he just wants to check a single file in a pool without
waiting for it to check the entire pool.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090928/1c973171/attachment.html>

Bob Friesenhahn

2009-Sep-28 17:25 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, 28 Sep 2009, Bob Friesenhahn wrote:>
> This should work but it does not verify the redundant metadata.  For
example,
> the duplicate metadata copy might be corrupt but the problem is not
detected
> since it did not happen to be used.
I am finding that your tar incantation is reading hardly any data from 
disk when testing my home directory and the ''tar'' happens to
be GNU
tar:

# time tar cf - . > /dev/null
tar cf - . > /dev/null  2.72s user 12.43s system 96% cpu 15.721 total
# du -sh .
82G

Looks like the GNU folks slipped in a small performance "enhancement" 
if the output is to /dev/null.

Make sure to use /bin/tar, which seems to actually read the data.

When actually reading the data via tar, read performance is very poor. 
Hopefully I will have a ZFS IDR to test with in the next few days 
which fixes the prefetch bug.

Zpool scrub reads the data at 360MB/second but this tar method is only 
reading at an average of 6MB/second to 42MB/second (according to zpool 
iostat).  Wups, I just saw a one-minute average of 105MB and then 
131MB.  Quite variable.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Victor Latushkin

2009-Sep-28 17:31 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Richard Elling wrote:> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
> 
>> On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>>
>>>> Scrub could be faster, but you can try
>>>>     tar cf - . > /dev/null
>>>>
>>>> If you think about it, validating checksums requires reading
the data.
>>>> So you simply need to read the data.
>>>
>>> This should work but it does not verify the redundant metadata. 
For
>>> example, the duplicate metadata copy might be corrupt but the
problem
>>> is not detected since it did not happen to be used.
>>
>> Too bad we cannot scrub a dataset/object.
> 
> Can you provide a use case? I don''t see why scrub
couldn''t start and
> stop at specific txgs for instance. That won''t necessarily get you
to a
> specific file, though.
With ever increasing disk and pool sizes it takes more and more time for 
scrub to complete its job. Let''s imagine that you have 100TB pool with 
90TB of data in it, and there''s dataset with 10TB that is critical and 
another dataset with 80TB that is not that critical and you can afford 
loosing some blocks/files there.

So being able to scrub individual dataset would help to run scrubs of 
critical data more frequently and faster and schedule scrubs for less 
frequently used and/or less important data to happen much less frequently.

It may be useful to have a way to tell ZFS to scrub pool-wide metadata 
only (space maps etc), so that you can build your own schedule of scrubs.

Another interesting idea is to be able to scrub only blocks modified 
since last snapshot.

victor

Richard Elling

2009-Sep-28 18:01 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Sep 28, 2009, at 10:31 AM, Victor Latushkin wrote:
> Richard Elling wrote:
>> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>>> On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>>>
>>>>> Scrub could be faster, but you can try
>>>>>    tar cf - . > /dev/null
>>>>>
>>>>> If you think about it, validating checksums requires
reading the
>>>>> data.
>>>>> So you simply need to read the data.
>>>>
>>>> This should work but it does not verify the redundant metadata.
>>>> For
>>>> example, the duplicate metadata copy might be corrupt but the  
>>>> problem
>>>> is not detected since it did not happen to be used.
>>>
>>> Too bad we cannot scrub a dataset/object.
>> Can you provide a use case? I don''t see why scrub
couldn''t start and
>> stop at specific txgs for instance. That won''t necessarily get
you
>> to a
>> specific file, though.
>
> With ever increasing disk and pool sizes it takes more and more time  
> for scrub to complete its job. Let''s imagine that you have 100TB  
> pool with 90TB of data in it, and there''s dataset with 10TB that
is
> critical and another dataset with 80TB that is not that critical and  
> you can afford loosing some blocks/files there.
Personally, I have three concerns here.
	1. Gratuitous complexity, especially inside a pool -- aka creeping  
featurism
	2. Wouldn''t a better practice be to use two pools with different  
protection
	   policies? The only protection policy differences inside a pool are  
copies.
	   In other words, I am concerned that people replace good data  
protection
	   practices with scrubs and expecting scrub to deliver better data  
protection
	   (it won''t).
	3. Since the pool contains the set of blocks, shared by datasets, it  
is not clear
	   to me that scrubbing a dataset will detect all of the data  
corruption failures
	   which can affect the dataset.  I''m thinking along the lines of  
phantom writes,
	   for example.
	4. the time it takes to scrub lots of stuff
...there are four concerns... :-)

For magnetic media, a yearly scrub interval should suffice for most  
folks.  I know
some folks who scrub monthly. More frequent scrubs won''t buy much.

Scrubs are also useful for detecting broken hardware. However, normal
activity will also detect broken hardware, so it is better to think of  
scrubs as
finding degradation of old data rather than being a hardware checking  
service.

> So being able to scrub individual dataset would help to run scrubs  
> of critical data more frequently and faster and schedule scrubs for  
> less frequently used and/or less important data to happen much less  
> frequently.
>
> It may be useful to have a way to tell ZFS to scrub pool-wide  
> metadata only (space maps etc), so that you can build your own  
> schedule of scrubs.
>
> Another interesting idea is to be able to scrub only blocks modified  
> since last snapshot.
This can be relatively easy to implement. But remember that scrubs are  
most
useful for finding data which has degraded from the media. In other  
words, old
data. New data is not likely to have degraded yet, and since ZFS is  
COW, all of
the new data is, well, new.  This is why having the ability to bound  
the start and
end of a scrub by txg can be easy and perhaps useful.
  -- richard

Victor Latushkin

2009-Sep-28 18:28 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On 28.09.09 22:01, Richard Elling wrote:> On Sep 28, 2009, at 10:31 AM, Victor Latushkin wrote:
> 
>> Richard Elling wrote:
>>> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>>>> On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn
wrote:
>>>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>>>>
>>>>>> Scrub could be faster, but you can try
>>>>>>    tar cf - . > /dev/null
>>>>>>
>>>>>> If you think about it, validating checksums requires
reading the
>>>>>> data.
>>>>>> So you simply need to read the data.
>>>>>
>>>>> This should work but it does not verify the redundant
metadata.  For
>>>>> example, the duplicate metadata copy might be corrupt but
the problem
>>>>> is not detected since it did not happen to be used.
>>>>
>>>> Too bad we cannot scrub a dataset/object.
>>> Can you provide a use case? I don''t see why scrub
couldn''t start and
>>> stop at specific txgs for instance. That won''t necessarily
get you to a
>>> specific file, though.
>>
>> With ever increasing disk and pool sizes it takes more and more time 
>> for scrub to complete its job. Let''s imagine that you have
100TB pool
>> with 90TB of data in it, and there''s dataset with 10TB that is
>> critical and another dataset with 80TB that is not that critical and 
>> you can afford loosing some blocks/files there.
> 
> Personally, I have three concerns here.
>     1. Gratuitous complexity, especially inside a pool -- aka creeping 
> featurism
There''s the idea of priority-based resilvering (though not implemented
yet, see
http://blogs.sun.com/bonwick/en_US/entry/smokin_mirrors) that can be simply 
extended to scrubs as well.
>     2. Wouldn''t a better practice be to use two pools with
different
> protection
>        policies? The only protection policy differences inside a pool 
> are copies.
>        In other words, I am concerned that people replace good data 
> protection
>        practices with scrubs and expecting scrub to deliver better data 
> protection
>        (it won''t).
It may be better, it may be not... With two pools you split you bandwidth and 
IOPS and space and have more entities to care about...
>     3. Since the pool contains the set of blocks, shared by datasets, it 
> is not clear
>        to me that scrubbing a dataset will detect all of the data 
> corruption failures
>        which can affect the dataset.  I''m thinking along the lines
of
> phantom writes,
>        for example.
That is why it may be useful to always scrub pool-wide metadata or have a way to
specifically request it.
>     4. the time it takes to scrub lots of stuff
> ...there are four concerns... :-)
> 
> For magnetic media, a yearly scrub interval should suffice for most 
> folks.  I know
> some folks who scrub monthly. More frequent scrubs won''t buy much.
It won''t buy you much in term of magnetic media decay discovery.
Unfortunately,
there other sources of corruption as well (including phantom writes you are 
thinking about), and being able to discover corruption and recover it as quickly
as possible from the backup it a good thing.
> Scrubs are also useful for detecting broken hardware. However, normal
> activity will also detect broken hardware, so it is better to think of 
> scrubs as
> finding degradation of old data rather than being a hardware checking 
> service.
> 
> 
>> So being able to scrub individual dataset would help to run scrubs of 
>> critical data more frequently and faster and schedule scrubs for less 
>> frequently used and/or less important data to happen much less 
>> frequently.
>>
>> It may be useful to have a way to tell ZFS to scrub pool-wide metadata 
>> only (space maps etc), so that you can build your own schedule of
scrubs.
>>
>> Another interesting idea is to be able to scrub only blocks modified 
>> since last snapshot.
> 
> This can be relatively easy to implement. But remember that scrubs are most
> useful for finding data which has degraded from the media. In other 
> words, old
> data. New data is not likely to have degraded yet, and since ZFS is COW, 
> all of
> the new data is, well, new.
> This is why having the ability to bound the start and end of a scrub by txg
> can be easy and perhaps useful.
This requires exporting concept of the transaction group numbers to the user and
i do not see how it is less complex from the user interface perspective than 
being able to request scrub of individual dataset, pool-wide metadata or 
newly-written data.

regards,
victor

Bob Friesenhahn

2009-Sep-28 18:41 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, 28 Sep 2009, Richard Elling wrote:
> 	   In other words, I am concerned that people replace good 
> data protection
> 	   practices with scrubs and expecting scrub to deliver better data 
> protection
> 	   (it won''t).
Many people here would profoundly disagree with the above.  There is 
no substitute for good backups, but a periodic scrub helps validate 
that a later resilver would succeed.  A perioic scrub also helps find 
system problems early when they are less likely to crater your 
business.  It is much better to find an issue during a scrub rather 
than during resilver of a mirror or raidz.
> Scrubs are also useful for detecting broken hardware. However, 
> normal activity will also detect broken hardware, so it is better to 
> think of scrubs as finding degradation of old data rather than being 
> a hardware checking service.
Do you have a scientific reference for this notion that "old data" is 
more likely to be corrupt than "new data" or is it just a gut-feeling?
This hypothesis does not sound very supportable to me.  Magnetic 
hysteresis lasts quite a lot longer than the recommended service life 
for a hard drive.  Studio audio tapes from the ''60s are still being 
used to produce modern "remasters" of old audio recordings which sound
better than they ever did before (other than the master tape).  Some 
forms of magnetic hysteresis are known to last millions of years. 
Media failure is more often than not mechanical or chemical and not 
related to loss of magnetic hysteresis.  Head failures may be 
construed to be media failures.

See http://en.wikipedia.org/wiki/Ferromagnetic for information on 
ferromagnetic materials.

It would be most useful if zfs incorporated a slow-scan scrub which 
validates data at a low rate of speed which does not hinder active 
I/O.  Of course this is not a "green" energy efficient solution.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Albert Chin

2009-Sep-28 21:41 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Without doing a zpool scrub, what''s the quickest way to find files in a
filesystem with cksum errors? Iterating over all files with "find"
takes
quite a bit of time. Maybe there''s some zdb fu that will perform the
check for me?

-- 
albert chin (china at thewrittenword.com)

Albert Chin

2009-Sep-28 22:42 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn
wrote:> On Mon, 28 Sep 2009, Richard Elling wrote:
>>
>> Scrub could be faster, but you can try
>> 	tar cf - . > /dev/null
>>
>> If you think about it, validating checksums requires reading the data.
>> So you simply need to read the data.
>
> This should work but it does not verify the redundant metadata.  For
> example, the duplicate metadata copy might be corrupt but the problem
> is not detected since it did not happen to be used.
Too bad we cannot scrub a dataset/object.

-- 
albert chin (china at thewrittenword.com)

Albert Chin

2009-Sep-28 22:58 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, Sep 28, 2009 at 10:16:20AM -0700, Richard Elling
wrote:> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>
>> On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>>
>>>> Scrub could be faster, but you can try
>>>> 	tar cf - . > /dev/null
>>>>
>>>> If you think about it, validating checksums requires reading
the
>>>> data.
>>>> So you simply need to read the data.
>>>
>>> This should work but it does not verify the redundant metadata. 
For
>>> example, the duplicate metadata copy might be corrupt but the
problem
>>> is not detected since it did not happen to be used.
>>
>> Too bad we cannot scrub a dataset/object.
>
> Can you provide a use case? I don''t see why scrub
couldn''t start and
> stop at specific txgs for instance. That won''t necessarily get you
to a
> specific file, though.
If your pool is borked but mostly readable, yet some file systems have
cksum errors, you cannot "zfs send" that file system (err, snapshot of
filesystem). So, you need to manually fix the file system by traversing
it to read all files to determine which must be fixed. Once this is
done, you can snapshot and "zfs send". If you have many file systems,
this is time consuming.

Of course, you could just rsync and be happy with what you were able to
recover, but if you have clones branched from the same parent, which a
few differences inbetween shapshots, having to rsync *everything* rather
than just the differences is painful. Hence the reason to try to get
"zfs send" to work.

But, this is an extreme example and I doubt pools are often in this
state so the engineering time isn''t worth it. In such cases though, a
"zfs scrub" would be useful.

-- 
albert chin (china at thewrittenword.com)

Richard Elling

2009-Sep-28 23:39 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Sep 28, 2009, at 11:41 AM, Bob Friesenhahn wrote:
> On Mon, 28 Sep 2009, Richard Elling wrote:
>
>> 	   In other words, I am concerned that people replace good data  
>> protection
>> 	   practices with scrubs and expecting scrub to deliver better  
>> data protection
>> 	   (it won''t).
>
> Many people here would profoundly disagree with the above.  There is  
> no substitute for good backups, but a periodic scrub helps validate  
> that a later resilver would succeed.  A perioic scrub also helps  
> find system problems early when they are less likely to crater your  
> business.  It is much better to find an issue during a scrub rather  
> than during resilver of a mirror or raidz.
As I said, I am concerned that people would mistakenly expect that  
scrubbing
offers data protection. It doesn''t.  I think you proved my point? ;-)
>> Scrubs are also useful for detecting broken hardware. However,  
>> normal activity will also detect broken hardware, so it is better  
>> to think of scrubs as finding degradation of old data rather than  
>> being a hardware checking service.
>
> Do you have a scientific reference for this notion that "old
data"
> is more likely to be corrupt than "new data" or is it just a gut-
> feeling? This hypothesis does not sound very supportable to me.   
> Magnetic hysteresis lasts quite a lot longer than the recommended  
> service life for a hard drive.  Studio audio tapes from the ''60s
are
> still being used to produce modern "remasters" of old audio  
> recordings which sound better than they ever did before (other than  
> the master tape).
Those are analog tapes... they just fade away...
For data, it depends on the ECC methods, quality of the media,  
environment, etc.
You will find considerable attention spent on verification of data on  
tapes in
archiving products. In the tape world, there are slightly different  
conditions than
the magnetic disk world, but I can''t think of a single study which  
shows that
magnetic disks get more reliable over time, while there are dozens  
which show
that they get less reliable and that latent sector errors dominate, as  
much as 5x,
over full disk failures.  My studies of Sun disk failure rates have  
shown similar
results.
>  Some forms of magnetic hysteresis are known to last millions of  
> years. Media failure is more often than not mechanical or chemical  
> and not related to loss of magnetic hysteresis.  Head failures may  
> be construed to be media failures.
Here is a good study from the University of Wisconsin-Madison which  
clearly
shows the relationship between disk age and latent sector errors. It  
also shows
how the increase in aerial density also increases the latent sector  
error (LSE) rate.
Additionally, this gets back to the ECC method, which we observe to be  
different
on consumer-grade and enterprise-class disks. The study shows a clear  
win
for enterprise-class drives wrt latent errors.  The paper suggests a 2- 
week
scrub cycle and recognizes that many RAID arrays have such policies.  
There
are indeed many studies which show latent sector errors are a bigger  
problem
as the disk ages.
	An Analysis of Latent Sector Errors in Disk Drives
	www.cs.wisc.edu/adsl/Publications/latent-sigmetrics07.ps
>
> See http://en.wikipedia.org/wiki/Ferromagnetic for information on  
> ferromagnetic materials.
For disks we worry about the superparamagnetic effect.
	http://en.wikipedia.org/wiki/Superparamagnetism

Quoting US Patent 6987630,
	... the superparamagnetic effect is a thermal relaxation of information
	stored on the disk surface. Because the superparamagnetic effect may
	occur at room temperature, over time, information stored on the disk
	surface will begin to decay. Once the stored information decays beyond
	a threshold level, it will be unable to be properly read by the read  
head
	and the information will be lost.

	The superparamagnetic effect manifests itself by a loss in amplitude in
	the readback signal over time or an increase in the mean square error
	(MSE) of the read back signal over time. In other words, the readback
	signal quality metrics are means square error and amplitude as measured
	by the read channel integrated circuit. Decreases in the quality of the
	readback signal cause bit error rate (BER) increases. As is well known,
	the BER is the ultimate measure of drive performance in a disk drive.

This effect is based on the time since written. Hence, older data can  
have
higher MSE and subsequent BER leading to a UER.

To be fair, newer disk technology is constantly improving. But what is
consistent with the physics is that increase in bit densities leads to
more space and rebalancing the BER. IMHO, this is why we see densities
increase, but UER does not increase (hint: marketing always wins these
sorts of battles).

FWIW, flash memories are not affected by superparamagnetic decay.
> It would be most useful if zfs incorporated a slow-scan scrub which  
> validates data at a low rate of speed which does not hinder active I/ 
> O.  Of course this is not a "green" energy efficient solution.
Oprea and Juels write, "Our key insight is that more aggressive  
scrubbing
does not always increase disk reliability, as previously believed."  
They show
how read-induced LSEs would tend to encourage you to scrub less  
frequently.
They also discuss the advantage of random versus sequential scrubbing. I
would classify zfs scrubs as more random than sequential, for most  
workloads.
Their model is even more sophisticated and considers scrubbing policy  
based
on the age of the disk and how many errors have been previously  
detected.
	A Clean-Slate Look at Disk Scrubbing
	http://www.rsa.com/rsalabs/staff/bios/aoprea/publications/scrubbing.pdf

Finally, there are two basic types of scrubs: read-only and rewrite.   
ZFS does
read-only. Other scrubbers can do rewrite. There is evidence that  
rewrites
are better for attacking superparamagnetic decay issues.

So it is still not clear what the best scrubbing model or interval  
should be
for the general case. I suggest scrubbing periodically, but not  
continuously :-)

Currently, scrub has the lowest priority for the vdev_queue. But I  
think the
vdev_queue could use more research.
  -- richard

David Magda

2009-Sep-29 01:46 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Sep 28, 2009, at 19:39, Richard Elling wrote:
> Finally, there are two basic types of scrubs: read-only and  
> rewrite.  ZFS does
> read-only. Other scrubbers can do rewrite. There is evidence that  
> rewrites
> are better for attacking superparamagnetic decay issues.
Something that may be possible when *bp rewrite is eventually committed.

Educating post. Thanks.

Robert Milkowski

2009-Sep-29 02:08 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Bob Friesenhahn wrote:> On Mon, 28 Sep 2009, Richard Elling wrote:
>>
>> Scrub could be faster, but you can try
>>     tar cf - . > /dev/null
>>
>> If you think about it, validating checksums requires reading the data.
>> So you simply need to read the data.
>
> This should work but it does not verify the redundant metadata.  For 
> example, the duplicate metadata copy might be corrupt but the problem 
> is not detected since it did not happen to be used.
>
Not only that - it won''t also read all the copies of data if zfs has 
redundancy configured at a pool level. Scrubbing the pool will. And 
that''s the main reason behind the scrub - to be able to detect and 
repair checksum errors (if any) while a redundant copy is still fine.

-- 
Robert Milkowski
http://milek.blogspot.com

Robert Milkowski

2009-Sep-29 02:12 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Robert Milkowski wrote:> Bob Friesenhahn wrote:
>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>
>>> Scrub could be faster, but you can try
>>>     tar cf - . > /dev/null
>>>
>>> If you think about it, validating checksums requires reading the
data.
>>> So you simply need to read the data.
>>
>> This should work but it does not verify the redundant metadata.  For 
>> example, the duplicate metadata copy might be corrupt but the problem 
>> is not detected since it did not happen to be used.
>>
>
> Not only that - it won''t also read all the copies of data if zfs
has
> redundancy configured at a pool level. Scrubbing the pool will. And 
> that''s the main reason behind the scrub - to be able to detect and
> repair checksum errors (if any) while a redundant copy is still fine.
>
Also doing tar means reading from ARC and/or L2ARC if data is cached 
which won''t verify if data is actually fine on a disk. Scrub
won''t use a
cache and will always go to physical disks.

-- 
Robert Milkowski
http://milek.blogspot.com

Bob Friesenhahn

2009-Sep-29 03:43 UTC

head link

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

On Mon, 28 Sep 2009, Richard Elling wrote:>> 
>> Many people here would profoundly disagree with the above.  There is no
>> substitute for good backups, but a periodic scrub helps validate that a
>> later resilver would succeed.  A perioic scrub also helps find system 
>> problems early when they are less likely to crater your business.  It
is
>> much better to find an issue during a scrub rather than during resilver
of
>> a mirror or raidz.
>
> As I said, I am concerned that people would mistakenly expect that
scrubbing
> offers data protection. It doesn''t.  I think you proved my point?
;-)
It does not specifically offer data "protection" but if you have only 
duplex redundancy, it substantially helps find and correct a failure 
which would have caused data loss during a resilver.  The value 
substantially diminishes if you have triple redundancy.

I hope it does not offend that I scrub my mirrored pools once a week.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

zfs discuss - Sep 2009 - Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

[zfs-discuss] Quickest way to find files with cksum errors without doing scrub