thr3ads.net - zfs discuss - [zfs-discuss] Another paper [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Gregory Shaw

2007-Feb-21 22:35 UTC

[zfs-discuss] Another paper

Below is another paper on drive failure analysis, this one won best
paper at usenix:

http://www.usenix.org/events/fast07/tech/schroeder/schroeder_html/
index.html

What I found most interesting was the idea that drives don''t fail
outright most of the time. They can slow down operations, and
slowly die.

With this behavior in mind, I had an idea for a new feature in ZFS:

If a disk fitness test were available to verify disk read/write and
performance, future drive problems could be avoided.

Some example tests:
- full disk read
- 8kb r/w iops
- 1mb r/w iops
- raw throughput

Since one disk may be different than others, I thought a comparison
between two presumably similar disks would be useful.

The command would be something like:
zpool dft c1t0d0 c1t1d0

Or:
zpool dft all

I think this would be a great feature, as only zfs can do fitness
tests on live running disks behind the scenes.

With the ability to compare individual disk performance, not only
will you find bad disks, it''s entirely possible you''ll find
mis-
configurations (such as bad connections) as well.

And yes, I do know about SMART. SMART can pre-indicate a disk
failure. However, I''ve run SMART on drives with bearings that were
gravel that passed smart, even though I knew the 10k drive was
running at about 3k rpm due to the bearings.

-----
Gregory Shaw, IT Architect
Phone: (303) 272-8817 (x78817)
ITCTO Group, Sun Microsystems Inc.
500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work)
Broomfield, CO 80021 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070221/bb981bbc/attachment.html>

Richard Elling

2007-Feb-21 23:59 UTC

head link

[zfs-discuss] Another paper

Gregory Shaw wrote:> Below is another paper on drive failure analysis, this one won best 
> paper at usenix:
> 
>
http://www.usenix.org/events/fast07/tech/schroeder/schroeder_html/index.html
> 
> What I found most interesting was the idea that drives don''t fail 
> outright most of the time.   They can slow down operations, and slowly die.
Yes, this is what my data shows, too.  You are most likely to see an
unrecoverable read which leads to a retry (slow response symptom).
> With this behavior in mind, I had an idea for a new feature in ZFS:
> 
> If a disk fitness test were available to verify disk read/write and 
> performance, future drive problems could be avoided.
> 
> Some example tests:
> - full disk read
> - 8kb r/w iops
> - 1mb r/w iops
> - raw throughput
Some problems can be seen by doing a simple sequential read and comparing
it to historical data.  It depends on the failure mode, though.
> Since one disk may be different than others, I thought a comparison 
> between two presumably similar disks would be useful.
> 
> The command would be something like:
> zpool dft c1t0d0 c1t1d0
> 
> Or:
> zpool dft all
> 
> I think this would be a great feature, as only zfs can do fitness tests 
> on live running disks behind the scenes.
I like the concept, but don''t see why ZFS would be required.
> With the ability to compare individual disk performance, not only will 
> you find bad disks, it''s entirely possible you''ll find 
> mis-configurations (such as bad connections) as well.
A few years ago we looked at unusual changes in response time as a
leading indicator, but I don''t recall the details as to why we dropped
the effort.  Perhaps we should take a look again?
> And yes, I do know about SMART.   SMART can pre-indicate a disk 
> failure.  However, I''ve run SMART on drives with bearings that
were
> gravel that passed smart, even though I knew the 10k drive was running 
> at about 3k rpm due to the bearings.   
ditto.
  -- richard

Gregory Shaw

2007-Feb-22 00:11 UTC

head link

[zfs-discuss] Another paper

On Feb 21, 2007, at 4:59 PM, Richard Elling wrote:
>> With this behavior in mind, I had an idea for a new feature in ZFS:
>> If a disk fitness test were available to verify disk read/write  
>> and performance, future drive problems could be avoided.
>> Some example tests:
>> - full disk read
>> - 8kb r/w iops
>> - 1mb r/w iops
>> - raw throughput
>
> Some problems can be seen by doing a simple sequential read and  
> comparing
> it to historical data.  It depends on the failure mode, though.
>
I agree.  Having this feature could provide that history.
>> Since one disk may be different than others, I thought a  
>> comparison between two presumably similar disks would be useful.
>> The command would be something like:
>> zpool dft c1t0d0 c1t1d0
>> Or:
>> zpool dft all
>> I think this would be a great feature, as only zfs can do fitness  
>> tests on live running disks behind the scenes.
>
> I like the concept, but don''t see why ZFS would be required.
>
I''m thinking of production systems.  Since you can''t evacuate
the
disk, ZFS can do read/write tests on unused portion of the disk.   I  
don''t think that would be possible via another solution, such as SVM/ 
UFS.
>> With the ability to compare individual disk performance, not only  
>> will you find bad disks, it''s entirely possible
you''ll find mis-
>> configurations (such as bad connections) as well.
>
> A few years ago we looked at unusual changes in response time as a
> leading indicator, but I don''t recall the details as to why we
dropped
> the effort.  Perhaps we should take a look again?
>
More information is good in my book.   Anything that can tell me that  
things-aren''t-quite-right is more uptime that can be provided.
>> And yes, I do know about SMART.   SMART can pre-indicate a disk  
>> failure.  However, I''ve run SMART on drives with bearings that
>> were gravel that passed smart, even though I knew the 10k drive  
>> was running at about 3k rpm due to the bearings.
>
> ditto.
>  -- richard
-----
Gregory Shaw, IT Architect
Phone: (303) 272-8817 (x78817)
ITCTO Group, Sun Microsystems Inc.
500 Eldorado Blvd, UBRM02-157               greg.shaw at sun.com (work)
Broomfield, CO 80021                          shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070221/def4ef60/attachment.html>

Eric Schrock

2007-Feb-22 00:20 UTC

head link

[zfs-discuss] Another paper

On Wed, Feb 21, 2007 at 03:35:06PM -0700, Gregory Shaw
wrote:> Below is another paper on drive failure analysis, this one won best  
> paper at usenix:
> 
> http://www.usenix.org/events/fast07/tech/schroeder/schroeder_html/ 
> index.html
> 
> What I found most interesting was the idea that drives don''t fail
> outright most of the time.   They can slow down operations, and  
> slowly die.
Seems like there are a two pieces you''re suggesting here:

1. Some sort of background process to proactively find errors on disks
   in use by ZFS.  This will be accomplished by a background scrubbing
   option, dependent on the block-rewriting work Matt and Mark are
   working on.  This will allow something like "zpool set
scrub=2weeks",
   which will tell ZFS to "scrub my data at an interval such that all
   data is touched over a 2 week period".  This will test reading from
   every block and verifying checksums.  Stressing write failures is a
   little more difficult.

2. Distinguish "slow" drives from "normal" drives and
proactively mark
   them faulted.  This shouldn''t require an explicit "zpool
dft", as
   we should be watching the response times of the various drives and
   keep this as a statistic.  We want to incorporate this information
   to allow better allocation amongst slower and faster drives.
   Determining that a drive is "abnormally slow" is much more
difficult,
   though it could theoretically be done if we had some basis - either
   historical performance for the same drive or comparison to identical
   drives (manufacturer/model) within the pool.  While we''ve thought
   about these same issues, there is currently no active effort to keep
   track of these statistics or do anything with them.

These two things combined should avoid the need for an explicit fitness
test.

Hope that helps,

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Gregory Shaw

2007-Feb-22 03:08 UTC

head link

[zfs-discuss] Another paper

On Feb 21, 2007, at 5:20 PM, Eric Schrock wrote:
> On Wed, Feb 21, 2007 at 03:35:06PM -0700, Gregory Shaw wrote:
>> Below is another paper on drive failure analysis, this one won best
>> paper at usenix:
>>
>> http://www.usenix.org/events/fast07/tech/schroeder/schroeder_html/
>> index.html
>>
>> What I found most interesting was the idea that drives don''t
fail
>> outright most of the time.   They can slow down operations, and
>> slowly die.
>
> Seems like there are a two pieces you''re suggesting here:
>
> 1. Some sort of background process to proactively find errors on disks
>    in use by ZFS.  This will be accomplished by a background scrubbing
>    option, dependent on the block-rewriting work Matt and Mark are
>    working on.  This will allow something like "zpool set  
> scrub=2weeks",
>    which will tell ZFS to "scrub my data at an interval such that all
>    data is touched over a 2 week period".  This will test reading from
>    every block and verifying checksums.  Stressing write failures is a
>    little more difficult.
>
I was thinking of something similar to a scrub.   An ongoing process  
seemed too intrusive.  I''d envisioned a cron job similar to a scrub  
(or defrag) that could be run periodically to show any differences  
between disk performance over time.
> 2. Distinguish "slow" drives from "normal" drives and
proactively mark
>    them faulted.  This shouldn''t require an explicit "zpool
dft", as
>    we should be watching the response times of the various drives and
>    keep this as a statistic.  We want to incorporate this information
>    to allow better allocation amongst slower and faster drives.
>    Determining that a drive is "abnormally slow" is much more  
> difficult,
>    though it could theoretically be done if we had some basis - either
>    historical performance for the same drive or comparison to  
> identical
>    drives (manufacturer/model) within the pool.  While we''ve
thought
>    about these same issues, there is currently no active effort to  
> keep
>    track of these statistics or do anything with them.
>
I thought this would be very difficult to determine, as a slow disk  
could be a transient problem.

Me, I like tools that give me information I can work with.   Fully  
automated systems always seem to cause more problems than they solve.

For instance, if I have a drive on a pc using a shared ide bus, is it  
the disk that is slow, or the connection method?   It''s obviously the  
second, but finding that programatically will be very difficult.

I like the idea of a dft for testing a disk in a subjective manner.   
One benefit of this could be an objective performance test baseline  
for disks and arrays.

Btw, it does help.  :-)
> These two things combined should avoid the need for an explicit  
> fitness
> test.
>
> Hope that helps,
>
> - Eric
>
> --
> Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/ 
> eschrock
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 272-8817 (x78817)
ITCTO Group, Sun Microsystems Inc.
500 Eldorado Blvd, UBRM02-157               greg.shaw at sun.com (work)
Broomfield, CO 80021                          shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070221/5f0e7798/attachment.html>

Nicholas Lee

2007-Feb-22 03:13 UTC

head link

[zfs-discuss] Another paper

On 2/22/07, Gregory Shaw <Greg.Shaw at sun.com>
wrote:>
>
> I was thinking of something similar to a scrub.   An ongoing process
> seemed too intrusive.  I''d envisioned a cron job similar to a
scrub (or
> defrag) that could be run periodically to show any differences between disk
> performance over time.
>
...

I thought this would be very difficult to determine, as a slow disk could
be> a transient problem.
>
> Me, I like tools that give me information I can work with.   Fully
> automated systems always seem to cause more problems than they solve.
>
If the stats are publishable,  then something like cacti or any monitoring
tool should provide most admins with enough tools to spot potential issues.

Nicholas
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070222/e3526956/attachment.html>

TJ Easter

2007-Feb-22 06:25 UTC

head link

[zfs-discuss] Another paper

All,
    I think dtrace could be a viable option here.  crond to run a
dtrace script on a regular basis that times a series of reads and then
provides that info to Cacti or rrdtool.   It''s not quite the
one-size-fits-all that the OP was looking for, but if you want trends,
this should get ''em.

$0.02

Regards,
TJ Easter

On 2/21/07, Nicholas Lee <emptysands at gmail.com>
wrote:>
> On 2/22/07, Gregory Shaw <Greg.Shaw at sun.com> wrote:
> >
> >
> >
> > I was thinking of something similar to a scrub.   An ongoing process
> seemed too intrusive.  I''d envisioned a cron job similar to a
scrub (or
> defrag) that could be run periodically to show any differences between disk
> performance over time.
>
> ...
>
> >
> >
> > I thought this would be very difficult to determine, as a slow disk
could
> be a transient problem.
> >
> >
> > Me, I like tools that give me information I can work with.   Fully
> automated systems always seem to cause more problems than they solve.
>
> If the stats are publishable,  then something like cacti or any monitoring
> tool should provide most admins with enough tools to spot potential issues.
>
> Nicholas
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>

-- 
"Being a humanist means trying to behave decently without expectation
of rewards or punishment after you are dead."  -- Kurt Vonnegut
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x31185D8E

Wee Yeh Tan

2007-Feb-22 06:34 UTC

head link

[zfs-discuss] Another paper

Correct me if I''m wrong but fma seems like a more appropriate tool to
track disk errors.


-- 
Just me,
Wire ...

On 2/22/07, TJ Easter <tjeaster at gmail.com>
wrote:> All,
>     I think dtrace could be a viable option here.  crond to run a
> dtrace script on a regular basis that times a series of reads and then
> provides that info to Cacti or rrdtool.   It''s not quite the
> one-size-fits-all that the OP was looking for, but if you want trends,
> this should get ''em.
>
> $0.02
>
> Regards,
> TJ Easter

Joerg Schilling

2007-Feb-22 10:01 UTC

head link

[zfs-discuss] Another paper

Richard Elling <Richard.Elling at Sun.COM> wrote:
> > If a disk fitness test were available to verify disk read/write and 
> > performance, future drive problems could be avoided.
> > 
> > Some example tests:
> > - full disk read
> > - 8kb r/w iops
> > - 1mb r/w iops
> > - raw throughput
>
> Some problems can be seen by doing a simple sequential read and comparing
> it to historical data.  It depends on the failure mode, though.
Something that people often forget about are the bearings. Sometimes,
the disks do write too early asuming that the head is already on track.
The work out bearing however causes a track following problem....

For this reason, you need to run a random write test on old disks.
sformat includes such a test...

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Nicolas Williams

2007-Feb-22 16:22 UTC

head link

[zfs-discuss] Another paper

On Wed, Feb 21, 2007 at 04:20:58PM -0800, Eric Schrock
wrote:> Seems like there are a two pieces you''re suggesting here:
> 
> 1. Some sort of background process to proactively find errors on disks
>    in use by ZFS.  This will be accomplished by a background scrubbing
>    option, dependent on the block-rewriting work Matt and Mark are
>    working on.  This will allow something like "zpool set
scrub=2weeks",
>    which will tell ZFS to "scrub my data at an interval such that all
>    data is touched over a 2 week period".  This will test reading from
>    every block and verifying checksums.  Stressing write failures is a
>    little more difficult.
I got the impression that testing free disk space was also desired.
> 2. Distinguish "slow" drives from "normal" drives and
proactively mark
>    them faulted.  This shouldn''t require an explicit "zpool
dft", as
>    we should be watching the response times of the various drives and
>    keep this as a statistic.  We want to incorporate this information
>    to allow better allocation amongst slower and faster drives.
>    Determining that a drive is "abnormally slow" is much more
difficult,
>    though it could theoretically be done if we had some basis - either
>    historical performance for the same drive or comparison to identical
>    drives (manufacturer/model) within the pool.  While we''ve
thought
>    about these same issues, there is currently no active effort to keep
>    track of these statistics or do anything with them.
I would imagine that "slow" as in "long average seek times"
should be
relatively easy to detect, whereas "slow" as in "low
bandwidth" might be
harder (since I/O bandwidth might depend on characteristics of the
device path and how saturated it is).  Are long average seek times an
indication of trouble?

Nico
--

Olaf Manczak

2007-Feb-22 18:45 UTC

head link

[zfs-discuss] Another paper

Eric Schrock wrote:
> 1. Some sort of background process to proactively find errors on disks
>    in use by ZFS.  This will be accomplished by a background scrubbing
>    option, dependent on the block-rewriting work Matt and Mark are
>    working on.  This will allow something like "zpool set
scrub=2weeks",
>    which will tell ZFS to "scrub my data at an interval such that all
>    data is touched over a 2 week period".
Obviously, scrubbing and correcting "hard" errors that result in
ZFS checksum errors is very beneficial. However, it won''t address the
case of "soft" errors when the disk returns correct data but
observes some problems reading it. There are at least two good reasons
to pay attention to these "soft" errors:

a) Preemptive detection and rewriting of partially defective but
    still correctable sectors may prevent future data loss. Thus,
    it improves perceived reliability of disk drives, which is
    especially important in the JBOD case (including a single-drive JBOD).

b) It is not uncommon for such successful reads of partially defective
    media to happen only after several retries. It is somewhat unfortunate
    that there is no simple way to tell the drive how many times to retry.
    Firmware in ATA/SATA drives, used predominantly in single-disk PCs,
    will typically do a heroic effort to retrieve the data. It will
    make numerous attempts to reposition the actuator, recalibrate the
    head current, etc. It can take up to 20-40 seconds! Such strategy
    is reasonable for a desktop PC but in it happens in an busy
    enterprise file server it results in a temporary availability loss
    (the drive freezes for up 20-40 seconds every time you try to
    read this sector). Also, this strategy does not make any sense if
    a RAID group in which the drive participates has redundant data
    elsewhere, which is why SCSI/FC drives give up after a few retries.

One can detect (and repair) such problematic areas on disk by monitoring
the  SMART counters during scrubbing, and/or by monitoring physical
read timings (looking for abnormally slow ones).

-- Olaf

Eric Schrock

2007-Feb-22 18:55 UTC

head link

[zfs-discuss] Another paper

On Thu, Feb 22, 2007 at 10:45:04AM -0800, Olaf Manczak
wrote:> 
> Obviously, scrubbing and correcting "hard" errors that result in
> ZFS checksum errors is very beneficial. However, it won''t address
the
> case of "soft" errors when the disk returns correct data but
> observes some problems reading it. There are at least two good reasons
> to pay attention to these "soft" errors:
> 
> a) Preemptive detection and rewriting of partially defective but
>    still correctable sectors may prevent future data loss. Thus,
>    it improves perceived reliability of disk drives, which is
>    especially important in the JBOD case (including a single-drive JBOD).
These types of soft errors will be logged, managed, and (eventually)
diagnosed by SCSI FMA work currently in development.  If the SCSI DE
diagnoses a disk as faulty, then the ZFS agent will be able to respond
appropriately.
> b) It is not uncommon for such successful reads of partially defective
>    media to happen only after several retries. It is somewhat unfortunate
>    that there is no simple way to tell the drive how many times to retry.
>    Firmware in ATA/SATA drives, used predominantly in single-disk PCs,
>    will typically do a heroic effort to retrieve the data. It will
>    make numerous attempts to reposition the actuator, recalibrate the
>    head current, etc. It can take up to 20-40 seconds! Such strategy
>    is reasonable for a desktop PC but in it happens in an busy
>    enterprise file server it results in a temporary availability loss
>    (the drive freezes for up 20-40 seconds every time you try to
>    read this sector). Also, this strategy does not make any sense if
>    a RAID group in which the drive participates has redundant data
>    elsewhere, which is why SCSI/FC drives give up after a few retries.
> 
> One can detect (and repair) such problematic areas on disk by monitoring
> the  SMART counters during scrubbing, and/or by monitoring physical
> read timings (looking for abnormally slow ones).
Solaris currently has a disk monitoring FMA module that is specific to
Thumper (x4500) and monitors only the most basic information (overtemp,
self-test fail, predictive failure).  I have separated this out into a
common FMA transport module which will bring this functionality to all
platforms (though support for SCSI devices will depend on the
aforementioned SCSI FMA portfolio).  This should be putback soon.
Future work could expand this beyond the simple indicators into more
detailed analysis of various counters.

All of this is really a common FMA problem, not ZFS-specific.  All that
is needed in ZFS is an agent actively responding to external diagnoses.
I am laying the groundwork for this as part of ongoing ZFS/FMA work
mentioned in other threads.  For more information on ongoing FMA work, I
recommend visiting the FMA discussion forum.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Gregory Shaw

2007-Feb-23 14:59 UTC

head link

[zfs-discuss] Another paper

On Feb 22, 2007, at 11:55 AM, Eric Schrock wrote:
[ ... ]
>
>> b) It is not uncommon for such successful reads of partially  
>> defective
>>    media to happen only after several retries. It is somewhat  
>> unfortunate
>>    that there is no simple way to tell the drive how many times to  
>> retry.
>>    Firmware in ATA/SATA drives, used predominantly in single-disk  
>> PCs,
>>    will typically do a heroic effort to retrieve the data. It will
>>    make numerous attempts to reposition the actuator, recalibrate the
>>    head current, etc. It can take up to 20-40 seconds! Such strategy
>>    is reasonable for a desktop PC but in it happens in an busy
>>    enterprise file server it results in a temporary availability loss
>>    (the drive freezes for up 20-40 seconds every time you try to
>>    read this sector). Also, this strategy does not make any sense if
>>    a RAID group in which the drive participates has redundant data
>>    elsewhere, which is why SCSI/FC drives give up after a few  
>> retries.
>>
>> One can detect (and repair) such problematic areas on disk by  
>> monitoring
>> the  SMART counters during scrubbing, and/or by monitoring physical
>> read timings (looking for abnormally slow ones).
>
> Solaris currently has a disk monitoring FMA module that is specific to
> Thumper (x4500) and monitors only the most basic information  
> (overtemp,
> self-test fail, predictive failure).  I have separated this out into a
> common FMA transport module which will bring this functionality to all
> platforms (though support for SCSI devices will depend on the
> aforementioned SCSI FMA portfolio).  This should be putback soon.
> Future work could expand this beyond the simple indicators into more
> detailed analysis of various counters.
>
> All of this is really a common FMA problem, not ZFS-specific.  All  
> that
> is needed in ZFS is an agent actively responding to external  
> diagnoses.
> I am laying the groundwork for this as part of ongoing ZFS/FMA work
> mentioned in other threads.  For more information on ongoing FMA  
> work, I
> recommend visiting the FMA discussion forum.
>
> - Eric
>
> --
> Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/ 
> eschrock

I disagree.  Originally, I asked for the following:

- Objective performance reporting in a simple to parse format  
(similar to scrub)
- The ability to schedule non-data-intrusive disk tests to verify  
disk performance.
- The ability to compare two similar disks for performance.

In the above, you''ve taken pro-active capabilities and turned them  
into failure mitigation, or, re-active capabilities.

 From the paper, the problem isn''t outright disk failure, but disk  
performance degradation.  I asked for the above to easily determine  
whether a disk is performing similarly to others, or may be degrading.

The need for ZFS to do this is two-fold:

1.  ZFS can write to the disk non-intrusively.   Any subsystem  
outside of the native filesystem will be able to execute read tests  
only, which is only part of the analysis.
2.  If the command is available at the zfs (or pool) level, it  
becomes an easy method for diagnosis.  When you must ''roll your
own''
via script or dtrace, the objectivity goes away and comparisons  
between systems become increasingly difficult.

My concern for moving this into exclusively FMA has to do with  
focus.  I''ve found that most fault mitigation systems concentrate on  
just that: faults.  Performance degradation isn''t treated as a fault,  
and usually falls out of any fault management system as a "we''d
like
to do that, but we''ve got bigger things to do."

-----
Gregory Shaw, IT Architect
IT CTO Group, Sun Microsystems Inc.
Phone: (303)-272-8817 (x78817)
500 Eldorado Blvd, UBRM02-157     greg.shaw at sun.com (work)
Broomfield, CO 80021                   shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve won." -
Linus
Torvalds




-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070223/3a5154a2/attachment.html>

zfs discuss - Feb 2007 - Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper

[zfs-discuss] Another paper