thr3ads.net - zfs discuss - [zfs-discuss] "Hot Space" vs. hot spares [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Brandon High

2009-Oct-01 00:06 UTC

[zfs-discuss] "Hot Space" vs. hot spares

I might have this mentioned already on the list and can''t find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space & power but don''t provide
usable storage.

Depending on the number of spares you''ve assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn''t work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It''s not something I''d want to do with less than raidz2
protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn''t be able to share a spare among many
vdevs either, but you wouldn''t always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev & zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I''m sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B

-- 
Brandon High : bhigh at freaks.com
Always try to do things in chronological order; it''s less confusing
that way.

Tim Cook

2009-Oct-01 00:53 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

On Wed, Sep 30, 2009 at 7:06 PM, Brandon High <bhigh at freaks.com> wrote:
> I might have this mentioned already on the list and can''t find it
now,
> or I might have misread something and come up with this ...
>
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don''t
provide
> usable storage.
>
> Depending on the number of spares you''ve assigned, you could have
7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
>
> What about using the free space in the pool to cover for the failed drive?
>
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn''t work, create
a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
>
> It''s not something I''d want to do with less than raidz2
protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn''t be able to share a spare among
many
> vdevs either, but you wouldn''t always need to if you leave some
space
> free on the zpool.
>
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I''m sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
>
> -B
>
> --
> Brandon High : bhigh at freaks.com
> Always try to do things in chronological order; it''s less
confusing that
> way.
>

What are you hoping to accomplish?  You''re still going to need a drives
worth of free space, and if you''re so performance strapped that one
drive
makes the difference, you''ve got some bigger problems on your hands.

To me it sounds like complexity for complexity''s sake, and leaving
yourself
with a far less flexible option in the face of a drive failure.

BTW, you shouldn''t need one disk per tray of 14 disks.  Unless
you''ve got
some known bad disks/environmental issues, every 2-3 should be fine.  Quite
frankly, if you''re doing raid-z3, I''d feel comfortable with
one per thumper.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090930/5cf6d10f/attachment.html>

Erik Trimble

2009-Oct-01 00:56 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

Brandon High wrote:> I might have this mentioned already on the list and can''t find it
now,
> or I might have misread something and come up with this ...
>
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don''t
provide
> usable storage.
>
> Depending on the number of spares you''ve assigned, you could have
7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
>
> What about using the free space in the pool to cover for the failed drive?
>
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn''t work, create
a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
>
> It''s not something I''d want to do with less than raidz2
protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn''t be able to share a spare among
many
> vdevs either, but you wouldn''t always need to if you leave some
space
> free on the zpool.
>
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I''m sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
>
> -B
>
>   What you describe makes no sense for single-parity vdevs, since it 
actually increases the likelihood for data loss. In multi-parity vdevs, 
even with the loss of one drive, you still have full parity protection, 
so why would you go for all that extra effort, since it gains you what?


 From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
the way to go instead of hot spares. 

Hot spares are useful for adding protection to a number of vdevs, not a 
single vdev.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Matthew Ahrens

2009-Oct-01 01:01 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

Brandon,

Yes, this is something that should be possible once we have bp rewrite (the 
ability to move blocks around).  One minor downside to "hot space"
would be
that it couldn''t be shared among multiple pools the way that hot spares
can.

Also depending on the pool configuration, hot space may be impractical.  For 
example if you are using wide RAIDZ[-N] stripes.  If you have say 4 top-level 
RAIDZ-2 vdevs each with 10 disks in it, you would have to keep your pool at 
most 3/4 full to be able to take advantage of hot space.  And if you wanted 
to tolerate any 2 disks failing, the pool could be at most 1/2 full. 
(Although one could imagine eventually recombining some of the remaining 18 
good disks to make another RAIDZ group.)

So I imagine that with this implementation at least (remove faulted top-level 
vdev), Hot Space would only be practical when using mirroring.  That said, 
once we have (top-level) device removal implemented, you could implement a 
poor-man''s hot space with some simple scripts -- just remove the
degraded
top-level vdev from the pool.

FYI, I am currently working on bprewrite for device removal.

--matt

Brandon High wrote:> I might have this mentioned already on the list and can''t find it
now,
> or I might have misread something and come up with this ...
> 
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don''t
provide
> usable storage.
> 
> Depending on the number of spares you''ve assigned, you could have
7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
> 
> What about using the free space in the pool to cover for the failed drive?
> 
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn''t work, create
a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
> 
> It''s not something I''d want to do with less than raidz2
protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn''t be able to share a spare among
many
> vdevs either, but you wouldn''t always need to if you leave some
space
> free on the zpool.
> 
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I''m sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
> 
> -B
>

Matthew Ahrens

2009-Oct-01 01:03 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

Erik Trimble wrote:>  From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
> the way to go instead of hot spares.
> Hot spares are useful for adding protection to a number of vdevs, not a 
> single vdev.
Even when using raidz2 or 3, it is useful to have hot spares so that 
reconstruction can begin immediately.  Otherwise it would have to wait for 
the operator to physically remove the failed disk and insert a new one.

--matt

Richard Elling

2009-Oct-01 04:29 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

On Sep 30, 2009, at 6:03 PM, Matthew Ahrens wrote:
> Erik Trimble wrote:
>> From a global perspective, multi-disk parity (e.g. raidz2 or  
>> raidz3) is the way to go instead of hot spares.
>> Hot spares are useful for adding protection to a number of vdevs,  
>> not a single vdev.
>
> Even when using raidz2 or 3, it is useful to have hot spares so that  
> reconstruction can begin immediately.  Otherwise it would have to  
> wait for the operator to physically remove the failed disk and  
> insert a new one.
When I model these things, I use 8 hours logistical response time for
data centers and 48 hours for SOHO. When the disks were small, and
thus resilver times were short, the logistical response time could make
a big impact. With 2+ TB drives, the resilver time is becoming dominant.
As disks becoming larger and not faster, there will be a day when the
logistical response time will become insignificant. In other words, you
won''t need a spare to improve logistical response, but you can consider
using spares to extend logistical response time to months. To take this
argument to its limit, it is possible that in our lifetime RAID boxes  
will
be disposable... the razor industry will be proud of us ;-)
  -- richard

paul at paularcher.org

2009-Oct-01 14:01 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

> Yes, this is something that should be possible once we have bp rewrite
> (the
> ability to move blocks around).
[snip]> FYI, I am currently working on bprewrite for device removal.
>
> --matt
That''s very cool. I don''t code (much/enough to help), but
I''d like to help
if I can. If nothing else, my wife makes a mean chocolate chip cookie!
Think a batch of those would help?

Paul Archer

Bob Friesenhahn

2009-Oct-01 16:12 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

On Wed, 30 Sep 2009, Richard Elling wrote:> a big impact. With 2+ TB drives, the resilver time is becoming dominant.
> As disks becoming larger and not faster, there will be a day when the
> logistical response time will become insignificant. In other words, you
> won''t need a spare to improve logistical response, but you can
consider
> using spares to extend logistical response time to months. To take this
> argument to its limit, it is possible that in our lifetime RAID boxes will
> be disposable... the razor industry will be proud of us ;-)
Unless there is a dramatic increase in disk bandwidth, there is a 
point where disk storage size becomes unmanageable.  This is the point 
where we should transition from 3-1/2" disk to 2-1/2" disks with 
smaller storage sizes.  I see that 2-1/2" disks are already up to 
500GB.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Brandon High

2009-Oct-05 21:35 UTC

head link

[zfs-discuss] "Hot Space" vs. hot spares

Replying to a few folks in a digest format, because I''m lazy and
don''t
have that much to say.

On Wed, Sep 30, 2009 at 5:53 PM, Tim Cook <tim at cook.ms>
wrote:> What are you hoping to accomplish?? You''re still going to need a
drives
> worth of free space, and if you''re so performance strapped that
one drive
> makes the difference, you''ve got some bigger problems on your
hands.
As I mentioned, the biggest win would be in a SOHO environment where
you may have bought more space than you need right now, and in the
meantime can use it for a wider stripe.

Don''t think about it as a high performance filesystem, think about it
in the context of a Drobo-like device. It''ll provide you with
protection from 2 drives failing (since it would require double
parity), or even better if there''s free space available.
> BTW, you shouldn''t need one disk per tray of 14 disks.? Unless
you''ve got
We use one spare per shelf on our current NetApp hardware. No real
reason other than it makes the provisioning more consistent. Given the
large number of filers that we have, consistency is important.


On Wed, Sep 30, 2009 at 5:56 PM, Erik Trimble <Erik.Trimble at sun.com>
wrote:> What you describe makes no sense for single-parity vdevs, since it actually
I''m pretty sure I said that I wouldn''t recommend it for
anything less
than raidz2.

As far as gains? It would get you out of degraded mode, which can help
performance. This may not be important though, since I believe raidz2
with a single faulted device doesn''t have much of an impact.


On Wed, Sep 30, 2009 at 6:01 PM, Matthew Ahrens <Matthew.Ahrens at
sun.com> wrote:> ability to move blocks around).  One minor downside to "hot
space" would be
> that it couldn''t be shared among multiple pools the way that hot
spares can.
Why not? If you have the space available in the zpool, you should be
able to move the data to other vdevs and shrink the degraded one.
Unless bprewrite doesn''t allow data to move between vdevs, that is.

-B

-- 
Brandon High : bhigh at freaks.com

zfs discuss - Oct 2009 - "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares

[zfs-discuss] "Hot Space" vs. hot spares