thr3ads.net - zfs discuss - [zfs-discuss] Sub-divided disks and ZFS [May 2009]

If this information is useful, please help other people find it:
Share via:

Bob Friesenhahn

2009-May-01 16:44 UTC

[zfs-discuss] Sub-divided disks and ZFS

This morning as I was reading USENIX conference summaries which 
suggested that maybe SATA/SAS is not an optimimum interface for SSDs 
it came to mind that some out-of-the-box thinking is needed for hard 
drives as well.  Hard drive storage densities have been increasing 
dramatically so that latest SATA drives are measured in terrabytes, 
just they were measured in gigabytes some years ago.

A problem with huge hard drives is that the resilver times increase 
with drive size.  Failure of hard drives with sizes in the terrabyte 
range lead to a long wait.

Hard drives are comprised of multiple platters, with typically an 
independently navigated head on each side.  Due to a mix of hardware 
and firmware, these disparate platters and heads are exposed as a 
simple logical linear device comprised of blocks.  If one side of a 
platter, or a drive head fails, then the whole drive fails.

My understanding is that most drives stripe logical blocks across the 
various platters such that the lower block addresses are on the outer 
edge of the disks to achieve fastest I/O transfer rate.  This approach 
is great for large linear writes, but is not so great for random I/O, 
is not so great for when data becomes spread across the disk, or when 
the disk becomes almost full.

The thought I had this morning is that perhaps the firmware on the 
disk drive can be updated to create a logical disk drive appareance 
for each drive head.  Any bad block management (if enabled) would be 
done using the same platter side.  With this approach a single 
physical drive could appear like two, four, or eight logical drives.

ZFS is really good about scheduling I/O across many drives.  Provided 
that care is taken to ensure that redundant data is appropriately 
distributed, it seems like subdividing the drives like this would 
allow ZFS to offer considerably improved performance, and resilver 
time of a logical drive would be reduced since it is smaller.  If a 
drive head fails, then that logical drive could be marked permanently 
out of service, but the whole drive would not need to be immediately 
resigned to the dumpster.

Does anyone have thoughts on the viability of this approach?  Can 
existing drives be effectively subdivided like this by simply updating 
drive firmware?

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Eric D. Mudama

2009-May-01 17:21 UTC

head link

[zfs-discuss] Sub-divided disks and ZFS

On Fri, May 1 at 11:44, Bob Friesenhahn wrote:> Hard drives are comprised of multiple platters, with typically an
> independently navigated head on each side.
This is a gap in your assumptions I believe.

The headstack is a single physical entity, so all heads move in unison
to the same position on all surfaces at the same time.

Additionally, hard drives typically have a single channel, meaning
only one head can be active at a time.

With the nature of embedded position information on the same surface
that contains user data, they haven''t come up with a practical design
for doing multiple concurrent reads from different places. At least
one vendor (connor?) tried to do a 2-actuator disk drive, and it was a
mechanical resonance nightmare for the servo systems.

I think that what you''re looking for, however, is already happening,
with server farms moving to multiple 2.5" drives from the larger 3.5"
drives. Even on SATA drives, with NCQ the rotational speed doesn''t
matter as much for overall throughput, so there are a growing number
of server applications that will be utilizing traditional "laptop"
form factor devices, to increase the spindle:capacity ratio without
blowing out their space budget. SAS and SATA are both shipping
greater and greater volumes of SFF devices.

For the budget minded, a 2U server with a bunch of mirrored-pair 2.5"
laptop drives is a nice platform, since you can fit 8-12 spindles in
that box. The storage per unit volume is basically identical, just
that you get 2-4x the spindle count.

--eric

--
Eric D. Mudama
edmudama at mail.bounceswoosh.org

Miles Nordin

2009-May-01 18:19 UTC

head link

[zfs-discuss] Sub-divided disks and ZFS

>>>>> "edm" == Eric D Mudama <edmudama at
bounceswoosh.org> writes:
>> Hard drives are comprised of multiple platters, with typically
>> an independently navigated head on each side.

edm> This is a gap in your assumptions I believe.

edm> The headstack is a single physical entity, so all heads move
edm> in unison to the same position on all surfaces at the same
edm> time.

yes but AIUI switching heads requires resettling into the new track.
The cylinders are not really cylindrical, just because of wear or
temperature or whatever, so when switching heads the
``channel'''' has
to use data from the head as part of a servo loop to settle on the
other surface''s track.

I guess the rules do keep changing though.

edm> I think that what you''re looking for, however, is already
edm> happening, with server farms moving to multiple 2.5" drives

yeah but you''re reading him wrong. He is saying a failed drive may
still be useful if you just avoid the one failed head.

The problem currently is the LBA''s are laced through each cylinder,
which is worth doing so that things like short-stroking make sense to
reduce head movement. If you re-swizzled the LBA''s so that instead
they filled each side of each platter in turn, like a dual-layer DVD,
it wouldn''t change sequential throughput at all, and would have the
benefit that ZFS''s existing tendency to put redundant metadata copies
far apart in LBA would end up getting them on different heads, which
actually *is* helpful given known failure modes tend to be head
crashing, head falling off, u.s.w.

I think the idea is doomed firstly because these days when a single
head goes bad, the drive firmware, host adapter, driver, and even the
zfs maintenance commands, all the way up the storage stack to the
sysadmin''s keyboard, all shit their pants and become useless. You
have to find the bad drive, remove it, then move on.

Secondly I''m not sure I buy the USENIX claim that you can limp along
less one head. The last failed drive I took apart, was indeed failed
on just one head, but it had scraped all the rust off the platter
(down to glass! it was really glass!), and the inside of the thing
was filled with microscopic grey facepaint. It had slathered the air
filtering pillow and coated all kinds of other surfaces. so...I would
expect the other recording surfaces were not doing too well either,
but I could be wrong. It does match experience, though, of drives
going from partly-failed to completely-failed in a day or a week.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090501/62113d35/attachment.bin>

Bob Friesenhahn

2009-May-01 18:38 UTC

head link

[zfs-discuss] Sub-divided disks and ZFS

On Fri, 1 May 2009, Eric D. Mudama wrote:
> On Fri, May  1 at 11:44, Bob Friesenhahn wrote:
>> Hard drives are comprised of multiple platters, with typically an 
>> independently navigated head on each side.
>
> This is a gap in your assumptions I believe.
>
> The headstack is a single physical entity, so all heads move in unison
> to the same position on all surfaces at the same time.
Ahhh.  I see.  That would explain why the idea has not been explored 
already. :-)
> I think that what you''re looking for, however, is already
happening,
> with server farms moving to multiple 2.5" drives from the larger
3.5"
> drives.  Even on SATA drives, with NCQ the rotational speed
doesn''t
Yes.  I was hoping to hasten things along with a firmware/software 
update rather than forklift replacement.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Eric D. Mudama

2009-May-01 18:46 UTC

head link

[zfs-discuss] Sub-divided disks and ZFS

On Fri, May  1 at 14:19, Miles Nordin wrote:>Secondly I''m not sure I buy the USENIX claim that you can limp
along
>less one head.  The last failed drive I took apart, was indeed failed
>on just one head, but it had scraped all the rust off the platter
>(down to glass!  it was really glass!), and the inside of the thing
>was filled with microscopic grey facepaint.  It had slathered the air
>filtering pillow and coated all kinds of other surfaces.  so...I would
>expect the other recording surfaces were not doing too well either,
>but I could be wrong.  It does match experience, though, of drives
>going from partly-failed to completely-failed in a day or a week.
Your point here is 100% accurate.

Any physical damage inside the drive, even if initially constrained to
a single head, quickly becomes a huge problem for everything inside
the drive.

Once you''re looking for physically isolated heads and platters, you
might
as well just buy multiple smaller drives.


-- 
Eric D. Mudama
edmudama at mail.bounceswoosh.org

zfs discuss - May 2009 - Sub-divided disks and ZFS

[zfs-discuss] Sub-divided disks and ZFS

[zfs-discuss] Sub-divided disks and ZFS

[zfs-discuss] Sub-divided disks and ZFS

[zfs-discuss] Sub-divided disks and ZFS

[zfs-discuss] Sub-divided disks and ZFS