thr3ads.net - zfs discuss - [zfs-discuss] Implicit storage tiering w/ ZFS [Jun 2007]

If this information is useful, please help other people find it:
Share via:

Blue Thunder Somogyi

2007-Jun-22 17:21 UTC

[zfs-discuss] Implicit storage tiering w/ ZFS

I''m curious if there has been any discussion of or work done toward
implementing storage classing within zpools (this would be similar to the
storage foundation QoSS feature).

I''ve searched the forum and inspected the documentation looking for a
means to do this, and haven''t found anything, so pardon the post if
this is redundant/superfluous.

I would imagine this would require something along the lines of:
a) the ability to catagorize devices in a zpool with thier "class of
storage", perhaps a numeric rating or otherwise, with the idea that the
fastest disks get a "1" and the slowest get a "9" (or
whatever the largest number of supported tiers would be)
b) leveraging the copy-on-write nature of ZFS, when data is modified, the new
copy would be sent to the devices that were appropriate given statistical
information regarding that data''s access/modification frequency.  Not
being familiar with ZFS internals, I don''t know if there would be a way
of taking advantage of the ARC knowledge of access frequency.
c) It seems to me there would need to be some trawling of the storage tiers
(probably only the fastest, as the COW migration of frequently accessed data to
fast disk would not have an analogously inherent mechanism to move idle data
down a tier) to locate data that is gathering cobwebs and stage it down to an
appropriate tier.  Obviously it would be nice to have as much data as possible
on the fastest disks, while leaving all the free space on the dog disks, but
would also want to avoid any "write twice" behavior (not enough space
on appropriate tier so staged to slower tier and migrated up to faster disk) due
to the fastest tier being overfull.

While zpools are great for dealing with large volumes of data with integrity and
minimal management overhead, I''ve remained concerned about the inabiity
to control where data lives when using different types of storage, eg a mix of
FC and SATA disk in the extreme, mirror vs RAID-Z2, or as subtle as high RPM
small spindles vs low RPM large spindles.

For instance, if you had a database that you know has 100GB of dynamic data and
900GB of more stable data, with the above capabilities you could allocate the
appropriate ratio of FC and SATA disk and be confident that the data would
naturally migrate to it''s appropriate underlying storage.  Of course
there are ways of using multiple zpools with the different storage types and
table spaces to locate the data onto the appropriate zpool, but this is
undermining the "minimal management" appeal of ZFS.

Anyhow, just curious if this concept has come up before and if there are any
plans around it (or something similar).

Thanks,
BTS
 
 
This message posted from opensolaris.org

Richard Elling

2007-Jun-22 20:04 UTC

head link

[zfs-discuss] Implicit storage tiering w/ ZFS

Blue Thunder Somogyi wrote:> I''m curious if there has been any discussion of or work done
toward implementing storage classing within zpools (this would be similar to the
storage foundation QoSS feature).
There has been some discussion.  AFAIK, there is no significant work
in progress.  This problem is far more complex to solve than it may
first appear.
> I''ve searched the forum and inspected the documentation looking
for a means to do this, and haven''t found anything, so pardon the post
if this is redundant/superfluous.
> 
> I would imagine this would require something along the lines of:
> a) the ability to catagorize devices in a zpool with thier "class of
storage", perhaps a numeric rating or otherwise, with the idea that the
fastest disks get a "1" and the slowest get a "9" (or
whatever the largest number of supported tiers would be)
This gets more complicated when devices are very asymmetric in performance.
For a current example, consider an NVRAM-backed RAID array.  Writes tend to
complete very quickly, regardless of the offset.  But reads can vary widely,
and may be an order of magnitude slower.  However, this will not be consistent
as many of these arrays also cache reads (like JBOD track buffer caches).
Today, there are some devices which may demonstrate 2 or more orders of
magnitude difference between read and write latency.
> b) leveraging the copy-on-write nature of ZFS, when data is modified, the
new copy would be sent to the devices that were appropriate given statistical
information regarding that data''s access/modification frequency.  Not
being familiar with ZFS internals, I don''t know if there would be a way
of taking advantage of the ARC knowledge of access frequency.
I think the data is there.  This gets further complicated when a vdev shares
a resource with another vdev.  A shared resource may not be visible to Solaris
at all, so it would be difficult (or wrong) for Solaris to make a policy with
incorrect assumptions about resource constraints.
> c) It seems to me there would need to be some trawling of the storage tiers
(probably only the fastest, as the COW migration of frequently accessed data to
fast disk would not have an analogously inherent mechanism to move idle data
down a tier) to locate data that is gathering cobwebs and stage it down to an
appropriate tier.  Obviously it would be nice to have as much data as possible
on the fastest disks, while leaving all the free space on the dog disks, but
would also want to avoid any "write twice" behavior (not enough space
on appropriate tier so staged to slower tier and migrated up to faster disk) due
to the fastest tier being overfull.
When I follow this logical progression, I arrive at SAM-FS.  Perhaps it is
better to hook ZFS info SAM-FS?
> While zpools are great for dealing with large volumes of data with
integrity and minimal management overhead, I''ve remained concerned
about the inabiity to control where data lives when using different types of
storage, eg a mix of FC and SATA disk in the extreme, mirror vs RAID-Z2, or as
subtle as high RPM small spindles vs low RPM large spindles.
There is no real difference in performance based on the interface: FC vs. SATA.
So it would be a bad idea to base a policy on the interface type.
> For instance, if you had a database that you know has 100GB of dynamic data
and 900GB of more stable data, with the above capabilities you could allocate
the appropriate ratio of FC and SATA disk and be confident that the data would
naturally migrate to it''s appropriate underlying storage.  Of course
there are ways of using multiple zpools with the different storage types and
table spaces to locate the data onto the appropriate zpool, but this is
undermining the "minimal management" appeal of ZFS.
The people who tend to really care about performance will do what is needed
to get performance, and that doesn''t include intentially using slow
devices.
Perhaps you are thinking of a different market demographic?
> Anyhow, just curious if this concept has come up before and if there are
any plans around it (or something similar).
Statistically, it is hard to beat stochastically spreading wide and far.
  -- richard

Possibly Parallel Threads

Search for more seemingly similar threads

zfs discuss - Jun 2007 - Implicit storage tiering w/ ZFS

[zfs-discuss] Implicit storage tiering w/ ZFS

[zfs-discuss] Implicit storage tiering w/ ZFS

Possibly Parallel Threads