thr3ads.net - zfs discuss - [zfs-discuss] Multi-level ZFS in a SAN [Sep 2007]

If this information is useful, please help other people find it:
Share via:

James L Baker

2007-Sep-23 21:50 UTC

[zfs-discuss] Multi-level ZFS in a SAN

I''m a small-time sysadmin with big storage aspirations (I''ll
be honest
- for a planned MythTV back-end, and *ahem*, other storage), and I''ve
recently discovered ZFS. I''m thinking about putting together a
homebrew SAN with a NAS head, and am wondering if the following will
work (hoping the formatting will stick!):


SAN Box 1:
8-disk raid-z2 --> iSCSI over GbE >--+
                                      |
SAN Box 2:                           |    NAS Head:
8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool -->
NFS/SMB
                                      |
SAN Box N:                           |
8-disk raid-z2 --> iSCSI over GbE >--+


In plain english, for each SAN box, combining 8 (or so) disks in a ZFS
raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it
with other (similar) SAN volumes in a non-redundant zfs pool on the
NAS head, working out the partitioning, quotas, etc there.

I''m coming from a Linux Software RAID point-of-view, and (to me) this
is kind of like a RAID 6+0, with an intermediate iSCSI connection.
Using 200GB hdds in the SAN boxes, I''m looking at (roughly) 3.6TB of
available storage for a 3-SAN box setup as described.

I like the error-correction quality of ZFS; however, the ZFS
Administration Guide states: "A non-redundant pool configuration is
not recommended for production environments even if the single storage
object is presented from a hardware RAID array or from a software
volume manager. ZFS can only detect errors in these configurations."
Does this caveat still apply if the constituent volumes (aka storage
objects) are themselves redundant ZFS pools?

I assume the following:
 - Hardware errors in the hdds will be detected and dealt with by the
    ZFS raid-z2 pools in the SAN boxes.
 - Data added to the topmost ZFS pool will likely be allocated in an
    arbitrary fashion over its constituent volumes.
 - Compression can be enabled per partition created on the topmost ZFS
    pool (on the NAS head).
 - Volumes can be added to the topmost ZFS pool as SAN boxes are added
    to the network, increasing the overall capacity of the topmost pool.

Would I be better-off at the top-level ZFS pool (at the NAS box) with a
simple RAID 0, instead of the non-redundant ZFS pool?

Please comment as you see fit; let me know if I''m making any
fundamentally inaccurate assumptions.  I want to make sure that, when I
implement this, I''m doing the Right Thing.

Thanks in advance.

-Jim

Mattias Pantzare

2007-Sep-23 22:16 UTC

head link

[zfs-discuss] Multi-level ZFS in a SAN

2007/9/23, James L Baker <ted.t.logan at
gmail.com>:> I''m a small-time sysadmin with big storage aspirations
(I''ll be honest
> - for a planned MythTV back-end, and *ahem*, other storage), and
I''ve
> recently discovered ZFS. I''m thinking about putting together a
> homebrew SAN with a NAS head, and am wondering if the following will
> work (hoping the formatting will stick!):
>
>
> SAN Box 1:
> 8-disk raid-z2 --> iSCSI over GbE >--+
>                                       |
> SAN Box 2:                           |    NAS Head:
> 8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool -->
NFS/SMB
>                                       |
> SAN Box N:                           |
> 8-disk raid-z2 --> iSCSI over GbE >--+
>
>
> In plain english, for each SAN box, combining 8 (or so) disks in a ZFS
> raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it
> with other (similar) SAN volumes in a non-redundant zfs pool on the
> NAS head, working out the partitioning, quotas, etc there.
It would probably be better to iSCSI export the "raw" disks on the SAN
boxes to the NAS Head. Let the NAS head do raidz2. That will make it
easier to move disks between computers if you have to. Then you will
have a redundant zfs pool on the NAS head without loosing any disk
space.

You could do 3 way raidz so that you can loose any SAN box.

Richard Elling

2007-Sep-24 18:15 UTC

head link

[zfs-discuss] Multi-level ZFS in a SAN

James L Baker wrote:> I''m a small-time sysadmin with big storage aspirations
(I''ll be honest
> - for a planned MythTV back-end, and *ahem*, other storage), and
I''ve
> recently discovered ZFS. I''m thinking about putting together a
> homebrew SAN with a NAS head, and am wondering if the following will
> work (hoping the formatting will stick!):
> 
> 
> SAN Box 1:
> 8-disk raid-z2 --> iSCSI over GbE >--+
>                                       |
> SAN Box 2:                           |    NAS Head:
> 8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool -->
NFS/SMB
>                                       |
> SAN Box N:                           |
> 8-disk raid-z2 --> iSCSI over GbE >--+
Let''s draw the data flow.

  disk -- cpu/mem -- nic -- wire -- nic -- cpu/mem -- nic -- wire -- nic --
cpu/mem

Isn''t that a lot more complicated than:

  disk -- cpu/mem -- nic -- wire -- nic -- cpu/mem

Now let''s look at data context.

  disk -- file system -- application

The application has the highest understanding of the context of the data.
The application knows what is good data and (hopefully) what is bad data.
The further you get from the application, the less context you have, and
therefore you lose the ability to discern good data from bad.  To compensate,
you make policies designed to meet the goals of defined interfaces.  An
application may expect the file system to behave to the POSIX interface.
A file system may expect the disk to look like a block device.  Any time
you can improve the context of the data between the layers, you can improve
the reliability of the application.  Hence, ZFS improves the knowledge of
disks and their characteristics by implementing its own I/O, allocation, and
redundancy policies on the disk.  There are a precious few applications
which take similar approaches -- today most of that activity is in the web
application server space (eg Glassfish).  So, it follows that if the
application only provides a file system interface, then the better file
system design is that which has a tighter integration to the disks, like ZFS.

Putting another layer of stuff between ZFS and the disks only makes sense
if the value of the stuff outweighs the cost of the loss of data context.
Given the proposed design, and lack of system requirements, I do not see
how adding the (iSCSI) stuff provides any value.
  -- richard
> In plain english, for each SAN box, combining 8 (or so) disks in a ZFS
> raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it
> with other (similar) SAN volumes in a non-redundant zfs pool on the
> NAS head, working out the partitioning, quotas, etc there.
> 
> I''m coming from a Linux Software RAID point-of-view, and (to me)
this
> is kind of like a RAID 6+0, with an intermediate iSCSI connection.
> Using 200GB hdds in the SAN boxes, I''m looking at (roughly) 3.6TB
of
> available storage for a 3-SAN box setup as described.
> 
> I like the error-correction quality of ZFS; however, the ZFS
> Administration Guide states: "A non-redundant pool configuration is
> not recommended for production environments even if the single storage
> object is presented from a hardware RAID array or from a software
> volume manager. ZFS can only detect errors in these configurations."
> Does this caveat still apply if the constituent volumes (aka storage
> objects) are themselves redundant ZFS pools?
> 
> I assume the following:
>  - Hardware errors in the hdds will be detected and dealt with by the
>     ZFS raid-z2 pools in the SAN boxes.
>  - Data added to the topmost ZFS pool will likely be allocated in an
>     arbitrary fashion over its constituent volumes.
>  - Compression can be enabled per partition created on the topmost ZFS
>     pool (on the NAS head).
>  - Volumes can be added to the topmost ZFS pool as SAN boxes are added
>     to the network, increasing the overall capacity of the topmost pool.
> 
> Would I be better-off at the top-level ZFS pool (at the NAS box) with a
> simple RAID 0, instead of the non-redundant ZFS pool?
> 
> Please comment as you see fit; let me know if I''m making any
> fundamentally inaccurate assumptions.  I want to make sure that, when I
> implement this, I''m doing the Right Thing.
> 
> Thanks in advance.
> 
> -Jim
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Nathan Essex

2007-Oct-02 14:53 UTC

head link

[zfs-discuss] Multi-level ZFS in a SAN

I think I can offer a straightforward explanation to the following:

I like the error-correction quality of ZFS; however, the
ZFS> Administration Guide states: "A non-redundant pool configuration is
> not recommended for production environments even if the single storage
> object is presented from a hardware RAID array or from a software
> volume manager. ZFS can only detect errors in these configurations."
> Does this caveat still apply if the constituent volumes (aka storage
> objects) are themselves redundant ZFS pools?

Lets say you have disk targets from two target devices, they could be disk
arrays through fibre, SCSI, etc, that present say RAID 5 targets of some
sort.  If you simply stripe those together with ZFS, it can detect but not
fix errors, since it has no ZFS-redundant data to work from.  Identical to a
single disk running zfs with single copies.

However, if you mirrored the RAID 5 targets from the two arrays with ZFS,
you would now have ZFS-level redundancy and hence reconstruction capability.

If you construct what you propose, the zvols on each of the 3 boxes would
have reconstruction ability locally, since that can be done at a block
level, but the final, NAS Head zfs volume would have no redundancy or
ability to reconstruct data at that level.  Richard is right that you would
be losing some advantage of ZFS at the high layer, but assuming one of the
individual raid-z2''s never fail (by losing more than 2 disks) your high
level stripe would remain intact.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071002/e629e765/attachment.html>

zfs discuss - Sep 2007 - Multi-level ZFS in a SAN

[zfs-discuss] Multi-level ZFS in a SAN

[zfs-discuss] Multi-level ZFS in a SAN

[zfs-discuss] Multi-level ZFS in a SAN

[zfs-discuss] Multi-level ZFS in a SAN