I''m a small-time sysadmin with big storage aspirations (I''ll be honest - for a planned MythTV back-end, and *ahem*, other storage), and I''ve recently discovered ZFS. I''m thinking about putting together a homebrew SAN with a NAS head, and am wondering if the following will work (hoping the formatting will stick!): SAN Box 1: 8-disk raid-z2 --> iSCSI over GbE >--+ | SAN Box 2: | NAS Head: 8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool --> NFS/SMB | SAN Box N: | 8-disk raid-z2 --> iSCSI over GbE >--+ In plain english, for each SAN box, combining 8 (or so) disks in a ZFS raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it with other (similar) SAN volumes in a non-redundant zfs pool on the NAS head, working out the partitioning, quotas, etc there. I''m coming from a Linux Software RAID point-of-view, and (to me) this is kind of like a RAID 6+0, with an intermediate iSCSI connection. Using 200GB hdds in the SAN boxes, I''m looking at (roughly) 3.6TB of available storage for a 3-SAN box setup as described. I like the error-correction quality of ZFS; however, the ZFS Administration Guide states: "A non-redundant pool configuration is not recommended for production environments even if the single storage object is presented from a hardware RAID array or from a software volume manager. ZFS can only detect errors in these configurations." Does this caveat still apply if the constituent volumes (aka storage objects) are themselves redundant ZFS pools? I assume the following: - Hardware errors in the hdds will be detected and dealt with by the ZFS raid-z2 pools in the SAN boxes. - Data added to the topmost ZFS pool will likely be allocated in an arbitrary fashion over its constituent volumes. - Compression can be enabled per partition created on the topmost ZFS pool (on the NAS head). - Volumes can be added to the topmost ZFS pool as SAN boxes are added to the network, increasing the overall capacity of the topmost pool. Would I be better-off at the top-level ZFS pool (at the NAS box) with a simple RAID 0, instead of the non-redundant ZFS pool? Please comment as you see fit; let me know if I''m making any fundamentally inaccurate assumptions. I want to make sure that, when I implement this, I''m doing the Right Thing. Thanks in advance. -Jim
2007/9/23, James L Baker <ted.t.logan at gmail.com>:> I''m a small-time sysadmin with big storage aspirations (I''ll be honest > - for a planned MythTV back-end, and *ahem*, other storage), and I''ve > recently discovered ZFS. I''m thinking about putting together a > homebrew SAN with a NAS head, and am wondering if the following will > work (hoping the formatting will stick!): > > > SAN Box 1: > 8-disk raid-z2 --> iSCSI over GbE >--+ > | > SAN Box 2: | NAS Head: > 8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool --> NFS/SMB > | > SAN Box N: | > 8-disk raid-z2 --> iSCSI over GbE >--+ > > > In plain english, for each SAN box, combining 8 (or so) disks in a ZFS > raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it > with other (similar) SAN volumes in a non-redundant zfs pool on the > NAS head, working out the partitioning, quotas, etc there.It would probably be better to iSCSI export the "raw" disks on the SAN boxes to the NAS Head. Let the NAS head do raidz2. That will make it easier to move disks between computers if you have to. Then you will have a redundant zfs pool on the NAS head without loosing any disk space. You could do 3 way raidz so that you can loose any SAN box.
James L Baker wrote:> I''m a small-time sysadmin with big storage aspirations (I''ll be honest > - for a planned MythTV back-end, and *ahem*, other storage), and I''ve > recently discovered ZFS. I''m thinking about putting together a > homebrew SAN with a NAS head, and am wondering if the following will > work (hoping the formatting will stick!): > > > SAN Box 1: > 8-disk raid-z2 --> iSCSI over GbE >--+ > | > SAN Box 2: | NAS Head: > 8-disk raid-z2 --> iSCSI over GbE >--+--> N-volume zfs pool --> NFS/SMB > | > SAN Box N: | > 8-disk raid-z2 --> iSCSI over GbE >--+Let''s draw the data flow. disk -- cpu/mem -- nic -- wire -- nic -- cpu/mem -- nic -- wire -- nic -- cpu/mem Isn''t that a lot more complicated than: disk -- cpu/mem -- nic -- wire -- nic -- cpu/mem Now let''s look at data context. disk -- file system -- application The application has the highest understanding of the context of the data. The application knows what is good data and (hopefully) what is bad data. The further you get from the application, the less context you have, and therefore you lose the ability to discern good data from bad. To compensate, you make policies designed to meet the goals of defined interfaces. An application may expect the file system to behave to the POSIX interface. A file system may expect the disk to look like a block device. Any time you can improve the context of the data between the layers, you can improve the reliability of the application. Hence, ZFS improves the knowledge of disks and their characteristics by implementing its own I/O, allocation, and redundancy policies on the disk. There are a precious few applications which take similar approaches -- today most of that activity is in the web application server space (eg Glassfish). So, it follows that if the application only provides a file system interface, then the better file system design is that which has a tighter integration to the disks, like ZFS. Putting another layer of stuff between ZFS and the disks only makes sense if the value of the stuff outweighs the cost of the loss of data context. Given the proposed design, and lack of system requirements, I do not see how adding the (iSCSI) stuff provides any value. -- richard> In plain english, for each SAN box, combining 8 (or so) disks in a ZFS > raid-z2 pool, sharing the pool over GbE via iSCSI, then combining it > with other (similar) SAN volumes in a non-redundant zfs pool on the > NAS head, working out the partitioning, quotas, etc there. > > I''m coming from a Linux Software RAID point-of-view, and (to me) this > is kind of like a RAID 6+0, with an intermediate iSCSI connection. > Using 200GB hdds in the SAN boxes, I''m looking at (roughly) 3.6TB of > available storage for a 3-SAN box setup as described. > > I like the error-correction quality of ZFS; however, the ZFS > Administration Guide states: "A non-redundant pool configuration is > not recommended for production environments even if the single storage > object is presented from a hardware RAID array or from a software > volume manager. ZFS can only detect errors in these configurations." > Does this caveat still apply if the constituent volumes (aka storage > objects) are themselves redundant ZFS pools? > > I assume the following: > - Hardware errors in the hdds will be detected and dealt with by the > ZFS raid-z2 pools in the SAN boxes. > - Data added to the topmost ZFS pool will likely be allocated in an > arbitrary fashion over its constituent volumes. > - Compression can be enabled per partition created on the topmost ZFS > pool (on the NAS head). > - Volumes can be added to the topmost ZFS pool as SAN boxes are added > to the network, increasing the overall capacity of the topmost pool. > > Would I be better-off at the top-level ZFS pool (at the NAS box) with a > simple RAID 0, instead of the non-redundant ZFS pool? > > Please comment as you see fit; let me know if I''m making any > fundamentally inaccurate assumptions. I want to make sure that, when I > implement this, I''m doing the Right Thing. > > Thanks in advance. > > -Jim > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I think I can offer a straightforward explanation to the following: I like the error-correction quality of ZFS; however, the ZFS> Administration Guide states: "A non-redundant pool configuration is > not recommended for production environments even if the single storage > object is presented from a hardware RAID array or from a software > volume manager. ZFS can only detect errors in these configurations." > Does this caveat still apply if the constituent volumes (aka storage > objects) are themselves redundant ZFS pools?Lets say you have disk targets from two target devices, they could be disk arrays through fibre, SCSI, etc, that present say RAID 5 targets of some sort. If you simply stripe those together with ZFS, it can detect but not fix errors, since it has no ZFS-redundant data to work from. Identical to a single disk running zfs with single copies. However, if you mirrored the RAID 5 targets from the two arrays with ZFS, you would now have ZFS-level redundancy and hence reconstruction capability. If you construct what you propose, the zvols on each of the 3 boxes would have reconstruction ability locally, since that can be done at a block level, but the final, NAS Head zfs volume would have no redundancy or ability to reconstruct data at that level. Richard is right that you would be losing some advantage of ZFS at the high layer, but assuming one of the individual raid-z2''s never fail (by losing more than 2 disks) your high level stripe would remain intact. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071002/e629e765/attachment.html>