How nested can be the VDEV tree? All the examples I''ve seen suggested 3 layers - root vdev, striping over some top-level vdevs (if present), made (redundantly) of some physical/leaf vdevs. In trivial cases this goes down to two levels (a root striping over non-redundant leaf vdevs) or one level (single disk pools). 1) Can the top-level vdevs be nested? 2) In particular, if I wanted to make a mirror of raidzN''s, can it be done in one ZFS pool, or would I have to play with iSCSI and ZVOLs? 3) If I have two separate ZFS nodes with local storage, intended for failure-tolerant HA, which approach is better - to create a mirror over ZVOLs (importing a remote ZVOL over iSCSI and loopback), or to replicate using frequent snapshots and zfs send/zfs recv? I expect the mirroring approach to provide better failover HA since any most-recently-alive node has the most up-to-date data stored locally, and its mirroring layer would replicate (resilver) all changes to the downed node when that comes up. While with zfs-send replications, we''d have to roll back to the most recent common snapshot, and choose which node to replicate form each time... But with zfs send each host''s local data is more probably consistent, since it is written and tracked separately. I guess both approaches are good for actually different tasks - like failover and near-sync backup... What are known drawbacks of mirroring over LAN, such as implications on performance, latency, fragmentation, storage/computing overhead? Thanks, //Jim
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Jim Klimov > > 2) In particular, if I wanted to make a mirror of raidzN''s, > can it be done in one ZFS pool, or would I have to play > with iSCSI and ZVOLs?Although you can probably do something like this (which might give you some problems in terms of mount order, etc) it''s probably not a great idea. One part of the system will be optimizing for mirrored disks, another optimizing for a raidz. Block patterns may not (probably won''t) align, possibly resulting in some performance issues... You''ll be double-checksumming... If you want to mirror raidz devices, why don''t you just go with a mirror and no raidz? The mirror performs better. The only reason I can think of to do what you''re saying is if you want more redundancy than a 2-way mirror, but not as much as a 3-way mirror. When they designed zfs, they weren''t designing for you to have THAT level of redundancy precision. The more you stray off the beaten path, the more you discover the things other people haven''t discovered. My advice would be to stick with the 2-way or 3-way mirror. Don''t trick the system into using multi-layered vdev''s when it''s not intended to be used that way. Not saying you can''t - just that it''s not intended, and hence, not well tested and not recommended.> 3) If I have two separate ZFS nodes with local storage, > intended for failure-tolerant HA, which approach is better - > to create a mirror over ZVOLs (importing a remote ZVOL over > iSCSI and loopback), or to replicate using frequent snapshots > and zfs send/zfs recv?Each way has some pros & cons. If you attach the remote zvol to a mirror, it will start resilvering (latency intensive) and suddenly all your present disk activities will be bottlenecked to remote system speeds. Maybe not an issue if you''re connecting over infiniband or something super fast like it. The pro of doing things this way is - if the local side of the mirror dies for some reason, the storage doesn''t go down. So you''re giving up performance in order to gain reliability. If you use zfs send/receive, your local system will still be able to perform at local system speeds while it''s sending an optimized network data stream over to the other side. The downside is if the local storage system fails, your storage system goes down, and you will only have the most recently sent snapshot over on the remote side, so you could potentially lose some granularity in the most recent data. You''re gaining performance at the expense of risk to your most recent data. Actually - There''s a situation where the zfs send is actually MORE reliable too. Although not very common, it certainly CAN happen, and HAS happened, that live replicating systems (such as remotely attached mirror) both get destroyed at the same time. Whatever destroys the primary storage pool (such as a sysadmin accidentally destroying the wrong zpool) also simultaneously destroys the backup data pool. So the added layer of separation by using the zfs send might be desired. Also, if you mirror the remote zvol, you''ll be constrained to using the same size/geometry/configuration of storage on the local and remote sides. Usually, when I build systems, I use a better class of hardware on the primary server, and a cheaper class of hardware on the backup server. Say... Use SAS on the primary and SATA on the backup. And use the SSD ZIL on the primary, but simply disable ZIL on the backup server. And use twice as many half-the-size disks on the primary, while using half-as-many twice-the-size disks on the backup. Use mirror on the primary, use raidz2 on the backup. Enable compression on the backup but no compression on the primary. Etc. (Actually, in most cases, you *gain* performance by adding compression to both, depending on your data.)
2012-01-15 19:16, Edward Ned Harvey ?????:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Jim Klimov >> >> 2) In particular, if I wanted to make a mirror of raidzN''s, >> can it be done in one ZFS pool, or would I have to play >> with iSCSI and ZVOLs? > > Although you can probably do something like this (which might give you some > problems in terms of mount order, etc) it''s probably not a great idea. One > part of the system will be optimizing for mirrored disks, another optimizing > for a raidz. Block patterns may not (probably won''t) align, possibly > resulting in some performance issues... You''ll be double-checksumming... > > If you want to mirror raidz devices, why don''t you just go with a mirror and > no raidz? The mirror performs better. The only reason I can think of to do > what you''re saying is if you want more redundancy than a 2-way mirror, but > not as much as a 3-way mirror.Well, the idea of RAID51 in ZFS came up while I was writing another post, discussing whether raidzN actually protects data against silent corruption. So far I have no answer to that question ;) Main reason to go with raidzN would be to have some redundancy at expense of smaller storage space (or connection port number). If raidzN indeed doesn''t always protect data, mirroring over that *could* provide added reliability (mirrors seem to always know which half exactly is bad) while retaining the benefits of large storage space given the same amount of disks and ports (perhaps also limited to an enclosure/HBA/etc.). Mirror could also help speed up some reads ;) Also, if we mirror each local disk to a remote disk, in case of losing one of the storage boxes we end up with a non-redundant stripe of data. While mirroring local raidz to remote raidz, we end up with a redundant raidz... The exact balance in numbers has to be calculated, though - since 2-way mirroring over some sheer amount of disks would give more space than mirroring over raidzing over the same disks ;) So, yes, probably the main rationale for THIS would be separation of the mirror halves over hardware SPOFs in some balance with bandwidths and latencies (sending one chunk of data vs. many smaller chunks of mirrored+striped data). Perhaps the idea is senseless after all - but that was part of the question, as well ;) Thanks, //Jim