I''m looking for input on building an HA configuration for ZFS. I''ve read the FAQ and understand that the standard approach is to have a standby system with access to a shared pool that is imported during a failover. The problem is that we use ZFS for a specialized purpose that results in 10''s of thousands of filesystems (mostly snapshots and clones). All versions of Solaris and OpenSolaris that we''ve tested take a long time (> hour) to import that many filesystems. I''ve read about replication through AVS, but that also seems require an import during failover. We''d need something closer to an active-active configuration (even if the second active is only modified through replication). Or some way to greatly speedup imports. Any suggestions? -- This message posted from opensolaris.org
On May 12, 2010, at 1:17 AM, schickb <schickb at gmail.com> wrote:> I''m looking for input on building an HA configuration for ZFS. I''ve > read the FAQ and understand that the standard approach is to have a > standby system with access to a shared pool that is imported during > a failover. > > The problem is that we use ZFS for a specialized purpose that > results in 10''s of thousands of filesystems (mostly snapshots and > clones). All versions of Solaris and OpenSolaris that we''ve tested > take a long time (> hour) to import that many filesystems. > > I''ve read about replication through AVS, but that also seems require > an import during failover. We''d need something closer to an active- > active configuration (even if the second active is only modified > through replication). Or some way to greatly speedup imports. > > Any suggestions?Bypass the complexities of AVS and the start-up times by implementing a ZFS head server in a pair of ESX/ESXi with Hot-spares using redundant back-end storage (EMC, NetApp, Equalogics). Then, if there is a hardware or software failure of the head server or the host it is on, the hot-spare automatically kicks in with the same running state as the original. There should be no interruption of services in this setup. This type of arrangement provides for oodles of flexibility in testing/ upgrading deployments as well. -Ross
Ross Walker wrote:> On May 12, 2010, at 1:17 AM, schickb <schickb at gmail.com> wrote: > >> I''m looking for input on building an HA configuration for ZFS. I''ve >> read the FAQ and understand that the standard approach is to have a >> standby system with access to a shared pool that is imported during >> a failover. >> >> The problem is that we use ZFS for a specialized purpose that >> results in 10''s of thousands of filesystems (mostly snapshots and >> clones). All versions of Solaris and OpenSolaris that we''ve tested >> take a long time (> hour) to import that many filesystems. >> >> I''ve read about replication through AVS, but that also seems require >> an import during failover. We''d need something closer to an active- >> active configuration (even if the second active is only modified >> through replication). Or some way to greatly speedup imports. >> >> Any suggestions? > > Bypass the complexities of AVS and the start-up times by implementing > a ZFS head server in a pair of ESX/ESXi with Hot-spares using > redundant back-end storage (EMC, NetApp, Equalogics). > > Then, if there is a hardware or software failure of the head server or > the host it is on, the hot-spare automatically kicks in with the same > running state as the original.By hot-spare here, I assume you are talking about a hot-spare ESX virtual machine. If there is a software issue and the hot-spare server comes up with the same state, is it not likely to fail just like the primary server? If it does not, can you explain why it would not? Cheers Manoj
schickb wrote:> I''m looking for input on building an HA configuration for ZFS. I''ve > read the FAQ and understand that the standard approach is to have a > standby system with access to a shared pool that is imported during a > failover. > > The problem is that we use ZFS for a specialized purpose that results > in 10''s of thousands of filesystems (mostly snapshots and clones). > All versions of Solaris and OpenSolaris that we''ve tested take a long > time (> hour) to import that many filesystems.Do you see this behavior - the long import time - during boot-up as well? Or is an issue only during an export + import operation? I suspect that the zpool cache helps a bit (during boot) but does not get rid of the problem completely (unless it has been recently addressed). If it is not an issue during boot-up, I would give the Open HA Cluster/Solaris Cluster a try or check with ha-clusters-discuss at opensolaris.org. Cheers Manoj
On May 12, 2010, at 3:06 PM, Manoj Joseph <manoj.p.joseph at oracle.com> wrote:> Ross Walker wrote: >> On May 12, 2010, at 1:17 AM, schickb <schickb at gmail.com> wrote: >> >>> I''m looking for input on building an HA configuration for ZFS. I''ve >>> read the FAQ and understand that the standard approach is to have a >>> standby system with access to a shared pool that is imported during >>> a failover. >>> >>> The problem is that we use ZFS for a specialized purpose that >>> results in 10''s of thousands of filesystems (mostly snapshots and >>> clones). All versions of Solaris and OpenSolaris that we''ve tested >>> take a long time (> hour) to import that many filesystems. >>> >>> I''ve read about replication through AVS, but that also seems require >>> an import during failover. We''d need something closer to an active- >>> active configuration (even if the second active is only modified >>> through replication). Or some way to greatly speedup imports. >>> >>> Any suggestions? >> >> Bypass the complexities of AVS and the start-up times by implementing >> a ZFS head server in a pair of ESX/ESXi with Hot-spares using >> redundant back-end storage (EMC, NetApp, Equalogics). >> >> Then, if there is a hardware or software failure of the head server >> or >> the host it is on, the hot-spare automatically kicks in with the same >> running state as the original. > > By hot-spare here, I assume you are talking about a hot-spare ESX > virtual machine. > > If there is a software issue and the hot-spare server comes up with > the > same state, is it not likely to fail just like the primary server? > If it > does not, can you explain why it would not?That''s a good point and worth looking into. I guess it would fail as well as a vmware hot-spare is like a vm in constant vmotion where active memory is mirrored between the two. I suppose one would need a hot-spare for hardware failure and a cold- spare for software failure. Both scenarios are possible with ESX, the cold spare I suppose in this instance would be the original VM rebooting. Recovery time would be about the same in this instance as an AVS solution that has to mount 10000 mounts though, so it wins with a hardware failure and ties with a software failure, but wins with ease of setup and maintenance, but looses with additional cost. Guess it all depends on your risk analysis whether it is worth it. -Ross
On May 11, 2010, at 10:17 PM, schickb wrote:> I''m looking for input on building an HA configuration for ZFS. I''ve read the FAQ and understand that the standard approach is to have a standby system with access to a shared pool that is imported during a failover. > > The problem is that we use ZFS for a specialized purpose that results in 10''s of thousands of filesystems (mostly snapshots and clones). All versions of Solaris and OpenSolaris that we''ve tested take a long time (> hour) to import that many filesystems. > > I''ve read about replication through AVS, but that also seems require an import during failover. We''d need something closer to an active-active configuration (even if the second active is only modified through replication). Or some way to greatly speedup imports. > > Any suggestions?The import is fast, but two other operations occur during import that will affect boot time: + for each volume (zvol) and its snapshots, a device tree entry is made in /devices + for each NFS share, the file system is (NFS) exported When you get into the thousands of datasets and snapshots range, this takes some time. Several RFEs have been implemented over the past few years to help improve this. NB. Running in a VM doesn''t improve the share or device enumeration time. -- richard -- ZFS storage and performance consulting at http://www.RichardElling.com
On May 12, 2010, at 7:12 PM, Richard Elling <richard.elling at gmail.com> wrote:> On May 11, 2010, at 10:17 PM, schickb wrote: > >> I''m looking for input on building an HA configuration for ZFS. I''ve >> read the FAQ and understand that the standard approach is to have a >> standby system with access to a shared pool that is imported during >> a failover. >> >> The problem is that we use ZFS for a specialized purpose that >> results in 10''s of thousands of filesystems (mostly snapshots and >> clones). All versions of Solaris and OpenSolaris that we''ve tested >> take a long time (> hour) to import that many filesystems. >> >> I''ve read about replication through AVS, but that also seems >> require an import during failover. We''d need something closer to an >> active-active configuration (even if the second active is only >> modified through replication). Or some way to greatly speedup >> imports. >> >> Any suggestions? > > The import is fast, but two other operations occur during import > that will > affect boot time: > + for each volume (zvol) and its snapshots, a device tree entry is > made in /devices > + for each NFS share, the file system is (NFS) exported > > When you get into the thousands of datasets and snapshots range, this > takes some time. Several RFEs have been implemented over the past few > years to help improve this. > > NB. Running in a VM doesn''t improve the share or device enumeration > time.The idea I propose is to use VMs in a manner such that the server does not have to be restarted in the event of a hardware failure thus avoiding the enumerations by using VMware''s hot-spare VM technology. Of course using VMs could also mean the OP could have multiple ZFS servers such that the datasets could be spread evenly between them. This could conceivably be done in containers within the 2 original VMs so as to maximize ARC space. -Ross
Seemingly Similar Threads
- Highly Performance and Availability
- Non-traditional Failover Query
- [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
- [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
- [PATCH net-next v9 3/4] virtio_net: Extend virtio to use VF datapath when available