I''m currently looking to establish a small pool of xen hosts, each running several guests, backed by disk servers for the guests, accessible across the network. I''m hoping to achieve ease of migration, and reliability of service (particularly in cases of hardware failure), but I''m not pushing for 24/7 guaranteed HA. I''ve looked at the different ways of offering network storage to the guests, and have narrowed it down to what I think are the 2 best options: 1) DRBD & Heartbeat (or similar cluster management tool): Run 2 disk servers with one ''master'', mirroring in the background. In case of failure of the master, switch the slave to being the master. Advantages: Server-server communication is efficient Recovery only has to update a small delta Disadvantages: Risk of split-brain, if the disk servers stop ''seeing'' each other and both try to become masters. 2) Software RAID(1) of iSCSI disks: Export an iSCSI disk from each of 2 disk servers - RAID these together on the host, using multipath and RAID1 to deal with disk failure. Advantages: RAID array degrades ''nicely'' - no need to switch master/slave roles etc. No risk of split brain. No human decision making involved during recovery stages Disadvantages: RAID rebuild happens on the host, not between servers RAID recovery requires complete rebuild, not just delta of changes Looking at the above, I am drawn towards the RAID1 option. While it might be less efficient in terms of speed and rebuild times, it completely avoids the risk of split brain, and also there seem to be far fewer places where manual intervention (i.e human decisions where a script can''t do the work) might be needed than with DRBD. However, I''m also aware I might be missing something fundamental here. Anyone care to agree with, or enlighten me? Thanks, Matthew -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthew Richardson
2009-Aug-07 12:21 UTC
Re: [Xen-users] Network Block Devices & Redundancy
> Matthew, > wouldn''t your RAID1 solution lead to more complex manual configuration > for each domU? And you wouldn''t have an easy route to domU migrations. > If you use the DRBD+iSCSI route, you can connect to all your targets > on all your xen hosts, but only actually use the disk on the host with > that domU on it. Live migration is then very easy. > > CheersThanks for the comment. In both cases you can have the iSCSI targets permanently connected to all hosts (within reason, since I suspect there may be inefficiencies or upper limits in doing this). The only difference with the RAID1 approach is you need to start the array on the new host before you do the migration, and stop it on the old host. This does mean that you have to take action on both old and new hosts during migration, but this is easy enough to achieve =either with simple scripts, or possibly with libvirt hooks. This extra work might be a slight disadvantage, but when compared to the removal of ''split brain'' risks, I see if as being more than worth it. Matthew -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users