Hi, I have a small cluster that I use to host a collection of Xen virtual machines. I just expanded from 2 nodes to 4 nodes and am looking for some advice re. configuring a storage subsystem. The current (2-node) configuration is simple: - 4 disks per node - md-based raid - lvm - DRBD to replicate (some) volumes between the two nodes - (some) VMs set up for auto-failover on node failure (pacemaker, etc.) In moving to 4 nodes, I'd like to add some flexibility to move VMs across all 4 nodes, but... that requires using something other than DRBD to replicate volumes. I'm thinking of something with the following characteristics: - 4-node storage cluster (4 drives per node, total of 16 drives in the storage pool) - 4-node VM cluster - using the SAME 4 nodes for both - note: I've got 4 gigE ports to play with on each box (plan on using 2 for outside access, 2 for storage/heartbeat networking) GlusterFS stands out as the package that seems most capable of supporting this (if we were using KVM, I'd probably look at Sheepdog as well). So... a few questions: - it looks like running replicated volumes, across 4 nodes, will provide for redundancy and support migration/failover (am I right in this? or should I be looking at running RAID on the individual nodes as well?) - what kind of performance hit is involved in replicated volumes? - is there anything more efficient in disk use (i.e., mirroring 4 copies eats up lots of disk, is there anything equivalent to RAID 5/6 that is a little more efficient while maintaining redundancy?) - am I missing anything (either re. GlusterFS or other alternatives) Thanks very much for any suggestions and advice. Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra
On Mon, Oct 10, 2011 at 10:56:11AM -0400, Miles Fidelman wrote:> GlusterFS stands out as the package that seems most capable of > supporting this (if we were using KVM, I'd probably look at Sheepdog > as well).The Gluster folks have said it's best to wait for 3.3's improved VM support. My understanding is current versions can get bogged down in this context. Not that there aren't some using them this way. Whit
Anand Babu Periasamy
2011-Oct-10 15:38 UTC
[Gluster-users] configuration questions & advice
On Mon, Oct 10, 2011 at 8:26 PM, Miles Fidelman <mfidelman at meetinghouse.net> wrote:> Hi, > > I have a small cluster that I use to host a collection of Xen virtual > machines. ?I just expanded from 2 nodes to 4 nodes and am looking for some > advice re. configuring a storage subsystem. > > The current (2-node) configuration is simple: > - 4 disks per node > - md-based raid > - lvm > - DRBD to replicate (some) volumes between the two nodes > - (some) VMs set up for auto-failover on node failure (pacemaker, etc.) > > In moving to 4 nodes, I'd like to add some flexibility to move VMs across > all 4 nodes, but... that requires using something other than DRBD to > replicate volumes. ?I'm thinking of something with the following > characteristics: > > - 4-node storage cluster (4 drives per node, total of 16 drives in the > storage pool) > - 4-node VM cluster > - using the SAME 4 nodes for both > - note: I've got 4 gigE ports to play with on each box (plan on using 2 for > outside access, 2 for storage/heartbeat networking) > > GlusterFS stands out as the package that seems most capable of supporting > this (if we were using KVM, I'd probably look at Sheepdog as well). > > So... a few questions: > > - it looks like running replicated volumes, across 4 nodes, will provide for > redundancy and support migration/failover (am I right in this? or should I > be looking at running RAID on the individual nodes as well?)If you create a volume with replica count = 2, it creates a distributed replicated volume. (Imagine intelligent RAID-10). You may choose to use disk level RAID too as second level of protection. It is a small investment for added reliability.> - what kind of performance hit is involved in replicated volumes?Synchronous replication does take a hit on performance. It treats writes as a transaction across N nodes. Hit depends on application to application.> - is there anything more efficient in disk use (i.e., mirroring 4 copies > eats up lots of disk, is there anything equivalent to RAID 5/6 that is a > little more efficient while maintaining redundancy?)Just create distributed-mirror with replica count = 2. You can also write a script to automatically replace-brick to a spare disk space in case of node failures. (This way, system will re-build itself, if a mirror member does not come back on time).> - am I missing anything (either re. GlusterFS or other alternatives)Ceph (in development), Sheepdog (KVM specific) are two other projects.> Thanks very much for any suggestions and advice. > > Miles Fidelman > > -- > In theory, there is no difference between theory and practice. > In<fnord> ?practice, there is. ? .... Yogi Berra >-- Anand Babu Periasamy Blog [ http://www.unlocksmith.org ] Twitter [ http://twitter.com/abperiasamy ] Imagination is more important than knowledge --Albert Einstein