thr3ads.net - CentOS virt - [CentOS-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Dennis J.

2010-Mar-02 02:56 UTC

[CentOS-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

Hi,
up until now I've always deployed VMs with their storage located directly 
on the host system but as the number of VMs grows and the hardware becomes 
more powerful and can handle more virtual machines I'm concerned about a 
failure of the host taking down too many VMs in one go.
As a result I'm now looking at moving to an infrastructure that uses shared 
storage instead so I can live-migrate VMs or restart them quickly on 
another host if the one they are running on dies.
The problem is that I'm not sure how to go about this bandwidth-wise.
What I'm aiming for as a starting point is a 3-4 host cluster with about 10 
VMs on each host and a 2 system DRBD based cluster as a redundant storage 
backend.
The question that bugs me is how I can get enough bandwidth between the 
hosts and the storage to provide the VMs with reasonable I/O performance.
If all the 40 VMs start copying files at the same time that would mean that 
the bandwidth share for each VM would be tiny.
Granted this is a worst case scenario and that's why I want to ask if 
someone in here has experience with such a setup, can give recommendations 
or comment on alternative setups? Would I maybe get away with 4 bonded gbit 
ethernet ports? Would I require fiber channel or 10gbit infrastructure?

Regards,
   Dennis

PS: The sheepdog project (http://www.osrg.net/sheepdog/) looks interesting 
in that regard but apparently still is far from production-ready.

Christopher G. Stach II

2010-Mar-02 03:34 UTC

head link

[CentOS-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

----- "Dennis J." <dennisml at conversis.de> wrote:
> What I'm aiming for as a starting point is a 3-4 host cluster with
> about 10 VMs on each host and a 2 system DRBD based cluster as a
> redundant storage backend.
That's a good idea.
> The question that bugs me is how I can get enough bandwidth between the 
> hosts and the storage to provide the VMs with reasonable I/O
> performance.
You may also want to investigate whether or not a criss-cross replication setup 
(1A->2a, 2B->1b) is worth the complexity to you. That will spread the load
across two drbd hosts and give you approximately the same fault tolerance at a
slightly higher risk. (This is assuming that risk-performance tradeoff is
important enough to your project.)
> If all the 40 VMs start copying files at the same time that would mean
> that the bandwidth share for each VM would be tiny.
Would they? It's a possibility, and fun to think about, but what are the
chances? You will usually run into this with backups, cron, and other scheduled
[non-business load] tasks. These are far cheaper to fix with manually adjusting
schedules than any other way, unless you are rolling in dough.
> Would I maybe get away with 4 bonded gbit ethernet ports? Would I
> require fiber channel or 10gbit infrastructure?
Fuck FC, unless you want to get some out of date, used, gently broken, or
no-name stuff, or at least until FCoE comes out. (You're probably better off
getting unmanaged IB switches and using iSER.)

Can't say if 10GbE would even be enough, but it's probably overkill.
Just add up the PCI(-whatever) bus speeds of your hosts, benchmark your current
load or realistically estimate what sort of 95th percentile loads you would have
across the board, multiply by that percentage, and fudge that result for SLAs
and whatnot. Maybe go ahead and do some FMEA and see if losing a host or two is
going to peak the others over that bandwidth. If you find that 10GbE may be
necessary, a lot of mobos and SuperMicro have a better price per port for DDR IB
(maybe QDR now) and that may save you some money. Again, probably overkill.
Check your math. :)

Definitely use bonding. Definitely make sure you aren't going to saturate
the bus that card (or cards, if you are worried about losing an entire adapter)
is plugged into. If you're paranoid, get switches that can do bonding across
supervisors or across physical fixed configuration switches. If you can't
afford those, you may want to opt for 2Nx2N bonding-bridging. That would limit
you to probably two 4-1GbE cards per host, just for your SAN, but that's
probably plenty. Don't waste your money on iSCSI adapters. Just get ones
with TOEs.

-- 
Christopher G. Stach II
http://ldsys.net/~cgs/

Dennis J.

2010-Mar-07 13:58 UTC

head link

[CentOS-virt] [fedora-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

On 03/02/2010 04:51 AM, Ask Bj?rn Hansen wrote:>
> On Mar 1, 2010, at 18:56, Dennis J. wrote:
>
>> The question that bugs me is how I can get enough bandwidth between the
>> hosts and the storage to provide the VMs with reasonable I/O
performance.
>> If all the 40 VMs start copying files at the same time that would mean
that
>> the bandwidth share for each VM would be tiny.
>
> It really depends on the specific workloads.  In my experience it's
generally the number of IOs per second rather than the bandwidth that's the
limiting factor.
>
> We have a bunch of 4-disk boxes with md raid10 and we generally run out of
disk IO before we run out of memory (~24-48GB) or CPU (dual quad core 2.26GHz or
some such).
That's very similar to what we are experiencing. The primary Problem for me 
is how to deal with the bottleneck of a shared storage setup. The most 
simple setup is a 2-system criss-cross setup where the two hosts also serve 
as halves for a DRBD cluster. The advantage of this approach is that it's a 
cheap solution, that only a part of the storage-traffic has to go over the 
network between the machines and that the network only hast to handle the 
sorage-traffic of the VMs of those two machines.
The disadvantage of that approach is that you have to keep 50% of potential 
server capacity free in case of a failure of the twin node. That's quite a 
lot of wasted capacity.
To reduce that problem you can increase the number of hosts to let say for 
an example four which would reduce the spare capacity needed to 33% on each 
system but then you really need to separate the storage from the hosts an 
now you have a bottleneck on the storage end. Increase the number of hosts 
to 8 and you get even less wasted capacity but also increase the pressure 
on the storage bottleneck a lot.
Since I'm new to the whole SAN aspect I'm currently just looking at all
the
options that are out there and basically wonder how the big boys are 
handling this who have hundreds if not thousand of VMs running and need to 
be able to deal with physical failures too.
That is why I find the sheepdog project so interesting because it seems to 
address this particular problem in a way that would provide almost linear 
scalability without actually using a SAN at all (well, at least not in the 
traditional sense of the word).

Regards,
   Dennis

Reasonably Related Threads

Search for more apparently analagous threads

CentOS virt - Mar 2010 - Thoughts on storage infrastructure for small scale HA virtual machine deployments

[CentOS-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

[CentOS-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

[CentOS-virt] [fedora-virt] Thoughts on storage infrastructure for small scale HA virtual machine deployments

Reasonably Related Threads