Fernando Frediani (Qube)
2012-Jun-04 22:28 UTC
[Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual Machines storage
Hi, I have been reading and trying to test(without much success) Gluster 3.3 for Virtual Machines storage and from what I could see it isn't yet quiet ready for running virtual machines. One great improvement about the granular locking which was essential for these types of environments was achieved, but the other one is still not, which is the ability to use striped+(distributed)+replicated. As it stands now the natural choice would be Distributed + Replicated but when storing a Virtual Machines image it would reside in a single brick(replicated of course), so the maximum amount of IOPS for write would be the equivalent of a single brick's RAID controller and its disks underneath, while if striped+(distributed)+replicated was available it would spread the IOPS across all bricks containing the large Virtual Machine image and therefore multiple bricks and RAID controllers. Also, if I understand correctly, the maximum size for a file wouldn't be the size of a brick as ,again, the file would be spread across multiple bricks. This type of volume is said to be available on version 3.3 but as the documentation says it is only to run MapReduce workloads. What is everybody's opinion about this and has this been thought or considered ? Regards, Fernando -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120604/3b2d54c6/attachment.html>
Jeff White
2012-Jun-05 14:56 UTC
[Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual Machines storage
On 06/04/2012 06:28 PM, Fernando Frediani (Qube) wrote:> > Hi, > > I have been reading and trying to test(without much success) Gluster > 3.3 for Virtual Machines storage and from what I could see it isn?t > yet quiet ready for running virtual machines. > > One great improvement about the granular locking which was essential > for these types of environments was achieved, but the other one is > still not, which is the ability to use striped+(distributed)+replicated. > > As it stands now the natural choice would be Distributed + Replicated > but when storing a Virtual Machines image it would reside in a single > brick(replicated of course), so the maximum amount of IOPS for write > would be the equivalent of a single brick?s RAID controller and its > disks underneath, while if striped+(distributed)+replicated was > available it would spread the IOPS across all bricks containing the > large Virtual Machine image and therefore multiple bricks and RAID > controllers. > > Also, if I understand correctly, the maximum size for a file wouldn?t > be the size of a brick as ,again, the file would be spread across > multiple bricks. > > This type of volume is said to be available on version 3.3 but as the > documentation says it is only to run MapReduce workloads. > > What is everybody?s opinion about this and has this been thought or > considered ? > > Regards, > > > Fernando >I also heard it was only for MapReduce workloads but in theory can't you just create and use it for whatever you want? Is there something that limits it to only letting you use MapReduce? Also, I was under the impression repl+stripe was so you can have files larger than your brick size, not for speed improvements. I'm not sure you would see any major speed improvements with repl+stripe vs repl+distr unless you have lots of storage servers and lots of bricks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120605/49fa9261/attachment.html>
Brian Candler
2012-Jun-05 16:28 UTC
[Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual Machines storage
On Mon, Jun 04, 2012 at 10:28:50PM +0000, Fernando Frediani (Qube) wrote:> I have been reading and trying to test(without much success) Gluster > 3.3 for Virtual Machines storage and from what I could see it isn?t yet > quiet ready for running virtual machines. > > One great improvement about the granular locking which was essential > for these types of environments was achieved, but the other one is > still not, which is the ability to use > striped+(distributed)+replicated.I think you would have to have a very specialised requirement for this to be "essential". Suppose you have a host with 12 disks in a RAID10, and you make a replicated volume with another similar host for resilience. That gives you a pretty huge I/O ops for a VM to use, and also a pretty huge VM size (depending on how big the disks are, of course). Also: if you are handling terabytes of data, the natural approach in many cases would be to have a relatively small VM image, and store the data in glusterfs, mounting it from within the VM. This means that the same dataset can be shared by multiple VMs, and is easier to backup and replicate.