DISCLAIMER: I *really* appreciate this project and I thank all peoples
involved.
Il 2020-06-19 21:33 Mahdi Adnan ha scritto:> The strength of Gluster, in my opinion, is the simplicity of creating
> distributed volumes that can be consumed by different clients, and
> this is why we chose Gluster back in 2016 as our main VMs Storage
> backend for VMWare and oVirt.
I absolutely agree on that: it is very simple to create replicated and
distributed storage with Gluster. This is a key Gluster feature: last
time I checked Ceph, it basically required 6+ machines and a small team
managing them. I read it is much easier now, but I doubt it is as easy
as Gluster.
This is the main reason why I periodically reconsider Gluster as a
backend for hyperconverged virtual machines (coupled with the metadata
server-less approach). Sadly, I arrive each time at the same conclusion:
do not run Gluster in critical production environment without RedHat
support and stable release, as Red Hat Gluster Storage is.
Maybe I am overcautious, but I find too easy to end with split brain
scenario, sometime even by simply rebooting a node at the wrong moment.
The needing to enable sharding for efficient virtual disk healing scares
me: if gluster fails, I need to reconstruct the disk images from each
chunk with a tedious, time-and-space consuming process.
Moreover, from a cursory view of Gluster's and other mailing lists,
Qemu/KVM gfapi backend seems somewhat less stable than a FUSE gluster
mount, and FUSE is not so fast either. Is my impression wrong?
I really have the fealing that Gluster is good at a diametrically
opposed scenario: scale-out NAS, for which it was conceived many moons
ago. And RedHat backing on Ceph right now for block storage let me think
Gluster issues with virtual disks were noticed by many.
> We suffered "and still" from performance issues with Gluster on
use
> cases related to small files, but Gluster as a storage backend for
> Virtual Machines is really performant.
I have mixed feelings about Gluster performance. In the past[1] I
measured ~500 max IOPs for synchronous 4K writes per brick, which is a
lot for a single disk, less so for a RAID array or SSD. Aggregated
performance scaled linearly with increasing brick count, so maybe the
solution is to simply go full-Gluster and ditch RAID, but replacing a
disk in Gluster is much less convenient that doing the same with a RAID
array (both hardware and software RAID) or ZFS.
Note that I am not implying that Ceph is faster; rather, than a small
Gluster setup with few brick can be slower than expected.
I would love to ear other opinions and on-the-field experiences.
Thanks.
[1]
https://lists.gluster.org/pipermail/gluster-users/2020-January/037607.html
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8