Vijay Bellur
2014-Jan-06 09:44 UTC
[Gluster-users] [Users] Stopping glusterfsd service shut down data center
Adding gluster-users. On 01/06/2014 12:25 AM, Amedeo Salvati wrote:> Hi all, > > I'm testing ovirt+glusterfs with only two node for all (engine, > glusterfs, hypervisors), on centos 6.5 hosts following guide at: > > http://community.redhat.com/blog/2013/09/up-and-running-with-ovirt-3-3/ > http://www.gluster.org/2013/09/ovirt-3-3-glusterized/ > > but with some change like setting on glusterfs, parameter > cluster.server-quorum-ratio to 50% (due to prevent glusterfs to go down > if one node goes done) and option on /etc/glusterfs/glusterd.vol "option > base-port 50152" (due to libvirt port conflict). > > So, with the above parameter I was able to stop/reboot node not used to > directly mount glusterfs (eg lovhm002), but when I stop/reboot node, > that is used to mount glusterfs (eg node lovhm001), all data center goes > done, especially when I stop service glusterfsd (not glusterd > service!!!), but the glusterfs still alive and is reachable on node > lovhm002 that survives but ovirt/libvirt marks DC/storage in error. > > Do you have any ideas to configure DC/Cluster on ovirt that remains > aware if node used to mount glusterfs goes down?This seems to be due to client quorum in glusterfs. It can be observed that client quorum is on since option cluster.quorum-type has been set to value "auto". client quorum gets enabled by default as part of "Optimize for Virt" action in oVirt or by enabling "volume set group virt" in gluster CLI. client quorum gets enabled by default to provide additional protection against split-brains. In case of a gluster volume with replica count > 2, client quorum returns an error if writes/updates fail in more than 50% of the bricks. However, when the replica count happens to be 2, updates are failed if the first server/glusterfsd is not online. If the chances of a network partition and a split brain is not significant in your setup, you can turn off client quorum by setting option cluster.quorum-type to value "none". Regards, Vijay
Amedeo Salvati
2014-Jan-06 15:00 UTC
[Gluster-users] [Users] Stopping glusterfsd service shut down data center
Il 06/01/2014 10:44, Vijay Bellur ha scritto:> Adding gluster-users. > > On 01/06/2014 12:25 AM, Amedeo Salvati wrote: >> Hi all, >> >> I'm testing ovirt+glusterfs with only two node for all (engine, >> glusterfs, hypervisors), on centos 6.5 hosts following guide at: >> >> http://community.redhat.com/blog/2013/09/up-and-running-with-ovirt-3-3/ >> http://www.gluster.org/2013/09/ovirt-3-3-glusterized/ >> >> but with some change like setting on glusterfs, parameter >> cluster.server-quorum-ratio to 50% (due to prevent glusterfs to go down >> if one node goes done) and option on /etc/glusterfs/glusterd.vol "option >> base-port 50152" (due to libvirt port conflict). >> >> So, with the above parameter I was able to stop/reboot node not used to >> directly mount glusterfs (eg lovhm002), but when I stop/reboot node, >> that is used to mount glusterfs (eg node lovhm001), all data center goes >> done, especially when I stop service glusterfsd (not glusterd >> service!!!), but the glusterfs still alive and is reachable on node >> lovhm002 that survives but ovirt/libvirt marks DC/storage in error. >> >> Do you have any ideas to configure DC/Cluster on ovirt that remains >> aware if node used to mount glusterfs goes down? > > > This seems to be due to client quorum in glusterfs. It can be observed > that client quorum is on since option cluster.quorum-type has been set > to value "auto". > > client quorum gets enabled by default as part of "Optimize for Virt" > action in oVirt or by enabling "volume set group virt" in gluster CLI. > client quorum gets enabled by default to provide additional protection > against split-brains. In case of a gluster volume with replica count > > 2, client quorum returns an error if writes/updates fail in more than > 50% of the bricks. However, when the replica count happens to be 2, > updates are failed if the first server/glusterfsd is not online. > > If the chances of a network partition and a split brain is not > significant in your setup, you can turn off client quorum by setting > option cluster.quorum-type to value "none".yea! setting quorum-type to none has solved the issue... now I'm able to have a two node ovirt/gluster DC aware of faulting one node, except for engine portal that at this moment resides on only one node (sigh :-( ) I'm waiting for self hosted engine) thanks1k a> > Regards, > Vijay >-- Amedeo Salvati RHC{DS,E,VA} - LPIC-3 - UCP - NCLA 11 email: amedeo at oscert.net email: amedeo at linux.com http://plugcomputing.it/redhatcert.php http://plugcomputing.it/lpicert.php