thr3ads.net - Gluster users - [Gluster-users] Questions on ganesha HA and shared storage size [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Alessandro De Salvo

2015-Jun-08 18:00 UTC

[Gluster-users] Questions on ganesha HA and shared storage size

Hi,
indeed, it does not work :-)
OK, this is what I did, with 2 machines, running CentOS 7.1, Glusterfs 3.7.1 and
nfs-ganesha 2.2.0:

1) ensured that the machines are able to resolve their IPs (but this was already
true since they were in the DNS);
2) disabled NetworkManager and enabled network on both machines;
3) created a gluster shared volume 'gluster_shared_storage' and mounted
it on '/run/gluster/shared_storage' on all the cluster nodes using
glusterfs native mount (on CentOS 7.1 there is a link by default /var/run ->
../run)
4) created an empty /etc/ganesha/ganesha.conf;
5) installed pacemaker pcs resource-agents corosync on all cluster machines;
6) set the ?hacluster? user the same password on all machines;
7) pcs cluster auth <hostname> -u hacluster -p <pass> on all the
nodes (on both nodes I issued the commands for both nodes)
8) IPv6 is configured by default on all nodes, although the infrastructure is
not ready for IPv6
9) enabled pcsd and started it on all nodes
10) populated /etc/ganesha/ganesha-ha.conf with the following contents, one per
machine:

===> atlas-node1
# Name of the HA cluster created.
HA_NAME="ATLAS_GANESHA_01"
# The server from which you intend to mount
# the shared volume.
HA_VOL_SERVER=?atlas-node1"
# The subset of nodes of the Gluster Trusted Pool
# that forms the ganesha HA cluster. IP/Hostname
# is specified.
HA_CLUSTER_NODES=?atlas-node1,atlas-node2"
# Virtual IPs of each of the nodes specified above.
VIP_atlas-node1=?x.x.x.1"
VIP_atlas-node2=?x.x.x.2"

===> atlas-node2
# Name of the HA cluster created.
HA_NAME="ATLAS_GANESHA_01"
# The server from which you intend to mount
# the shared volume.
HA_VOL_SERVER=?atlas-node2"
# The subset of nodes of the Gluster Trusted Pool
# that forms the ganesha HA cluster. IP/Hostname
# is specified.
HA_CLUSTER_NODES=?atlas-node1,atlas-node2"
# Virtual IPs of each of the nodes specified above.
VIP_atlas-node1=?x.x.x.1"
VIP_atlas-node2=?x.x.x.2?

11) issued gluster nfs-ganesha enable, but it fails with a cryptic message:

# gluster nfs-ganesha enable
Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted
pool. Do you still want to continue? (y/n) y
nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. Please check
the log file for details

Looking at the logs I found nothing really special but this:

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=[2015-06-08
17:57:15.672844] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop]
0-management: nfs already stopped
[2015-06-08 17:57:15.675395] I [glusterd-ganesha.c:386:check_host_list]
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:15.720692] I [glusterd-ganesha.c:386:check_host_list]
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:15.721161] I [glusterd-ganesha.c:335:is_ganesha_host]
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:16.633048] E [glusterd-ganesha.c:254:glusterd_op_set_ganesha]
0-management: Initial NFS-Ganesha set up failed
[2015-06-08 17:57:16.641563] E [glusterd-syncop.c:1396:gd_commit_op_phase]
0-management: Commit of operation 'Volume (null)' failed on localhost :
Failed to set up HA config for NFS-Ganesha. Please check the log file for
details

==> /var/log/glusterfs/cmd_history.log <=[2015-06-08 17:57:16.643615]  :
nfs-ganesha enable : FAILED : Failed to set up HA config for NFS-Ganesha. Please
check the log file for details

==> /var/log/glusterfs/cli.log <=[2015-06-08 17:57:16.643839] I
[input.c:36:cli_batch] 0-: Exiting with: -1

Also, pcs seems to be fine for the auth part, although it obviously tells me the
cluster is not running.

I, [2015-06-08T19:57:16.305323 #7223]  INFO -- : Running:
/usr/sbin/corosync-cmapctl totem.cluster_name
I, [2015-06-08T19:57:16.345457 #7223]  INFO -- : Running: /usr/sbin/pcs cluster
token-nodes
::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET /remote/check_auth
HTTP/1.1" 200 68 0.1919
::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET /remote/check_auth
HTTP/1.1" 200 68 0.1920
atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] "GET
/remote/check_auth HTTP/1.1" 200 68
- -> /remote/check_auth

What am I doing wrong?
Thanks,

	Alessandro
> Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri <skoduri at
redhat.com> ha scritto:
> 
> 
> 
> 
> On 06/08/2015 08:20 PM, Alessandro De Salvo wrote:
>> Sorry, just another question:
>> 
>> - in my installation of gluster 3.7.1 the command gluster
features.ganesha enable does not work:
>> 
>> # gluster features.ganesha enable
>> unrecognized word: features.ganesha (position 0)
>> 
>> Which version has full support for it?
> 
> Sorry. This option has recently been changed. It is now
> 
> $ gluster nfs-ganesha enable
> 
> 
>> 
>> - in the documentation the ccs and cman packages are required, but they
seems not to be available anymore on CentOS 7 and similar, I guess they are not
really required anymore, as pcs should do the full job
>> 
>> Thanks,
>> 
>> 	Alessandro
> 
> Looks like so from http://clusterlabs.org/quickstart-redhat.html. Let us
know if it doesn't work.
> 
> Thanks,
> Soumya
> 
>> 
>>> Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo
<alessandro.desalvo at roma1.infn.it> ha scritto:
>>> 
>>> Great, many thanks Soumya!
>>> Cheers,
>>> 
>>> 	Alessandro
>>> 
>>>> Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri
<skoduri at redhat.com> ha scritto:
>>>> 
>>>> Hi,
>>>> 
>>>> Please find the slides of the demo video at [1]
>>>> 
>>>> We recommend to have a distributed replica volume as a shared
volume for better data-availability.
>>>> 
>>>> Size of the volume depends on the workload you may have. Since
it is used to maintain states of NLM/NFSv4 clients, you may calculate the size
of the volume to be minimum of aggregate of
>>>> (typical_size_of'/var/lib/nfs'_directory +
~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)
>>>> 
>>>> We shall document about this feature sooner in the gluster docs
as well.
>>>> 
>>>> Thanks,
>>>> Soumya
>>>> 
>>>> [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846
>>>> 
>>>> On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:
>>>>> Hi,
>>>>> I have seen the demo video on ganesha HA,
https://www.youtube.com/watch?v=Z4mvTQC-efM
>>>>> However there is no advice on the appropriate size of the
shared volume. How is it really used, and what should be a reasonable size for
it?
>>>>> Also, are the slides from the video available somewhere, as
well as a documentation on all this? I did not manage to find them.
>>>>> Thanks,
>>>>> 
>>>>> 	Alessandro
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>> 
>>> 
>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1770 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150608/5fd70e78/attachment.p7s>

Alessandro De Salvo

2015-Jun-08 20:01 UTC

head link

[Gluster-users] Questions on ganesha HA and shared storage size

OK, I found at least one of the bugs.
The /usr/libexec/ganesha/ganesha.sh has the following lines:

    if [ -e /etc/os-release ]; then
        RHEL6_PCS_CNAME_OPTION=""
    fi

This is OK for RHEL < 7, but does not work for >= 7. I have changed it to
the following, to make it working:

    if [ -e /etc/os-release ]; then
        eval $(grep -F "REDHAT_SUPPORT_PRODUCT=" /etc/os-release)
        [ "$REDHAT_SUPPORT_PRODUCT" == "Fedora" ] &&
RHEL6_PCS_CNAME_OPTION=""
    fi

Apart from that, the VIP_<node> I was using were wrong, and I should have
converted all the ?-? to underscores, maybe this could be mentioned in the
documentation when you will have it ready.
Now, the cluster starts, but the VIPs apparently not:

Online: [ atlas-node1 atlas-node2 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ atlas-node1 atlas-node2 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ atlas-node1 atlas-node2 ]
 atlas-node1-cluster_ip-1  (ocf::heartbeat:IPaddr):        Stopped 
 atlas-node1-trigger_ip-1  (ocf::heartbeat:Dummy): Started atlas-node1 
 atlas-node2-cluster_ip-1  (ocf::heartbeat:IPaddr):        Stopped 
 atlas-node2-trigger_ip-1  (ocf::heartbeat:Dummy): Started atlas-node2 
 atlas-node1-dead_ip-1     (ocf::heartbeat:Dummy): Started atlas-node1 
 atlas-node2-dead_ip-1     (ocf::heartbeat:Dummy): Started atlas-node2 

PCSD Status:
  atlas-node1: Online
  atlas-node2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


But the issue that is puzzling me more is the following:

# showmount -e localhost
rpc mount export: RPC: Timed out

And when I try to enable the ganesha exports on a volume I get this error:

# gluster volume set atlas-home-01 ganesha.enable on
volume set: failed: Failed to create NFS-Ganesha export config file.

But I see the file created in /etc/ganesha/exports/*.conf
Still, showmount hangs and times out.
Any help?
Thanks,

	Alessandro
> Il giorno 08/giu/2015, alle ore 20:00, Alessandro De Salvo
<Alessandro.DeSalvo at roma1.infn.it> ha scritto:
> 
> Hi,
> indeed, it does not work :-)
> OK, this is what I did, with 2 machines, running CentOS 7.1, Glusterfs
3.7.1 and nfs-ganesha 2.2.0:
> 
> 1) ensured that the machines are able to resolve their IPs (but this was
already true since they were in the DNS);
> 2) disabled NetworkManager and enabled network on both machines;
> 3) created a gluster shared volume 'gluster_shared_storage' and
mounted it on '/run/gluster/shared_storage' on all the cluster nodes
using glusterfs native mount (on CentOS 7.1 there is a link by default /var/run
-> ../run)
> 4) created an empty /etc/ganesha/ganesha.conf;
> 5) installed pacemaker pcs resource-agents corosync on all cluster
machines;
> 6) set the ?hacluster? user the same password on all machines;
> 7) pcs cluster auth <hostname> -u hacluster -p <pass> on all
the nodes (on both nodes I issued the commands for both nodes)
> 8) IPv6 is configured by default on all nodes, although the infrastructure
is not ready for IPv6
> 9) enabled pcsd and started it on all nodes
> 10) populated /etc/ganesha/ganesha-ha.conf with the following contents, one
per machine:
> 
> 
> ===> atlas-node1
> # Name of the HA cluster created.
> HA_NAME="ATLAS_GANESHA_01"
> # The server from which you intend to mount
> # the shared volume.
> HA_VOL_SERVER=?atlas-node1"
> # The subset of nodes of the Gluster Trusted Pool
> # that forms the ganesha HA cluster. IP/Hostname
> # is specified.
> HA_CLUSTER_NODES=?atlas-node1,atlas-node2"
> # Virtual IPs of each of the nodes specified above.
> VIP_atlas-node1=?x.x.x.1"
> VIP_atlas-node2=?x.x.x.2"
> 
> ===> atlas-node2
> # Name of the HA cluster created.
> HA_NAME="ATLAS_GANESHA_01"
> # The server from which you intend to mount
> # the shared volume.
> HA_VOL_SERVER=?atlas-node2"
> # The subset of nodes of the Gluster Trusted Pool
> # that forms the ganesha HA cluster. IP/Hostname
> # is specified.
> HA_CLUSTER_NODES=?atlas-node1,atlas-node2"
> # Virtual IPs of each of the nodes specified above.
> VIP_atlas-node1=?x.x.x.1"
> VIP_atlas-node2=?x.x.x.2?
> 
> 11) issued gluster nfs-ganesha enable, but it fails with a cryptic message:
> 
> # gluster nfs-ganesha enable
> Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted
pool. Do you still want to continue? (y/n) y
> nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. Please
check the log file for details
> 
> Looking at the logs I found nothing really special but this:
> 
> ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=>
[2015-06-08 17:57:15.672844] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
> [2015-06-08 17:57:15.675395] I [glusterd-ganesha.c:386:check_host_list]
0-management: ganesha host found Hostname is atlas-node2
> [2015-06-08 17:57:15.720692] I [glusterd-ganesha.c:386:check_host_list]
0-management: ganesha host found Hostname is atlas-node2
> [2015-06-08 17:57:15.721161] I [glusterd-ganesha.c:335:is_ganesha_host]
0-management: ganesha host found Hostname is atlas-node2
> [2015-06-08 17:57:16.633048] E
[glusterd-ganesha.c:254:glusterd_op_set_ganesha] 0-management: Initial
NFS-Ganesha set up failed
> [2015-06-08 17:57:16.641563] E [glusterd-syncop.c:1396:gd_commit_op_phase]
0-management: Commit of operation 'Volume (null)' failed on localhost :
Failed to set up HA config for NFS-Ganesha. Please check the log file for
details
> 
> ==> /var/log/glusterfs/cmd_history.log <=> [2015-06-08
17:57:16.643615]  : nfs-ganesha enable : FAILED : Failed to set up HA config for
NFS-Ganesha. Please check the log file for details
> 
> ==> /var/log/glusterfs/cli.log <=> [2015-06-08 17:57:16.643839] I
[input.c:36:cli_batch] 0-: Exiting with: -1
> 
> 
> Also, pcs seems to be fine for the auth part, although it obviously tells
me the cluster is not running.
> 
> I, [2015-06-08T19:57:16.305323 #7223]  INFO -- : Running:
/usr/sbin/corosync-cmapctl totem.cluster_name
> I, [2015-06-08T19:57:16.345457 #7223]  INFO -- : Running: /usr/sbin/pcs
cluster token-nodes
> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET
/remote/check_auth HTTP/1.1" 200 68 0.1919
> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET
/remote/check_auth HTTP/1.1" 200 68 0.1920
> atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] "GET
/remote/check_auth HTTP/1.1" 200 68
> - -> /remote/check_auth
> 
> 
> What am I doing wrong?
> Thanks,
> 
> 	Alessandro
> 
>> Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri <skoduri at
redhat.com> ha scritto:
>> 
>> 
>> 
>> 
>> On 06/08/2015 08:20 PM, Alessandro De Salvo wrote:
>>> Sorry, just another question:
>>> 
>>> - in my installation of gluster 3.7.1 the command gluster
features.ganesha enable does not work:
>>> 
>>> # gluster features.ganesha enable
>>> unrecognized word: features.ganesha (position 0)
>>> 
>>> Which version has full support for it?
>> 
>> Sorry. This option has recently been changed. It is now
>> 
>> $ gluster nfs-ganesha enable
>> 
>> 
>>> 
>>> - in the documentation the ccs and cman packages are required, but
they seems not to be available anymore on CentOS 7 and similar, I guess they are
not really required anymore, as pcs should do the full job
>>> 
>>> Thanks,
>>> 
>>> 	Alessandro
>> 
>> Looks like so from http://clusterlabs.org/quickstart-redhat.html. Let
us know if it doesn't work.
>> 
>> Thanks,
>> Soumya
>> 
>>> 
>>>> Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo
<alessandro.desalvo at roma1.infn.it> ha scritto:
>>>> 
>>>> Great, many thanks Soumya!
>>>> Cheers,
>>>> 
>>>> 	Alessandro
>>>> 
>>>>> Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri
<skoduri at redhat.com> ha scritto:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Please find the slides of the demo video at [1]
>>>>> 
>>>>> We recommend to have a distributed replica volume as a
shared volume for better data-availability.
>>>>> 
>>>>> Size of the volume depends on the workload you may have.
Since it is used to maintain states of NLM/NFSv4 clients, you may calculate the
size of the volume to be minimum of aggregate of
>>>>> (typical_size_of'/var/lib/nfs'_directory +
~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)
>>>>> 
>>>>> We shall document about this feature sooner in the gluster
docs as well.
>>>>> 
>>>>> Thanks,
>>>>> Soumya
>>>>> 
>>>>> [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846
>>>>> 
>>>>> On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:
>>>>>> Hi,
>>>>>> I have seen the demo video on ganesha HA,
https://www.youtube.com/watch?v=Z4mvTQC-efM
>>>>>> However there is no advice on the appropriate size of
the shared volume. How is it really used, and what should be a reasonable size
for it?
>>>>>> Also, are the slides from the video available
somewhere, as well as a documentation on all this? I did not manage to find
them.
>>>>>> Thanks,
>>>>>> 
>>>>>> 	Alessandro
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>> 
>>>> 
>>> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1770 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150608/60aa4604/attachment.p7s>

Gluster users - Jun 2015 - Questions on ganesha HA and shared storage size

[Gluster-users] Questions on ganesha HA and shared storage size

[Gluster-users] Questions on ganesha HA and shared storage size