Alessandro De Salvo
2015-Jun-10 00:19 UTC
[Gluster-users] Questions on ganesha HA and shared storage size
Hi, I have enabled the full debug already, but I see nothing special. Before exporting any volume the log shows no error, even when I do a showmount (the log is attached, ganesha.log.gz). If I do the same after exporting a volume nfs-ganesha does not even start, complaining for not being able to bind the IPv6 ruota socket, but in fact there is nothing listening on IPv6, so it should not happen: tcp6 0 0 :::111 :::* LISTEN 7433/rpcbind tcp6 0 0 :::2224 :::* LISTEN 9054/ruby tcp6 0 0 :::22 :::* LISTEN 1248/sshd udp6 0 0 :::111 :::* 7433/rpcbind udp6 0 0 fe80::8c2:27ff:fef2:123 :::* 31238/ntpd udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd udp6 0 0 ::1:123 :::* 31238/ntpd udp6 0 0 fe80::5484:7aff:fef:123 :::* 31238/ntpd udp6 0 0 :::123 :::* 31238/ntpd udp6 0 0 :::824 :::* 7433/rpcbind The error, as shown in the attached ganesha-after-export.log.gz logfile, is the following: 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6 socket, error 98 (Address already in use) 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets :DISP :FATAL :Error binding to V6 interface. Cannot continue. 10/06/2015 02:07:48 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] glusterfs_unload :FSAL :DEBUG :FSAL Gluster unloaded Thanks, Alessandro -------------- next part -------------- A non-text attachment was scrubbed... Name: ganesha.log.gz Type: application/x-gzip Size: 19427 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150610/904ae82b/attachment.gz> -------------- next part -------------- A non-text attachment was scrubbed... Name: ganesha-after-export.log.gz Type: application/x-gzip Size: 6936 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150610/904ae82b/attachment-0001.gz> -------------- next part --------------> Il giorno 09/giu/2015, alle ore 18:37, Soumya Koduri <skoduri at redhat.com> ha scritto: > > > > On 06/09/2015 09:47 PM, Alessandro De Salvo wrote: >> Another update: the fact that I was unable to use vol set ganesha.enable >> was due to another bug in the ganesha scripts. In short, they are all >> using the following line to get the location of the conf file: >> >> CONF=$(cat /etc/sysconfig/ganesha | grep "CONFFILE" | cut -f 2 -d "=") >> >> First of all by default in /etc/sysconfig/ganesha there is no line >> CONFFILE, second there is a bug in that directive, as it works if I add >> in /etc/sysconfig/ganesha >> >> CONFFILE=/etc/ganesha/ganesha.conf >> >> but it fails if the same is quoted >> >> CONFFILE="/etc/ganesha/ganesha.conf" >> >> It would be much better to use the following, which has a default as >> well: >> >> eval $(grep -F CONFFILE= /etc/sysconfig/ganesha) >> CONF=${CONFFILE:/etc/ganesha/ganesha.conf} >> >> I'll update the bug report. >> Having said this... the last issue to tackle is the real problem with >> the ganesha.nfsd :-( > > Thanks. Could you try changing log level to NIV_FULL_DEBUG in '/etc/sysconfig/ganesha' and check if anything gets logged in '/var/log/ganesha.log' or '/ganesha.log'. > > Thanks, > Soumya > >> Cheers, >> >> Alessandro >> >> >> On Tue, 2015-06-09 at 14:25 +0200, Alessandro De Salvo wrote: >>> OK, I can confirm that the ganesha.nsfd process is actually not >>> answering to the calls. Here it is what I see: >>> >>> # rpcinfo -p >>> program vers proto port service >>> 100000 4 tcp 111 portmapper >>> 100000 3 tcp 111 portmapper >>> 100000 2 tcp 111 portmapper >>> 100000 4 udp 111 portmapper >>> 100000 3 udp 111 portmapper >>> 100000 2 udp 111 portmapper >>> 100024 1 udp 41594 status >>> 100024 1 tcp 53631 status >>> 100003 3 udp 2049 nfs >>> 100003 3 tcp 2049 nfs >>> 100003 4 udp 2049 nfs >>> 100003 4 tcp 2049 nfs >>> 100005 1 udp 58127 mountd >>> 100005 1 tcp 56301 mountd >>> 100005 3 udp 58127 mountd >>> 100005 3 tcp 56301 mountd >>> 100021 4 udp 46203 nlockmgr >>> 100021 4 tcp 41798 nlockmgr >>> 100011 1 udp 875 rquotad >>> 100011 1 tcp 875 rquotad >>> 100011 2 udp 875 rquotad >>> 100011 2 tcp 875 rquotad >>> >>> # netstat -lpn | grep ganesha >>> tcp6 14 0 :::2049 :::* >>> LISTEN 11937/ganesha.nfsd >>> tcp6 0 0 :::41798 :::* >>> LISTEN 11937/ganesha.nfsd >>> tcp6 0 0 :::875 :::* >>> LISTEN 11937/ganesha.nfsd >>> tcp6 10 0 :::56301 :::* >>> LISTEN 11937/ganesha.nfsd >>> tcp6 0 0 :::564 :::* >>> LISTEN 11937/ganesha.nfsd >>> udp6 0 0 :::2049 :::* >>> 11937/ganesha.nfsd >>> udp6 0 0 :::46203 :::* >>> 11937/ganesha.nfsd >>> udp6 0 0 :::58127 :::* >>> 11937/ganesha.nfsd >>> udp6 0 0 :::875 :::* >>> 11937/ganesha.nfsd >>> >>> I'm attaching the strace of a showmount from a node to the other. >>> This machinery was working with nfs-ganesha 2.1.0, so it must be >>> something introduced with 2.2.0. >>> Cheers, >>> >>> Alessandro >>> >>> >>> >>> On Tue, 2015-06-09 at 15:16 +0530, Soumya Koduri wrote: >>>> >>>> On 06/09/2015 02:48 PM, Alessandro De Salvo wrote: >>>>> Hi, >>>>> OK, the problem with the VIPs not starting is due to the ganesha_mon >>>>> heartbeat script looking for a pid file called >>>>> /var/run/ganesha.nfsd.pid, while by default ganesha.nfsd v.2.2.0 is >>>>> creating /var/run/ganesha.pid, this needs to be corrected. The file is >>>>> in glusterfs-ganesha-3.7.1-1.el7.x86_64, in my case. >>>>> For the moment I have created a symlink in this way and it works: >>>>> >>>>> ln -s /var/run/ganesha.pid /var/run/ganesha.nfsd.pid >>>>> >>>> Thanks. Please update this as well in the bug. >>>> >>>>> So far so good, the VIPs are up and pingable, but still there is the >>>>> problem of the hanging showmount (i.e. hanging RPC). >>>>> Still, I see a lot of errors like this in /var/log/messages: >>>>> >>>>> Jun 9 11:15:20 atlas-node1 lrmd[31221]: notice: operation_finished: >>>>> nfs-mon_monitor_10000:29292:stderr [ Error: Resource does not exist. ] >>>>> >>>>> While ganesha.log shows the server is not in grace: >>>>> >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29964[main] main :MAIN :EVENT :ganesha.nfsd Starting: >>>>> Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.2.0/src, built at >>>>> May 18 2015 14:17:18 on buildhw-09.phx2.fedoraproject.org >>>>> <http://buildhw-09.phx2.fedoraproject.org> >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_set_param_from_conf :NFS STARTUP :EVENT >>>>> :Configuration file successfully parsed >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT >>>>> :Initializing ID Mapper. >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper >>>>> successfully initialized. >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] main :NFS STARTUP :WARN :No export entries >>>>> found in configuration file !!! >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] config_errs_to_log :CONFIG :WARN :Config File >>>>> ((null):0): Empty configuration file >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT >>>>> :CAP_SYS_RESOURCE was successfully removed for proper quota management >>>>> in FSAL >>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT :currenty set >>>>> capabilities are: >>>>> cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap+ep >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Init_svc :DISP :CRIT :Cannot acquire >>>>> credentials for principal nfs >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin >>>>> thread initialized >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs4_start_grace :STATE :EVENT :NFS Server Now >>>>> IN GRACE, duration 60 >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT >>>>> :Callback creds directory (/var/run/ganesha) already exists >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN >>>>> :gssd_refresh_krb5_machine_credential failed (2:2) >>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :Starting >>>>> delayed executor. >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :9P/TCP >>>>> dispatcher thread was started successfully >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[_9p_disp] _9p_dispatcher_thread :9P DISP :EVENT :9P >>>>> dispatcher started >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT >>>>> :gsh_dbusthread was started successfully >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :admin thread >>>>> was started successfully >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :reaper thread >>>>> was started successfully >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now IN >>>>> GRACE >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :General >>>>> fridge was started successfully >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT >>>>> :------------------------------------------------- >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT : NFS >>>>> SERVER INITIALIZED >>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT >>>>> :------------------------------------------------- >>>>> 09/06/2015 11:17:22 : epoch 5576aee4 : atlas-node1 : >>>>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now >>>>> NOT IN GRACE >>>>> >>>>> >>>> Please check the status of nfs-ganesha >>>> $service nfs-ganesha status >>>> >>>> Could you try taking a packet trace (during showmount or mount) and >>>> check the server responses. >>>> >>>> Thanks, >>>> Soumya >>>> >>>>> Cheers, >>>>> >>>>> Alessandro >>>>> >>>>> >>>>>> Il giorno 09/giu/2015, alle ore 10:36, Alessandro De Salvo >>>>>> <alessandro.desalvo at roma1.infn.it >>>>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto: >>>>>> >>>>>> Hi Soumya, >>>>>> >>>>>>> Il giorno 09/giu/2015, alle ore 08:06, Soumya Koduri >>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 06/09/2015 01:31 AM, Alessandro De Salvo wrote: >>>>>>>> OK, I found at least one of the bugs. >>>>>>>> The /usr/libexec/ganesha/ganesha.sh has the following lines: >>>>>>>> >>>>>>>> if [ -e /etc/os-release ]; then >>>>>>>> RHEL6_PCS_CNAME_OPTION="" >>>>>>>> fi >>>>>>>> >>>>>>>> This is OK for RHEL < 7, but does not work for >= 7. I have changed >>>>>>>> it to the following, to make it working: >>>>>>>> >>>>>>>> if [ -e /etc/os-release ]; then >>>>>>>> eval $(grep -F "REDHAT_SUPPORT_PRODUCT=" /etc/os-release) >>>>>>>> [ "$REDHAT_SUPPORT_PRODUCT" == "Fedora" ] && >>>>>>>> RHEL6_PCS_CNAME_OPTION="" >>>>>>>> fi >>>>>>>> >>>>>>> Oh..Thanks for the fix. Could you please file a bug for the same (and >>>>>>> probably submit your fix as well). We shall have it corrected. >>>>>> >>>>>> Just did it,https://bugzilla.redhat.com/show_bug.cgi?id=1229601 >>>>>> >>>>>>> >>>>>>>> Apart from that, the VIP_<node> I was using were wrong, and I should >>>>>>>> have converted all the ?-? to underscores, maybe this could be >>>>>>>> mentioned in the documentation when you will have it ready. >>>>>>>> Now, the cluster starts, but the VIPs apparently not: >>>>>>>> >>>>>>> Sure. Thanks again for pointing it out. We shall make a note of it. >>>>>>> >>>>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>>>> >>>>>>>> Full list of resources: >>>>>>>> >>>>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>>> >>>>>>>> PCSD Status: >>>>>>>> atlas-node1: Online >>>>>>>> atlas-node2: Online >>>>>>>> >>>>>>>> Daemon Status: >>>>>>>> corosync: active/disabled >>>>>>>> pacemaker: active/disabled >>>>>>>> pcsd: active/enabled >>>>>>>> >>>>>>>> >>>>>>> Here corosync and pacemaker shows 'disabled' state. Can you check the >>>>>>> status of their services. They should be running prior to cluster >>>>>>> creation. We need to include that step in document as well. >>>>>> >>>>>> Ah, OK, you?re right, I have added it to my puppet modules (we install >>>>>> and configure ganesha via puppet, I?ll put the module on puppetforge >>>>>> soon, in case anyone is interested). >>>>>> >>>>>>> >>>>>>>> But the issue that is puzzling me more is the following: >>>>>>>> >>>>>>>> # showmount -e localhost >>>>>>>> rpc mount export: RPC: Timed out >>>>>>>> >>>>>>>> And when I try to enable the ganesha exports on a volume I get this >>>>>>>> error: >>>>>>>> >>>>>>>> # gluster volume set atlas-home-01 ganesha.enable on >>>>>>>> volume set: failed: Failed to create NFS-Ganesha export config file. >>>>>>>> >>>>>>>> But I see the file created in /etc/ganesha/exports/*.conf >>>>>>>> Still, showmount hangs and times out. >>>>>>>> Any help? >>>>>>>> Thanks, >>>>>>>> >>>>>>> Hmm that's strange. Sometimes, in case if there was no proper cleanup >>>>>>> done while trying to re-create the cluster, we have seen such issues. >>>>>>> >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1227709 >>>>>>> >>>>>>> http://review.gluster.org/#/c/11093/ >>>>>>> >>>>>>> Can you please unexport all the volumes, teardown the cluster using >>>>>>> 'gluster vol set <volname> ganesha.enable off? >>>>>> >>>>>> OK: >>>>>> >>>>>> # gluster vol set atlas-home-01 ganesha.enable off >>>>>> volume set: failed: ganesha.enable is already 'off'. >>>>>> >>>>>> # gluster vol set atlas-data-01 ganesha.enable off >>>>>> volume set: failed: ganesha.enable is already 'off'. >>>>>> >>>>>> >>>>>>> 'gluster ganesha disable' command. >>>>>> >>>>>> I?m assuming you wanted to write nfs-ganesha instead? >>>>>> >>>>>> # gluster nfs-ganesha disable >>>>>> ganesha enable : success >>>>>> >>>>>> >>>>>> A side note (not really important): it?s strange that when I do a >>>>>> disable the message is ?ganesha enable? :-) >>>>>> >>>>>>> >>>>>>> Verify if the following files have been deleted on all the nodes- >>>>>>> '/etc/cluster/cluster.conf? >>>>>> >>>>>> this file is not present at all, I think it?s not needed in CentOS 7 >>>>>> >>>>>>> '/etc/ganesha/ganesha.conf?, >>>>>> >>>>>> it?s still there, but empty, and I guess it should be OK, right? >>>>>> >>>>>>> '/etc/ganesha/exports/*? >>>>>> >>>>>> no more files there >>>>>> >>>>>>> '/var/lib/pacemaker/cib? >>>>>> >>>>>> it?s empty >>>>>> >>>>>>> >>>>>>> Verify if the ganesha service is stopped on all the nodes. >>>>>> >>>>>> nope, it?s still running, I will stop it. >>>>>> >>>>>>> >>>>>>> start/restart the services - corosync, pcs. >>>>>> >>>>>> In the node where I issued the nfs-ganesha disable there is no more >>>>>> any /etc/corosync/corosync.conf so corosync won?t start. The other >>>>>> node instead still has the file, it?s strange. >>>>>> >>>>>>> >>>>>>> And re-try the HA cluster creation >>>>>>> 'gluster ganesha enable? >>>>>> >>>>>> This time (repeated twice) it did not work at all: >>>>>> >>>>>> # pcs status >>>>>> Cluster name: ATLAS_GANESHA_01 >>>>>> Last updated: Tue Jun 9 10:13:43 2015 >>>>>> Last change: Tue Jun 9 10:13:22 2015 >>>>>> Stack: corosync >>>>>> Current DC: atlas-node1 (1) - partition with quorum >>>>>> Version: 1.1.12-a14efad >>>>>> 2 Nodes configured >>>>>> 6 Resources configured >>>>>> >>>>>> >>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>> >>>>>> Full list of resources: >>>>>> >>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>> >>>>>> PCSD Status: >>>>>> atlas-node1: Online >>>>>> atlas-node2: Online >>>>>> >>>>>> Daemon Status: >>>>>> corosync: active/enabled >>>>>> pacemaker: active/enabled >>>>>> pcsd: active/enabled >>>>>> >>>>>> >>>>>> >>>>>> I tried then "pcs cluster destroy" on both nodes, and then again >>>>>> nfs-ganesha enable, but now I?m back to the old problem: >>>>>> >>>>>> # pcs status >>>>>> Cluster name: ATLAS_GANESHA_01 >>>>>> Last updated: Tue Jun 9 10:22:27 2015 >>>>>> Last change: Tue Jun 9 10:17:00 2015 >>>>>> Stack: corosync >>>>>> Current DC: atlas-node2 (2) - partition with quorum >>>>>> Version: 1.1.12-a14efad >>>>>> 2 Nodes configured >>>>>> 10 Resources configured >>>>>> >>>>>> >>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>> >>>>>> Full list of resources: >>>>>> >>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>> >>>>>> PCSD Status: >>>>>> atlas-node1: Online >>>>>> atlas-node2: Online >>>>>> >>>>>> Daemon Status: >>>>>> corosync: active/enabled >>>>>> pacemaker: active/enabled >>>>>> pcsd: active/enabled >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Alessandro >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Soumya >>>>>>> >>>>>>>> Alessandro >>>>>>>> >>>>>>>>> Il giorno 08/giu/2015, alle ore 20:00, Alessandro De Salvo >>>>>>>>> <Alessandro.DeSalvo at roma1.infn.it >>>>>>>>> <mailto:Alessandro.DeSalvo at roma1.infn.it>> ha scritto: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> indeed, it does not work :-) >>>>>>>>> OK, this is what I did, with 2 machines, running CentOS 7.1, >>>>>>>>> Glusterfs 3.7.1 and nfs-ganesha 2.2.0: >>>>>>>>> >>>>>>>>> 1) ensured that the machines are able to resolve their IPs (but >>>>>>>>> this was already true since they were in the DNS); >>>>>>>>> 2) disabled NetworkManager and enabled network on both machines; >>>>>>>>> 3) created a gluster shared volume 'gluster_shared_storage' and >>>>>>>>> mounted it on '/run/gluster/shared_storage' on all the cluster >>>>>>>>> nodes using glusterfs native mount (on CentOS 7.1 there is a link >>>>>>>>> by default /var/run -> ../run) >>>>>>>>> 4) created an empty /etc/ganesha/ganesha.conf; >>>>>>>>> 5) installed pacemaker pcs resource-agents corosync on all cluster >>>>>>>>> machines; >>>>>>>>> 6) set the ?hacluster? user the same password on all machines; >>>>>>>>> 7) pcs cluster auth <hostname> -u hacluster -p <pass> on all the >>>>>>>>> nodes (on both nodes I issued the commands for both nodes) >>>>>>>>> 8) IPv6 is configured by default on all nodes, although the >>>>>>>>> infrastructure is not ready for IPv6 >>>>>>>>> 9) enabled pcsd and started it on all nodes >>>>>>>>> 10) populated /etc/ganesha/ganesha-ha.conf with the following >>>>>>>>> contents, one per machine: >>>>>>>>> >>>>>>>>> >>>>>>>>> ===> atlas-node1 >>>>>>>>> # Name of the HA cluster created. >>>>>>>>> HA_NAME="ATLAS_GANESHA_01" >>>>>>>>> # The server from which you intend to mount >>>>>>>>> # the shared volume. >>>>>>>>> HA_VOL_SERVER=?atlas-node1" >>>>>>>>> # The subset of nodes of the Gluster Trusted Pool >>>>>>>>> # that forms the ganesha HA cluster. IP/Hostname >>>>>>>>> # is specified. >>>>>>>>> HA_CLUSTER_NODES=?atlas-node1,atlas-node2" >>>>>>>>> # Virtual IPs of each of the nodes specified above. >>>>>>>>> VIP_atlas-node1=?x.x.x.1" >>>>>>>>> VIP_atlas-node2=?x.x.x.2" >>>>>>>>> >>>>>>>>> ===> atlas-node2 >>>>>>>>> # Name of the HA cluster created. >>>>>>>>> HA_NAME="ATLAS_GANESHA_01" >>>>>>>>> # The server from which you intend to mount >>>>>>>>> # the shared volume. >>>>>>>>> HA_VOL_SERVER=?atlas-node2" >>>>>>>>> # The subset of nodes of the Gluster Trusted Pool >>>>>>>>> # that forms the ganesha HA cluster. IP/Hostname >>>>>>>>> # is specified. >>>>>>>>> HA_CLUSTER_NODES=?atlas-node1,atlas-node2" >>>>>>>>> # Virtual IPs of each of the nodes specified above. >>>>>>>>> VIP_atlas-node1=?x.x.x.1" >>>>>>>>> VIP_atlas-node2=?x.x.x.2? >>>>>>>>> >>>>>>>>> 11) issued gluster nfs-ganesha enable, but it fails with a cryptic >>>>>>>>> message: >>>>>>>>> >>>>>>>>> # gluster nfs-ganesha enable >>>>>>>>> Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the >>>>>>>>> trusted pool. Do you still want to continue? (y/n) y >>>>>>>>> nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. >>>>>>>>> Please check the log file for details >>>>>>>>> >>>>>>>>> Looking at the logs I found nothing really special but this: >>>>>>>>> >>>>>>>>> ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=>>>>>>>>> [2015-06-08 17:57:15.672844] I [MSGID: 106132] >>>>>>>>> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs >>>>>>>>> already stopped >>>>>>>>> [2015-06-08 17:57:15.675395] I >>>>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host >>>>>>>>> found Hostname is atlas-node2 >>>>>>>>> [2015-06-08 17:57:15.720692] I >>>>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host >>>>>>>>> found Hostname is atlas-node2 >>>>>>>>> [2015-06-08 17:57:15.721161] I >>>>>>>>> [glusterd-ganesha.c:335:is_ganesha_host] 0-management: ganesha host >>>>>>>>> found Hostname is atlas-node2 >>>>>>>>> [2015-06-08 17:57:16.633048] E >>>>>>>>> [glusterd-ganesha.c:254:glusterd_op_set_ganesha] 0-management: >>>>>>>>> Initial NFS-Ganesha set up failed >>>>>>>>> [2015-06-08 17:57:16.641563] E >>>>>>>>> [glusterd-syncop.c:1396:gd_commit_op_phase] 0-management: Commit of >>>>>>>>> operation 'Volume (null)' failed on localhost : Failed to set up HA >>>>>>>>> config for NFS-Ganesha. Please check the log file for details >>>>>>>>> >>>>>>>>> ==> /var/log/glusterfs/cmd_history.log <=>>>>>>>>> [2015-06-08 17:57:16.643615] : nfs-ganesha enable : FAILED : >>>>>>>>> Failed to set up HA config for NFS-Ganesha. Please check the log >>>>>>>>> file for details >>>>>>>>> >>>>>>>>> ==> /var/log/glusterfs/cli.log <=>>>>>>>>> [2015-06-08 17:57:16.643839] I [input.c:36:cli_batch] 0-: Exiting >>>>>>>>> with: -1 >>>>>>>>> >>>>>>>>> >>>>>>>>> Also, pcs seems to be fine for the auth part, although it obviously >>>>>>>>> tells me the cluster is not running. >>>>>>>>> >>>>>>>>> I, [2015-06-08T19:57:16.305323 #7223] INFO -- : Running: >>>>>>>>> /usr/sbin/corosync-cmapctl totem.cluster_name >>>>>>>>> I, [2015-06-08T19:57:16.345457 #7223] INFO -- : Running: >>>>>>>>> /usr/sbin/pcs cluster token-nodes >>>>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET >>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1919 >>>>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET >>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1920 >>>>>>>>> atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] "GET >>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 >>>>>>>>> - -> /remote/check_auth >>>>>>>>> >>>>>>>>> >>>>>>>>> What am I doing wrong? >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Alessandro >>>>>>>>> >>>>>>>>>> Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri >>>>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 06/08/2015 08:20 PM, Alessandro De Salvo wrote: >>>>>>>>>>> Sorry, just another question: >>>>>>>>>>> >>>>>>>>>>> - in my installation of gluster 3.7.1 the command gluster >>>>>>>>>>> features.ganesha enable does not work: >>>>>>>>>>> >>>>>>>>>>> # gluster features.ganesha enable >>>>>>>>>>> unrecognized word: features.ganesha (position 0) >>>>>>>>>>> >>>>>>>>>>> Which version has full support for it? >>>>>>>>>> >>>>>>>>>> Sorry. This option has recently been changed. It is now >>>>>>>>>> >>>>>>>>>> $ gluster nfs-ganesha enable >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - in the documentation the ccs and cman packages are required, >>>>>>>>>>> but they seems not to be available anymore on CentOS 7 and >>>>>>>>>>> similar, I guess they are not really required anymore, as pcs >>>>>>>>>>> should do the full job >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Alessandro >>>>>>>>>> >>>>>>>>>> Looks like so from http://clusterlabs.org/quickstart-redhat.html. >>>>>>>>>> Let us know if it doesn't work. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Soumya >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo >>>>>>>>>>>> <alessandro.desalvo at roma1.infn.it >>>>>>>>>>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>> Great, many thanks Soumya! >>>>>>>>>>>> Cheers, >>>>>>>>>>>> >>>>>>>>>>>> Alessandro >>>>>>>>>>>> >>>>>>>>>>>>> Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri >>>>>>>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Please find the slides of the demo video at [1] >>>>>>>>>>>>> >>>>>>>>>>>>> We recommend to have a distributed replica volume as a shared >>>>>>>>>>>>> volume for better data-availability. >>>>>>>>>>>>> >>>>>>>>>>>>> Size of the volume depends on the workload you may have. Since >>>>>>>>>>>>> it is used to maintain states of NLM/NFSv4 clients, you may >>>>>>>>>>>>> calculate the size of the volume to be minimum of aggregate of >>>>>>>>>>>>> (typical_size_of'/var/lib/nfs'_directory + >>>>>>>>>>>>> ~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point) >>>>>>>>>>>>> >>>>>>>>>>>>> We shall document about this feature sooner in the gluster docs >>>>>>>>>>>>> as well. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Soumya >>>>>>>>>>>>> >>>>>>>>>>>>> [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846 >>>>>>>>>>>>> >>>>>>>>>>>>> On 06/08/2015 04:34 PM, Alessandro De Salvo wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> I have seen the demo video on ganesha HA, >>>>>>>>>>>>>> https://www.youtube.com/watch?v=Z4mvTQC-efM >>>>>>>>>>>>>> However there is no advice on the appropriate size of the >>>>>>>>>>>>>> shared volume. How is it really used, and what should be a >>>>>>>>>>>>>> reasonable size for it? >>>>>>>>>>>>>> Also, are the slides from the video available somewhere, as >>>>>>>>>>>>>> well as a documentation on all this? I did not manage to find >>>>>>>>>>>>>> them. >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Alessandro >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing list >>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1770 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150610/904ae82b/attachment.p7s>
Soumya Koduri
2015-Jun-10 09:58 UTC
[Gluster-users] Questions on ganesha HA and shared storage size
On 06/10/2015 05:49 AM, Alessandro De Salvo wrote:> Hi, > I have enabled the full debug already, but I see nothing special. Before exporting any volume the log shows no error, even when I do a showmount (the log is attached, ganesha.log.gz). If I do the same after exporting a volume nfs-ganesha does not even start, complaining for not being able to bind the IPv6 ruota socket, but in fact there is nothing listening on IPv6, so it should not happen: > > tcp6 0 0 :::111 :::* LISTEN 7433/rpcbind > tcp6 0 0 :::2224 :::* LISTEN 9054/ruby > tcp6 0 0 :::22 :::* LISTEN 1248/sshd > udp6 0 0 :::111 :::* 7433/rpcbind > udp6 0 0 fe80::8c2:27ff:fef2:123 :::* 31238/ntpd > udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd > udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd > udp6 0 0 fe80::230:48ff:fed2:123 :::* 31238/ntpd > udp6 0 0 ::1:123 :::* 31238/ntpd > udp6 0 0 fe80::5484:7aff:fef:123 :::* 31238/ntpd > udp6 0 0 :::123 :::* 31238/ntpd > udp6 0 0 :::824 :::* 7433/rpcbind > > The error, as shown in the attached ganesha-after-export.log.gz logfile, is the following: > > > 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6 socket, error 98 (Address already in use) > 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets :DISP :FATAL :Error binding to V6 interface. Cannot continue. > 10/06/2015 02:07:48 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] glusterfs_unload :FSAL :DEBUG :FSAL Gluster unloaded >We have seen such issues with RPCBIND few times. NFS-Ganesha setup first disables Gluster-NFS and then brings up NFS-Ganesha service. Sometimes, there could be delay or issue with Gluster-NFS un-registering those services and when NFS-Ganesha tries to register to the same port, it throws this error. Please try registering Rquota to any random port using below config option in "/etc/ganesha/ganesha.conf" NFS_Core_Param { #Use a non-privileged port for RQuota Rquota_Port = 4501; } and cleanup '/var/cache/rpcbind/' directory before the setup. Thanks, Soumya> > Thanks, > > Alessandro > > > > >> Il giorno 09/giu/2015, alle ore 18:37, Soumya Koduri <skoduri at redhat.com> ha scritto: >> >> >> >> On 06/09/2015 09:47 PM, Alessandro De Salvo wrote: >>> Another update: the fact that I was unable to use vol set ganesha.enable >>> was due to another bug in the ganesha scripts. In short, they are all >>> using the following line to get the location of the conf file: >>> >>> CONF=$(cat /etc/sysconfig/ganesha | grep "CONFFILE" | cut -f 2 -d "=") >>> >>> First of all by default in /etc/sysconfig/ganesha there is no line >>> CONFFILE, second there is a bug in that directive, as it works if I add >>> in /etc/sysconfig/ganesha >>> >>> CONFFILE=/etc/ganesha/ganesha.conf >>> >>> but it fails if the same is quoted >>> >>> CONFFILE="/etc/ganesha/ganesha.conf" >>> >>> It would be much better to use the following, which has a default as >>> well: >>> >>> eval $(grep -F CONFFILE= /etc/sysconfig/ganesha) >>> CONF=${CONFFILE:/etc/ganesha/ganesha.conf} >>> >>> I'll update the bug report. >>> Having said this... the last issue to tackle is the real problem with >>> the ganesha.nfsd :-( >> >> Thanks. Could you try changing log level to NIV_FULL_DEBUG in '/etc/sysconfig/ganesha' and check if anything gets logged in '/var/log/ganesha.log' or '/ganesha.log'. >> >> Thanks, >> Soumya >> >>> Cheers, >>> >>> Alessandro >>> >>> >>> On Tue, 2015-06-09 at 14:25 +0200, Alessandro De Salvo wrote: >>>> OK, I can confirm that the ganesha.nsfd process is actually not >>>> answering to the calls. Here it is what I see: >>>> >>>> # rpcinfo -p >>>> program vers proto port service >>>> 100000 4 tcp 111 portmapper >>>> 100000 3 tcp 111 portmapper >>>> 100000 2 tcp 111 portmapper >>>> 100000 4 udp 111 portmapper >>>> 100000 3 udp 111 portmapper >>>> 100000 2 udp 111 portmapper >>>> 100024 1 udp 41594 status >>>> 100024 1 tcp 53631 status >>>> 100003 3 udp 2049 nfs >>>> 100003 3 tcp 2049 nfs >>>> 100003 4 udp 2049 nfs >>>> 100003 4 tcp 2049 nfs >>>> 100005 1 udp 58127 mountd >>>> 100005 1 tcp 56301 mountd >>>> 100005 3 udp 58127 mountd >>>> 100005 3 tcp 56301 mountd >>>> 100021 4 udp 46203 nlockmgr >>>> 100021 4 tcp 41798 nlockmgr >>>> 100011 1 udp 875 rquotad >>>> 100011 1 tcp 875 rquotad >>>> 100011 2 udp 875 rquotad >>>> 100011 2 tcp 875 rquotad >>>> >>>> # netstat -lpn | grep ganesha >>>> tcp6 14 0 :::2049 :::* >>>> LISTEN 11937/ganesha.nfsd >>>> tcp6 0 0 :::41798 :::* >>>> LISTEN 11937/ganesha.nfsd >>>> tcp6 0 0 :::875 :::* >>>> LISTEN 11937/ganesha.nfsd >>>> tcp6 10 0 :::56301 :::* >>>> LISTEN 11937/ganesha.nfsd >>>> tcp6 0 0 :::564 :::* >>>> LISTEN 11937/ganesha.nfsd >>>> udp6 0 0 :::2049 :::* >>>> 11937/ganesha.nfsd >>>> udp6 0 0 :::46203 :::* >>>> 11937/ganesha.nfsd >>>> udp6 0 0 :::58127 :::* >>>> 11937/ganesha.nfsd >>>> udp6 0 0 :::875 :::* >>>> 11937/ganesha.nfsd >>>> >>>> I'm attaching the strace of a showmount from a node to the other. >>>> This machinery was working with nfs-ganesha 2.1.0, so it must be >>>> something introduced with 2.2.0. >>>> Cheers, >>>> >>>> Alessandro >>>> >>>> >>>> >>>> On Tue, 2015-06-09 at 15:16 +0530, Soumya Koduri wrote: >>>>> >>>>> On 06/09/2015 02:48 PM, Alessandro De Salvo wrote: >>>>>> Hi, >>>>>> OK, the problem with the VIPs not starting is due to the ganesha_mon >>>>>> heartbeat script looking for a pid file called >>>>>> /var/run/ganesha.nfsd.pid, while by default ganesha.nfsd v.2.2.0 is >>>>>> creating /var/run/ganesha.pid, this needs to be corrected. The file is >>>>>> in glusterfs-ganesha-3.7.1-1.el7.x86_64, in my case. >>>>>> For the moment I have created a symlink in this way and it works: >>>>>> >>>>>> ln -s /var/run/ganesha.pid /var/run/ganesha.nfsd.pid >>>>>> >>>>> Thanks. Please update this as well in the bug. >>>>> >>>>>> So far so good, the VIPs are up and pingable, but still there is the >>>>>> problem of the hanging showmount (i.e. hanging RPC). >>>>>> Still, I see a lot of errors like this in /var/log/messages: >>>>>> >>>>>> Jun 9 11:15:20 atlas-node1 lrmd[31221]: notice: operation_finished: >>>>>> nfs-mon_monitor_10000:29292:stderr [ Error: Resource does not exist. ] >>>>>> >>>>>> While ganesha.log shows the server is not in grace: >>>>>> >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29964[main] main :MAIN :EVENT :ganesha.nfsd Starting: >>>>>> Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.2.0/src, built at >>>>>> May 18 2015 14:17:18 on buildhw-09.phx2.fedoraproject.org >>>>>> <http://buildhw-09.phx2.fedoraproject.org> >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_set_param_from_conf :NFS STARTUP :EVENT >>>>>> :Configuration file successfully parsed >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT >>>>>> :Initializing ID Mapper. >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper >>>>>> successfully initialized. >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] main :NFS STARTUP :WARN :No export entries >>>>>> found in configuration file !!! >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] config_errs_to_log :CONFIG :WARN :Config File >>>>>> ((null):0): Empty configuration file >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT >>>>>> :CAP_SYS_RESOURCE was successfully removed for proper quota management >>>>>> in FSAL >>>>>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT :currenty set >>>>>> capabilities are: >>>>>> cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap+ep >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Init_svc :DISP :CRIT :Cannot acquire >>>>>> credentials for principal nfs >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin >>>>>> thread initialized >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs4_start_grace :STATE :EVENT :NFS Server Now >>>>>> IN GRACE, duration 60 >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT >>>>>> :Callback creds directory (/var/run/ganesha) already exists >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN >>>>>> :gssd_refresh_krb5_machine_credential failed (2:2) >>>>>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :Starting >>>>>> delayed executor. >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :9P/TCP >>>>>> dispatcher thread was started successfully >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[_9p_disp] _9p_dispatcher_thread :9P DISP :EVENT :9P >>>>>> dispatcher started >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT >>>>>> :gsh_dbusthread was started successfully >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :admin thread >>>>>> was started successfully >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :reaper thread >>>>>> was started successfully >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now IN >>>>>> GRACE >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :General >>>>>> fridge was started successfully >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT >>>>>> :------------------------------------------------- >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT : NFS >>>>>> SERVER INITIALIZED >>>>>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT >>>>>> :------------------------------------------------- >>>>>> 09/06/2015 11:17:22 : epoch 5576aee4 : atlas-node1 : >>>>>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now >>>>>> NOT IN GRACE >>>>>> >>>>>> >>>>> Please check the status of nfs-ganesha >>>>> $service nfs-ganesha status >>>>> >>>>> Could you try taking a packet trace (during showmount or mount) and >>>>> check the server responses. >>>>> >>>>> Thanks, >>>>> Soumya >>>>> >>>>>> Cheers, >>>>>> >>>>>> Alessandro >>>>>> >>>>>> >>>>>>> Il giorno 09/giu/2015, alle ore 10:36, Alessandro De Salvo >>>>>>> <alessandro.desalvo at roma1.infn.it >>>>>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto: >>>>>>> >>>>>>> Hi Soumya, >>>>>>> >>>>>>>> Il giorno 09/giu/2015, alle ore 08:06, Soumya Koduri >>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 06/09/2015 01:31 AM, Alessandro De Salvo wrote: >>>>>>>>> OK, I found at least one of the bugs. >>>>>>>>> The /usr/libexec/ganesha/ganesha.sh has the following lines: >>>>>>>>> >>>>>>>>> if [ -e /etc/os-release ]; then >>>>>>>>> RHEL6_PCS_CNAME_OPTION="" >>>>>>>>> fi >>>>>>>>> >>>>>>>>> This is OK for RHEL < 7, but does not work for >= 7. I have changed >>>>>>>>> it to the following, to make it working: >>>>>>>>> >>>>>>>>> if [ -e /etc/os-release ]; then >>>>>>>>> eval $(grep -F "REDHAT_SUPPORT_PRODUCT=" /etc/os-release) >>>>>>>>> [ "$REDHAT_SUPPORT_PRODUCT" == "Fedora" ] && >>>>>>>>> RHEL6_PCS_CNAME_OPTION="" >>>>>>>>> fi >>>>>>>>> >>>>>>>> Oh..Thanks for the fix. Could you please file a bug for the same (and >>>>>>>> probably submit your fix as well). We shall have it corrected. >>>>>>> >>>>>>> Just did it,https://bugzilla.redhat.com/show_bug.cgi?id=1229601 >>>>>>> >>>>>>>> >>>>>>>>> Apart from that, the VIP_<node> I was using were wrong, and I should >>>>>>>>> have converted all the ?-? to underscores, maybe this could be >>>>>>>>> mentioned in the documentation when you will have it ready. >>>>>>>>> Now, the cluster starts, but the VIPs apparently not: >>>>>>>>> >>>>>>>> Sure. Thanks again for pointing it out. We shall make a note of it. >>>>>>>> >>>>>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>>>>> >>>>>>>>> Full list of resources: >>>>>>>>> >>>>>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>>>> >>>>>>>>> PCSD Status: >>>>>>>>> atlas-node1: Online >>>>>>>>> atlas-node2: Online >>>>>>>>> >>>>>>>>> Daemon Status: >>>>>>>>> corosync: active/disabled >>>>>>>>> pacemaker: active/disabled >>>>>>>>> pcsd: active/enabled >>>>>>>>> >>>>>>>>> >>>>>>>> Here corosync and pacemaker shows 'disabled' state. Can you check the >>>>>>>> status of their services. They should be running prior to cluster >>>>>>>> creation. We need to include that step in document as well. >>>>>>> >>>>>>> Ah, OK, you?re right, I have added it to my puppet modules (we install >>>>>>> and configure ganesha via puppet, I?ll put the module on puppetforge >>>>>>> soon, in case anyone is interested). >>>>>>> >>>>>>>> >>>>>>>>> But the issue that is puzzling me more is the following: >>>>>>>>> >>>>>>>>> # showmount -e localhost >>>>>>>>> rpc mount export: RPC: Timed out >>>>>>>>> >>>>>>>>> And when I try to enable the ganesha exports on a volume I get this >>>>>>>>> error: >>>>>>>>> >>>>>>>>> # gluster volume set atlas-home-01 ganesha.enable on >>>>>>>>> volume set: failed: Failed to create NFS-Ganesha export config file. >>>>>>>>> >>>>>>>>> But I see the file created in /etc/ganesha/exports/*.conf >>>>>>>>> Still, showmount hangs and times out. >>>>>>>>> Any help? >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>> Hmm that's strange. Sometimes, in case if there was no proper cleanup >>>>>>>> done while trying to re-create the cluster, we have seen such issues. >>>>>>>> >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1227709 >>>>>>>> >>>>>>>> http://review.gluster.org/#/c/11093/ >>>>>>>> >>>>>>>> Can you please unexport all the volumes, teardown the cluster using >>>>>>>> 'gluster vol set <volname> ganesha.enable off? >>>>>>> >>>>>>> OK: >>>>>>> >>>>>>> # gluster vol set atlas-home-01 ganesha.enable off >>>>>>> volume set: failed: ganesha.enable is already 'off'. >>>>>>> >>>>>>> # gluster vol set atlas-data-01 ganesha.enable off >>>>>>> volume set: failed: ganesha.enable is already 'off'. >>>>>>> >>>>>>> >>>>>>>> 'gluster ganesha disable' command. >>>>>>> >>>>>>> I?m assuming you wanted to write nfs-ganesha instead? >>>>>>> >>>>>>> # gluster nfs-ganesha disable >>>>>>> ganesha enable : success >>>>>>> >>>>>>> >>>>>>> A side note (not really important): it?s strange that when I do a >>>>>>> disable the message is ?ganesha enable? :-) >>>>>>> >>>>>>>> >>>>>>>> Verify if the following files have been deleted on all the nodes- >>>>>>>> '/etc/cluster/cluster.conf? >>>>>>> >>>>>>> this file is not present at all, I think it?s not needed in CentOS 7 >>>>>>> >>>>>>>> '/etc/ganesha/ganesha.conf?, >>>>>>> >>>>>>> it?s still there, but empty, and I guess it should be OK, right? >>>>>>> >>>>>>>> '/etc/ganesha/exports/*? >>>>>>> >>>>>>> no more files there >>>>>>> >>>>>>>> '/var/lib/pacemaker/cib? >>>>>>> >>>>>>> it?s empty >>>>>>> >>>>>>>> >>>>>>>> Verify if the ganesha service is stopped on all the nodes. >>>>>>> >>>>>>> nope, it?s still running, I will stop it. >>>>>>> >>>>>>>> >>>>>>>> start/restart the services - corosync, pcs. >>>>>>> >>>>>>> In the node where I issued the nfs-ganesha disable there is no more >>>>>>> any /etc/corosync/corosync.conf so corosync won?t start. The other >>>>>>> node instead still has the file, it?s strange. >>>>>>> >>>>>>>> >>>>>>>> And re-try the HA cluster creation >>>>>>>> 'gluster ganesha enable? >>>>>>> >>>>>>> This time (repeated twice) it did not work at all: >>>>>>> >>>>>>> # pcs status >>>>>>> Cluster name: ATLAS_GANESHA_01 >>>>>>> Last updated: Tue Jun 9 10:13:43 2015 >>>>>>> Last change: Tue Jun 9 10:13:22 2015 >>>>>>> Stack: corosync >>>>>>> Current DC: atlas-node1 (1) - partition with quorum >>>>>>> Version: 1.1.12-a14efad >>>>>>> 2 Nodes configured >>>>>>> 6 Resources configured >>>>>>> >>>>>>> >>>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>>> >>>>>>> Full list of resources: >>>>>>> >>>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>> >>>>>>> PCSD Status: >>>>>>> atlas-node1: Online >>>>>>> atlas-node2: Online >>>>>>> >>>>>>> Daemon Status: >>>>>>> corosync: active/enabled >>>>>>> pacemaker: active/enabled >>>>>>> pcsd: active/enabled >>>>>>> >>>>>>> >>>>>>> >>>>>>> I tried then "pcs cluster destroy" on both nodes, and then again >>>>>>> nfs-ganesha enable, but now I?m back to the old problem: >>>>>>> >>>>>>> # pcs status >>>>>>> Cluster name: ATLAS_GANESHA_01 >>>>>>> Last updated: Tue Jun 9 10:22:27 2015 >>>>>>> Last change: Tue Jun 9 10:17:00 2015 >>>>>>> Stack: corosync >>>>>>> Current DC: atlas-node2 (2) - partition with quorum >>>>>>> Version: 1.1.12-a14efad >>>>>>> 2 Nodes configured >>>>>>> 10 Resources configured >>>>>>> >>>>>>> >>>>>>> Online: [ atlas-node1 atlas-node2 ] >>>>>>> >>>>>>> Full list of resources: >>>>>>> >>>>>>> Clone Set: nfs-mon-clone [nfs-mon] >>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>> Clone Set: nfs-grace-clone [nfs-grace] >>>>>>> Started: [ atlas-node1 atlas-node2 ] >>>>>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped >>>>>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 >>>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 >>>>>>> >>>>>>> PCSD Status: >>>>>>> atlas-node1: Online >>>>>>> atlas-node2: Online >>>>>>> >>>>>>> Daemon Status: >>>>>>> corosync: active/enabled >>>>>>> pacemaker: active/enabled >>>>>>> pcsd: active/enabled >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Alessandro >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Soumya >>>>>>>> >>>>>>>>> Alessandro >>>>>>>>> >>>>>>>>>> Il giorno 08/giu/2015, alle ore 20:00, Alessandro De Salvo >>>>>>>>>> <Alessandro.DeSalvo at roma1.infn.it >>>>>>>>>> <mailto:Alessandro.DeSalvo at roma1.infn.it>> ha scritto: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> indeed, it does not work :-) >>>>>>>>>> OK, this is what I did, with 2 machines, running CentOS 7.1, >>>>>>>>>> Glusterfs 3.7.1 and nfs-ganesha 2.2.0: >>>>>>>>>> >>>>>>>>>> 1) ensured that the machines are able to resolve their IPs (but >>>>>>>>>> this was already true since they were in the DNS); >>>>>>>>>> 2) disabled NetworkManager and enabled network on both machines; >>>>>>>>>> 3) created a gluster shared volume 'gluster_shared_storage' and >>>>>>>>>> mounted it on '/run/gluster/shared_storage' on all the cluster >>>>>>>>>> nodes using glusterfs native mount (on CentOS 7.1 there is a link >>>>>>>>>> by default /var/run -> ../run) >>>>>>>>>> 4) created an empty /etc/ganesha/ganesha.conf; >>>>>>>>>> 5) installed pacemaker pcs resource-agents corosync on all cluster >>>>>>>>>> machines; >>>>>>>>>> 6) set the ?hacluster? user the same password on all machines; >>>>>>>>>> 7) pcs cluster auth <hostname> -u hacluster -p <pass> on all the >>>>>>>>>> nodes (on both nodes I issued the commands for both nodes) >>>>>>>>>> 8) IPv6 is configured by default on all nodes, although the >>>>>>>>>> infrastructure is not ready for IPv6 >>>>>>>>>> 9) enabled pcsd and started it on all nodes >>>>>>>>>> 10) populated /etc/ganesha/ganesha-ha.conf with the following >>>>>>>>>> contents, one per machine: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ===> atlas-node1 >>>>>>>>>> # Name of the HA cluster created. >>>>>>>>>> HA_NAME="ATLAS_GANESHA_01" >>>>>>>>>> # The server from which you intend to mount >>>>>>>>>> # the shared volume. >>>>>>>>>> HA_VOL_SERVER=?atlas-node1" >>>>>>>>>> # The subset of nodes of the Gluster Trusted Pool >>>>>>>>>> # that forms the ganesha HA cluster. IP/Hostname >>>>>>>>>> # is specified. >>>>>>>>>> HA_CLUSTER_NODES=?atlas-node1,atlas-node2" >>>>>>>>>> # Virtual IPs of each of the nodes specified above. >>>>>>>>>> VIP_atlas-node1=?x.x.x.1" >>>>>>>>>> VIP_atlas-node2=?x.x.x.2" >>>>>>>>>> >>>>>>>>>> ===> atlas-node2 >>>>>>>>>> # Name of the HA cluster created. >>>>>>>>>> HA_NAME="ATLAS_GANESHA_01" >>>>>>>>>> # The server from which you intend to mount >>>>>>>>>> # the shared volume. >>>>>>>>>> HA_VOL_SERVER=?atlas-node2" >>>>>>>>>> # The subset of nodes of the Gluster Trusted Pool >>>>>>>>>> # that forms the ganesha HA cluster. IP/Hostname >>>>>>>>>> # is specified. >>>>>>>>>> HA_CLUSTER_NODES=?atlas-node1,atlas-node2" >>>>>>>>>> # Virtual IPs of each of the nodes specified above. >>>>>>>>>> VIP_atlas-node1=?x.x.x.1" >>>>>>>>>> VIP_atlas-node2=?x.x.x.2? >>>>>>>>>> >>>>>>>>>> 11) issued gluster nfs-ganesha enable, but it fails with a cryptic >>>>>>>>>> message: >>>>>>>>>> >>>>>>>>>> # gluster nfs-ganesha enable >>>>>>>>>> Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the >>>>>>>>>> trusted pool. Do you still want to continue? (y/n) y >>>>>>>>>> nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. >>>>>>>>>> Please check the log file for details >>>>>>>>>> >>>>>>>>>> Looking at the logs I found nothing really special but this: >>>>>>>>>> >>>>>>>>>> ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=>>>>>>>>>> [2015-06-08 17:57:15.672844] I [MSGID: 106132] >>>>>>>>>> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs >>>>>>>>>> already stopped >>>>>>>>>> [2015-06-08 17:57:15.675395] I >>>>>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host >>>>>>>>>> found Hostname is atlas-node2 >>>>>>>>>> [2015-06-08 17:57:15.720692] I >>>>>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host >>>>>>>>>> found Hostname is atlas-node2 >>>>>>>>>> [2015-06-08 17:57:15.721161] I >>>>>>>>>> [glusterd-ganesha.c:335:is_ganesha_host] 0-management: ganesha host >>>>>>>>>> found Hostname is atlas-node2 >>>>>>>>>> [2015-06-08 17:57:16.633048] E >>>>>>>>>> [glusterd-ganesha.c:254:glusterd_op_set_ganesha] 0-management: >>>>>>>>>> Initial NFS-Ganesha set up failed >>>>>>>>>> [2015-06-08 17:57:16.641563] E >>>>>>>>>> [glusterd-syncop.c:1396:gd_commit_op_phase] 0-management: Commit of >>>>>>>>>> operation 'Volume (null)' failed on localhost : Failed to set up HA >>>>>>>>>> config for NFS-Ganesha. Please check the log file for details >>>>>>>>>> >>>>>>>>>> ==> /var/log/glusterfs/cmd_history.log <=>>>>>>>>>> [2015-06-08 17:57:16.643615] : nfs-ganesha enable : FAILED : >>>>>>>>>> Failed to set up HA config for NFS-Ganesha. Please check the log >>>>>>>>>> file for details >>>>>>>>>> >>>>>>>>>> ==> /var/log/glusterfs/cli.log <=>>>>>>>>>> [2015-06-08 17:57:16.643839] I [input.c:36:cli_batch] 0-: Exiting >>>>>>>>>> with: -1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Also, pcs seems to be fine for the auth part, although it obviously >>>>>>>>>> tells me the cluster is not running. >>>>>>>>>> >>>>>>>>>> I, [2015-06-08T19:57:16.305323 #7223] INFO -- : Running: >>>>>>>>>> /usr/sbin/corosync-cmapctl totem.cluster_name >>>>>>>>>> I, [2015-06-08T19:57:16.345457 #7223] INFO -- : Running: >>>>>>>>>> /usr/sbin/pcs cluster token-nodes >>>>>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET >>>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1919 >>>>>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET >>>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1920 >>>>>>>>>> atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] "GET >>>>>>>>>> /remote/check_auth HTTP/1.1" 200 68 >>>>>>>>>> - -> /remote/check_auth >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> What am I doing wrong? >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Alessandro >>>>>>>>>> >>>>>>>>>>> Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri >>>>>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 06/08/2015 08:20 PM, Alessandro De Salvo wrote: >>>>>>>>>>>> Sorry, just another question: >>>>>>>>>>>> >>>>>>>>>>>> - in my installation of gluster 3.7.1 the command gluster >>>>>>>>>>>> features.ganesha enable does not work: >>>>>>>>>>>> >>>>>>>>>>>> # gluster features.ganesha enable >>>>>>>>>>>> unrecognized word: features.ganesha (position 0) >>>>>>>>>>>> >>>>>>>>>>>> Which version has full support for it? >>>>>>>>>>> >>>>>>>>>>> Sorry. This option has recently been changed. It is now >>>>>>>>>>> >>>>>>>>>>> $ gluster nfs-ganesha enable >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - in the documentation the ccs and cman packages are required, >>>>>>>>>>>> but they seems not to be available anymore on CentOS 7 and >>>>>>>>>>>> similar, I guess they are not really required anymore, as pcs >>>>>>>>>>>> should do the full job >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Alessandro >>>>>>>>>>> >>>>>>>>>>> Looks like so from http://clusterlabs.org/quickstart-redhat.html. >>>>>>>>>>> Let us know if it doesn't work. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Soumya >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo >>>>>>>>>>>>> <alessandro.desalvo at roma1.infn.it >>>>>>>>>>>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>> Great, many thanks Soumya! >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> >>>>>>>>>>>>> Alessandro >>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri >>>>>>>>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please find the slides of the demo video at [1] >>>>>>>>>>>>>> >>>>>>>>>>>>>> We recommend to have a distributed replica volume as a shared >>>>>>>>>>>>>> volume for better data-availability. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Size of the volume depends on the workload you may have. Since >>>>>>>>>>>>>> it is used to maintain states of NLM/NFSv4 clients, you may >>>>>>>>>>>>>> calculate the size of the volume to be minimum of aggregate of >>>>>>>>>>>>>> (typical_size_of'/var/lib/nfs'_directory + >>>>>>>>>>>>>> ~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point) >>>>>>>>>>>>>> >>>>>>>>>>>>>> We shall document about this feature sooner in the gluster docs >>>>>>>>>>>>>> as well. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Soumya >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 06/08/2015 04:34 PM, Alessandro De Salvo wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> I have seen the demo video on ganesha HA, >>>>>>>>>>>>>>> https://www.youtube.com/watch?v=Z4mvTQC-efM >>>>>>>>>>>>>>> However there is no advice on the appropriate size of the >>>>>>>>>>>>>>> shared volume. How is it really used, and what should be a >>>>>>>>>>>>>>> reasonable size for it? >>>>>>>>>>>>>>> Also, are the slides from the video available somewhere, as >>>>>>>>>>>>>>> well as a documentation on all this? I did not manage to find >>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alessandro >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Gluster-users mailing list >>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >