Mahdi Adnan
2017-May-03 09:32 UTC
[Gluster-users] Gluster and NFS-Ganesha - cluster is down after reboot
Hi, Same here, when i reboot the node i have to manually execute "pcs cluster start gluster01" and pcsd already enabled and started. Gluster 3.8.11 Centos 7.3 latest Installed using CentOS Storage SIG repository -- Respectfully Mahdi A. Mahdi ________________________________ From: gluster-users-bounces at gluster.org <gluster-users-bounces at gluster.org> on behalf of Adam Ru <ad.ruckel at gmail.com> Sent: Wednesday, May 3, 2017 12:09:58 PM To: Soumya Koduri Cc: gluster-users at gluster.org Subject: Re: [Gluster-users] Gluster and NFS-Ganesha - cluster is down after reboot Hi Soumya, thank you very much for your reply. I enabled pcsd during setup and after reboot during troubleshooting I manually started it and checked resources (pcs status). They were not running. I didn?t find what was wrong but I?m going to try it again. I?ve thoroughly checked http://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/ and I can confirm that I followed all steps with one exception. I installed following RPMs: glusterfs-server glusterfs-fuse glusterfs-cli glusterfs-ganesha nfs-ganesha-xfs and the guide referenced above specifies: glusterfs-server glusterfs-api glusterfs-ganesha glusterfs-api is a dependency of one of RPMs that I installed so this is not a problem. But I cannot find any mention to install nfs-ganesha-xfs. I?ll try to setup the whole environment again without installing nfs-ganesha-xfs (I assume glusterfs-ganesha has all required binaries). Again, thank you for you time to answer my previous message. Kind regards, Adam On Tue, May 2, 2017 at 8:49 AM, Soumya Koduri <skoduri at redhat.com<mailto:skoduri at redhat.com>> wrote: Hi, On 05/02/2017 01:34 AM, Rudolf wrote: Hi Gluster users, First, I'd like to thank you all for this amazing open-source! Thank you! I'm working on home project ? three servers with Gluster and NFS-Ganesha. My goal is to create HA NFS share with three copies of each file on each server. My systems are CentOS 7.3 Minimal install with the latest updates and the most current RPMs from "centos-gluster310" repository. I followed this tutorial: http://blog.gluster.org/2015/10/linux-scale-out-nfsv4-using-nfs-ganesha-and-glusterfs-one-step-at-a-time/ (second half that describes multi-node HA setup) with a few exceptions: 1. All RPMs are from "centos-gluster310" repo that is installed by "yum -y install centos-release-gluster" 2. I have three nodes (not four) with "replica 3" volume. 3. I created empty ganesha.conf and not empty ganesha-ha.conf in "/var/run/gluster/shared_storage/nfs-ganesha/" (referenced blog post is outdated, this is now requirement) 4. ganesha-ha.conf doesn't have "HA_VOL_SERVER" since this isn't needed anymore. Please refer to http://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/ It is being updated with latest changes happened wrt setup. When I finish configuration, all is good. nfs-ganesha.service is active and running and from client I can ping all three VIPs and I can mount NFS. Copied files are replicated to all nodes. But when I restart nodes (one by one, with 5 min. delay between) then I cannot ping or mount (I assume that all VIPs are down). So my setup definitely isn't HA. I found that: # pcs status Error: cluster is not currently running on this node This means pcsd service is not up. Did you enable (systemctl enable pcsd) pcsd service so that is comes up post reboot automatically. If not please start it manually. and nfs-ganesha.service is in inactive state. Btw. I didn't enable "systemctl enable nfs-ganesha" since I assume that this is something that Gluster does. Please check /var/log/ganesha.log for any errors/warnings. We recommend not to enable nfs-ganesha.service (by default), as the shared storage (where the ganesha.conf file resides now) should be up and running before nfs-ganesha gets started. So if enabled by default it could happen that shared_storage mount point is not yet up and it resulted in nfs-ganesha service failure. If you would like to address this, you could have a cron job which keeps checking the mount point health and then start nfs-ganesha service. Thanks, Soumya I assume that my issue is that I followed instructions in blog post from 2015/10 that are outdated. Unfortunately I cannot find anything better ? I spent whole day by googling. Would you be so kind and check the instructions in blog post and let me know what steps are wrong / outdated? Or please do you have more current instructions for Gluster+Ganesha setup? Thank you. Kind regards, Adam _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -- Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170503/38680e43/attachment.html>
Adam Ru
2017-May-05 14:34 UTC
[Gluster-users] Gluster and NFS-Ganesha - cluster is down after reboot
Hi Soumya, Thank you for the answer. Enabling Pacemaker? Yes, you?re completely right, I didn?t do it. Thank you. I spent some time by testing and I have some results. This is what I did: - Clean installation of CentOS 7.3 with all updates, 3x node, resolvable IPs and VIPs - Stopped firewalld (just for testing) - Install "centos-release-gluster" to get "centos-gluster310" repo and install following (nothing else): --- glusterfs-server --- glusterfs-ganesha - Passwordless SSH between all nodes (/var/lib/glusterd/nfs/secret.pem and secret.pem.pub on all nodes) - systemctl enable and start glusterd - gluster peer probe <other nodes> - gluster volume set all cluster.enable-shared-storage enable - systemctl enable and start pcsd.service - systemctl enable pacemaker.service (cannot be started at this moment) - Set password for hacluster user on all nodes - pcs cluster auth <node 1> <node 2> <node 3> -u hacluster -p blabla - mkdir /var/run/gluster/shared_storage/nfs-ganesha/ - touch /var/run/gluster/shared_storage/nfs-ganesha/ganesha.conf (not sure if needed) - vi /var/run/gluster/shared_storage/nfs-ganesha/ganesha-ha.conf and insert configuration - Try list files on other nodes: ls /var/run/gluster/shared_storage/nfs-ganesha/ - gluster nfs-ganesha enable - Check on other nodes that nfs-ganesha.service is running and "pcs status" shows started resources - gluster volume create mynewshare replica 3 transport tcp node1:/<dir> node2:/<dir> node3:/<dir> - gluster volume start mynewshare - gluster vol set mynewshare ganesha.enable on After these steps, all VIPs are pingable and I can mount node1:/mynewshare Funny thing is that pacemaker.service is disabled again (something disabled it). This is status of important (I think) services: systemctl list-units --all # corosync.service loaded active running # glusterd.service loaded active running # nfs-config.service loaded inactive dead # nfs-ganesha-config.service loaded inactive dead # nfs-ganesha-lock.service loaded active running # nfs-ganesha.service loaded active running # nfs-idmapd.service loaded inactive dead # nfs-mountd.service loaded inactive dead # nfs-server.service loaded inactive dead # nfs-utils.service loaded inactive dead # pacemaker.service loaded active running # pcsd.service loaded active running systemctl list-unit-files --all # corosync-notifyd.service disabled # corosync.service disabled # glusterd.service enabled # glusterfsd.service disabled # nfs-blkmap.service disabled # nfs-config.service static # nfs-ganesha-config.service static # nfs-ganesha-lock.service static # nfs-ganesha.service disabled # nfs-idmap.service static # nfs-idmapd.service static # nfs-lock.service static # nfs-mountd.service static # nfs-rquotad.service disabled # nfs-secure-server.service static # nfs-secure.service static # nfs-server.service disabled # nfs-utils.service static # nfs.service disabled # nfslock.service static # pacemaker.service disabled # pcsd.service enabled I enabled pacemaker again on all nodes and restart all nodes one by one. After reboot all VIPs are gone and I can see that nfs-ganesha.service isn?t running. When I start it on at least two nodes then VIPs are pingable again and I can mount NFS again. But there is still some issue in the setup because when I check nfs-ganesha-lock.service I get: systemctl -l status nfs-ganesha-lock.service ? nfs-ganesha-lock.service - NFS status monitor for NFSv2/3 locking. Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha-lock.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2017-05-05 13:43:37 UTC; 31min ago Process: 6203 ExecStart=/usr/sbin/rpc.statd --no-notify $STATDARGS (code=exited, status=1/FAILURE) May 05 13:43:37 node0.localdomain systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... May 05 13:43:37 node0.localdomain rpc.statd[6205]: Version 1.3.0 starting May 05 13:43:37 node0.localdomain rpc.statd[6205]: Flags: TI-RPC May 05 13:43:37 node0.localdomain rpc.statd[6205]: Failed to open directory sm: Permission denied May 05 13:43:37 node0.localdomain rpc.statd[6205]: Failed to open /var/lib/nfs/statd/state: Permission denied May 05 13:43:37 node0.localdomain systemd[1]: nfs-ganesha-lock.service: control process exited, code=exited status=1 May 05 13:43:37 node0.localdomain systemd[1]: Failed to start NFS status monitor for NFSv2/3 locking.. May 05 13:43:37 node0.localdomain systemd[1]: Unit nfs-ganesha-lock.service entered failed state. May 05 13:43:37 node0.localdomain systemd[1]: nfs-ganesha-lock.service failed. Thank you, Kind regards, Adam On Wed, May 3, 2017 at 10:32 AM, Mahdi Adnan <mahdi.adnan at outlook.com> wrote:> Hi, > > > Same here, when i reboot the node i have to manually execute "pcs cluster > start gluster01" and pcsd already enabled and started. > > Gluster 3.8.11 > > Centos 7.3 latest > > Installed using CentOS Storage SIG repository > > > > -- > > Respectfully > *Mahdi A. Mahdi* > > ------------------------------ > *From:* gluster-users-bounces at gluster.org <gluster-users-bounces@ > gluster.org> on behalf of Adam Ru <ad.ruckel at gmail.com> > *Sent:* Wednesday, May 3, 2017 12:09:58 PM > *To:* Soumya Koduri > *Cc:* gluster-users at gluster.org > *Subject:* Re: [Gluster-users] Gluster and NFS-Ganesha - cluster is down > after reboot > > Hi Soumya, > > thank you very much for your reply. > > I enabled pcsd during setup and after reboot during troubleshooting I > manually started it and checked resources (pcs status). They were not > running. I didn?t find what was wrong but I?m going to try it again. > > I?ve thoroughly checked > http://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha% > 20GlusterFS%20Integration/ > and I can confirm that I followed all steps with one exception. I > installed following RPMs: > glusterfs-server > glusterfs-fuse > glusterfs-cli > glusterfs-ganesha > nfs-ganesha-xfs > > and the guide referenced above specifies: > glusterfs-server > glusterfs-api > glusterfs-ganesha > > glusterfs-api is a dependency of one of RPMs that I installed so this is > not a problem. But I cannot find any mention to install nfs-ganesha-xfs. > > I?ll try to setup the whole environment again without installing > nfs-ganesha-xfs (I assume glusterfs-ganesha has all required binaries). > > Again, thank you for you time to answer my previous message. > > Kind regards, > Adam > > On Tue, May 2, 2017 at 8:49 AM, Soumya Koduri <skoduri at redhat.com> wrote: > >> Hi, >> >> On 05/02/2017 01:34 AM, Rudolf wrote: >> >>> Hi Gluster users, >>> >>> First, I'd like to thank you all for this amazing open-source! Thank you! >>> >>> I'm working on home project ? three servers with Gluster and >>> NFS-Ganesha. My goal is to create HA NFS share with three copies of each >>> file on each server. >>> >>> My systems are CentOS 7.3 Minimal install with the latest updates and >>> the most current RPMs from "centos-gluster310" repository. >>> >>> I followed this tutorial: >>> http://blog.gluster.org/2015/10/linux-scale-out-nfsv4-using- >>> nfs-ganesha-and-glusterfs-one-step-at-a-time/ >>> (second half that describes multi-node HA setup) >>> >>> with a few exceptions: >>> >>> 1. All RPMs are from "centos-gluster310" repo that is installed by "yum >>> -y install centos-release-gluster" >>> 2. I have three nodes (not four) with "replica 3" volume. >>> 3. I created empty ganesha.conf and not empty ganesha-ha.conf in >>> "/var/run/gluster/shared_storage/nfs-ganesha/" (referenced blog post is >>> outdated, this is now requirement) >>> 4. ganesha-ha.conf doesn't have "HA_VOL_SERVER" since this isn't needed >>> anymore. >>> >>> >> Please refer to http://gluster.readthedocs.io/ >> en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/ >> >> It is being updated with latest changes happened wrt setup. >> >> When I finish configuration, all is good. nfs-ganesha.service is active >>> and running and from client I can ping all three VIPs and I can mount >>> NFS. Copied files are replicated to all nodes. >>> >>> But when I restart nodes (one by one, with 5 min. delay between) then I >>> cannot ping or mount (I assume that all VIPs are down). So my setup >>> definitely isn't HA. >>> >>> I found that: >>> # pcs status >>> Error: cluster is not currently running on this node >>> >> >> This means pcsd service is not up. Did you enable (systemctl enable pcsd) >> pcsd service so that is comes up post reboot automatically. If not please >> start it manually. >> >> >>> and nfs-ganesha.service is in inactive state. Btw. I didn't enable >>> "systemctl enable nfs-ganesha" since I assume that this is something >>> that Gluster does. >>> >> >> Please check /var/log/ganesha.log for any errors/warnings. >> >> We recommend not to enable nfs-ganesha.service (by default), as the >> shared storage (where the ganesha.conf file resides now) should be up and >> running before nfs-ganesha gets started. >> So if enabled by default it could happen that shared_storage mount point >> is not yet up and it resulted in nfs-ganesha service failure. If you would >> like to address this, you could have a cron job which keeps checking the >> mount point health and then start nfs-ganesha service. >> >> Thanks, >> Soumya >> >> >>> I assume that my issue is that I followed instructions in blog post from >>> 2015/10 that are outdated. Unfortunately I cannot find anything better ? >>> I spent whole day by googling. >>> >>> Would you be so kind and check the instructions in blog post and let me >>> know what steps are wrong / outdated? Or please do you have more current >>> instructions for Gluster+Ganesha setup? >>> >>> Thank you. >>> >>> Kind regards, >>> Adam >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> > > > -- > Adam >-- Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170505/72d69c2a/attachment.html>