Strahil
2019-Nov-04 04:59 UTC
[Gluster-users] hook script question related to ctdb, shared storage, and bind mounts
Hi Erik, I took another approach. 1. I got a systemd mount unit for my ctdb lock volume's brick: [root at ovirt1 system]# grep var /etc/fstab gluster1:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults,x-systemd.requires=glusterd.service,x-systemd.automount 0 0 As you can see - it is an automounter, because sometimes it fails to mount on time 2. I got custom systemd services for glusterd,ctdb and vdo - as I need to 'put' dependencies for each of those. Now, I'm no longer using ctdb & NFS Ganesha (as my version of ctdb cannot use hpstnames and my environment is a little bit crazy), but I can still provide hints how I did it. Best Regards, Strahil NikolovOn Nov 3, 2019 22:46, Erik Jacobson <erik.jacobson at hpe.com> wrote:> > So, I have a solution I have written about in the based that is based on > gluster with CTDB for IP and a level of redundancy. > > It's been working fine except for a few quirks I need to work out on > giant clusters when I get access. > > I have 3x9 gluster volume, each are also NFS servers, using gluster > NFS (ganesha isn't reliable for my workload yet). There are 9 IP > aliases spread across 9 servers. > > I also have many bind mounts that point to the shared storage as a > source, and the /gluster/lock volume ("ctdb") of course. > > glusterfs 4.1.6 (rhel8 today, but I use rhel7, rhel8, sles12, and > sles15) > > Things work well when everything is up and running. IP failover works > well when one of the servers goes down. My issue is when that server > comes back up. Despite my best efforts with systemd fstab dependencies, > the shared storage areas including the gluster lock for CTDB do not > always get mounted before CTDB starts. This causes trouble for CTDB > correctly joining the collective. I also have problems where my > bind mounts can happen before the shared storage is mounted, despite my > attempts at preventing this with dependencies in fstab. > > I decided a better approach would be to use a gluster hook and just > mount everything I need as I need it, and start up ctdb when I know and > verify that /gluster/lock is really gluster and not a local disk. > > I started down a road of doing this with a start host hook and after > spending a while at it, I realized my logic error. This will only fire > when the volume is *started*, not when a server that was down re-joins. > > I took a look at the code, glusterd-hooks.c, and found that support > for "brick start" is not in place for a hook script but it's nearly > there: > > ??????? [GD_OP_START_BRICK]???????????? = EMPTY, > ... > > and no entry in glusterd_hooks_add_op_args() yet. > > > Before I make a patch for my own use, I wanted to do a sanity check and > find out if others have solved this better than the road I'm heading > down. > > What I was thinking of doing is enabling a brick start hook, and > do my processing for volumes being mounted from there. However, I > suppose brick start is a bad choice for the case of simply stopping and > starting the volume, because my processing would try to complete before > the gluster volume was fully started. It would probably work for a brick > "coming back and joining" but not "stop volume/start volume". > > Any suggestions? > > My end goal is: > - mount shared storage every boot > - only attempt to mount when gluster is available (_netdev doesn't seem > ?? to be enough) > - never start ctdb unless /gluster/lock is a shared storage and not a > ?? directory. > - only do my bind mounts from shared storage in to the rest of the > ?? layout when we are sure the shared storage is mounted (don't > ?? bind-mount using an empty directory as a source by accident!) > > Thanks so much for reading my question, > > Erik > ________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/118564314 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/118564314 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users
Erik Jacobson
2019-Nov-04 13:57 UTC
[Gluster-users] hook script question related to ctdb, shared storage, and bind mounts
Thank you! I am very interested. I hadn't considered the automounter idea. Also, your fstab has a different dependency approach than mine otherwise as well. If you happen to have the examples handy, I'll give them a shot here. I'm looking forward to emerging from this dark place of dependencies not working!! Thank you so much for writing back, Erik On Mon, Nov 04, 2019 at 06:59:10AM +0200, Strahil wrote:> Hi Erik, > > I took another approach. > > 1. I got a systemd mount unit for my ctdb lock volume's brick: > [root at ovirt1 system]# grep var /etc/fstab > gluster1:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults,x-systemd.requires=glusterd.service,x-systemd.automount 0 0 > > As you can see - it is an automounter, because sometimes it fails to mount on time > > 2. I got custom systemd services for glusterd,ctdb and vdo - as I need to 'put' dependencies for each of those. > > Now, I'm no longer using ctdb & NFS Ganesha (as my version of ctdb cannot use hpstnames and my environment is a little bit crazy), but I can still provide hints how I did it. > > Best Regards, > Strahil NikolovOn Nov 3, 2019 22:46, Erik Jacobson <erik.jacobson at hpe.com> wrote: > > > > So, I have a solution I have written about in the based that is based on > > gluster with CTDB for IP and a level of redundancy. > > > > It's been working fine except for a few quirks I need to work out on > > giant clusters when I get access. > > > > I have 3x9 gluster volume, each are also NFS servers, using gluster > > NFS (ganesha isn't reliable for my workload yet). There are 9 IP > > aliases spread across 9 servers. > > > > I also have many bind mounts that point to the shared storage as a > > source, and the /gluster/lock volume ("ctdb") of course. > > > > glusterfs 4.1.6 (rhel8 today, but I use rhel7, rhel8, sles12, and > > sles15) > > > > Things work well when everything is up and running. IP failover works > > well when one of the servers goes down. My issue is when that server > > comes back up. Despite my best efforts with systemd fstab dependencies, > > the shared storage areas including the gluster lock for CTDB do not > > always get mounted before CTDB starts. This causes trouble for CTDB > > correctly joining the collective. I also have problems where my > > bind mounts can happen before the shared storage is mounted, despite my > > attempts at preventing this with dependencies in fstab. > > > > I decided a better approach would be to use a gluster hook and just > > mount everything I need as I need it, and start up ctdb when I know and > > verify that /gluster/lock is really gluster and not a local disk. > > > > I started down a road of doing this with a start host hook and after > > spending a while at it, I realized my logic error. This will only fire > > when the volume is *started*, not when a server that was down re-joins. > > > > I took a look at the code, glusterd-hooks.c, and found that support > > for "brick start" is not in place for a hook script but it's nearly > > there: > > > > ??????? [GD_OP_START_BRICK]???????????? = EMPTY, > > ... > > > > and no entry in glusterd_hooks_add_op_args() yet. > > > > > > Before I make a patch for my own use, I wanted to do a sanity check and > > find out if others have solved this better than the road I'm heading > > down. > > > > What I was thinking of doing is enabling a brick start hook, and > > do my processing for volumes being mounted from there. However, I > > suppose brick start is a bad choice for the case of simply stopping and > > starting the volume, because my processing would try to complete before > > the gluster volume was fully started. It would probably work for a brick > > "coming back and joining" but not "stop volume/start volume". > > > > Any suggestions? > > > > My end goal is: > > - mount shared storage every boot > > - only attempt to mount when gluster is available (_netdev doesn't seem > > ?? to be enough) > > - never start ctdb unless /gluster/lock is a shared storage and not a > > ?? directory. > > - only do my bind mounts from shared storage in to the rest of the > > ?? layout when we are sure the shared storage is mounted (don't > > ?? bind-mount using an empty directory as a source by accident!) > > > > Thanks so much for reading my question, > > > > Erik > > ________ > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/118564314 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/118564314 > > > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-usersErik Jacobson Software Engineer erik.jacobson at hpe.com +1 612 851 0550 Office Eagan, MN hpe.com