Hi Marin - again thank you for the help. I can't believe I coundn't find any info about this big configuration change. Even the Samba WIKI doesn't really spell this out at all in instructs you to use ctdbd.conf. Do I need to enable the 20.nfs_ganesha.check script file at all, or will the config itself take care of that? Also, are there any recommendations on which nfs-checks.d checks should still be used in the nfs-checks-ganesha? Gluster, nfs-ganesha, ctdb seems like a great combination that's sorely lacking on proper documentation. I know there's storhaug that is supposed to tie this together, but that flat out didn't work for me at all, has poor documentation and needed manual edits just to get it to work. ?On 10/1/19, 5:46 PM, "Martin Schwenke" <martin at meltin.net> wrote: NOTE: This email originated from outside of the organization. Hi Max, On Tue, 1 Oct 2019 18:57:43 +0000, Max DiOrio via samba <samba at lists.samba.org> wrote: > Hi there ? I seem to be having trouble wrapping my brain about the > CTDB and ganesha configuration. I thought I had it figured out, but > it doesn?t seem to be doing any checking of the nfs-ganesha service. > I put nfs-ganesha-callout as executable in /etc/ctdb > I create nfs-checks-ganesha.d folder in /etc/ctdb and in there I have > 20.nfs_ganesha.check You may want to symlink some of the others across, such as 00.portmapper.check. > In my ctdbd.conf file I have: Aha! In >=4.9 nothing looks at that file anymore. If you run https://git.samba.org/?p=samba.git;a=blob_plain;f=ctdb/doc/examples/config_migrate.sh;h=8479aeb39f383bf9d3a05d79b9357a7e07a6a836;hb=refs/heads/v4-9-stable on it then you'll get some useful suggestions... but keep reading below... > # Options to ctdbd, read by ctdbd_wrapper(1) > # > # See ctdbd.conf(5) for more information about CTDB configuration variables. > > # Shared recovery lock file to avoid split brain. No default. > # > # Do NOT run CTDB without a recovery lock file unless you know exactly > # what you are doing. > CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.CTDB-lockfile This should be in ctdb.conf: [cluster] recovery lock = /run/gluster/shared_storage/.CTDB-lockfile > # List of nodes in the cluster. Default is below. > CTDB_NODES=/etc/ctdb/nodes > > # List of public addresses for providing NAS services. No default. > CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses These are no longer used. The above defaults in /etc/ctdb/ are now hardwired. Symlinks can be used if necessary. > # What services should CTDB manage? Default is none. > # CTDB_MANAGES_SAMBA=yes > # CTDB_MANAGES_WINBIND=yes > CTDB_MANAGES_NFS=yes Gone. Now just enable the event scripts. > # Raise the file descriptor limit for CTDB? > # CTDB_MAX_OPEN_FILES=10000 Gone. Either do the right thing in the systemd unit file or put a ulimit command in /etc/sysconfig/ctdb or /etc/default/ctdb (depending on distro). > # Default is to use the log file below instead of syslog. > CTDB_LOGGING=file:/var/log/log.ctdb ctdb.conf: [logging] location = file:/var/log/log.ctdb However, this is the default so you can just omit it. If you can use syslog instead, then I strongly suggest you do that (see ctdb.conf(5) manpage). CTDB's file logging has no useful way of rotating the logs. There's an open bug, there are plans, but it is complicated. > # Default log level is NOTICE. Want less logging? > CTDB_DEBUGLEVEL=DEBUG ctdb.conf: [logging] log level = DEBUG > CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout > CTDB_NFS_CHECKS_DIR=/etc/ctdb/nfs-checks-ganesha.d > CTBS_NFS_STATE_FS_TYPE=glusterfs > CTDB_NFS_STATE_MNT=/run/gluster/shared_storage > CTDB_NFS_SKIP_SHARE_CHECK=yes > NFS_HOSTNAME=hq-6pnfs Move all of these to /etc/ctdb/script.options. > But in the logs, I see nothing relating to nfs-ganesha, and when the > ganesha service fails, the IP?s don?t get migrated and/or the service > doesn?t get restarted. > Any ideas? Yeah, complete rewrite of configuration handling. :-) At some stage where will be another change some time in the future when: * service event scripts go to /etc/ctdb/events/service/ - not many related options to change here... * failover event scripts go to /etc/ctdb/events/failover/ and most of the failover-related tunables become script options * ... After that, things should be very easy to understand... :-) peace & happiness, martin
As soon as I made the configuration change and restarted CTDB, it crashes. Oct 2 11:05:14 hq-6pgluster01 systemd: Started CTDB. Oct 2 11:05:21 hq-6pgluster01 systemd: ctdb.service: main process exited, code=exited, status=1/FAILURE Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: connect() failed, errno=111 Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: Failed to connect to CTDB daemon (/var/run/ctdb/ctdbd.socket) Oct 2 11:05:22 hq-6pgluster01 ctdbd_wrapper: Error while shutting down CTDB ?On 10/2/19, 10:51 AM, "samba on behalf of Max DiOrio via samba" <samba-bounces at lists.samba.org on behalf of samba at lists.samba.org> wrote: NOTE: This email originated from outside of the organization. Hi Marin - again thank you for the help. I can't believe I coundn't find any info about this big configuration change. Even the Samba WIKI doesn't really spell this out at all in instructs you to use ctdbd.conf. Do I need to enable the 20.nfs_ganesha.check script file at all, or will the config itself take care of that? Also, are there any recommendations on which nfs-checks.d checks should still be used in the nfs-checks-ganesha? Gluster, nfs-ganesha, ctdb seems like a great combination that's sorely lacking on proper documentation. I know there's storhaug that is supposed to tie this together, but that flat out didn't work for me at all, has poor documentation and needed manual edits just to get it to work. On 10/1/19, 5:46 PM, "Martin Schwenke" <martin at meltin.net> wrote: NOTE: This email originated from outside of the organization. Hi Max, On Tue, 1 Oct 2019 18:57:43 +0000, Max DiOrio via samba <samba at lists.samba.org> wrote: > Hi there ? I seem to be having trouble wrapping my brain about the > CTDB and ganesha configuration. I thought I had it figured out, but > it doesn?t seem to be doing any checking of the nfs-ganesha service. > I put nfs-ganesha-callout as executable in /etc/ctdb > I create nfs-checks-ganesha.d folder in /etc/ctdb and in there I have > 20.nfs_ganesha.check You may want to symlink some of the others across, such as 00.portmapper.check. > In my ctdbd.conf file I have: Aha! In >=4.9 nothing looks at that file anymore. If you run https://git.samba.org/?p=samba.git;a=blob_plain;f=ctdb/doc/examples/config_migrate.sh;h=8479aeb39f383bf9d3a05d79b9357a7e07a6a836;hb=refs/heads/v4-9-stable on it then you'll get some useful suggestions... but keep reading below... > # Options to ctdbd, read by ctdbd_wrapper(1) > # > # See ctdbd.conf(5) for more information about CTDB configuration variables. > > # Shared recovery lock file to avoid split brain. No default. > # > # Do NOT run CTDB without a recovery lock file unless you know exactly > # what you are doing. > CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.CTDB-lockfile This should be in ctdb.conf: [cluster] recovery lock = /run/gluster/shared_storage/.CTDB-lockfile > # List of nodes in the cluster. Default is below. > CTDB_NODES=/etc/ctdb/nodes > > # List of public addresses for providing NAS services. No default. > CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses These are no longer used. The above defaults in /etc/ctdb/ are now hardwired. Symlinks can be used if necessary. > # What services should CTDB manage? Default is none. > # CTDB_MANAGES_SAMBA=yes > # CTDB_MANAGES_WINBIND=yes > CTDB_MANAGES_NFS=yes Gone. Now just enable the event scripts. > # Raise the file descriptor limit for CTDB? > # CTDB_MAX_OPEN_FILES=10000 Gone. Either do the right thing in the systemd unit file or put a ulimit command in /etc/sysconfig/ctdb or /etc/default/ctdb (depending on distro). > # Default is to use the log file below instead of syslog. > CTDB_LOGGING=file:/var/log/log.ctdb ctdb.conf: [logging] location = file:/var/log/log.ctdb However, this is the default so you can just omit it. If you can use syslog instead, then I strongly suggest you do that (see ctdb.conf(5) manpage). CTDB's file logging has no useful way of rotating the logs. There's an open bug, there are plans, but it is complicated. > # Default log level is NOTICE. Want less logging? > CTDB_DEBUGLEVEL=DEBUG ctdb.conf: [logging] log level = DEBUG > CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout > CTDB_NFS_CHECKS_DIR=/etc/ctdb/nfs-checks-ganesha.d > CTBS_NFS_STATE_FS_TYPE=glusterfs > CTDB_NFS_STATE_MNT=/run/gluster/shared_storage > CTDB_NFS_SKIP_SHARE_CHECK=yes > NFS_HOSTNAME=hq-6pnfs Move all of these to /etc/ctdb/script.options. > But in the logs, I see nothing relating to nfs-ganesha, and when the > ganesha service fails, the IP?s don?t get migrated and/or the service > doesn?t get restarted. > Any ideas? Yeah, complete rewrite of configuration handling. :-) At some stage where will be another change some time in the future when: * service event scripts go to /etc/ctdb/events/service/ - not many related options to change here... * failover event scripts go to /etc/ctdb/events/failover/ and most of the failover-related tunables become script options * ... After that, things should be very easy to understand... :-) peace & happiness, martin -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Hi Max, On Wed, 2 Oct 2019 14:51:06 +0000, Max DiOrio <Max.DiOrio at ieeeglobalspec.com> wrote:> Hi Marin - again thank you for the help. I can't believe I coundn't > find any info about this big configuration change. Even the Samba > WIKI doesn't really spell this out at all in instructs you to use > ctdbd.conf.Can you please point me to that so I can fix it?> Do I need to enable the 20.nfs_ganesha.check script file at all, or > will the config itself take care of that? Also, are there any > recommendations on which nfs-checks.d checks should still be used in > the nfs-checks-ganesha?That check script provides failure detection of the main Ganesha daemon and attempts service restarts in an attempt to "self heal". The other files do something similar for the other RPC services. For example, if you're using the standard rpc.statd in your setup then you would also enable the corresponding check script to detect failures and restart the RPC service when it fails.> Gluster, nfs-ganesha, ctdb seems like a great combination that's > sorely lacking on proper documentation. I know there's storhaug that > is supposed to tie this together, but that flat out didn't work for > me at all, has poor documentation and needed manual edits just to get > it to work.I don't have huge amounts of experience setting up NFS Ganesha with CTDB or I would try to improve that documentation. If you get this working nicely and are able to improve the documentation then I and others would be very grateful. Note that the example Ganesha support for CTDB is very close to something used by products that use (or used) this combination. So, it should be relatively easy to get working. For storhaug, you could look up Jos? Rivera's email address on https://www.samba.org/samba/team/ and CC: him on an email to this list, asking about the status of storhaug. He may be able to help. The main problem is that available development and documentation time is limited. :-( peace & happiness, martin
Hi Max, On Wed, 2 Oct 2019 15:08:43 +0000, Max DiOrio <Max.DiOrio at ieeeglobalspec.com> wrote:> As soon as I made the configuration change and restarted CTDB, it crashes. > > Oct 2 11:05:14 hq-6pgluster01 systemd: Started CTDB. > Oct 2 11:05:21 hq-6pgluster01 systemd: ctdb.service: main process exited, code=exited, status=1/FAILURE > Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: connect() failed, errno=111 > Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: Failed to connect to CTDB daemon (/var/run/ctdb/ctdbd.socket) > Oct 2 11:05:22 hq-6pgluster01 ctdbd_wrapper: Error while shutting down CTDBIs there anything in the log file to suggest that this is an early failure instead of an actual crash? We do a lot of testing on CentsOS 7 (although not with Ganesha and with our own CTDB packages) and we haven't seen any crashes in recent times. If this is a crash, are you able to get a core dump and generate a backtrace for me? Thanks... peace & happiness, martin