Hi Marin - again thank you for the help.  I can't believe I coundn't
find any info about this big configuration change.  Even the Samba WIKI
doesn't really spell this out at all in instructs you to use ctdbd.conf.
Do I need to enable the 20.nfs_ganesha.check script file at all, or will the
config itself take care of that?  Also, are there any recommendations on which
nfs-checks.d checks should still be used in the nfs-checks-ganesha?
Gluster, nfs-ganesha, ctdb seems like a great combination that's sorely
lacking on proper documentation.  I know there's storhaug that is supposed
to tie this together, but that flat out didn't work for me at all, has poor
documentation and needed manual edits just to get it to work.
?On 10/1/19, 5:46 PM, "Martin Schwenke" <martin at meltin.net>
wrote:
    NOTE: This email originated from outside of the organization.
    
    
    Hi Max,
    
    On Tue, 1 Oct 2019 18:57:43 +0000, Max DiOrio via samba
    <samba at lists.samba.org> wrote:
    
    > Hi there ? I seem to be having trouble wrapping my brain about the
    > CTDB and ganesha configuration.  I thought I had it figured out, but
    > it doesn?t seem to be doing any checking of the nfs-ganesha service.
    
    > I put nfs-ganesha-callout as executable in /etc/ctdb
    
    > I create nfs-checks-ganesha.d folder in /etc/ctdb and in there I have
    > 20.nfs_ganesha.check
    
    You may want to symlink some of the others across, such as
    00.portmapper.check.
    
    > In my ctdbd.conf file I have:
    
    Aha!  In >=4.9 nothing looks at that file anymore.  If you run
    
     
https://git.samba.org/?p=samba.git;a=blob_plain;f=ctdb/doc/examples/config_migrate.sh;h=8479aeb39f383bf9d3a05d79b9357a7e07a6a836;hb=refs/heads/v4-9-stable
    
    on it then you'll get some useful suggestions...  but keep reading
    below...
    
    > # Options to ctdbd, read by ctdbd_wrapper(1)
    > #
    > # See ctdbd.conf(5) for more information about CTDB configuration
variables.
    >
    > # Shared recovery lock file to avoid split brain.  No default.
    > #
    > # Do NOT run CTDB without a recovery lock file unless you know exactly
    > # what you are doing.
    > CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.CTDB-lockfile
    
    This should be in ctdb.conf:
    
    [cluster]
    
      recovery lock = /run/gluster/shared_storage/.CTDB-lockfile
    
    > # List of nodes in the cluster.  Default is below.
    > CTDB_NODES=/etc/ctdb/nodes
    >
    > # List of public addresses for providing NAS services.  No default.
    > CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
    
    These are no longer used.  The above defaults in /etc/ctdb/ are now
    hardwired.  Symlinks can be used if necessary.
    
    > # What services should CTDB manage?  Default is none.
    > # CTDB_MANAGES_SAMBA=yes
    > # CTDB_MANAGES_WINBIND=yes
    > CTDB_MANAGES_NFS=yes
    
    Gone.  Now just enable the event scripts.
    
    > # Raise the file descriptor limit for CTDB?
    > # CTDB_MAX_OPEN_FILES=10000
    
    Gone.  Either do the right thing in the systemd unit file or put a
    ulimit command in /etc/sysconfig/ctdb or /etc/default/ctdb (depending
    on distro).
    
    > # Default is to use the log file below instead of syslog.
    > CTDB_LOGGING=file:/var/log/log.ctdb
    
    ctdb.conf:
    
    [logging]
      location = file:/var/log/log.ctdb
    
    However, this is the default so you can just omit it.
    
    If you can use syslog instead, then I strongly suggest you do that
    (see ctdb.conf(5) manpage).  CTDB's file logging has no useful way of
    rotating the logs. There's an open bug, there are plans, but it is
    complicated.
    
    > # Default log level is NOTICE.  Want less logging?
    > CTDB_DEBUGLEVEL=DEBUG
    
    ctdb.conf:
    
    [logging]
      log level = DEBUG
    
    > CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout
    > CTDB_NFS_CHECKS_DIR=/etc/ctdb/nfs-checks-ganesha.d
    > CTBS_NFS_STATE_FS_TYPE=glusterfs
    > CTDB_NFS_STATE_MNT=/run/gluster/shared_storage
    > CTDB_NFS_SKIP_SHARE_CHECK=yes
    > NFS_HOSTNAME=hq-6pnfs
    
    Move all of these to /etc/ctdb/script.options.
    
    > But in the logs, I see nothing relating to nfs-ganesha, and when the
    > ganesha service fails, the IP?s don?t get migrated and/or the service
    > doesn?t get restarted.
    
    > Any ideas?
    
    Yeah, complete rewrite of configuration handling.  :-)
    
    At some stage where will be another change some time in the future when:
    
    * service event scripts go to /etc/ctdb/events/service/ - not many
      related options to change here...
    
    * failover event scripts go to /etc/ctdb/events/failover/ and most of
      the failover-related tunables become script options
    
    * ...
    
    After that, things should be very easy to understand...  :-)
    
    peace & happiness,
    martin
As soon as I made the configuration change and restarted CTDB, it crashes.
Oct  2 11:05:14 hq-6pgluster01 systemd: Started CTDB.
Oct  2 11:05:21 hq-6pgluster01 systemd: ctdb.service: main process exited,
code=exited, status=1/FAILURE
Oct  2 11:05:21 hq-6pgluster01 ctdbd_wrapper: connect() failed, errno=111
Oct  2 11:05:21 hq-6pgluster01 ctdbd_wrapper: Failed to connect to CTDB daemon
(/var/run/ctdb/ctdbd.socket)
Oct  2 11:05:22 hq-6pgluster01 ctdbd_wrapper: Error while shutting down CTDB
?On 10/2/19, 10:51 AM, "samba on behalf of Max DiOrio via samba"
<samba-bounces at lists.samba.org on behalf of samba at lists.samba.org>
wrote:
    NOTE: This email originated from outside of the organization.
    
    
    Hi Marin - again thank you for the help.  I can't believe I coundn't
find any info about this big configuration change.  Even the Samba WIKI
doesn't really spell this out at all in instructs you to use ctdbd.conf.
    
    Do I need to enable the 20.nfs_ganesha.check script file at all, or will the
config itself take care of that?  Also, are there any recommendations on which
nfs-checks.d checks should still be used in the nfs-checks-ganesha?
    
    Gluster, nfs-ganesha, ctdb seems like a great combination that's sorely
lacking on proper documentation.  I know there's storhaug that is supposed
to tie this together, but that flat out didn't work for me at all, has poor
documentation and needed manual edits just to get it to work.
    
    
    
    On 10/1/19, 5:46 PM, "Martin Schwenke" <martin at
meltin.net> wrote:
    
        NOTE: This email originated from outside of the organization.
    
    
        Hi Max,
    
        On Tue, 1 Oct 2019 18:57:43 +0000, Max DiOrio via samba
        <samba at lists.samba.org> wrote:
    
        > Hi there ? I seem to be having trouble wrapping my brain about the
        > CTDB and ganesha configuration.  I thought I had it figured out,
but
        > it doesn?t seem to be doing any checking of the nfs-ganesha
service.
    
        > I put nfs-ganesha-callout as executable in /etc/ctdb
    
        > I create nfs-checks-ganesha.d folder in /etc/ctdb and in there I
have
        > 20.nfs_ganesha.check
    
        You may want to symlink some of the others across, such as
        00.portmapper.check.
    
        > In my ctdbd.conf file I have:
    
        Aha!  In >=4.9 nothing looks at that file anymore.  If you run
    
         
https://git.samba.org/?p=samba.git;a=blob_plain;f=ctdb/doc/examples/config_migrate.sh;h=8479aeb39f383bf9d3a05d79b9357a7e07a6a836;hb=refs/heads/v4-9-stable
    
        on it then you'll get some useful suggestions...  but keep reading
        below...
    
        > # Options to ctdbd, read by ctdbd_wrapper(1)
        > #
        > # See ctdbd.conf(5) for more information about CTDB configuration
variables.
        >
        > # Shared recovery lock file to avoid split brain.  No default.
        > #
        > # Do NOT run CTDB without a recovery lock file unless you know
exactly
        > # what you are doing.
        > CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.CTDB-lockfile
    
        This should be in ctdb.conf:
    
        [cluster]
    
          recovery lock = /run/gluster/shared_storage/.CTDB-lockfile
    
        > # List of nodes in the cluster.  Default is below.
        > CTDB_NODES=/etc/ctdb/nodes
        >
        > # List of public addresses for providing NAS services.  No default.
        > CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
    
        These are no longer used.  The above defaults in /etc/ctdb/ are now
        hardwired.  Symlinks can be used if necessary.
    
        > # What services should CTDB manage?  Default is none.
        > # CTDB_MANAGES_SAMBA=yes
        > # CTDB_MANAGES_WINBIND=yes
        > CTDB_MANAGES_NFS=yes
    
        Gone.  Now just enable the event scripts.
    
        > # Raise the file descriptor limit for CTDB?
        > # CTDB_MAX_OPEN_FILES=10000
    
        Gone.  Either do the right thing in the systemd unit file or put a
        ulimit command in /etc/sysconfig/ctdb or /etc/default/ctdb (depending
        on distro).
    
        > # Default is to use the log file below instead of syslog.
        > CTDB_LOGGING=file:/var/log/log.ctdb
    
        ctdb.conf:
    
        [logging]
          location = file:/var/log/log.ctdb
    
        However, this is the default so you can just omit it.
    
        If you can use syslog instead, then I strongly suggest you do that
        (see ctdb.conf(5) manpage).  CTDB's file logging has no useful way
of
        rotating the logs. There's an open bug, there are plans, but it is
        complicated.
    
        > # Default log level is NOTICE.  Want less logging?
        > CTDB_DEBUGLEVEL=DEBUG
    
        ctdb.conf:
    
        [logging]
          log level = DEBUG
    
        > CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout
        > CTDB_NFS_CHECKS_DIR=/etc/ctdb/nfs-checks-ganesha.d
        > CTBS_NFS_STATE_FS_TYPE=glusterfs
        > CTDB_NFS_STATE_MNT=/run/gluster/shared_storage
        > CTDB_NFS_SKIP_SHARE_CHECK=yes
        > NFS_HOSTNAME=hq-6pnfs
    
        Move all of these to /etc/ctdb/script.options.
    
        > But in the logs, I see nothing relating to nfs-ganesha, and when
the
        > ganesha service fails, the IP?s don?t get migrated and/or the
service
        > doesn?t get restarted.
    
        > Any ideas?
    
        Yeah, complete rewrite of configuration handling.  :-)
    
        At some stage where will be another change some time in the future when:
    
        * service event scripts go to /etc/ctdb/events/service/ - not many
          related options to change here...
    
        * failover event scripts go to /etc/ctdb/events/failover/ and most of
          the failover-related tunables become script options
    
        * ...
    
        After that, things should be very easy to understand...  :-)
    
        peace & happiness,
        martin
    
    
    --
    To unsubscribe from this list go to the following URL and read the
    instructions:  https://lists.samba.org/mailman/options/samba
Hi Max, On Wed, 2 Oct 2019 14:51:06 +0000, Max DiOrio <Max.DiOrio at ieeeglobalspec.com> wrote:> Hi Marin - again thank you for the help. I can't believe I coundn't > find any info about this big configuration change. Even the Samba > WIKI doesn't really spell this out at all in instructs you to use > ctdbd.conf.Can you please point me to that so I can fix it?> Do I need to enable the 20.nfs_ganesha.check script file at all, or > will the config itself take care of that? Also, are there any > recommendations on which nfs-checks.d checks should still be used in > the nfs-checks-ganesha?That check script provides failure detection of the main Ganesha daemon and attempts service restarts in an attempt to "self heal". The other files do something similar for the other RPC services. For example, if you're using the standard rpc.statd in your setup then you would also enable the corresponding check script to detect failures and restart the RPC service when it fails.> Gluster, nfs-ganesha, ctdb seems like a great combination that's > sorely lacking on proper documentation. I know there's storhaug that > is supposed to tie this together, but that flat out didn't work for > me at all, has poor documentation and needed manual edits just to get > it to work.I don't have huge amounts of experience setting up NFS Ganesha with CTDB or I would try to improve that documentation. If you get this working nicely and are able to improve the documentation then I and others would be very grateful. Note that the example Ganesha support for CTDB is very close to something used by products that use (or used) this combination. So, it should be relatively easy to get working. For storhaug, you could look up Jos? Rivera's email address on https://www.samba.org/samba/team/ and CC: him on an email to this list, asking about the status of storhaug. He may be able to help. The main problem is that available development and documentation time is limited. :-( peace & happiness, martin
Hi Max, On Wed, 2 Oct 2019 15:08:43 +0000, Max DiOrio <Max.DiOrio at ieeeglobalspec.com> wrote:> As soon as I made the configuration change and restarted CTDB, it crashes. > > Oct 2 11:05:14 hq-6pgluster01 systemd: Started CTDB. > Oct 2 11:05:21 hq-6pgluster01 systemd: ctdb.service: main process exited, code=exited, status=1/FAILURE > Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: connect() failed, errno=111 > Oct 2 11:05:21 hq-6pgluster01 ctdbd_wrapper: Failed to connect to CTDB daemon (/var/run/ctdb/ctdbd.socket) > Oct 2 11:05:22 hq-6pgluster01 ctdbd_wrapper: Error while shutting down CTDBIs there anything in the log file to suggest that this is an early failure instead of an actual crash? We do a lot of testing on CentsOS 7 (although not with Ganesha and with our own CTDB packages) and we haven't seen any crashes in recent times. If this is a crash, are you able to get a core dump and generate a backtrace for me? Thanks... peace & happiness, martin