thr3ads.net - Gluster users - [Gluster-users] Gluster + NFS-Ganesha Failover [Apr 2018]

If this information is useful, please help other people find it:
Share via:

Philip Fuchs

2018-Apr-23 09:53 UTC

[Gluster-users] Gluster + NFS-Ganesha Failover

Hello All,

I am trying to setup a three way replicated Gluster Storage which is
exported by NFS Ganesha.
This 3 node Ganesha cluster is managed by pacemaker and corosync. I want
to use this cluster as a backend for several different web-based
applications as well as storage for mailboxes.

The cluster is working well but after triggering the failover by
stopping the ganesha service on one node, the ganesha services on the
other two nodes are also stopping after a couple of minutes, bringing
down the whole cluster.

Setup:
3 CentOS Gluster Servers with Ganesha
2 CentOS Clients

Packages:
glusterfs-libs-3.10.11-1.el7.x86_64
glusterfs-3.10.11-1.el7.x86_64
glusterfs-fuse-3.10.11-1.el7.x86_64
centos-release-gluster310-1.0-1.el7.centos.noarch
glusterfs-api-3.10.11-1.el7.x86_64
python2-glusterfs-api-1.1-1.el7.noarch
glusterfs-client-xlators-3.10.11-1.el7.x86_64
glusterfs-cli-3.10.11-1.el7.x86_64
glusterfs-server-3.10.11-1.el7.x86_64
python2-gluster-3.10.11-1.el7.x86_64
glusterfs-ganesha-3.10.11-1.el7.x86_64
nfs-ganesha-gluster-2.5.2-1.el7.x86_64
nfs-ganesha-2.5.2-1.el7.x86_64


Log messages:
==> ganesha/ganesha-gfapi.log <=[2018-04-16 16:37:52.777997] I [MSGID:
109066]
[dht-rename.c:1610:dht_rename] 0-mail-vol-dht: renaming
/tmp/1523896461.M716652P30764.rz.uni-augsburg.de
(hash=mail-vol-replicate-0/cache=mail-vol-replicate-0) =>
/cur/1523896461.M716652P30764.rz.uni-augsburg.de,S=1441,W=1478:2,S
(hash=mail-vol-replicate-0/cache=<nul>)
[2018-04-16 16:37:52.788361] W [inode.c:1341:inode_parent]
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f900105b0b8]
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0x8e)
[0x7f9000d84aee] -->/lib64/libglusterfs.so.0(inode_parent+0xda)
[0x7f9000d8270a] ) 0-gfapi: inode not found
[2018-04-16 16:37:52.788549] E [inode.c:2567:inode_parent_null_check]
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f900105b0b8]
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0xa0)
[0x7f9000d84b00] -->/lib64/libglusterfs.so.0(+0x398c4) [0x7f9000d818c4]
) 0-inode: invalid argument: inode [Das Argument ist ung?ltig]

==> messages <=Apr 16 18:37:52 nfsc02 kernel: ganesha.nfsd[29880]:
segfault at 0 ip
00007f9000d84b00 sp 00007f8f7a7d1650 error 4 in
libglusterfs.so.0.0.1[7f9000d48000+f1000]
Apr 16 18:37:52 nfsc02 systemd: nfs-ganesha.service: main process
exited, code=killed, status=11/SEGV
Apr 16 18:37:52 nfsc02 systemd: Unit nfs-ganesha.service entered failed
state.
Apr 16 18:37:52 nfsc02 systemd: nfs-ganesha.service failed.


Backtrace with gdb:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fafd6fcd700 (LWP 5171)]
0x00007fb0558eeb00 in glusterfs_normalize_dentry () from
/lib64/libglusterfs.so.0
(gdb) bt
#0  0x00007fb0558eeb00 in glusterfs_normalize_dentry () from
/lib64/libglusterfs.so.0
#1  0x00007fb055bc50b8 in glfs_resolve_at () from /lib64/libgfapi.so.0
#2  0x00007fb055bc6bb4 in glfs_h_lookupat () from /lib64/libgfapi.so.0
#3  0x00007fb055fe375f in lookup () from
/usr/lib64/ganesha/libfsalgluster.so
#4  0x000055e36bb2362f in mdc_get_parent ()
#5  0x000055e36bb202a5 in mdcache_create_handle ()
#6  0x000055e36ba81422 in nfs4_mds_putfh ()
#7  0x000055e36ba81998 in nfs4_op_putfh ()
#8  0x000055e36ba7108f in nfs4_Compound ()
#9  0x000055e36ba604fc in nfs_rpc_execute ()
#10 0x000055e36ba61dad in worker_run ()
#11 0x000055e36baf72c9 in fridgethr_start_routine ()
#12 0x00007fb05914de25 in start_thread (arg=0x7fafd6fcd700) at
pthread_create.c:308
#13 0x00007fb05881b34d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Anybody got an idea how to solve this problem?


Thanks,
Philip

Seemingly Similar Threads

Search for more apparently analagous threads

Gluster users - Apr 2018 - Gluster + NFS-Ganesha Failover

[Gluster-users] Gluster + NFS-Ganesha Failover

Seemingly Similar Threads

Wisdom of the Ancients