thr3ads.net - Gluster users - [Gluster-users] Split brain after rebooting half of a two-node cluster [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Peter Becker

2015-Aug-04 23:19 UTC

[Gluster-users] Split brain after rebooting half of a two-node cluster

Hello,

We are trying to run a pair of ActiveMQ nodes on top of glusterfs, using the
approach described in
http://activemq.apache.org/shared-file-system-master-slave.html

This seemed to work at first, but if I start rebooting machines while under load
I seem to quickly get into this problem:

  [2015-08-05 08:54:40.475351] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
  [2015-08-05 08:54:40.475373] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 61819: LOOKUP() /kahadb/db.data => -1 (Input/output error)

(from /var/log/glusterfs/srv-amq.log , more of the log below)

Afterwards the whole cluster ceases to function, since the affected file is
crucial to ActiveMQ's storage backend.

I have gotten into this situation three times by now, recovering in between by
rebuilding the glusterfs configuration from scratch (stop volume, delete, empty
bricks, create, start). The trigger is always a "sudo reboot" on one
of the nodes.

Am I wrong to expect this to work or is this an issue with my configuration or
glusterfs itself?

Cheers,
    Peter


More detail:
-----
qmaster at srvamqpy01:~$ cat /etc/issue
Ubuntu 12.04.5 LTS \n \l

qmaster at srvamqpy01:~$ uname -a
Linux srvamqpy01 3.13.0-61-generic #100~precise1-Ubuntu SMP Wed Jul 29 12:06:40
UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
qmaster at srvamqpy01:~$ gluster --version
glusterfs 3.2.5 built on Jan 31 2012 07:39:59
[...]
qmaster at srvamqpy01:~$ cat /etc/fstab
[...]
/dev/sdb1       /data/brick1    ext4    acl,user_xattr  0       2
srvamqpy01:/gv0 /srv/amq        glusterfs      
defaults,nobootwait,_netdev,direct-io-mode=disable 0 0
-----

Command used to create the volume:
-----
gluster volume create gv0 replica 2 srvamqpy01:/data/brick1/gv0
srvamqpy02:/data/brick1/gv0
-----

And more of the log:
-----
[2015-08-05 08:51:54.50969] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-gv0-client-0: changing port to 24011 (from 0)
[2015-08-05 08:51:54.51313] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-gv0-client-1: changing port to 24011 (from 0)
[2015-08-05 08:51:58.32060] I
[client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-0: Using
Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:51:58.32239] I [client-handshake.c:913:client_setvolume_cbk]
0-gv0-client-0: Connected to 10.254.2.137:24011, attached to remote volume
'/data/brick1/gv0'.
[2015-08-05 08:51:58.32257] I [afr-common.c:3141:afr_notify] 0-gv0-replicate-0:
Subvolume 'gv0-client-0' came back up; going online.
[2015-08-05 08:51:58.32359] I
[client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-1: Using
Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:51:58.33070] I [client-handshake.c:913:client_setvolume_cbk]
0-gv0-client-1: Connected to 10.254.2.164:24011, attached to remote volume
'/data/brick1/gv0'.
[2015-08-05 08:51:58.35521] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
switched to graph 0
[2015-08-05 08:51:58.35642] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.22
[2015-08-05 08:51:58.36851] I
[afr-common.c:1520:afr_set_root_inode_on_first_lookup] 0-gv0-replicate-0: added
root inode
[2015-08-05 08:52:06.24620] I [afr-common.c:1038:afr_launch_self_heal]
0-gv0-replicate-0: background  meta-data data self-heal triggered. path:
/kahadb/lock
[2015-08-05 08:52:06.28557] I
[afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  meta-data data self-heal completed on /kahadb/lock
[2015-08-05 08:52:16.64428] I [afr-common.c:1038:afr_launch_self_heal]
0-gv0-replicate-0: background  meta-data self-heal triggered. path: /kahadb/lock
[2015-08-05 08:52:16.65701] I
[afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  meta-data self-heal completed on /kahadb/lock
[2015-08-05 08:52:21.692657] W [socket.c:1494:__socket_proto_state_machine]
0-gv0-client-1: reading from socket failed. Error (Transport endpoint is not
connected), peer (10.254.2.164:24011)
[2015-08-05 08:52:21.693353] I [client.c:1883:client_rpc_notify] 0-gv0-client-1:
disconnected
[2015-08-05 08:52:26.71942] W [client3_1-fops.c:4699:client3_1_lk]
0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:52:26.71988] W [client3_1-fops.c:4751:client3_1_lk]
0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:52:32.35552] E [socket.c:1685:socket_connect_finish]
0-gv0-client-1: connection to 10.254.2.164:24011 failed (Connection refused)
[2015-08-05 08:52:35.36179] I
[client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-1: Using
Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:52:35.37641] I [client-handshake.c:913:client_setvolume_cbk]
0-gv0-client-1: Connected to 10.254.2.164:24011, attached to remote volume
'/data/brick1/gv0'.
[2015-08-05 08:52:36.538807] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:
data missing-entry gfid self-heal triggered. path: /kahadb/db-4.log, reason:
Replicate up down flush, data lock is held
[2015-08-05 08:52:36.539349] I
[afr-self-heal-common.c:1203:sh_missing_entries_create] 0-gv0-replicate-0: no
missing files - /kahadb/db-4.log. proceeding to metadata check
[2015-08-05 08:52:36.540105] W [dict.c:418:dict_unref]
(-->/usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fea25a93ec5]
(-->/usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client3_1_fstat_cbk+0x312)
[0x7fea228f8902]
(-->/usr/lib/glusterfs/3.2.5/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x1d5)
[0x7fea226a0405]))) 0-dict: dict is NULL
[2015-08-05 08:52:36.772749] I
[afr-self-heal-algorithm.c:520:sh_diff_loop_driver_done] 0-gv0-replicate-0: diff
self-heal on /kahadb/db-4.log: completed. (1 blocks of 252 were different
(0.40%))
[2015-08-05 08:52:36.775638] I
[afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  data missing-entry gfid self-heal completed on /kahadb/db-4.log
[2015-08-05 08:52:36.785113] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:
data missing-entry gfid self-heal triggered. path: /kahadb/db.redo, reason:
Replicate up down flush, data lock is held
[2015-08-05 08:52:36.785214] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:
data missing-entry gfid self-heal triggered. path: /kahadb/db.data, reason:
Replicate up down flush, data lock is held
[2015-08-05 08:52:36.785458] I
[afr-self-heal-common.c:1858:afr_sh_post_nb_entrylk_conflicting_sh_cbk]
0-gv0-replicate-0: Non blocking entrylks failed.
[2015-08-05 08:52:36.785480] I
[afr-self-heal-common.c:963:afr_sh_missing_entries_done] 0-gv0-replicate-0:
split brain found, aborting selfheal of /kahadb/db.data
[2015-08-05 08:52:36.785496] E
[afr-self-heal-common.c:2074:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  data missing-entry gfid self-heal failed on /kahadb/db.data
[2015-08-05 08:52:36.786139] I
[afr-self-heal-common.c:1203:sh_missing_entries_create] 0-gv0-replicate-0: no
missing files - /kahadb/db.redo. proceeding to metadata check
[2015-08-05 08:52:36.787147] I
[afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  data missing-entry gfid self-heal completed on /kahadb/db.redo
[2015-08-05 08:52:56.948495] I [afr-common.c:1038:afr_launch_self_heal]
0-gv0-replicate-0: background  entry self-heal triggered. path: /kahadb
[2015-08-05 08:52:56.949790] I
[afr-self-heal-entry.c:644:afr_sh_entry_expunge_entry_cbk] 0-gv0-replicate-0:
missing entry /kahadb/db.free on gv0-client-0
[2015-08-05 08:52:56.952400] E
[afr-self-heal-common.c:1054:afr_sh_common_lookup_resp_handler]
0-gv0-replicate-0: path /kahadb/lock on subvolume gv0-client-1 => -1 (No such
file or directory)
[2015-08-05 08:52:56.953281] I
[afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0:
background  entry self-heal completed on /kahadb
[2015-08-05 08:53:37.196481] I [client3_1-fops.c:1025:client3_1_removexattr_cbk]
0-gv0-client-0: remote operation failed: No data available
[2015-08-05 08:53:37.196735] I [client3_1-fops.c:1025:client3_1_removexattr_cbk]
0-gv0-client-1: remote operation failed: No data available
[2015-08-05 08:53:37.196917] W [fuse-bridge.c:850:fuse_err_cbk]
0-glusterfs-fuse: 54284: REMOVEXATTR() /kahadb/db-4.log => -1 (No data
available)
[2015-08-05 08:53:37.200487] I [client3_1-fops.c:1025:client3_1_removexattr_cbk]
0-gv0-client-0: remote operation failed: No data available
[2015-08-05 08:53:37.200746] I [client3_1-fops.c:1025:client3_1_removexattr_cbk]
0-gv0-client-1: remote operation failed: No data available
[2015-08-05 08:53:37.200936] W [fuse-bridge.c:850:fuse_err_cbk]
0-glusterfs-fuse: 54291: REMOVEXATTR() /kahadb/db-5.log => -1 (No data
available)
[2015-08-05 08:53:48.674314] W [client3_1-fops.c:3655:client3_1_flush]
0-gv0-client-1: (-2161116166): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.674350] W [client3_1-fops.c:3692:client3_1_flush]
0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.676375] W [client3_1-fops.c:3655:client3_1_flush]
0-gv0-client-1: (-1443019630): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.676396] W [client3_1-fops.c:3692:client3_1_flush]
0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.762598] W [client3_1-fops.c:4699:client3_1_lk]
0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.762662] W [client3_1-fops.c:4751:client3_1_lk]
0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.764122] W [client3_1-fops.c:3655:client3_1_flush]
0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.764142] W [client3_1-fops.c:3692:client3_1_flush]
0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:54:40.467613] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.467839] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.467861] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 61809: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.468151] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.468171] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 61811: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.473764] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.473797] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 61812: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.475351] I [afr-self-heal-common.c:705:afr_mark_sources]
0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.475373] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 61819: LOOKUP() /kahadb/db.data => -1 (Input/output error)
-----


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150804/0770948e/attachment.html>

Ravishankar N

2015-Aug-05 01:12 UTC

head link

[Gluster-users] Split brain after rebooting half of a two-node cluster

On 08/05/2015 04:49 AM, Peter Becker wrote:>
> qmaster at srvamqpy01:~$ gluster --version
>
> glusterfs 3.2.5 built on Jan 31 2012 07:39:59
>FWIW, this is a rather old release. Can you see if the issue is 
recurring with glusterfs 3.7?

-Ravi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150805/ce8c4e0f/attachment.html>

Gluster users - Aug 2015 - Split brain after rebooting half of a two-node cluster

[Gluster-users] Split brain after rebooting half of a two-node cluster

[Gluster-users] Split brain after rebooting half of a two-node cluster