________________________________________ From: Ted & Jean Miller <tjmillers at gmail.com> Sent: Monday, January 13, 2014 11:29 PM To: Ted Miller Subject: split-brain + what? I had a network failure yesterday, and besides some split-brain files, I am getting the following in my files. It appears that my two nodes won't stay connected because they can't agree on something, but I don't understand what they don't agree on, or what to do about it. This log segment begins just after one of the nodes was rebooted. +------------------------------------------------------------------------------+ [2014-01-14 01:20:59.410888] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-VM-client-0: changing port to 49154 (from 0) [2014-01-14 01:20:59.411008] W [socket.c:514:__socket_rwv] 0-VM-client-0: readv failed (No data available) [2014-01-14 01:20:59.418326] I [client-handshake.c:1659:select_server_supported_programs] 0-VM-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-01-14 01:20:59.418462] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-VM-client-1: changing port to 49156 (from 0) [2014-01-14 01:20:59.418512] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 01:20:59.422397] I [client-handshake.c:1456:client_setvolume_cbk] 0-VM-client-0: Connected to 10.41.65.4:49154, attached to remote volume '/bricks/01/B'. [2014-01-14 01:20:59.422420] I [client-handshake.c:1468:client_setvolume_cbk] 0-VM-client-0: Server and Client lk-version numbers are not same, reopening the fds [2014-01-14 01:20:59.422481] I [afr-common.c:3698:afr_notify] 0-VM-replicate-0: Subvolume 'VM-client-0' came back up; going online. [2014-01-14 01:20:59.422617] I [client-handshake.c:450:client_set_lk_version_cbk] 0-VM-client-0: Server lk version = 1 [2014-01-14 01:20:59.422706] I [client-handshake.c:1659:select_server_supported_programs] 0-VM-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-01-14 01:20:59.422899] I [client-handshake.c:1456:client_setvolume_cbk] 0-VM-client-1: Connected to 10.41.65.2:49156, attached to remote volume '/bricks/01/B'. [2014-01-14 01:20:59.422909] I [client-handshake.c:1468:client_setvolume_cbk] 0-VM-client-1: Server and Client lk-version numbers are not same, reopening the fds [2014-01-14 01:20:59.429930] I [fuse-bridge.c:4769:fuse_graph_setup] 0-fuse: switched to graph 0 [2014-01-14 01:20:59.430077] I [client-handshake.c:450:client_set_lk_version_cbk] 0-VM-client-1: Server lk version = 1 [2014-01-14 01:20:59.431195] I [fuse-bridge.c:3724:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2014-01-14 01:20:59.432065] I [afr-common.c:2057:afr_set_root_inode_on_first_lookup] 0-VM-replicate-0: added root inode [2014-01-14 01:20:59.432434] I [afr-common.c:2120:afr_discovery_cbk] 0-VM-replicate-0: selecting local read_child VM-client-0 [2014-01-14 03:00:47.506676] W [socket.c:514:__socket_rwv] 0-glusterfs: readv failed (No data available) [2014-01-14 03:00:47.506693] W [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: reading from socket failed. Error (No data available), peer (10.41.65.4:24007) [2014-01-14 03:00:57.793050] I [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2014-01-14 03:14:05.976531] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (Connection timed out) [2014-01-14 03:14:05.976563] W [socket.c:1962:__socket_proto_state_machine] 0-VM-client-1: reading from socket failed. Error (Connection timed out), peer (10.41.65.2:49156) [2014-01-14 03:14:05.976597] I [client.c:2097:client_rpc_notify] 0-VM-client-1: disconnected [2014-01-14 03:14:47.855141] E [client-handshake.c:1742:client_query_portmap_cbk] 0-VM-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2014-01-14 03:14:47.855192] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 03:14:47.855214] I [client.c:2097:client_rpc_notify] 0-VM-client-1: disconnected [2014-01-14 03:14:49.860364] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-VM-client-1: changing port to 49156 (from 0) [2014-01-14 03:14:49.860396] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 03:14:49.864065] I [client-handshake.c:1659:select_server_supported_programs] 0-VM-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-01-14 03:14:49.864325] I [client-handshake.c:1456:client_setvolume_cbk] 0-VM-client-1: Connected to 10.41.65.2:49156, attached to remote volume '/bricks/01/B'. [2014-01-14 03:14:49.864349] I [client-handshake.c:1468:client_setvolume_cbk] 0-VM-client-1: Server and Client lk-version numbers are not same, reopening the fds [2014-01-14 03:14:49.864527] I [client-handshake.c:450:client_set_lk_version_cbk] 0-VM-client-1: Server lk version = 1 [2014-01-14 03:18:07.904536] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (Connection timed out) [2014-01-14 03:18:07.904574] W [socket.c:1962:__socket_proto_state_machine] 0-VM-client-1: reading from socket failed. Error (Connection timed out), peer (10.41.65.2:49156) [2014-01-14 03:18:07.904602] I [client.c:2097:client_rpc_notify] 0-VM-client-1: disconnected [2014-01-14 03:18:49.876759] E [socket.c:2157:socket_connect_finish] 0-VM-client-1: connection to 10.41.65.2:24007 failed (Connection refused) [2014-01-14 03:18:49.876798] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 03:18:51.911943] E [client-handshake.c:1742:client_query_portmap_cbk] 0-VM-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2014-01-14 03:18:51.911977] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 03:18:51.912001] I [client.c:2097:client_rpc_notify] 0-VM-client-1: disconnected [2014-01-14 03:18:54.885965] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-VM-client-1: changing port to 49156 (from 0) [2014-01-14 03:18:54.885998] W [socket.c:514:__socket_rwv] 0-VM-client-1: readv failed (No data available) [2014-01-14 03:18:54.889314] I [client-handshake.c:1659:select_server_supported_programs] 0-VM-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-01-14 03:18:54.889530] I [client-handshake.c:1456:client_setvolume_cbk] 0-VM-client-1: Connected to 10.41.65.2:49156, attached to remote volume '/bricks/01/B'. [2014-01-14 03:18:54.889540] I [client-handshake.c:1468:client_setvolume_cbk] 0-VM-client-1: Server and Client lk-version numbers are not same, reopening the fds [2014-01-14 03:18:54.889751] I [client-handshake.c:450:client_set_lk_version_cbk] 0-VM-client-1: Server lk version = 1 Ted Miller Elkhart, IN