Iain Milne
2014-Apr-15 15:26 UTC
[Gluster-users] Volume add-brick: failed: (with no error message)
Hi folks, We've had a 2 node gluster array working great for the last year. Each brick is a 37TB xfs mount. It's now on Centos 6.5 (x64) running gluster 3.4.3-2 Volume Name: gfs Type: Distribute Volume ID: ddbb46bb-821e-44db-bc7e-32f43334f62c Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: server1:/mnt/data Brick2: server2:/mnt/data We've just bought a new server (identical in every way to the previous two) and we're trying to get it added to the volume. The peering process goes fine: Number of Peers: 2 Hostname: server2 Uuid: 02f1a25b-afd8-49e2-8708-95456f6b8473 State: Peer in Cluster (Connected) Hostname: server3 Port: 24007 Uuid: 3fc9df26-bb49-4c74-8eae-4b3f37389224 State: Peer in Cluster (Connected) The only thing of interest (?) there is the addition of the port number for the new server. Neither of the old servers show a port, even when running the peer status command on any of the boxes. The main problem is the addition of the new server/brick: [root at server1 glusterfs]# gluster volume add-brick gfs server3:/mnt/data volume add-brick: failed: There's no error there at all: just a blank after the colon. The logs on server1 (the one trying to do the add): W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket" I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed I [cli-rpc-ops.c:1695:gf_cli_add_brick_cbk] 0-cli: Received resp to add brick I [input.c:36:cli_batch] 0-: Exiting with: -1 And the logs on server3 (the one being added): E [glusterd-op-sm.c:3719:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Add brick', Status : -1 The current storage array is live and in-use by users, so it can't be taken offline at short notice. For completeness, here's glusterd on server3 running in debug mode when the add-brick command was attempted: [2014-04-15 15:03:33.133976] D [glusterd-handler.c:549:__glusterd_handle_cluster_lock] 0-management: Received LOCK from uuid: 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.134013] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.134031] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134051] D [glusterd-handler.c:572:__glusterd_handle_cluster_lock] 0-management: Returning 0 [2014-04-15 15:03:33.134065] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134083] D [glusterd-utils.c:340:glusterd_lock] 0-management: Cluster lock held by 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.134096] D [glusterd-op-sm.c:2445:glusterd_op_ac_lock] 0-management: Lock Returned 0 [2014-04-15 15:03:33.134153] D [glusterd-handler.c:1776:glusterd_op_lock_send_resp] 0-management: Responded to lock, ret: 0 [2014-04-15 15:03:33.134171] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Default' to 'Locked' due to event 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134187] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0 [2014-04-15 15:03:33.135409] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.135452] D [glusterd-handler.c:604:glusterd_req_ctx_create] 0-management: Received op from uuid 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.135481] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.135497] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.135524] D [glusterd-utils.c:1209:glusterd_volinfo_find] 0-: Volume gfs found [2014-04-15 15:03:33.135537] D [glusterd-utils.c:1216:glusterd_volinfo_find] 0-: Returning 0 [2014-04-15 15:03:33.135554] D [glusterd-utils.c:5223:glusterd_is_rb_started] 0-: is_rb_started:status=0 [2014-04-15 15:03:33.135600] D [glusterd-utils.c:5232:glusterd_is_rb_paused] 0-: is_rb_paused:status=0 [2014-04-15 15:03:33.135643] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135662] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.135677] D [glusterd-utils.c:665:glusterd_volinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135698] D [glusterd-utils.c:749:glusterd_volume_brickinfos_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135713] D [glusterd-utils.c:777:glusterd_volinfo_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135729] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135742] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.135755] D [glusterd-utils.c:665:glusterd_volinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135771] D [glusterd-utils.c:749:glusterd_volume_brickinfos_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135784] D [glusterd-utils.c:777:glusterd_volinfo_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135797] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135810] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.136093] D [glusterd-utils.c:5029:glusterd_friend_find_by_hostname] 0-management: Unable to find friend: server3 [2014-04-15 15:03:33.136194] D [glusterd-utils.c:290:glusterd_is_local_addr] 0-management: 10.0.0.244 [2014-04-15 15:03:33.136755] D [glusterd-utils.c:257:glusterd_interface_search] 0-management: 10.0.0.244 is local address at interface em1 [2014-04-15 15:03:33.136778] D [glusterd-utils.c:5064:glusterd_hostname_to_uuid] 0-management: returning 0 [2014-04-15 15:03:33.136790] D [glusterd-utils.c:819:glusterd_resolve_brick] 0-management: Returning 0 [2014-04-15 15:03:33.136818] D [glusterd-utils.c:5215:glusterd_new_brick_validate] 0-management: returning 0 [2014-04-15 15:03:33.136849] D [glusterd-brick-ops.c:1177:glusterd_op_stage_add_brick] 0-management: Returning -1 [2014-04-15 15:03:33.136866] D [glusterd-op-sm.c:3975:glusterd_op_stage_validate] 0-management: Returning -1 [2014-04-15 15:03:33.136878] E [glusterd-op-sm.c:3719:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Add brick', Status : -1 [2014-04-15 15:03:33.136940] D [glusterd-handler.c:1891:glusterd_op_stage_send_resp] 0-management: Responded to stage, ret: 0 [2014-04-15 15:03:33.136959] D [glusterd-op-sm.c:3728:glusterd_op_ac_stage_op] 0-management: Returning with 0 [2014-04-15 15:03:33.136975] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Locked' to 'Staged' due to event 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.136989] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0 [2014-04-15 15:03:33.138024] D [glusterd-handler.c:1824:__glusterd_handle_cluster_unlock] 0-management: Received UNLOCK from uuid: 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.138063] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.138105] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138123] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138139] D [glusterd-op-sm.c:2469:glusterd_op_ac_unlock] 0-management: Unlock Returned 0 [2014-04-15 15:03:33.138192] D [glusterd-handler.c:1795:glusterd_op_unlock_send_resp] 0-management: Responded to unlock, ret: 0 [2014-04-15 15:03:33.138209] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Staged' to 'Default' due to event 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138224] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0