Hans Lambermont
2013-Apr-08 14:17 UTC
[Gluster-users] Replace-brick on 3.3.1 hangs entire volume for several minutes and then hangs glusterfs on destination brick
Hi gluster users, I just upgraded 3.2.5 to 3.3.1 for a Distributed-Replicate volume with about 2M directories to get a working replace-brick and now see it hang up the entire gluster volume for all clients for several minutes, and subsequently hang up the glusterfs on the destination brick. I suspect the gluster volume hangup to be related to https://bugzilla.redhat.com/show_bug.cgi?id=832609 "Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up". The resulting hanging destination replace-brick sits at 100% CPU and shows no strace output. gluster volume replace-brick xxx status Number of files migrated = 3 Current file= /xxx %CPU %MEM TIME+ P COMMAND 100 0.2 2238:48 2 //sbin/glusterfs -f/var/lib/glusterd/vols/vol01/rb_dst_brick.vol ... The target brick received about 1% of the intended directories. The log file -etc-glusterfs-glusterd.vol.log shows only that the replace-brick has started : I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick status request I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by 3* I [glusterd-handler.c:463:glusterd_op_txn_begin] 0-management: Acquired local lock I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: s1:/g/c I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 2 peers I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: s1:/g/c I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick I [glusterd-replace-brick.c:1288:rb_update_dstbrick_port] 0-: adding dst-brick port no I [glusterd-op-sm.c:2384:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 2 peers I [glusterd-rpc-ops.c:1317:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-rpc-ops.c:1317:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:607:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:607:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-op-sm.c:2653:glusterd_op_txn_complete] 0-glusterd: Cleared local lock Any hints on how to proceed from here and get replace-brick to work are welcome. regards, Hans Lambermont -- Hans Lambermont | Senior Architect (t) +31407370104 (w) www.shapeways.com
Hans Lambermont
2013-Apr-09 12:31 UTC
[Gluster-users] 'replace-brick' - why we plan to deprecate
Amar Tumballi wrote on Thu Oct 11 18:35:32 UTC 2012 :> When we initially came up with specs of 'glusterd', we needed an > option to replace a dead brick, and few people even requested for > having an option to migrate the data from the brick, when we are > replacing it.Do you specifically mean a 'dead' brick ? Or does your proposal hold for a live brick too ? (And doesn't a dead brick prevent one from reading data from it ?)> The result of this is 'gluster volume replace-brick' CLI, and in the > releases till 3.3.0 this was the only way to 'migrate' data off a > removed brick properly.> Now, with 3.3.0+ (ie, in upstream too), we have another *better* > approach (technically), which is achieved by below methods:...> 2) (Distributed-)Replicate Volume: > > earlier: > #gluster volume replace-brick <VOL> brick1 brick2 start [1] > > now: > > #gluster volume replace-brick <VOL> brick1 brick2 commit force > (self-heal daemon takes care of syncing data from one brick to another)For a dead brick this is OK. For a live brick this would break redundancy during the long syncing time which is unacceptable. What is the live brick replace-brick 3.3.1 status and roadmap ? regards, Hans Lambermont -- Hans Lambermont | Senior Architect (t) +31407370104 (w) www.shapeways.com