Alessandro Ipe
2015-Jan-07 11:39 UTC
[Gluster-users] volume replace-brick start is not working
Hi, The corresponding logs in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log (OS is openSuSE 12.3) [2015-01-06 12:32:14.596601] I [glusterd-replace-brick.c:98:__glusterd_handle_replace_brick] 0-management: Received replace brick req [2015-01-06 12:32:14.596633] I [glusterd-replace-brick.c:153:__glusterd_handle_replace_brick] 0-management: Received replace brick status request [2015-01-06 12:32:14.596991] E [glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380 [2015-01-06 12:32:14.598100] E [glusterd-rpc-ops.c:675:__glusterd_cluster_unlock_cbk] 0-management: Received unlock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380 However, several reboots managed to cancel the replace-brick command. Moreover, I read that this command could still have issues (obviously) in 3.5, so I managed to find a workaround for it. A. On Wednesday 07 January 2015 15:41:07 Atin Mukherjee wrote:> On 01/06/2015 06:05 PM, Alessandro Ipe wrote: > > Hi, > > > > > > We have set up a "md1" volume using gluster 3.4.2 over 4 servers > > configured as distributed and replicated. Then, we upgraded smoohtly to > > 3.5.3, since it was mentionned that the command "volume replace-brick" is > > broken on 3.4.x. We added two more peers (after having read that the > > quota feature neede to be turn off for this command to succeed...). > > > > We have then issued an > > gluster volume replace-brick md1 > > 193.190.249.113:/data/glusterfs/md1/brick1 > > 193.190.249.122:/data/glusterfs/md1/brick1 start force Then I did an > > gluster volume replace-brick md1 > > 193.190.249.113:/data/glusterfs/md1/brick1 > > 193.190.249.122:/data/glusterfs/md1/brick1 abort > > because nothing was happening. > > > > However wheh trying to monitor the previous command by > > gluster volume replace-brick md1 > > 193.190.249.113:/data/glusterfs/md1/brick1 > > 193.190.249.122:/data/glusterfs/md1/brick1 status it outputs > > volume replace-brick: failed: Another transaction could be in progress. > > Please try again after sometime. and the following lines are written in > > cli.log > > [2015-01-06 12:32:14.595387] I [socket.c:3645:socket_init] 0-glusterfs: > > SSL support is NOT enabled [2015-01-06 12:32:14.595434] I > > [socket.c:3660:socket_init] 0-glusterfs: using system polling thread > > [2015-01-06 12:32:14.595590] I [socket.c:3645:socket_init] 0-glusterfs: > > SSL support is NOT enabled [2015-01-06 12:32:14.595606] I > > [socket.c:3660:socket_init] 0-glusterfs: using system polling thread > > [2015-01-06 12:32:14.596013] I > > [cli-cmd-volume.c:1706:cli_check_gsync_present] 0-: geo-replication not > > installed [2015-01-06 12:32:14.602165] I > > [cli-rpc-ops.c:2162:gf_cli_replace_brick_cbk] 0-cli: Received resp to > > replace brick [2015-01-06 12:32:14.602248] I [input.c:36:cli_batch] 0-: > > Exiting with: -1 > > > > What am I doing wrong ? > > Can you please share the glusterd log? > > ~Atin > > > Many thanks, > > > > > > Alessandro. > > > > > > gluster volume info md1 outputs: > > Volume Name: md1 > > Type: Distributed-Replicate > > Volume ID: 6da4b915-1def-4df4-a41c-2f3300ebf16b > > Status: Started > > Number of Bricks: 2 x 2 = 4 > > Transport-type: tcp > > Bricks: > > Brick1: tsunami1:/data/glusterfs/md1/brick1 > > Brick2: tsunami2:/data/glusterfs/md1/brick1 > > Brick3: tsunami3:/data/glusterfs/md1/brick1 > > Brick4: tsunami4:/data/glusterfs/md1/brick1 > > Options Reconfigured: > > server.allow-insecure: on > > cluster.read-hash-mode: 2 > > features.quota: off > > nfs.disable: on > > performance.cache-size: 512MB > > performance.io-thread-count: 64 > > performance.flush-behind: off > > performance.write-behind-window-size: 4MB > > performance.write-behind: on > > > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users-- Dr. Ir. Alessandro Ipe Department of Observations Tel. +32 2 373 06 31 Remote Sensing from Space Fax. +32 2 374 67 88 Royal Meteorological Institute Avenue Circulaire 3 Email: B-1180 Brussels Belgium Alessandro.Ipe at meteo.be Web: http://gerb.oma.be
Atin Mukherjee
2015-Jan-07 12:36 UTC
[Gluster-users] volume replace-brick start is not working
On 01/07/2015 05:09 PM, Alessandro Ipe wrote:> Hi, > > > The corresponding logs in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log (OS is openSuSE 12.3) > [2015-01-06 12:32:14.596601] I [glusterd-replace-brick.c:98:__glusterd_handle_replace_brick] 0-management: Received replace brick req > [2015-01-06 12:32:14.596633] I [glusterd-replace-brick.c:153:__glusterd_handle_replace_brick] 0-management: Received replace brick status request > [2015-01-06 12:32:14.596991] E [glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380Here is the problem, lock request is rejected by peer b1aa773f-f9f4-491c-9493-00b23d5ee380, I feel this is one of the peer which you added as part of your use case. Was the peer probe successful? Can you please provide the peer status output? ~Atin> [2015-01-06 12:32:14.598100] E [glusterd-rpc-ops.c:675:__glusterd_cluster_unlock_cbk] 0-management: Received unlock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380 > > However, several reboots managed to cancel the replace-brick command. Moreover, I read that this command could still have issues (obviously) in 3.5, so I managed to find a workaround for it. > > > A. > > > On Wednesday 07 January 2015 15:41:07 Atin Mukherjee wrote: >> On 01/06/2015 06:05 PM, Alessandro Ipe wrote: >>> Hi, >>> >>> >>> We have set up a "md1" volume using gluster 3.4.2 over 4 servers >>> configured as distributed and replicated. Then, we upgraded smoohtly to >>> 3.5.3, since it was mentionned that the command "volume replace-brick" is >>> broken on 3.4.x. We added two more peers (after having read that the >>> quota feature neede to be turn off for this command to succeed...). >>> >>> We have then issued an >>> gluster volume replace-brick md1 >>> 193.190.249.113:/data/glusterfs/md1/brick1 >>> 193.190.249.122:/data/glusterfs/md1/brick1 start force Then I did an >>> gluster volume replace-brick md1 >>> 193.190.249.113:/data/glusterfs/md1/brick1 >>> 193.190.249.122:/data/glusterfs/md1/brick1 abort >>> because nothing was happening. >>> >>> However wheh trying to monitor the previous command by >>> gluster volume replace-brick md1 >>> 193.190.249.113:/data/glusterfs/md1/brick1 >>> 193.190.249.122:/data/glusterfs/md1/brick1 status it outputs >>> volume replace-brick: failed: Another transaction could be in progress. >>> Please try again after sometime. and the following lines are written in >>> cli.log >>> [2015-01-06 12:32:14.595387] I [socket.c:3645:socket_init] 0-glusterfs: >>> SSL support is NOT enabled [2015-01-06 12:32:14.595434] I >>> [socket.c:3660:socket_init] 0-glusterfs: using system polling thread >>> [2015-01-06 12:32:14.595590] I [socket.c:3645:socket_init] 0-glusterfs: >>> SSL support is NOT enabled [2015-01-06 12:32:14.595606] I >>> [socket.c:3660:socket_init] 0-glusterfs: using system polling thread >>> [2015-01-06 12:32:14.596013] I >>> [cli-cmd-volume.c:1706:cli_check_gsync_present] 0-: geo-replication not >>> installed [2015-01-06 12:32:14.602165] I >>> [cli-rpc-ops.c:2162:gf_cli_replace_brick_cbk] 0-cli: Received resp to >>> replace brick [2015-01-06 12:32:14.602248] I [input.c:36:cli_batch] 0-: >>> Exiting with: -1 >>> >>> What am I doing wrong ? >> >> Can you please share the glusterd log? >> >> ~Atin >> >>> Many thanks, >>> >>> >>> Alessandro. >>> >>> >>> gluster volume info md1 outputs: >>> Volume Name: md1 >>> Type: Distributed-Replicate >>> Volume ID: 6da4b915-1def-4df4-a41c-2f3300ebf16b >>> Status: Started >>> Number of Bricks: 2 x 2 = 4 >>> Transport-type: tcp >>> Bricks: >>> Brick1: tsunami1:/data/glusterfs/md1/brick1 >>> Brick2: tsunami2:/data/glusterfs/md1/brick1 >>> Brick3: tsunami3:/data/glusterfs/md1/brick1 >>> Brick4: tsunami4:/data/glusterfs/md1/brick1 >>> Options Reconfigured: >>> server.allow-insecure: on >>> cluster.read-hash-mode: 2 >>> features.quota: off >>> nfs.disable: on >>> performance.cache-size: 512MB >>> performance.io-thread-count: 64 >>> performance.flush-behind: off >>> performance.write-behind-window-size: 4MB >>> performance.write-behind: on >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >