thr3ads.net - Gluster users - [Gluster-users] volume replace-brick start is not working [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Alessandro Ipe

2015-Jan-07 11:39 UTC

[Gluster-users] volume replace-brick start is not working

Hi,


The corresponding logs in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log (OS
is openSuSE 12.3)
[2015-01-06 12:32:14.596601] I
[glusterd-replace-brick.c:98:__glusterd_handle_replace_brick] 0-management:
Received replace brick req
[2015-01-06 12:32:14.596633] I
[glusterd-replace-brick.c:153:__glusterd_handle_replace_brick] 0-management:
Received replace brick status request
[2015-01-06 12:32:14.596991] E
[glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock
RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380
[2015-01-06 12:32:14.598100] E
[glusterd-rpc-ops.c:675:__glusterd_cluster_unlock_cbk] 0-management: Received
unlock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380

However, several reboots managed to cancel the replace-brick command. Moreover,
I read that this command could still have issues (obviously) in 3.5, so I
managed to find a workaround for it.


A.


On Wednesday 07 January 2015 15:41:07 Atin Mukherjee
wrote:> On 01/06/2015 06:05 PM, Alessandro Ipe wrote:
> > Hi,
> > 
> > 
> > We have set up a "md1" volume using gluster 3.4.2 over 4
servers
> > configured as distributed and replicated. Then, we upgraded smoohtly
to
> > 3.5.3, since it was mentionned that the command "volume
replace-brick" is
> > broken on 3.4.x. We added two more peers (after having read that the
> > quota feature neede to be turn off for this command to succeed...).
> > 
> > We have then issued an
> > gluster volume replace-brick md1
> > 193.190.249.113:/data/glusterfs/md1/brick1
> > 193.190.249.122:/data/glusterfs/md1/brick1 start force Then I did an
> > gluster volume replace-brick md1
> > 193.190.249.113:/data/glusterfs/md1/brick1
> > 193.190.249.122:/data/glusterfs/md1/brick1 abort
> > because nothing was happening.
> > 
> > However wheh trying to monitor the previous command by
> > gluster volume replace-brick md1
> > 193.190.249.113:/data/glusterfs/md1/brick1
> > 193.190.249.122:/data/glusterfs/md1/brick1 status it outputs
> > volume replace-brick: failed: Another transaction could be in
progress.
> > Please try again after sometime. and the following lines are written
in
> > cli.log
> > [2015-01-06 12:32:14.595387] I [socket.c:3645:socket_init]
0-glusterfs:
> > SSL support is NOT enabled [2015-01-06 12:32:14.595434] I
> > [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
> > [2015-01-06 12:32:14.595590] I [socket.c:3645:socket_init]
0-glusterfs:
> > SSL support is NOT enabled [2015-01-06 12:32:14.595606] I
> > [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
> > [2015-01-06 12:32:14.596013] I
> > [cli-cmd-volume.c:1706:cli_check_gsync_present] 0-: geo-replication
not
> > installed [2015-01-06 12:32:14.602165] I
> > [cli-rpc-ops.c:2162:gf_cli_replace_brick_cbk] 0-cli: Received resp to
> > replace brick [2015-01-06 12:32:14.602248] I [input.c:36:cli_batch]
0-:
> > Exiting with: -1
> > 
> > What am I doing wrong ?
> 
> Can you please share the glusterd log?
> 
> ~Atin
> 
> > Many thanks,
> > 
> > 
> > Alessandro.
> > 
> > 
> > gluster volume info md1 outputs:
> > Volume Name: md1
> > Type: Distributed-Replicate
> > Volume ID: 6da4b915-1def-4df4-a41c-2f3300ebf16b
> > Status: Started
> > Number of Bricks: 2 x 2 = 4
> > Transport-type: tcp
> > Bricks:
> > Brick1: tsunami1:/data/glusterfs/md1/brick1
> > Brick2: tsunami2:/data/glusterfs/md1/brick1
> > Brick3: tsunami3:/data/glusterfs/md1/brick1
> > Brick4: tsunami4:/data/glusterfs/md1/brick1
> > Options Reconfigured:
> > server.allow-insecure: on
> > cluster.read-hash-mode: 2
> > features.quota: off
> > nfs.disable: on
> > performance.cache-size: 512MB
> > performance.io-thread-count: 64
> > performance.flush-behind: off
> > performance.write-behind-window-size: 4MB
> > performance.write-behind: on
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
-- 

 Dr. Ir. Alessandro Ipe   
 Department of Observations             Tel. +32 2 373 06 31
 Remote Sensing from Space              Fax. +32 2 374 67 88  
 Royal Meteorological Institute  
 Avenue Circulaire 3                    Email:  
 B-1180 Brussels        Belgium         Alessandro.Ipe at meteo.be 
 Web: http://gerb.oma.be

Atin Mukherjee

2015-Jan-07 12:36 UTC

head link

[Gluster-users] volume replace-brick start is not working

On 01/07/2015 05:09 PM, Alessandro Ipe wrote:> Hi,
> 
> 
> The corresponding logs in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
(OS is openSuSE 12.3)
> [2015-01-06 12:32:14.596601] I
[glusterd-replace-brick.c:98:__glusterd_handle_replace_brick] 0-management:
Received replace brick req
> [2015-01-06 12:32:14.596633] I
[glusterd-replace-brick.c:153:__glusterd_handle_replace_brick] 0-management:
Received replace brick status request
> [2015-01-06 12:32:14.596991] E
[glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock
RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380Here is the problem, lock request is rejected by peer
b1aa773f-f9f4-491c-9493-00b23d5ee380, I feel this is one of the peer
which you added as part of your use case. Was the peer probe successful?
Can you please provide the peer status output?

~Atin> [2015-01-06 12:32:14.598100] E
[glusterd-rpc-ops.c:675:__glusterd_cluster_unlock_cbk] 0-management: Received
unlock RJT from uuid: b1aa773f-f9f4-491c-9493-00b23d5ee380
> 
> However, several reboots managed to cancel the replace-brick command.
Moreover, I read that this command could still have issues (obviously) in 3.5,
so I managed to find a workaround for it.
> 
> 
> A.
> 
> 
> On Wednesday 07 January 2015 15:41:07 Atin Mukherjee wrote:
>> On 01/06/2015 06:05 PM, Alessandro Ipe wrote:
>>> Hi,
>>>
>>>
>>> We have set up a "md1" volume using gluster 3.4.2 over 4
servers
>>> configured as distributed and replicated. Then, we upgraded
smoohtly to
>>> 3.5.3, since it was mentionned that the command "volume
replace-brick" is
>>> broken on 3.4.x. We added two more peers (after having read that
the
>>> quota feature neede to be turn off for this command to succeed...).
>>>
>>> We have then issued an
>>> gluster volume replace-brick md1
>>> 193.190.249.113:/data/glusterfs/md1/brick1
>>> 193.190.249.122:/data/glusterfs/md1/brick1 start force Then I did
an
>>> gluster volume replace-brick md1
>>> 193.190.249.113:/data/glusterfs/md1/brick1
>>> 193.190.249.122:/data/glusterfs/md1/brick1 abort
>>> because nothing was happening.
>>>
>>> However wheh trying to monitor the previous command by
>>> gluster volume replace-brick md1
>>> 193.190.249.113:/data/glusterfs/md1/brick1
>>> 193.190.249.122:/data/glusterfs/md1/brick1 status it outputs
>>> volume replace-brick: failed: Another transaction could be in
progress.
>>> Please try again after sometime. and the following lines are
written in
>>> cli.log
>>> [2015-01-06 12:32:14.595387] I [socket.c:3645:socket_init]
0-glusterfs:
>>> SSL support is NOT enabled [2015-01-06 12:32:14.595434] I
>>> [socket.c:3660:socket_init] 0-glusterfs: using system polling
thread
>>> [2015-01-06 12:32:14.595590] I [socket.c:3645:socket_init]
0-glusterfs:
>>> SSL support is NOT enabled [2015-01-06 12:32:14.595606] I
>>> [socket.c:3660:socket_init] 0-glusterfs: using system polling
thread
>>> [2015-01-06 12:32:14.596013] I
>>> [cli-cmd-volume.c:1706:cli_check_gsync_present] 0-: geo-replication
not
>>> installed [2015-01-06 12:32:14.602165] I
>>> [cli-rpc-ops.c:2162:gf_cli_replace_brick_cbk] 0-cli: Received resp
to
>>> replace brick [2015-01-06 12:32:14.602248] I [input.c:36:cli_batch]
0-:
>>> Exiting with: -1
>>>
>>> What am I doing wrong ?
>>
>> Can you please share the glusterd log?
>>
>> ~Atin
>>
>>> Many thanks,
>>>
>>>
>>> Alessandro.
>>>
>>>
>>> gluster volume info md1 outputs:
>>> Volume Name: md1
>>> Type: Distributed-Replicate
>>> Volume ID: 6da4b915-1def-4df4-a41c-2f3300ebf16b
>>> Status: Started
>>> Number of Bricks: 2 x 2 = 4
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: tsunami1:/data/glusterfs/md1/brick1
>>> Brick2: tsunami2:/data/glusterfs/md1/brick1
>>> Brick3: tsunami3:/data/glusterfs/md1/brick1
>>> Brick4: tsunami4:/data/glusterfs/md1/brick1
>>> Options Reconfigured:
>>> server.allow-insecure: on
>>> cluster.read-hash-mode: 2
>>> features.quota: off
>>> nfs.disable: on
>>> performance.cache-size: 512MB
>>> performance.io-thread-count: 64
>>> performance.flush-behind: off
>>> performance.write-behind-window-size: 4MB
>>> performance.write-behind: on
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Jan 2015 - volume replace-brick start is not working

[Gluster-users] volume replace-brick start is not working

[Gluster-users] volume replace-brick start is not working