On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N <ravishankar at redhat.com>
wrote:
> On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>
>> From the log snippet:
>>
>> [2016-12-07 09:15:35.677645] I [MSGID: 106482]
>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] 0-management:
>> Received add brick req
>> [2016-12-07 09:15:35.677708] I [MSGID: 106062]
>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] 0-management:
>> replica-count is 2
>> [2016-12-07 09:15:35.677735] E [MSGID: 106291]
>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] 0-management:
>>
>> The last log entry indicates that we hit the code path in
>> gd_addbr_validate_replica_count ()
>>
>> if (replica_count == volinfo->replica_count) {
>> if (!(total_bricks %
volinfo->dist_leaf_count)) {
>> ret = 1;
>> goto out;
>> }
>> }
>>
>>
> It seems unlikely that this snippet was hit because we print the E [MSGID:
> 106291] in the above message only if ret==-1.
> gd_addbr_validate_replica_count() returns -1 and yet not populates
> err_str only when in volinfo->type doesn't match any of the known
volume
> types, so volinfo->type is corrupted perhaps?
>
You are right, I missed that ret is set to 1 here in the above snippet.
@Milos - Can you please provide us the volume info file from
/var/lib/glusterd/vols/<volname>/ from all the three nodes to continue the
analysis?
>
> -Ravi
>
> @Pranith, Ravi - Milos was trying to convert a dist (1 X 1) volume to a
>> replicate (1 X 2) using add brick and hit this issue where add-brick
>> failed. The cluster is operating with 3.7.6. Could you help on what
>> scenario this code path can be hit? One straight forward issue I see
here
>> is missing err_str in this path.
>>
>>
>>
>
--
~ Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161208/f69b43c0/attachment.html>