Milos, I just managed to take a look into a similar issue and my analysis is at [1]. I remember you mentioning about some incorrect /etc/hosts entries which lead to this same problem in earlier case, do you mind to recheck the same? [1] http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html On Wed, Dec 14, 2016 at 2:57 AM, Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com> wrote:> Hi All, > > Moving forward with my issue, sorry for the late reply! > > I had some issues with the storage2 server (original volume), then decided > to use 3.9.0, si I have the latest version. > > For that, I synced manually all the files to the storage server. I > installed there gluster 3.9.0, started it, created new volume called > storage and all seems to work ok. > > Now, I need to create my replicated volume (add new brick on storage2 > server). Almost all the files are there. So, I was adding on storage server: > > * sudo gluter peer probe storage2 > * sudo gluster volume add-brick storage replica 2 > storage2:/data/data-cluster force > > But there I am receiving "volume add-brick: failed: Host storage2 is not > in 'Peer in Cluster' state" > > Any idea? > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com > Skype: milos.cuculovic.mdpi > > On 08.12.2016 17:52, Ravishankar N wrote: > >> On 12/08/2016 09:44 PM, Milo? ?u?ulovi? - MDPI wrote: >> >>> I was able to fix the sync by rsync-ing all the directories, then the >>> hale started. The next problem :), as soon as there are files on the >>> new brick, the gluster mount will render also this one for mounts, and >>> the new brick is not ready yet, as the sync is not yet done, so it >>> results on missing files on client side. I temporary removed the new >>> brick, now I am running a manual rsync and will add the brick again, >>> hope this could work. >>> >>> What mechanism is managing this issue, I guess there is something per >>> built to make a replica brick available only once the data is >>> completely synced. >>> >> This mechanism was introduced in 3.7.9 or 3.7.10 >> (http://review.gluster.org/#/c/13806/). Before that version, you >> manually needed to set some xattrs on the bricks so that healing could >> happen in parallel while the client still would server reads from the >> original brick. I can't find the link to the doc which describes these >> steps for setting xattrs.:-( >> >> Calling it a day, >> Ravi >> >>> >>> - Kindest regards, >>> >>> Milos Cuculovic >>> IT Manager >>> >>> --- >>> MDPI AG >>> Postfach, CH-4020 Basel, Switzerland >>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>> Tel. +41 61 683 77 35 >>> Fax +41 61 302 89 18 >>> Email: cuculovic at mdpi.com >>> Skype: milos.cuculovic.mdpi >>> >>> On 08.12.2016 16:17, Ravishankar N wrote: >>> >>>> On 12/08/2016 06:53 PM, Atin Mukherjee wrote: >>>> >>>>> >>>>> >>>>> On Thu, Dec 8, 2016 at 6:44 PM, Milo? ?u?ulovi? - MDPI >>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote: >>>>> >>>>> Ah, damn! I found the issue. On the storage server, the storage2 >>>>> IP address was wrong, I inversed two digits in the /etc/hosts >>>>> file, sorry for that :( >>>>> >>>>> I was able to add the brick now, I started the heal, but still no >>>>> data transfer visible. >>>>> >>>>> 1. Are the files getting created on the new brick though? >>>> 2. Can you provide the output of `getfattr -d -m . -e hex >>>> /data/data-cluster` on both bricks? >>>> 3. Is it possible to attach gdb to the self-heal daemon on the original >>>> (old) brick and get a backtrace? >>>> `gdb -p <pid of self-heal daemon on the orignal brick>` >>>> thread apply all bt -->share this output >>>> quit gdb. >>>> >>>> >>>> -Ravi >>>> >>>>> >>>>> @Ravi/Pranith - can you help here? >>>>> >>>>> >>>>> >>>>> By doing gluster volume status, I have >>>>> >>>>> Status of volume: storage >>>>> Gluster process TCP Port RDMA Port >>>>> Online Pid >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> >>>>> Brick storage2:/data/data-cluster 49152 0 Y >>>>> 23101 >>>>> Brick storage:/data/data-cluster 49152 0 Y >>>>> 30773 >>>>> Self-heal Daemon on localhost N/A N/A Y >>>>> 30050 >>>>> Self-heal Daemon on storage N/A N/A Y >>>>> 30792 >>>>> >>>>> >>>>> Any idea? >>>>> >>>>> On storage I have: >>>>> Number of Peers: 1 >>>>> >>>>> Hostname: 195.65.194.217 >>>>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 >>>>> State: Peer in Cluster (Connected) >>>>> >>>>> >>>>> - Kindest regards, >>>>> >>>>> Milos Cuculovic >>>>> IT Manager >>>>> >>>>> --- >>>>> MDPI AG >>>>> Postfach, CH-4020 Basel, Switzerland >>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>>>> Tel. +41 61 683 77 35 >>>>> Fax +41 61 302 89 18 >>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> >>>>> Skype: milos.cuculovic.mdpi >>>>> >>>>> On 08.12.2016 13:55, Atin Mukherjee wrote: >>>>> >>>>> Can you resend the attachment as zip? I am unable to extract >>>>> the >>>>> content? We shouldn't have 0 info file. What does gluster peer >>>>> status >>>>> output say? >>>>> >>>>> On Thu, Dec 8, 2016 at 4:51 PM, Milo? ?u?ulovi? - MDPI >>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> >>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> >>>>> wrote: >>>>> >>>>> I hope you received my last email Atin, thank you! >>>>> >>>>> - Kindest regards, >>>>> >>>>> Milos Cuculovic >>>>> IT Manager >>>>> >>>>> --- >>>>> MDPI AG >>>>> Postfach, CH-4020 Basel, Switzerland >>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>>>> Tel. +41 61 683 77 35 >>>>> Fax +41 61 302 89 18 >>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> >>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> >>>>> Skype: milos.cuculovic.mdpi >>>>> >>>>> On 08.12.2016 10:28, Atin Mukherjee wrote: >>>>> >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: *Atin Mukherjee* <amukherj at redhat.com >>>>> <mailto:amukherj at redhat.com> >>>>> <mailto:amukherj at redhat.com >>>>> <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com >>>>> <mailto:amukherj at redhat.com> >>>>> <mailto:amukherj at redhat.com >>>>> <mailto:amukherj at redhat.com>>>> >>>>> Date: Thu, Dec 8, 2016 at 11:56 AM >>>>> Subject: Re: [Gluster-users] Replica brick not working >>>>> To: Ravishankar N <ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com> >>>>> <mailto:ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com>> >>>>> <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com> >>>>> <mailto:ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com>>>> >>>>> Cc: Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com >>>>> <mailto:cuculovic at mdpi.com> >>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com >>>>> >> >>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> >>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>, >>>>> Pranith Kumar Karampuri >>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com> >>>>> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>> >>>>> <mailto:pkarampu at redhat.com >>>>> <mailto:pkarampu at redhat.com> <mailto:pkarampu at redhat.com >>>>> <mailto:pkarampu at redhat.com>>>>, >>>>> gluster-users >>>>> <gluster-users at gluster.org >>>>> <mailto:gluster-users at gluster.org> >>>>> <mailto:gluster-users at gluster.org >>>>> <mailto:gluster-users at gluster.org>> >>>>> <mailto:gluster-users at gluster.org >>>>> <mailto:gluster-users at gluster.org> >>>>> <mailto:gluster-users at gluster.org >>>>> <mailto:gluster-users at gluster.org>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N >>>>> <ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com>> >>>>> <mailto:ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com >>>>> <mailto:ravishankar at redhat.com>>>> >>>>> >>>>> wrote: >>>>> >>>>> On 12/08/2016 10:43 AM, Atin Mukherjee wrote: >>>>> >>>>> >From the log snippet: >>>>> >>>>> [2016-12-07 09:15:35.677645] I [MSGID: 106482] >>>>> >>>>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] >>>>> 0-management: Received add brick req >>>>> [2016-12-07 09:15:35.677708] I [MSGID: 106062] >>>>> >>>>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] >>>>> 0-management: replica-count is 2 >>>>> [2016-12-07 09:15:35.677735] E [MSGID: 106291] >>>>> >>>>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] >>>>> 0-management: >>>>> >>>>> The last log entry indicates that we hit the >>>>> code path in >>>>> gd_addbr_validate_replica_count () >>>>> >>>>> if (replica_count =>>>>> volinfo->replica_count) { >>>>> if (!(total_bricks % >>>>> volinfo->dist_leaf_count)) { >>>>> ret = 1; >>>>> goto out; >>>>> } >>>>> } >>>>> >>>>> >>>>> It seems unlikely that this snippet was hit >>>>> because we print >>>>> the E >>>>> [MSGID: 106291] in the above message only if >>>>> ret==-1. >>>>> gd_addbr_validate_replica_count() returns -1 and >>>>> yet not >>>>> populates >>>>> err_str only when in volinfo->type doesn't match >>>>> any of the >>>>> known >>>>> volume types, so volinfo->type is corrupted >>>>> perhaps? >>>>> >>>>> >>>>> You are right, I missed that ret is set to 1 here in >>>>> the above >>>>> snippet. >>>>> >>>>> @Milos - Can you please provide us the volume info >>>>> file from >>>>> /var/lib/glusterd/vols/<volname>/ from all the three >>>>> nodes to >>>>> continue >>>>> the analysis? >>>>> >>>>> >>>>> >>>>> -Ravi >>>>> >>>>> @Pranith, Ravi - Milos was trying to convert a >>>>> dist (1 X 1) >>>>> volume to a replicate (1 X 2) using add brick >>>>> and hit >>>>> this issue >>>>> where add-brick failed. The cluster is >>>>> operating with 3.7.6. >>>>> Could you help on what scenario this code path >>>>> can be >>>>> hit? One >>>>> straight forward issue I see here is missing >>>>> err_str in >>>>> this path. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ~ Atin (atinm) >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ~ Atin (atinm) >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ~ Atin (atinm) >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ~ Atin (atinm) >>>>> >>>> >>>> >>>> >> >>-- ~ Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161214/df16301f/attachment.html>
Miloš Čučulović - MDPI
2016-Dec-14 08:04 UTC
[Gluster-users] Fwd: Replica brick not working
Atin, I was able to move forward a bit. Initially, I had this: sudo gluster peer status Number of Peers: 1 Hostname: storage2 Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad State: Peer Rejected (Connected) Then, on storage2 I removed all from /var/lib/glusterd except the info file. Now I am getting another error message: sudo gluster peer status Number of Peers: 1 Hostname: storage2 Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad State: Sent and Received peer request (Connected) But the add brick is still not working. I checked the hosts file and all seems ok, ping is also working well. The think I also need to know, when adding a new replicated brick, do I need to first sync all files, or the new brick server needs to be empty? Also, do I first need to create the same volume on the new server or adding it to the volume of server1 will do it automatically? - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculovic at mdpi.com Skype: milos.cuculovic.mdpi On 14.12.2016 05:13, Atin Mukherjee wrote:> Milos, > > I just managed to take a look into a similar issue and my analysis is at > [1]. I remember you mentioning about some incorrect /etc/hosts entries > which lead to this same problem in earlier case, do you mind to recheck > the same? > > [1] > http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html > > On Wed, Dec 14, 2016 at 2:57 AM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote: > > Hi All, > > Moving forward with my issue, sorry for the late reply! > > I had some issues with the storage2 server (original volume), then > decided to use 3.9.0, si I have the latest version. > > For that, I synced manually all the files to the storage server. I > installed there gluster 3.9.0, started it, created new volume called > storage and all seems to work ok. > > Now, I need to create my replicated volume (add new brick on > storage2 server). Almost all the files are there. So, I was adding > on storage server: > > * sudo gluter peer probe storage2 > * sudo gluster volume add-brick storage replica 2 > storage2:/data/data-cluster force > > But there I am receiving "volume add-brick: failed: Host storage2 is > not in 'Peer in Cluster' state" > > Any idea? > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 17:52, Ravishankar N wrote: > > On 12/08/2016 09:44 PM, Milo? ?u?ulovi? - MDPI wrote: > > I was able to fix the sync by rsync-ing all the directories, > then the > hale started. The next problem :), as soon as there are > files on the > new brick, the gluster mount will render also this one for > mounts, and > the new brick is not ready yet, as the sync is not yet done, > so it > results on missing files on client side. I temporary removed > the new > brick, now I am running a manual rsync and will add the > brick again, > hope this could work. > > What mechanism is managing this issue, I guess there is > something per > built to make a replica brick available only once the data is > completely synced. > > This mechanism was introduced in 3.7.9 or 3.7.10 > (http://review.gluster.org/#/c/13806/ > <http://review.gluster.org/#/c/13806/>). Before that version, you > manually needed to set some xattrs on the bricks so that healing > could > happen in parallel while the client still would server reads > from the > original brick. I can't find the link to the doc which > describes these > steps for setting xattrs.:-( > > Calling it a day, > Ravi > > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 16:17, Ravishankar N wrote: > > On 12/08/2016 06:53 PM, Atin Mukherjee wrote: > > > > On Thu, Dec 8, 2016 at 6:44 PM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com>>> wrote: > > Ah, damn! I found the issue. On the storage > server, the storage2 > IP address was wrong, I inversed two digits in > the /etc/hosts > file, sorry for that :( > > I was able to add the brick now, I started the > heal, but still no > data transfer visible. > > 1. Are the files getting created on the new brick though? > 2. Can you provide the output of `getfattr -d -m . -e hex > /data/data-cluster` on both bricks? > 3. Is it possible to attach gdb to the self-heal daemon > on the original > (old) brick and get a backtrace? > `gdb -p <pid of self-heal daemon on the orignal brick>` > thread apply all bt -->share this output > quit gdb. > > > -Ravi > > > @Ravi/Pranith - can you help here? > > > > By doing gluster volume status, I have > > Status of volume: storage > Gluster process TCP Port > RDMA Port > Online Pid > ------------------------------------------------------------------------------ > > Brick storage2:/data/data-cluster 49152 0 Y > 23101 > Brick storage:/data/data-cluster 49152 0 Y > 30773 > Self-heal Daemon on localhost N/A > N/A Y > 30050 > Self-heal Daemon on storage N/A > N/A Y > 30792 > > > Any idea? > > On storage I have: > Number of Peers: 1 > > Hostname: 195.65.194.217 > Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 > State: Peer in Cluster (Connected) > > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 13:55, Atin Mukherjee wrote: > > Can you resend the attachment as zip? I am > unable to extract > the > content? We shouldn't have 0 info file. What > does gluster peer > status > output say? > > On Thu, Dec 8, 2016 at 4:51 PM, Milo? > ?u?ulovi? - MDPI > <cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com>>>> wrote: > > I hope you received my last email Atin, > thank you! > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, > Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 10:28, Atin Mukherjee wrote: > > > ---------- Forwarded message ---------- > From: *Atin Mukherjee* > <amukherj at redhat.com <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>>> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>>>>> > Date: Thu, Dec 8, 2016 at 11:56 AM > Subject: Re: [Gluster-users] Replica > brick not working > To: Ravishankar N > <ravishankar at redhat.com <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>>> > Cc: Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com>>>>>, > Pranith Kumar Karampuri > <pkarampu at redhat.com > <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com>> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com>>> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com>> > <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com>>>>>, > gluster-users > <gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>>> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>>>>> > > > > > On Thu, Dec 8, 2016 at 11:11 AM, > Ravishankar N > <ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>>> > > wrote: > > On 12/08/2016 10:43 AM, Atin > Mukherjee wrote: > > >From the log snippet: > > [2016-12-07 09:15:35.677645] > I [MSGID: 106482] > > > [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] > 0-management: Received add > brick req > [2016-12-07 09:15:35.677708] > I [MSGID: 106062] > > > [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] > 0-management: replica-count is 2 > [2016-12-07 09:15:35.677735] > E [MSGID: 106291] > > > [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] > 0-management: > > The last log entry indicates > that we hit the > code path in > > gd_addbr_validate_replica_count () > > if > (replica_count => volinfo->replica_count) { > if > (!(total_bricks % > volinfo->dist_leaf_count)) { > > ret = 1; > > goto out; > } > } > > > It seems unlikely that this > snippet was hit > because we print > the E > [MSGID: 106291] in the above > message only if > ret==-1. > > gd_addbr_validate_replica_count() returns -1 and > yet not > populates > err_str only when in > volinfo->type doesn't match > any of the > known > volume types, so volinfo->type > is corrupted > perhaps? > > > You are right, I missed that ret is > set to 1 here in > the above > snippet. > > @Milos - Can you please provide us > the volume info > file from > /var/lib/glusterd/vols/<volname>/ > from all the three > nodes to > continue > the analysis? > > > > -Ravi > > @Pranith, Ravi - Milos was > trying to convert a > dist (1 X 1) > volume to a replicate (1 X > 2) using add brick > and hit > this issue > where add-brick failed. The > cluster is > operating with 3.7.6. > Could you help on what > scenario this code path > can be > hit? One > straight forward issue I see > here is missing > err_str in > this path. > > > > > > > -- > > ~ Atin (atinm) > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm)