On Thu, Dec 8, 2016 at 6:44 PM, Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com> wrote:> Ah, damn! I found the issue. On the storage server, the storage2 IP > address was wrong, I inversed two digits in the /etc/hosts file, sorry for > that :( > > I was able to add the brick now, I started the heal, but still no data > transfer visible. >@Ravi/Pranith - can you help here?> > By doing gluster volume status, I have > > Status of volume: storage > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------ > ------------------ > Brick storage2:/data/data-cluster 49152 0 Y 23101 > Brick storage:/data/data-cluster 49152 0 Y 30773 > Self-heal Daemon on localhost N/A N/A Y 30050 > Self-heal Daemon on storage N/A N/A Y 30792 > > > Any idea? > > On storage I have: > Number of Peers: 1 > > Hostname: 195.65.194.217 > Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 > State: Peer in Cluster (Connected) > > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com > Skype: milos.cuculovic.mdpi > > On 08.12.2016 13:55, Atin Mukherjee wrote: > >> Can you resend the attachment as zip? I am unable to extract the >> content? We shouldn't have 0 info file. What does gluster peer status >> output say? >> >> On Thu, Dec 8, 2016 at 4:51 PM, Milo? ?u?ulovi? - MDPI >> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote: >> >> I hope you received my last email Atin, thank you! >> >> - Kindest regards, >> >> Milos Cuculovic >> IT Manager >> >> --- >> MDPI AG >> Postfach, CH-4020 Basel, Switzerland >> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >> Tel. +41 61 683 77 35 >> Fax +41 61 302 89 18 >> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> >> Skype: milos.cuculovic.mdpi >> >> On 08.12.2016 10:28, Atin Mukherjee wrote: >> >> >> ---------- Forwarded message ---------- >> From: *Atin Mukherjee* <amukherj at redhat.com >> <mailto:amukherj at redhat.com> <mailto:amukherj at redhat.com >> <mailto:amukherj at redhat.com>>> >> Date: Thu, Dec 8, 2016 at 11:56 AM >> Subject: Re: [Gluster-users] Replica brick not working >> To: Ravishankar N <ravishankar at redhat.com >> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com >> <mailto:ravishankar at redhat.com>>> >> Cc: Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com >> <mailto:cuculovic at mdpi.com> >> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>, >> Pranith Kumar Karampuri >> <pkarampu at redhat.com <mailto:pkarampu at redhat.com> >> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>>>, >> gluster-users >> <gluster-users at gluster.org <mailto:gluster-users at gluster.org> >> <mailto:gluster-users at gluster.org >> <mailto:gluster-users at gluster.org>>> >> >> >> >> >> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N >> <ravishankar at redhat.com <mailto:ravishankar at redhat.com> >> <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com>>> >> >> wrote: >> >> On 12/08/2016 10:43 AM, Atin Mukherjee wrote: >> >> >From the log snippet: >> >> [2016-12-07 09:15:35.677645] I [MSGID: 106482] >> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] >> 0-management: Received add brick req >> [2016-12-07 09:15:35.677708] I [MSGID: 106062] >> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] >> 0-management: replica-count is 2 >> [2016-12-07 09:15:35.677735] E [MSGID: 106291] >> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] >> 0-management: >> >> The last log entry indicates that we hit the code path in >> gd_addbr_validate_replica_count () >> >> if (replica_count =>> volinfo->replica_count) { >> if (!(total_bricks % >> volinfo->dist_leaf_count)) { >> ret = 1; >> goto out; >> } >> } >> >> >> It seems unlikely that this snippet was hit because we print >> the E >> [MSGID: 106291] in the above message only if ret==-1. >> gd_addbr_validate_replica_count() returns -1 and yet not >> populates >> err_str only when in volinfo->type doesn't match any of the >> known >> volume types, so volinfo->type is corrupted perhaps? >> >> >> You are right, I missed that ret is set to 1 here in the above >> snippet. >> >> @Milos - Can you please provide us the volume info file from >> /var/lib/glusterd/vols/<volname>/ from all the three nodes to >> continue >> the analysis? >> >> >> >> -Ravi >> >> @Pranith, Ravi - Milos was trying to convert a dist (1 X >> 1) >> volume to a replicate (1 X 2) using add brick and hit >> this issue >> where add-brick failed. The cluster is operating with >> 3.7.6. >> Could you help on what scenario this code path can be >> hit? One >> straight forward issue I see here is missing err_str in >> this path. >> >> >> >> >> >> >> -- >> >> ~ Atin (atinm) >> >> >> >> -- >> >> ~ Atin (atinm) >> >> >> >> >> -- >> >> ~ Atin (atinm) >> >-- ~ Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161208/b7c43c05/attachment.html>
Miloš Čučulović - MDPI
2016-Dec-08 14:22 UTC
[Gluster-users] Fwd: Replica brick not working
A note to add, when checking the sudo gluster volume heal storage info I am getting more than 2k Number of entries, but no file on storage server (replica). - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculovic at mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 14:23, Atin Mukherjee wrote:> > > On Thu, Dec 8, 2016 at 6:44 PM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote: > > Ah, damn! I found the issue. On the storage server, the storage2 IP > address was wrong, I inversed two digits in the /etc/hosts file, > sorry for that :( > > I was able to add the brick now, I started the heal, but still no > data transfer visible. > > > @Ravi/Pranith - can you help here? > > > > By doing gluster volume status, I have > > Status of volume: storage > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------------------------ > Brick storage2:/data/data-cluster 49152 0 Y 23101 > Brick storage:/data/data-cluster 49152 0 Y 30773 > Self-heal Daemon on localhost N/A N/A Y 30050 > Self-heal Daemon on storage N/A N/A Y 30792 > > > Any idea? > > On storage I have: > Number of Peers: 1 > > Hostname: 195.65.194.217 > Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 > State: Peer in Cluster (Connected) > > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 13:55, Atin Mukherjee wrote: > > Can you resend the attachment as zip? I am unable to extract the > content? We shouldn't have 0 info file. What does gluster peer > status > output say? > > On Thu, Dec 8, 2016 at 4:51 PM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote: > > I hope you received my last email Atin, thank you! > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 10:28, Atin Mukherjee wrote: > > > ---------- Forwarded message ---------- > From: *Atin Mukherjee* <amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>> > Date: Thu, Dec 8, 2016 at 11:56 AM > Subject: Re: [Gluster-users] Replica brick not working > To: Ravishankar N <ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>> > Cc: Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>, > Pranith Kumar Karampuri > <pkarampu at redhat.com <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>> > <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>>>>, > gluster-users > <gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>>>> > > > > > On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N > <ravishankar at redhat.com <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>> > > wrote: > > On 12/08/2016 10:43 AM, Atin Mukherjee wrote: > > >From the log snippet: > > [2016-12-07 09:15:35.677645] I [MSGID: 106482] > > [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] > 0-management: Received add brick req > [2016-12-07 09:15:35.677708] I [MSGID: 106062] > > [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] > 0-management: replica-count is 2 > [2016-12-07 09:15:35.677735] E [MSGID: 106291] > > [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] > 0-management: > > The last log entry indicates that we hit the > code path in > gd_addbr_validate_replica_count () > > if (replica_count => volinfo->replica_count) { > if (!(total_bricks % > volinfo->dist_leaf_count)) { > ret = 1; > goto out; > } > } > > > It seems unlikely that this snippet was hit because > we print > the E > [MSGID: 106291] in the above message only if ret==-1. > gd_addbr_validate_replica_count() returns -1 and yet not > populates > err_str only when in volinfo->type doesn't match any > of the > known > volume types, so volinfo->type is corrupted perhaps? > > > You are right, I missed that ret is set to 1 here in the > above > snippet. > > @Milos - Can you please provide us the volume info file from > /var/lib/glusterd/vols/<volname>/ from all the three > nodes to > continue > the analysis? > > > > -Ravi > > @Pranith, Ravi - Milos was trying to convert a > dist (1 X 1) > volume to a replicate (1 X 2) using add brick > and hit > this issue > where add-brick failed. The cluster is operating > with 3.7.6. > Could you help on what scenario this code path > can be > hit? One > straight forward issue I see here is missing > err_str in > this path. > > > > > > > -- > > ~ Atin (atinm) > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm)
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:> > > On Thu, Dec 8, 2016 at 6:44 PM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote: > > Ah, damn! I found the issue. On the storage server, the storage2 > IP address was wrong, I inversed two digits in the /etc/hosts > file, sorry for that :( > > I was able to add the brick now, I started the heal, but still no > data transfer visible. >1. Are the files getting created on the new brick though? 2. Can you provide the output of `getfattr -d -m . -e hex /data/data-cluster` on both bricks? 3. Is it possible to attach gdb to the self-heal daemon on the original (old) brick and get a backtrace? `gdb -p <pid of self-heal daemon on the orignal brick>` thread apply all bt -->share this output quit gdb. -Ravi> > @Ravi/Pranith - can you help here? > > > By doing gluster volume status, I have > > Status of volume: storage > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------------------------ > Brick storage2:/data/data-cluster 49152 0 Y 23101 > Brick storage:/data/data-cluster 49152 0 Y 30773 > Self-heal Daemon on localhost N/A N/A Y 30050 > Self-heal Daemon on storage N/A N/A Y 30792 > > > Any idea? > > On storage I have: > Number of Peers: 1 > > Hostname: 195.65.194.217 > Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 > State: Peer in Cluster (Connected) > > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 13:55, Atin Mukherjee wrote: > > Can you resend the attachment as zip? I am unable to extract the > content? We shouldn't have 0 info file. What does gluster peer > status > output say? > > On Thu, Dec 8, 2016 at 4:51 PM, Milo? ?u?ulovi? - MDPI > <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote: > > I hope you received my last email Atin, thank you! > > - Kindest regards, > > Milos Cuculovic > IT Manager > > --- > MDPI AG > Postfach, CH-4020 Basel, Switzerland > Office: St. Alban-Anlage 66, 4052 Basel, Switzerland > Tel. +41 61 683 77 35 > Fax +41 61 302 89 18 > Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > Skype: milos.cuculovic.mdpi > > On 08.12.2016 10:28, Atin Mukherjee wrote: > > > ---------- Forwarded message ---------- > From: *Atin Mukherjee* <amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com> > <mailto:amukherj at redhat.com > <mailto:amukherj at redhat.com>>>> > Date: Thu, Dec 8, 2016 at 11:56 AM > Subject: Re: [Gluster-users] Replica brick not working > To: Ravishankar N <ravishankar at redhat.com > <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>> > Cc: Milo? ?u?ulovi? - MDPI <cuculovic at mdpi.com > <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com> > <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>, > Pranith Kumar Karampuri > <pkarampu at redhat.com <mailto:pkarampu at redhat.com> > <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>> > <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com> <mailto:pkarampu at redhat.com > <mailto:pkarampu at redhat.com>>>>, > gluster-users > <gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org> > <mailto:gluster-users at gluster.org > <mailto:gluster-users at gluster.org>>>> > > > > > On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N > <ravishankar at redhat.com > <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>> > <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>>> > > wrote: > > On 12/08/2016 10:43 AM, Atin Mukherjee wrote: > > >From the log snippet: > > [2016-12-07 09:15:35.677645] I [MSGID: 106482] > > [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] > 0-management: Received add brick req > [2016-12-07 09:15:35.677708] I [MSGID: 106062] > > [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] > 0-management: replica-count is 2 > [2016-12-07 09:15:35.677735] E [MSGID: 106291] > > [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] > 0-management: > > The last log entry indicates that we hit the > code path in > gd_addbr_validate_replica_count () > > if (replica_count => volinfo->replica_count) { > if (!(total_bricks % > volinfo->dist_leaf_count)) { > ret = 1; > goto out; > } > } > > > It seems unlikely that this snippet was hit > because we print > the E > [MSGID: 106291] in the above message only if ret==-1. > gd_addbr_validate_replica_count() returns -1 and > yet not > populates > err_str only when in volinfo->type doesn't match > any of the > known > volume types, so volinfo->type is corrupted perhaps? > > > You are right, I missed that ret is set to 1 here in > the above > snippet. > > @Milos - Can you please provide us the volume info > file from > /var/lib/glusterd/vols/<volname>/ from all the three > nodes to > continue > the analysis? > > > > -Ravi > > @Pranith, Ravi - Milos was trying to convert a > dist (1 X 1) > volume to a replicate (1 X 2) using add brick > and hit > this issue > where add-brick failed. The cluster is > operating with 3.7.6. > Could you help on what scenario this code path > can be > hit? One > straight forward issue I see here is missing > err_str in > this path. > > > > > > > -- > > ~ Atin (atinm) > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm) > > > > > -- > > ~ Atin (atinm)-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161208/d56d2189/attachment.html>