On Mon, Jul 25, 2016 at 4:37 PM, B.K.Raghuram <bkrram at gmail.com> wrote:> Atin, > > Couple of quick questions about the upgrade and in general about the > meaning of some of the parameters in the glusterd dir.. > > - I dont see the quota-version in the volume info file post upgrade, so > did the upgrade not go through properly? >If you are seeing a check sum issue you'd need to copy the same volume info file to that node where the checksum went wrong and then restart glusterd service. And yes, this looks like a bug in quota. @Mani - time to chip in :) - What does the op-version in the volume info file mean? Does this have any> corelation with the cluster op-version? Does it change with an upgrade? >volume's op-version is different. This is basically used in checking client's compatibility and it shouldn't change with an upgrade AFAIK and remember from the code.> - A more basic question - should all peer probes always be done from the > same node or can they be done from any node that is already in the cluster? > The reason I ask is when I tried to do what was said in > http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ > the initial cluster was initiated from node A with 5 other peers. Then post > upgrade, node B which was in the cluster got a peer rejected. So I deleted > all the files except glusterd.info and then did a peer probe of A from B. > Then when I ran a peer status on A, it only showed one node, B. Should I > have probed B from A instead? >peer probe can be done from any node in the trusted storage pool. So that's really not the issue. Ensure you keep all your peer file contents through out the same (/var/lib/glusterd/peers) where as only self uuid differs and then restarting glusterd service should solve the problem.> > On Sat, Jul 23, 2016 at 10:48 AM, Atin Mukherjee <amukherj at redhat.com> > wrote: > >> I am suspecting it to be new quota-version introduced in the volume info >> file which may have resulted in a checksum mismatch resulting into >> peer rejection. But we can confirm it from log files and respective info >> file content. >> >> >> On Saturday 23 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >> >>> Unfortunately, the setup is at a customer's place which is not remotely >>> accessible. Will try and get it by early next week. But could it just be a >>> mismatch of the /var/lib/glusterd files? >>> >>> On Fri, Jul 22, 2016 at 8:07 PM, Atin Mukherjee <amukherj at redhat.com> >>> wrote: >>> >>>> Glusterd logs from all the nodes please? >>>> >>>> >>>> On Friday 22 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>> >>>>> When we upgrade some nodes from 3.6.1 to 3.7.13, some of the nodes >>>>> give a peer status of "peer rejected" while some dont. Is there a reason >>>>> for this discrepency and will the steps mentioned in >>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>> work for this as well? >>>>> >>>>> Just out of curiosity, why the line "Try the whole procedure a couple >>>>> more times if it doesn't work right away." in the link above? >>>>> >>>> >>>> >>>> -- >>>> Atin >>>> Sent from iPhone >>>> >>> >>> >> >> -- >> Atin >> Sent from iPhone >> > >-- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160725/d9ad7ea6/attachment.html>
Manikandan Selvaganesh
2016-Jul-25 12:11 UTC
[Gluster-users] Issue when upgrading from 3.6 to 3.7
Hi, Could you please attach the vol files, log files and the output of gluster v info? On Mon, Jul 25, 2016 at 5:35 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> > > On Mon, Jul 25, 2016 at 4:37 PM, B.K.Raghuram <bkrram at gmail.com> wrote: > >> Atin, >> >> Couple of quick questions about the upgrade and in general about the >> meaning of some of the parameters in the glusterd dir.. >> >> - I dont see the quota-version in the volume info file post upgrade, so >> did the upgrade not go through properly? >> > > If you are seeing a check sum issue you'd need to copy the same volume > info file to that node where the checksum went wrong and then restart > glusterd service. > And yes, this looks like a bug in quota. @Mani - time to chip in :) > > - What does the op-version in the volume info file mean? Does this have >> any corelation with the cluster op-version? Does it change with an upgrade? >> > > volume's op-version is different. This is basically used in checking > client's compatibility and it shouldn't change with an upgrade AFAIK and > remember from the code. > > >> - A more basic question - should all peer probes always be done from the >> same node or can they be done from any node that is already in the cluster? >> The reason I ask is when I tried to do what was said in >> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >> the initial cluster was initiated from node A with 5 other peers. Then post >> upgrade, node B which was in the cluster got a peer rejected. So I deleted >> all the files except glusterd.info and then did a peer probe of A from >> B. Then when I ran a peer status on A, it only showed one node, B. Should I >> have probed B from A instead? >> > > peer probe can be done from any node in the trusted storage pool. So > that's really not the issue. Ensure you keep all your peer file contents > through out the same (/var/lib/glusterd/peers) where as only self uuid > differs and then restarting glusterd service should solve the problem. > >> >> On Sat, Jul 23, 2016 at 10:48 AM, Atin Mukherjee <amukherj at redhat.com> >> wrote: >> >>> I am suspecting it to be new quota-version introduced in the volume info >>> file which may have resulted in a checksum mismatch resulting into >>> peer rejection. But we can confirm it from log files and respective info >>> file content. >>> >>> >>> On Saturday 23 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>> >>>> Unfortunately, the setup is at a customer's place which is not remotely >>>> accessible. Will try and get it by early next week. But could it just be a >>>> mismatch of the /var/lib/glusterd files? >>>> >>>> On Fri, Jul 22, 2016 at 8:07 PM, Atin Mukherjee <amukherj at redhat.com> >>>> wrote: >>>> >>>>> Glusterd logs from all the nodes please? >>>>> >>>>> >>>>> On Friday 22 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>>> >>>>>> When we upgrade some nodes from 3.6.1 to 3.7.13, some of the nodes >>>>>> give a peer status of "peer rejected" while some dont. Is there a reason >>>>>> for this discrepency and will the steps mentioned in >>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>>> work for this as well? >>>>>> >>>>>> Just out of curiosity, why the line "Try the whole procedure a couple >>>>>> more times if it doesn't work right away." in the link above? >>>>>> >>>>> >>>>> >>>>> -- >>>>> Atin >>>>> Sent from iPhone >>>>> >>>> >>>> >>> >>> -- >>> Atin >>> Sent from iPhone >>> >> >> > > > -- > > --Atin > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-- Regards, Manikandan Selvaganesh. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160725/5e1fc3f6/attachment.html>