Manikandan Selvaganesh
2016-Jul-27 10:19 UTC
[Gluster-users] Issue when upgrading from 3.6 to 3.7
Hi, Sorry for the delay. Apparently, from your config files in the /var/lib/glusterd/glusterd.info the operating-version is still 30700. We have implemented quota-versioning in 3.7.6 and we have another feature(enhancing quota enable/disable performance improvements) implemented in 3.7.12. To use these features, you need to bump up the op version after the upgrade by doing 'gluster v set all cluster.op-version 30712(In case of 3.7.12). I guess this would fix the problem you reported. Let us know otherwise. If this does not fix the issue, please revert us back with the logs. -- Regards, Manikandan Selvaganesh. On Wed, Jul 27, 2016 at 10:51 AM, Manikandan Selvaganesh < mselvaga at redhat.com> wrote:> Hi Ram, > > Apologies. I was stuck on something else. I will update you within the EOD. > > On Wed, Jul 27, 2016 at 10:11 AM, B.K.Raghuram <bkrram at gmail.com> wrote: > >> Hi Manikandan, >> >> Did you have a chance to look at the glusterd config files? We've tried a >> couple of times to upgrade from 3.6.1 and the vol info files never seems to >> get a quota-version flag in it.. One of our installations is stuck at the >> old version because of potential upgrade issues to 3.7.13. >> >> Thanks, >> -Ram >> >> On Mon, Jul 25, 2016 at 6:40 PM, Manikandan Selvaganesh < >> mselvaga at redhat.com> wrote: >> >>> Hi, >>> >>> It would work fine with the upgraded setup on a fresh install. And yes, >>> if quota-version is not present it would cause malfunctioning such as >>> checksum issue, peer rejection and quota would not work properly. This >>> quota-version is introduced recently which adds suffix to the quota related >>> extended attributes. >>> >>> On Jul 25, 2016 6:36 PM, "B.K.Raghuram" <bkrram at gmail.com> wrote: >>> >>>> Manikandan, >>>> >>>> We just overwrote the setup with a fresh install and there I see the >>>> quota-version in the volume info file. For the upgraded setup, I only have >>>> the /var/lib/glusterd, which I'm attaching. Once we recreate this, I'll >>>> send you the rest of the info. >>>> >>>> However, is there an issue if the quota-version is not being in the >>>> info file? Will it cause the quota functionality to malfunction? >>>> >>>> On Mon, Jul 25, 2016 at 5:41 PM, Manikandan Selvaganesh < >>>> mselvaga at redhat.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> Could you please attach the vol files, log files and the output of >>>>> gluster v info? >>>>> >>>>> On Mon, Jul 25, 2016 at 5:35 PM, Atin Mukherjee <amukherj at redhat.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Jul 25, 2016 at 4:37 PM, B.K.Raghuram <bkrram at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Atin, >>>>>>> >>>>>>> Couple of quick questions about the upgrade and in general about the >>>>>>> meaning of some of the parameters in the glusterd dir.. >>>>>>> >>>>>>> - I dont see the quota-version in the volume info file post upgrade, >>>>>>> so did the upgrade not go through properly? >>>>>>> >>>>>> >>>>>> If you are seeing a check sum issue you'd need to copy the same >>>>>> volume info file to that node where the checksum went wrong and then >>>>>> restart glusterd service. >>>>>> And yes, this looks like a bug in quota. @Mani - time to chip in :) >>>>>> >>>>>> - What does the op-version in the volume info file mean? Does this >>>>>>> have any corelation with the cluster op-version? Does it change with an >>>>>>> upgrade? >>>>>>> >>>>>> >>>>>> volume's op-version is different. This is basically used in checking >>>>>> client's compatibility and it shouldn't change with an upgrade AFAIK and >>>>>> remember from the code. >>>>>> >>>>>> >>>>>>> - A more basic question - should all peer probes always be done from >>>>>>> the same node or can they be done from any node that is already in the >>>>>>> cluster? The reason I ask is when I tried to do what was said in >>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>>>> the initial cluster was initiated from node A with 5 other peers. Then post >>>>>>> upgrade, node B which was in the cluster got a peer rejected. So I deleted >>>>>>> all the files except glusterd.info and then did a peer probe of A >>>>>>> from B. Then when I ran a peer status on A, it only showed one node, B. >>>>>>> Should I have probed B from A instead? >>>>>>> >>>>>> >>>>>> peer probe can be done from any node in the trusted storage pool. So >>>>>> that's really not the issue. Ensure you keep all your peer file contents >>>>>> through out the same (/var/lib/glusterd/peers) where as only self uuid >>>>>> differs and then restarting glusterd service should solve the problem. >>>>>> >>>>>>> >>>>>>> On Sat, Jul 23, 2016 at 10:48 AM, Atin Mukherjee < >>>>>>> amukherj at redhat.com> wrote: >>>>>>> >>>>>>>> I am suspecting it to be new quota-version introduced in the volume >>>>>>>> info file which may have resulted in a checksum mismatch resulting into >>>>>>>> peer rejection. But we can confirm it from log files and respective info >>>>>>>> file content. >>>>>>>> >>>>>>>> >>>>>>>> On Saturday 23 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>>>>>> >>>>>>>>> Unfortunately, the setup is at a customer's place which is not >>>>>>>>> remotely accessible. Will try and get it by early next week. But could it >>>>>>>>> just be a mismatch of the /var/lib/glusterd files? >>>>>>>>> >>>>>>>>> On Fri, Jul 22, 2016 at 8:07 PM, Atin Mukherjee < >>>>>>>>> amukherj at redhat.com> wrote: >>>>>>>>> >>>>>>>>>> Glusterd logs from all the nodes please? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Friday 22 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> When we upgrade some nodes from 3.6.1 to 3.7.13, some of the >>>>>>>>>>> nodes give a peer status of "peer rejected" while some dont. Is there a >>>>>>>>>>> reason for this discrepency and will the steps mentioned in >>>>>>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>>>>>>>> work for this as well? >>>>>>>>>>> >>>>>>>>>>> Just out of curiosity, why the line "Try the whole procedure a >>>>>>>>>>> couple more times if it doesn't work right away." in the link above? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Atin >>>>>>>>>> Sent from iPhone >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Atin >>>>>>>> Sent from iPhone >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> --Atin >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Manikandan Selvaganesh. >>>>> >>>> >>>> >> > > > -- > Regards, > Manikandan Selvaganesh. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160727/02eb8317/attachment.html>
Thanks a lot! Yes, I did upgrade to 3.7.13 but was unaware of the new cluster op-version. Could this incorrect op-version have been the cause for some of the peers being in the rejected state after an upgrade? On Wed, Jul 27, 2016 at 3:49 PM, Manikandan Selvaganesh <mselvaga at redhat.com> wrote:> Hi, > > Sorry for the delay. Apparently, from your config files in the > /var/lib/glusterd/glusterd.info the operating-version > is still 30700. We have implemented quota-versioning in 3.7.6 and we have > another feature(enhancing quota > enable/disable performance improvements) implemented in 3.7.12. > > To use these features, you need to bump up the op version after the > upgrade by doing > 'gluster v set all cluster.op-version 30712(In case of 3.7.12). I guess > this would fix the problem you reported. > Let us know otherwise. If this does not fix the issue, please revert us > back with the logs. > > -- > Regards, > Manikandan Selvaganesh. > > > On Wed, Jul 27, 2016 at 10:51 AM, Manikandan Selvaganesh < > mselvaga at redhat.com> wrote: > >> Hi Ram, >> >> Apologies. I was stuck on something else. I will update you within the >> EOD. >> >> On Wed, Jul 27, 2016 at 10:11 AM, B.K.Raghuram <bkrram at gmail.com> wrote: >> >>> Hi Manikandan, >>> >>> Did you have a chance to look at the glusterd config files? We've tried >>> a couple of times to upgrade from 3.6.1 and the vol info files never seems >>> to get a quota-version flag in it.. One of our installations is stuck at >>> the old version because of potential upgrade issues to 3.7.13. >>> >>> Thanks, >>> -Ram >>> >>> On Mon, Jul 25, 2016 at 6:40 PM, Manikandan Selvaganesh < >>> mselvaga at redhat.com> wrote: >>> >>>> Hi, >>>> >>>> It would work fine with the upgraded setup on a fresh install. And yes, >>>> if quota-version is not present it would cause malfunctioning such as >>>> checksum issue, peer rejection and quota would not work properly. This >>>> quota-version is introduced recently which adds suffix to the quota related >>>> extended attributes. >>>> >>>> On Jul 25, 2016 6:36 PM, "B.K.Raghuram" <bkrram at gmail.com> wrote: >>>> >>>>> Manikandan, >>>>> >>>>> We just overwrote the setup with a fresh install and there I see the >>>>> quota-version in the volume info file. For the upgraded setup, I only have >>>>> the /var/lib/glusterd, which I'm attaching. Once we recreate this, I'll >>>>> send you the rest of the info. >>>>> >>>>> However, is there an issue if the quota-version is not being in the >>>>> info file? Will it cause the quota functionality to malfunction? >>>>> >>>>> On Mon, Jul 25, 2016 at 5:41 PM, Manikandan Selvaganesh < >>>>> mselvaga at redhat.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Could you please attach the vol files, log files and the output of >>>>>> gluster v info? >>>>>> >>>>>> On Mon, Jul 25, 2016 at 5:35 PM, Atin Mukherjee <amukherj at redhat.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jul 25, 2016 at 4:37 PM, B.K.Raghuram <bkrram at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Atin, >>>>>>>> >>>>>>>> Couple of quick questions about the upgrade and in general about >>>>>>>> the meaning of some of the parameters in the glusterd dir.. >>>>>>>> >>>>>>>> - I dont see the quota-version in the volume info file post >>>>>>>> upgrade, so did the upgrade not go through properly? >>>>>>>> >>>>>>> >>>>>>> If you are seeing a check sum issue you'd need to copy the same >>>>>>> volume info file to that node where the checksum went wrong and then >>>>>>> restart glusterd service. >>>>>>> And yes, this looks like a bug in quota. @Mani - time to chip in :) >>>>>>> >>>>>>> - What does the op-version in the volume info file mean? Does this >>>>>>>> have any corelation with the cluster op-version? Does it change with an >>>>>>>> upgrade? >>>>>>>> >>>>>>> >>>>>>> volume's op-version is different. This is basically used in checking >>>>>>> client's compatibility and it shouldn't change with an upgrade AFAIK and >>>>>>> remember from the code. >>>>>>> >>>>>>> >>>>>>>> - A more basic question - should all peer probes always be done >>>>>>>> from the same node or can they be done from any node that is already in the >>>>>>>> cluster? The reason I ask is when I tried to do what was said in >>>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>>>>> the initial cluster was initiated from node A with 5 other peers. Then post >>>>>>>> upgrade, node B which was in the cluster got a peer rejected. So I deleted >>>>>>>> all the files except glusterd.info and then did a peer probe of A >>>>>>>> from B. Then when I ran a peer status on A, it only showed one node, B. >>>>>>>> Should I have probed B from A instead? >>>>>>>> >>>>>>> >>>>>>> peer probe can be done from any node in the trusted storage pool. >>>>>>> So that's really not the issue. Ensure you keep all your peer file contents >>>>>>> through out the same (/var/lib/glusterd/peers) where as only self uuid >>>>>>> differs and then restarting glusterd service should solve the problem. >>>>>>> >>>>>>>> >>>>>>>> On Sat, Jul 23, 2016 at 10:48 AM, Atin Mukherjee < >>>>>>>> amukherj at redhat.com> wrote: >>>>>>>> >>>>>>>>> I am suspecting it to be new quota-version introduced in the >>>>>>>>> volume info file which may have resulted in a checksum mismatch resulting >>>>>>>>> into peer rejection. But we can confirm it from log files and respective >>>>>>>>> info file content. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Saturday 23 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Unfortunately, the setup is at a customer's place which is not >>>>>>>>>> remotely accessible. Will try and get it by early next week. But could it >>>>>>>>>> just be a mismatch of the /var/lib/glusterd files? >>>>>>>>>> >>>>>>>>>> On Fri, Jul 22, 2016 at 8:07 PM, Atin Mukherjee < >>>>>>>>>> amukherj at redhat.com> wrote: >>>>>>>>>> >>>>>>>>>>> Glusterd logs from all the nodes please? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Friday 22 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> When we upgrade some nodes from 3.6.1 to 3.7.13, some of the >>>>>>>>>>>> nodes give a peer status of "peer rejected" while some dont. Is there a >>>>>>>>>>>> reason for this discrepency and will the steps mentioned in >>>>>>>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ >>>>>>>>>>>> work for this as well? >>>>>>>>>>>> >>>>>>>>>>>> Just out of curiosity, why the line "Try the whole procedure a >>>>>>>>>>>> couple more times if it doesn't work right away." in the link above? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Atin >>>>>>>>>>> Sent from iPhone >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Atin >>>>>>>>> Sent from iPhone >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> --Atin >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Manikandan Selvaganesh. >>>>>> >>>>> >>>>> >>> >> >> >> -- >> Regards, >> Manikandan Selvaganesh. >> > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160727/24e3e249/attachment.html>