Darrell Budic
2019-Apr-03 20:37 UTC
[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6
Hari- I was upgrading my test cluster from 5.5 to 6 and I hit this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1694010 <https://bugzilla.redhat.com/show_bug.cgi?id=1694010>) or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn?t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look? My old config (all nodes): volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option ping-timeout 10 option event-threads 1 option rpc-auth-allow-insecure on # option transport.address-family inet6 # option base-port 49152 end-volume changed to: volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option transport.socket.listen-port 24007 option transport.rdma.listen-port 24008 option ping-timeout 0 option event-threads 1 option rpc-auth-allow-insecure on # option lock-timer 180 # option transport.address-family inet6 # option base-port 49152 option max-port 60999 end-volume the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened: [2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer <ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state <Peer in Cluster>, has disconnected from glusterd.> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee <atin.mukherjee83 at gmail.com> wrote: > > > > On Mon, 1 Apr 2019 at 10:28, Hari Gowtham <hgowtham at redhat.com <mailto:hgowtham at redhat.com>> wrote: > Comments inline. > > On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay > <sankarshan.mukhopadhyay at gmail.com <mailto:sankarshan.mukhopadhyay at gmail.com>> wrote: > > > > Quite a considerable amount of detail here. Thank you! > > > > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowtham at redhat.com <mailto:hgowtham at redhat.com>> wrote: > > > > > > Hello Gluster users, > > > > > > As you all aware that glusterfs-6 is out, we would like to inform you > > > that, we have spent a significant amount of time in testing > > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > > > > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. > > > There were xlators (and respective options to enable/disable them) > > > added and deprecated in glusterfs-6 from various versions [1]. > > > > > > We had to check the following upgrade scenarios for all such options > > > Identified in [1]: > > > 1) option never enabled and upgraded > > > 2) option enabled and then upgraded > > > 3) option enabled and then disabled and then upgraded > > > > > > We weren't manually able to check all the combinations for all the options. > > > So the options involving enabling and disabling xlators were prioritized. > > > The below are the result of the ones tested. > > > > > > Never enabled and upgraded: > > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > > > > > Enabled and upgraded: > > > Tested for tier which is deprecated, It is not a recommended upgrade. > > > As expected the volume won't be consumable and will have a few more > > > issues as well. > > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > > > > > Enabled, disabled before upgrade. > > > Tested for tier with 3.12 and the upgrade went fine. > > > > > > There is one common issue to note in every upgrade. The node being > > > upgraded is going into disconnected state. You have to flush the iptables > > > and the restart glusterd on all nodes to fix this. > > > > > > > Is this something that is written in the upgrade notes? I do not seem > > to recall, if not, I'll send a PR > > No this wasn't mentioned in the release notes. PRs are welcome. > > > > > > The testing for enabling new options is still pending. The new options > > > won't cause as much issues as the deprecated ones so this was put at > > > the end of the priority list. It would be nice to get contributions > > > for this. > > > > > > > Did the range of tests lead to any new issues? > > Yes. In the first round of testing we found an issue and had to postpone the > release of 6 until the fix was made available. > https://bugzilla.redhat.com/show_bug.cgi?id=1684029 <https://bugzilla.redhat.com/show_bug.cgi?id=1684029> > > And then we tested it again after this patch was made available. > and came across this: > https://bugzilla.redhat.com/show_bug.cgi?id=1694010 <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> > > This isn?t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release. > > <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> > > Have mentioned this in the second mail as to how to over this situation > for now until the fix is available. > > > > > > For the disable testing, tier was used as it covers most of the xlator > > > that was removed. And all of these tests were done on a replica 3 volume. > > > > > > > I'm not sure if the Glusto team is reading this, but it would be > > pertinent to understand if the approach you have taken can be > > converted into a form of automated testing pre-release. > > I don't have an answer for this, have CCed Vijay. > He might have an idea. > > > > > > Note: This is only for upgrade testing of the newly added and removed > > > xlators. Does not involve the normal tests for the xlator. > > > > > > If you have any questions, please feel free to reach us. > > > > > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing <https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing> > > > > > > Regards, > > > Hari and Sanju. > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > > https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users> > > > > -- > Regards, > Hari Gowtham. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users> > -- > --Atin > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190403/1d40c74a/attachment.html>
Sanju Rakonde
2019-Apr-04 07:54 UTC
[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6
We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while upgrading to glusterfs-6. We tested it in different setups and understood that this issue is seen because of some issue in setup. regarding the issue you have faced, can you please let us know which documentation you have followed for the upgrade? During our testing, we didn't hit any such issue. we would like to understand what went wrong. On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic <budic at onholyground.com> wrote:> Hari- > > I was upgrading my test cluster from 5.5 to 6 and I hit this bug ( > https://bugzilla.redhat.com/show_bug.cgi?id=1694010) or something > similar. In my case, the workaround did not work, and I was left with a > gluster that had gone into no-quorum mode and stopped all the bricks. > Wasn?t much in the logs either, but I noticed my > /etc/glusterfs/glusterd.vol files were not the same as the newer versions, > so I updated them, restarted glusterd, and suddenly the updated node showed > as peer-in-cluster again. Once I updated other notes the same way, things > started working again. Maybe a place to look? > > My old config (all nodes): > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option ping-timeout 10 > option event-threads 1 > option rpc-auth-allow-insecure on > # option transport.address-family inet6 > # option base-port 49152 > end-volume > > changed to: > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option transport.socket.listen-port 24007 > option transport.rdma.listen-port 24008 > option ping-timeout 0 > option event-threads 1 > option rpc-auth-allow-insecure on > # option lock-timer 180 > # option transport.address-family inet6 > # option base-port 49152 > option max-port 60999 > end-volume > > the only thing I found in the glusterd logs that looks relevant was > (repeated for both of the other nodes in this cluster), so no clue why it > happened: > [2019-04-03 20:19:16.802638] I [MSGID: 106004] > [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer > <ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state <Peer in > Cluster>, has disconnected from glusterd. > > > On Apr 2, 2019, at 4:53 AM, Atin Mukherjee <atin.mukherjee83 at gmail.com> > wrote: > > > > On Mon, 1 Apr 2019 at 10:28, Hari Gowtham <hgowtham at redhat.com> wrote: > >> Comments inline. >> >> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >> <sankarshan.mukhopadhyay at gmail.com> wrote: >> > >> > Quite a considerable amount of detail here. Thank you! >> > >> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowtham at redhat.com> >> wrote: >> > > >> > > Hello Gluster users, >> > > >> > > As you all aware that glusterfs-6 is out, we would like to inform you >> > > that, we have spent a significant amount of time in testing >> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >> > > >> > > As glusterfs-6 has got in a lot of changes, we wanted to test those >> portions. >> > > There were xlators (and respective options to enable/disable them) >> > > added and deprecated in glusterfs-6 from various versions [1]. >> > > >> > > We had to check the following upgrade scenarios for all such options >> > > Identified in [1]: >> > > 1) option never enabled and upgraded >> > > 2) option enabled and then upgraded >> > > 3) option enabled and then disabled and then upgraded >> > > >> > > We weren't manually able to check all the combinations for all the >> options. >> > > So the options involving enabling and disabling xlators were >> prioritized. >> > > The below are the result of the ones tested. >> > > >> > > Never enabled and upgraded: >> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >> > > >> > > Enabled and upgraded: >> > > Tested for tier which is deprecated, It is not a recommended upgrade. >> > > As expected the volume won't be consumable and will have a few more >> > > issues as well. >> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >> > > >> > > Enabled, disabled before upgrade. >> > > Tested for tier with 3.12 and the upgrade went fine. >> > > >> > > There is one common issue to note in every upgrade. The node being >> > > upgraded is going into disconnected state. You have to flush the >> iptables >> > > and the restart glusterd on all nodes to fix this. >> > > >> > >> > Is this something that is written in the upgrade notes? I do not seem >> > to recall, if not, I'll send a PR >> >> No this wasn't mentioned in the release notes. PRs are welcome. >> >> > >> > > The testing for enabling new options is still pending. The new options >> > > won't cause as much issues as the deprecated ones so this was put at >> > > the end of the priority list. It would be nice to get contributions >> > > for this. >> > > >> > >> > Did the range of tests lead to any new issues? >> >> Yes. In the first round of testing we found an issue and had to postpone >> the >> release of 6 until the fix was made available. >> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >> >> And then we tested it again after this patch was made available. >> and came across this: >> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 > > > This isn?t a bug as we found that upgrade worked seamelessly in two > different setup. So we have no issues in the upgrade path to glusterfs-6 > release. > > <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> >> >> Have mentioned this in the second mail as to how to over this situation >> for now until the fix is available. >> >> > >> > > For the disable testing, tier was used as it covers most of the xlator >> > > that was removed. And all of these tests were done on a replica 3 >> volume. >> > > >> > >> > I'm not sure if the Glusto team is reading this, but it would be >> > pertinent to understand if the approach you have taken can be >> > converted into a form of automated testing pre-release. >> >> I don't have an answer for this, have CCed Vijay. >> He might have an idea. >> >> > >> > > Note: This is only for upgrade testing of the newly added and removed >> > > xlators. Does not involve the normal tests for the xlator. >> > > >> > > If you have any questions, please feel free to reach us. >> > > >> > > [1] >> https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >> > > >> > > Regards, >> > > Hari and Sanju. >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> -- >> Regards, >> Hari Gowtham. >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > -- > --Atin > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Thanks, Sanju -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190404/f0af062a/attachment.html>