thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Darrell Budic

2019-Apr-03 20:37 UTC

[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

Hari-

I was upgrading my test cluster from 5.5 to 6 and I hit this bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1694010
<https://bugzilla.redhat.com/show_bug.cgi?id=1694010>) or something
similar. In my case, the workaround did not work, and I was left with a gluster
that had gone into no-quorum mode and stopped all the bricks. Wasn?t much in the
logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the
same as the newer versions, so I updated them, restarted glusterd, and suddenly
the updated node showed as peer-in-cluster again. Once I updated other notes the
same way, things started working again. Maybe a place to look?

My old config (all nodes):
volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option transport.socket.read-fail-log off
    option ping-timeout 10
    option event-threads 1
    option rpc-auth-allow-insecure on
#   option transport.address-family inet6
#   option base-port 49152
end-volume

changed to:
volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option transport.socket.read-fail-log off
    option transport.socket.listen-port 24007
    option transport.rdma.listen-port 24008
    option ping-timeout 0
    option event-threads 1
    option rpc-auth-allow-insecure on
#   option lock-timer 180
#   option transport.address-family inet6
#   option base-port 49152
    option max-port  60999
end-volume

the only thing I found in the glusterd logs that looks relevant was (repeated
for both of the other nodes in this cluster), so no clue why it happened:
[2019-04-03 20:19:16.802638] I [MSGID: 106004]
[glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
<ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state
<Peer in Cluster>, has disconnected from glusterd.

> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee <atin.mukherjee83 at
gmail.com> wrote:
> 
> 
> 
> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham <hgowtham at redhat.com
<mailto:hgowtham at redhat.com>> wrote:
> Comments inline.
> 
> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay
> <sankarshan.mukhopadhyay at gmail.com <mailto:sankarshan.mukhopadhyay
at gmail.com>> wrote:
> >
> > Quite a considerable amount of detail here. Thank you!
> >
> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowtham at
redhat.com <mailto:hgowtham at redhat.com>> wrote:
> > >
> > > Hello Gluster users,
> > >
> > > As you all aware that glusterfs-6 is out, we would like to inform
you
> > > that, we have spent a significant amount of time in testing
> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to
> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3.
> > >
> > > As glusterfs-6 has got in a lot of changes, we wanted to test
those portions.
> > > There were xlators (and respective options to enable/disable
them)
> > > added and deprecated in glusterfs-6 from various versions [1].
> > >
> > > We had to check the following upgrade scenarios for all such
options
> > > Identified in [1]:
> > > 1) option never enabled and upgraded
> > > 2) option enabled and then upgraded
> > > 3) option enabled and then disabled and then upgraded
> > >
> > > We weren't manually able to check all the combinations for
all the options.
> > > So the options involving enabling and disabling xlators were
prioritized.
> > > The below are the result of the ones tested.
> > >
> > > Never enabled and upgraded:
> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works.
> > >
> > > Enabled and upgraded:
> > > Tested for tier which is deprecated, It is not a recommended
upgrade.
> > > As expected the volume won't be consumable and will have a
few more
> > > issues as well.
> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade.
> > >
> > > Enabled, disabled before upgrade.
> > > Tested for tier with 3.12 and the upgrade went fine.
> > >
> > > There is one common issue to note in every upgrade. The node
being
> > > upgraded is going into disconnected state. You have to flush the
iptables
> > > and the restart glusterd on all nodes to fix this.
> > >
> >
> > Is this something that is written in the upgrade notes? I do not seem
> > to recall, if not, I'll send a PR
> 
> No this wasn't mentioned in the release notes. PRs are welcome.
> 
> >
> > > The testing for enabling new options is still pending. The new
options
> > > won't cause as much issues as the deprecated ones so this was
put at
> > > the end of the priority list. It would be nice to get
contributions
> > > for this.
> > >
> >
> > Did the range of tests lead to any new issues?
> 
> Yes. In the first round of testing we found an issue and had to postpone
the
> release of 6 until the fix was made available.
> https://bugzilla.redhat.com/show_bug.cgi?id=1684029
<https://bugzilla.redhat.com/show_bug.cgi?id=1684029>
> 
> And then we tested it again after this patch was made available.
> and came  across this:
> https://bugzilla.redhat.com/show_bug.cgi?id=1694010
<https://bugzilla.redhat.com/show_bug.cgi?id=1694010>
> 
> This isn?t a bug as we found that upgrade worked seamelessly in two
different setup. So we have no issues in the upgrade path to glusterfs-6
release.
> 
>  <https://bugzilla.redhat.com/show_bug.cgi?id=1694010>
> 
> Have mentioned this in the second mail as to how to over this situation
> for now until the fix is available.
> 
> >
> > > For the disable testing, tier was used as it covers most of the
xlator
> > > that was removed. And all of these tests were done on a replica 3
volume.
> > >
> >
> > I'm not sure if the Glusto team is reading this, but it would be
> > pertinent to understand if the approach you have taken can be
> > converted into a form of automated testing pre-release.
> 
> I don't have an answer for this, have CCed Vijay.
> He might have an idea.
> 
> >
> > > Note: This is only for upgrade testing of the newly added and
removed
> > > xlators. Does not involve the normal tests for the xlator.
> > >
> > > If you have any questions, please feel free to reach us.
> > >
> > > [1]
https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing
<https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing>
> > >
> > > Regards,
> > > Hari and Sanju.
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
> > https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> 
> 
> -- 
> Regards,
> Hari Gowtham.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
> -- 
> --Atin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190403/1d40c74a/attachment.html>

Sanju Rakonde

2019-Apr-04 07:54 UTC

head link

[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while
upgrading to glusterfs-6. We tested it in different setups and understood
that this issue is seen because of some issue in setup.

regarding the issue you have faced, can you please let us know which
documentation you have followed for the upgrade? During our testing, we
didn't hit any such issue. we would like to understand what went wrong.

On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic <budic at onholyground.com>
wrote:
> Hari-
>
> I was upgrading my test cluster from 5.5 to 6 and I hit this bug (
> https://bugzilla.redhat.com/show_bug.cgi?id=1694010) or something
> similar. In my case, the workaround did not work, and I was left with a
> gluster that had gone into no-quorum mode and stopped all the bricks.
> Wasn?t much in the logs either, but I noticed my
> /etc/glusterfs/glusterd.vol files were not the same as the newer versions,
> so I updated them, restarted glusterd, and suddenly the updated node showed
> as peer-in-cluster again. Once I updated other notes the same way, things
> started working again. Maybe a place to look?
>
> My old config (all nodes):
> volume management
>     type mgmt/glusterd
>     option working-directory /var/lib/glusterd
>     option transport-type socket
>     option transport.socket.keepalive-time 10
>     option transport.socket.keepalive-interval 2
>     option transport.socket.read-fail-log off
>     option ping-timeout 10
>     option event-threads 1
>     option rpc-auth-allow-insecure on
> #   option transport.address-family inet6
> #   option base-port 49152
> end-volume
>
> changed to:
> volume management
>     type mgmt/glusterd
>     option working-directory /var/lib/glusterd
>     option transport-type socket,rdma
>     option transport.socket.keepalive-time 10
>     option transport.socket.keepalive-interval 2
>     option transport.socket.read-fail-log off
>     option transport.socket.listen-port 24007
>     option transport.rdma.listen-port 24008
>     option ping-timeout 0
>     option event-threads 1
>     option rpc-auth-allow-insecure on
> #   option lock-timer 180
> #   option transport.address-family inet6
> #   option base-port 49152
>     option max-port  60999
> end-volume
>
> the only thing I found in the glusterd logs that looks relevant was
> (repeated for both of the other nodes in this cluster), so no clue why it
> happened:
> [2019-04-03 20:19:16.802638] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
> <ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in
state <Peer in
> Cluster>, has disconnected from glusterd.
>
>
> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee <atin.mukherjee83 at
gmail.com>
> wrote:
>
>
>
> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham <hgowtham at redhat.com>
wrote:
>
>> Comments inline.
>>
>> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay
>> <sankarshan.mukhopadhyay at gmail.com> wrote:
>> >
>> > Quite a considerable amount of detail here. Thank you!
>> >
>> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowtham at
redhat.com>
>> wrote:
>> > >
>> > > Hello Gluster users,
>> > >
>> > > As you all aware that glusterfs-6 is out, we would like to
inform you
>> > > that, we have spent a significant amount of time in testing
>> > > glusterfs-6 in upgrade scenarios. We have done upgrade
testing to
>> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3.
>> > >
>> > > As glusterfs-6 has got in a lot of changes, we wanted to test
those
>> portions.
>> > > There were xlators (and respective options to enable/disable
them)
>> > > added and deprecated in glusterfs-6 from various versions
[1].
>> > >
>> > > We had to check the following upgrade scenarios for all such
options
>> > > Identified in [1]:
>> > > 1) option never enabled and upgraded
>> > > 2) option enabled and then upgraded
>> > > 3) option enabled and then disabled and then upgraded
>> > >
>> > > We weren't manually able to check all the combinations
for all the
>> options.
>> > > So the options involving enabling and disabling xlators were
>> prioritized.
>> > > The below are the result of the ones tested.
>> > >
>> > > Never enabled and upgraded:
>> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works.
>> > >
>> > > Enabled and upgraded:
>> > > Tested for tier which is deprecated, It is not a recommended
upgrade.
>> > > As expected the volume won't be consumable and will have
a few more
>> > > issues as well.
>> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade.
>> > >
>> > > Enabled, disabled before upgrade.
>> > > Tested for tier with 3.12 and the upgrade went fine.
>> > >
>> > > There is one common issue to note in every upgrade. The node
being
>> > > upgraded is going into disconnected state. You have to flush
the
>> iptables
>> > > and the restart glusterd on all nodes to fix this.
>> > >
>> >
>> > Is this something that is written in the upgrade notes? I do not
seem
>> > to recall, if not, I'll send a PR
>>
>> No this wasn't mentioned in the release notes. PRs are welcome.
>>
>> >
>> > > The testing for enabling new options is still pending. The
new options
>> > > won't cause as much issues as the deprecated ones so this
was put at
>> > > the end of the priority list. It would be nice to get
contributions
>> > > for this.
>> > >
>> >
>> > Did the range of tests lead to any new issues?
>>
>> Yes. In the first round of testing we found an issue and had to
postpone
>> the
>> release of 6 until the fix was made available.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1684029
>>
>> And then we tested it again after this patch was made available.
>> and came  across this:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1694010
>
>
> This isn?t a bug as we found that upgrade worked seamelessly in two
> different setup. So we have no issues in the upgrade path to glusterfs-6
> release.
>
> <https://bugzilla.redhat.com/show_bug.cgi?id=1694010>
>>
>> Have mentioned this in the second mail as to how to over this situation
>> for now until the fix is available.
>>
>> >
>> > > For the disable testing, tier was used as it covers most of
the xlator
>> > > that was removed. And all of these tests were done on a
replica 3
>> volume.
>> > >
>> >
>> > I'm not sure if the Glusto team is reading this, but it would
be
>> > pertinent to understand if the approach you have taken can be
>> > converted into a form of automated testing pre-release.
>>
>> I don't have an answer for this, have CCed Vijay.
>> He might have an idea.
>>
>> >
>> > > Note: This is only for upgrade testing of the newly added and
removed
>> > > xlators. Does not involve the normal tests for the xlator.
>> > >
>> > > If you have any questions, please feel free to reach us.
>> > >
>> > > [1]
>>
https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing
>> > >
>> > > Regards,
>> > > Hari and Sanju.
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> --
>> Regards,
>> Hari Gowtham.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
> --
> --Atin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users


-- 
Thanks,
Sanju
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190404/f0af062a/attachment.html>

Gluster users - Apr 2019 - [Gluster-devel] Upgrade testing to gluster 6

[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

[Gluster-users] [Gluster-devel] Upgrade testing to gluster 6