thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] 3.7.5 upgrade issues [Oct 2015]

If this information is useful, please help other people find it:
Share via:

Alan Orth

2015-Oct-26 10:48 UTC

[Gluster-users] [Gluster-devel] 3.7.5 upgrade issues

Hi,

We're debating updating from 3.5.x to 3.7.x soon on our 2x2 replica set and
these upgrade issues are a bit worrying. Can I hear a few voices from
people who have had positive experiences? :)

Thanks,

Alan

On Fri, Oct 23, 2015 at 6:32 PM, JuanFra Rodr?guez Cardoso <
jfrodriguez at keedio.com> wrote:
> I had that problem too, but I'm not able to fix it. I was forced to
> downgrade to 3.7.4 to continue running my gluster volumes.
>
> The upgrading process (3.7.4 -> 3.7.5) does not seem fully reliable.
>
> Best.
>
> .....................................................................
>
> Juan Francisco Rodr?guez Cardoso
>
> jfrodriguez at keedio.com | +34 636 69 26 91
>
> www.keedio.com
>
> .....................................................................
>
> On 16 October 2015 at 15:24, David Robinson <david.robinson at
corvidtec.com>
> wrote:
>
>> That log was the frick one, which is the node that I upgraded.  The
frack
>> one is attached.  One thing I did notice was the errors below in the
etc
>> log file.  The /usr/lib64/glusterfs/3.7.5 directory doesn't exist
yet on
>> frack.
>>
>>
>>
+------------------------------------------------------------------------------+
>> [2015-10-16 12:04:06.235993] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 2
>> [2015-10-16 12:04:06.236036] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2015-10-16 12:04:06.236099] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 2
>> [2015-10-16 12:04:09.242413] E [socket.c:2278:socket_connect_finish]
>> 0-management: connection to 10.200.82.1:24007 failed (No route to host)
>> [2015-10-16 12:04:09.242504] I [MSGID: 106004]
>> [glusterd-handler.c:5056:__glusterd_peer_rpc_notify] 0-management: Peer
<
>> frackib01.corvidtec.com>
(<8ab9a966-d536-4bd1-828a-64b2d72c47ca>), in
>> state <Peer in Cluster>, has disconnected from glusterd.
>> [2015-10-16 12:04:09.726895] W [socket.c:869:__socket_keepalive]
>> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14, Invalid
>> argument
>> [2015-10-16 12:04:09.726918] E [socket.c:2965:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2015-10-16 12:04:09.902756] W [MSGID: 101095]
>> [xlator.c:143:xlator_volopt_dynload] 0-xlator:
>> */usr/lib64/glusterfs/3.7.5/xlator/rpc-transport/socket.so:* cannot
open
>> shared object file: No such file or directory
>>
>>
>> ------ Original Message ------
>> From: "Mohammed Rafi K C" <rkavunga at redhat.com>
>> To: "David Robinson" <drobinson at corvidtec.com>;
"
>> gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster Devel" <
>> gluster-devel at gluster.org>
>> Sent: 10/16/2015 8:43:21 AM
>> Subject: Re: [Gluster-devel] 3.7.5 upgrade issues
>>
>>
>> Hi David,
>>
>> The logs you attached, are they from node
"frackib01.corvidtec.com", if
>> not can you attach logs from the node
"frackib01.corvidtec.com" ?
>>
>> Regards
>> Rafi KC
>> On 10/16/2015 05:46 PM, David Robinson wrote:
>>
>> I have a replica pair setup that I was trying to upgrade from 3.7.4 to
>> 3.7.5.
>> After upgrading the rpm packages (rpm -Uvh *.rpm) and rebooting one of
>> the nodes, I am now receiving the following:
>>
>> [root at frick01 log]# gluster volume status
>> Staging failed on frackib01.corvidtec.com. Please check log file for
>> details.
>>
>>
>>
>> The logs are attached and my setup is shown below.  Can anyone help?
>>
>> [root at frick01 log]# gluster volume info
>>
>> Volume Name: gfs
>> Type: Replicate
>> Volume ID: abc63b5c-bed7-4e3d-9057-00930a2d85d3
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp,rdma
>> Bricks:
>> Brick1: frickib01.corvidtec.com:/data/brick01/gfs
>> Brick2: frackib01.corvidtec.com:/data/brick01/gfs
>> Options Reconfigured:
>> storage.owner-gid: 100
>> server.allow-insecure: on
>> performance.readdir-ahead: on
>> server.event-threads: 4
>> client.event-threads: 4
>> David
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing listGluster-devel at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>


-- 
Alan Orth
alan.orth at gmail.com
https://alaninkenya.org
https://mjanja.ch
"In heaven all the interesting people are missing." -Friedrich
Nietzsche
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151026/dfb71475/attachment.html>

JuanFra Rodríguez Cardoso

2015-Oct-26 10:59 UTC

head link

[Gluster-users] [Gluster-devel] 3.7.5 upgrade issues

I have replicated my upgradable environment in a testing lab with the
following configuration:

Distributed gluster (one brick per node)
- Node gluster-1: glusterfs version 3.7.4
- Node gluster-2: glusterfs version 3.7.4
- Node gluster-3: glusterfs version 3.7.4

I began by upgrading only the first node to newest version (3.7.5).

root at gluster-1 ~]# gluster --version
glusterfs 3.7.5 built on Oct  7 2015 16:27:05

When I tried to request the status of gluster volume, I got these error
messages:

[root at gluster-1 ~]# gluster volume status
Staging failed on gluster-2. Please check log file for details.
Staging failed on gluster-3. Please check log file for details.

In node gluster-2, tailed messages from
/var/log/glusterfs/etc-glusterfs-glusterd.vol.log:

[2015-10-26 10:50:16.378672] E [MSGID: 106062]
[glusterd-volume-ops.c:1796:glusterd_op_stage_heal_volume] 0-glusterd:
Unable to get volume name
[2015-10-26 10:50:16.378735] E [MSGID: 106301]
[glusterd-op-sm.c:5171:glusterd_op_ac_stage_op] 0-management: Stage failed
on operation 'Volume Heal', Status : -2


On the other hand, if I upgrade all the nodes at the same time, everything
seems working fine!

The issue may be when nodes have different versions (3.7.4 and 3.7.5).
Is this a normal behavior? It is needed to stop the entire cluster?


Regards,

.....................................................................

Juan Francisco Rodr?guez Cardoso

jfrodriguez at keedio.com | +34 636 69 26 91

www.keedio.com

.....................................................................

On 26 October 2015 at 11:48, Alan Orth <alan.orth at gmail.com> wrote:
> Hi,
>
> We're debating updating from 3.5.x to 3.7.x soon on our 2x2 replica set
> and these upgrade issues are a bit worrying. Can I hear a few voices from
> people who have had positive experiences? :)
>
> Thanks,
>
> Alan
>
> On Fri, Oct 23, 2015 at 6:32 PM, JuanFra Rodr?guez Cardoso <
> jfrodriguez at keedio.com> wrote:
>
>> I had that problem too, but I'm not able to fix it. I was forced to
>> downgrade to 3.7.4 to continue running my gluster volumes.
>>
>> The upgrading process (3.7.4 -> 3.7.5) does not seem fully reliable.
>>
>> Best.
>>
>> .....................................................................
>>
>> Juan Francisco Rodr?guez Cardoso
>>
>> jfrodriguez at keedio.com | +34 636 69 26 91
>>
>> www.keedio.com
>>
>> .....................................................................
>>
>> On 16 October 2015 at 15:24, David Robinson <david.robinson at
corvidtec.com
>> > wrote:
>>
>>> That log was the frick one, which is the node that I upgraded.  The
>>> frack one is attached.  One thing I did notice was the errors below
in the
>>> etc log file.  The /usr/lib64/glusterfs/3.7.5 directory doesn't
exist yet
>>> on frack.
>>>
>>>
>>>
+------------------------------------------------------------------------------+
>>> [2015-10-16 12:04:06.235993] I [MSGID: 101190]
>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
>>> with index 2
>>> [2015-10-16 12:04:06.236036] I [MSGID: 101190]
>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
>>> with index 1
>>> [2015-10-16 12:04:06.236099] I [MSGID: 101190]
>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
>>> with index 2
>>> [2015-10-16 12:04:09.242413] E
[socket.c:2278:socket_connect_finish]
>>> 0-management: connection to 10.200.82.1:24007 failed (No route to
host)
>>> [2015-10-16 12:04:09.242504] I [MSGID: 106004]
>>> [glusterd-handler.c:5056:__glusterd_peer_rpc_notify] 0-management:
Peer <
>>> frackib01.corvidtec.com>
(<8ab9a966-d536-4bd1-828a-64b2d72c47ca>), in
>>> state <Peer in Cluster>, has disconnected from glusterd.
>>> [2015-10-16 12:04:09.726895] W [socket.c:869:__socket_keepalive]
>>> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14,
Invalid
>>> argument
>>> [2015-10-16 12:04:09.726918] E [socket.c:2965:socket_connect]
>>> 0-management: Failed to set keep-alive: Invalid argument
>>> [2015-10-16 12:04:09.902756] W [MSGID: 101095]
>>> [xlator.c:143:xlator_volopt_dynload] 0-xlator:
>>> */usr/lib64/glusterfs/3.7.5/xlator/rpc-transport/socket.so:* cannot
>>> open shared object file: No such file or directory
>>>
>>>
>>> ------ Original Message ------
>>> From: "Mohammed Rafi K C" <rkavunga at redhat.com>
>>> To: "David Robinson" <drobinson at corvidtec.com>;
"
>>> gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster Devel"
>>> <gluster-devel at gluster.org>
>>> Sent: 10/16/2015 8:43:21 AM
>>> Subject: Re: [Gluster-devel] 3.7.5 upgrade issues
>>>
>>>
>>> Hi David,
>>>
>>> The logs you attached, are they from node
"frackib01.corvidtec.com", if
>>> not can you attach logs from the node
"frackib01.corvidtec.com" ?
>>>
>>> Regards
>>> Rafi KC
>>> On 10/16/2015 05:46 PM, David Robinson wrote:
>>>
>>> I have a replica pair setup that I was trying to upgrade from 3.7.4
to
>>> 3.7.5.
>>> After upgrading the rpm packages (rpm -Uvh *.rpm) and rebooting one
of
>>> the nodes, I am now receiving the following:
>>>
>>> [root at frick01 log]# gluster volume status
>>> Staging failed on frackib01.corvidtec.com. Please check log file
for
>>> details.
>>>
>>>
>>>
>>> The logs are attached and my setup is shown below.  Can anyone
help?
>>>
>>> [root at frick01 log]# gluster volume info
>>>
>>> Volume Name: gfs
>>> Type: Replicate
>>> Volume ID: abc63b5c-bed7-4e3d-9057-00930a2d85d3
>>> Status: Started
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp,rdma
>>> Bricks:
>>> Brick1: frickib01.corvidtec.com:/data/brick01/gfs
>>> Brick2: frackib01.corvidtec.com:/data/brick01/gfs
>>> Options Reconfigured:
>>> storage.owner-gid: 100
>>> server.allow-insecure: on
>>> performance.readdir-ahead: on
>>> server.event-threads: 4
>>> client.event-threads: 4
>>> David
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing listGluster-devel at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Alan Orth
> alan.orth at gmail.com
> https://alaninkenya.org
> https://mjanja.ch
> "In heaven all the interesting people are missing." -Friedrich
Nietzsche
> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151026/4dbf06e4/attachment.html>

Gluster users - Oct 2015 - [Gluster-devel] 3.7.5 upgrade issues

[Gluster-users] [Gluster-devel] 3.7.5 upgrade issues

[Gluster-users] [Gluster-devel] 3.7.5 upgrade issues