thr3ads.net - Gluster users - [Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP [Mar 2019]

If this information is useful, please help other people find it:
Share via:

Amar Tumballi Suryanarayan

2019-Mar-07 06:28 UTC

[Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP

We are talking days. Not weeks. Considering already it is Thursday here. 1
more day for tagging, and packaging. May be ok to expect it on Monday.

-Amar

On Thu, Mar 7, 2019 at 11:54 AM Artem Russakovskii <archon810 at
gmail.com>
wrote:
> Is the next release going to be an imminent hotfix, i.e. something like
> today/tomorrow, or are we talking weeks?
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> <http://www.apkmirror.com/>, Illogical Robot LLC
> beerpla.net | +ArtemRussakovskii
> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
> <http://twitter.com/ArtemR>
>
>
> On Tue, Mar 5, 2019 at 11:09 AM Artem Russakovskii <archon810 at
gmail.com>
> wrote:
>
>> Ended up downgrading to 5.3 just in case. Peer status and volume status
>> are OK now.
>>
>> zypper install --oldpackage glusterfs-5.3-lp150.100.1
>> Loading repository data...
>> Reading installed packages...
>> Resolving package dependencies...
>>
>> Problem: glusterfs-5.3-lp150.100.1.x86_64 requires libgfapi0 = 5.3, but
>> this requirement cannot be provided
>>   not installable providers:
libgfapi0-5.3-lp150.100.1.x86_64[glusterfs]
>>  Solution 1: Following actions will be done:
>>   downgrade of libgfapi0-5.4-lp150.100.1.x86_64 to
>> libgfapi0-5.3-lp150.100.1.x86_64
>>   downgrade of libgfchangelog0-5.4-lp150.100.1.x86_64 to
>> libgfchangelog0-5.3-lp150.100.1.x86_64
>>   downgrade of libgfrpc0-5.4-lp150.100.1.x86_64 to
>> libgfrpc0-5.3-lp150.100.1.x86_64
>>   downgrade of libgfxdr0-5.4-lp150.100.1.x86_64 to
>> libgfxdr0-5.3-lp150.100.1.x86_64
>>   downgrade of libglusterfs0-5.4-lp150.100.1.x86_64 to
>> libglusterfs0-5.3-lp150.100.1.x86_64
>>  Solution 2: do not install glusterfs-5.3-lp150.100.1.x86_64
>>  Solution 3: break glusterfs-5.3-lp150.100.1.x86_64 by ignoring some of
>> its dependencies
>>
>> Choose from above solutions by number or cancel [1/2/3/c] (c): 1
>> Resolving dependencies...
>> Resolving package dependencies...
>>
>> The following 6 packages are going to be downgraded:
>>   glusterfs libgfapi0 libgfchangelog0 libgfrpc0 libgfxdr0 libglusterfs0
>>
>> 6 packages to downgrade.
>>
>> Sincerely,
>> Artem
>>
>> --
>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>> <http://www.apkmirror.com/>, Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>> <http://twitter.com/ArtemR>
>>
>>
>> On Tue, Mar 5, 2019 at 10:57 AM Artem Russakovskii <archon810 at
gmail.com>
>> wrote:
>>
>>> Noticed the same when upgrading from 5.3 to 5.4, as mentioned.
>>>
>>> I'm confused though. Is actual replication affected, because
the 5.4
>>> server and the 3x 5.3 servers still show heal info as all 4
connected, and
>>> the files seem to be replicating correctly as well.
>>>
>>> So what's actually affected - just the status command, or
leaving 5.4 on
>>> one of the nodes is doing some damage to the underlying fs? Is it
fixable
>>> by tweaking transport.socket.ssl-enabled? Does upgrading all
servers to 5.4
>>> resolve it, or should we revert back to 5.3?
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>>
>>> On Tue, Mar 5, 2019 at 2:02 AM Hu Bert <revirii at
googlemail.com> wrote:
>>>
>>>> fyi: did a downgrade 5.4 -> 5.3 and it worked. all replicas
are up and
>>>> running. Awaiting updated v5.4.
>>>>
>>>> thx :-)
>>>>
>>>> Am Di., 5. M?rz 2019 um 09:26 Uhr schrieb Hari Gowtham <
>>>> hgowtham at redhat.com>:
>>>> >
>>>> > There are plans to revert the patch causing this error and
rebuilt
>>>> 5.4.
>>>> > This should happen faster. the rebuilt 5.4 should be void
of this
>>>> upgrade issue.
>>>> >
>>>> > In the meantime, you can use 5.3 for this cluster.
>>>> > Downgrading to 5.3 will work if it was just one node that
was upgrade
>>>> to 5.4
>>>> > and the other nodes are still in 5.3.
>>>> >
>>>> > On Tue, Mar 5, 2019 at 1:07 PM Hu Bert <revirii at
googlemail.com>
>>>> wrote:
>>>> > >
>>>> > > Hi Hari,
>>>> > >
>>>> > > thx for the hint. Do you know when this will be
fixed? Is a
>>>> downgrade
>>>> > > 5.4 -> 5.3 a possibility to fix this?
>>>> > >
>>>> > > Hubert
>>>> > >
>>>> > > Am Di., 5. M?rz 2019 um 08:32 Uhr schrieb Hari
Gowtham <
>>>> hgowtham at redhat.com>:
>>>> > > >
>>>> > > > Hi,
>>>> > > >
>>>> > > > This is a known issue we are working on.
>>>> > > > As the checksum differs between the updated and
non updated node,
>>>> the
>>>> > > > peers are getting rejected.
>>>> > > > The bricks aren't coming because of the same
issue.
>>>> > > >
>>>> > > > More about the issue:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1685120
>>>> > > >
>>>> > > > On Tue, Mar 5, 2019 at 12:56 PM Hu Bert
<revirii at googlemail.com>
>>>> wrote:
>>>> > > > >
>>>> > > > > Interestingly: gluster volume status misses
gluster1, while heal
>>>> > > > > statistics show gluster1:
>>>> > > > >
>>>> > > > > gluster volume status workdata
>>>> > > > > Status of volume: workdata
>>>> > > > > Gluster process                            
TCP Port  RDMA
>>>> Port  Online  Pid
>>>> > > > >
>>>>
------------------------------------------------------------------------------
>>>> > > > > Brick gluster2:/gluster/md4/workdata       
49153     0
>>>>   Y       1723
>>>> > > > > Brick gluster3:/gluster/md4/workdata       
49153     0
>>>>   Y       2068
>>>> > > > > Self-heal Daemon on localhost              
N/A       N/A
>>>>   Y       1732
>>>> > > > > Self-heal Daemon on gluster3               
N/A       N/A
>>>>   Y       2077
>>>> > > > >
>>>> > > > > vs.
>>>> > > > >
>>>> > > > > gluster volume heal workdata statistics
heal-count
>>>> > > > > Gathering count of entries to be healed on
volume workdata has
>>>> been successful
>>>> > > > >
>>>> > > > > Brick gluster1:/gluster/md4/workdata
>>>> > > > > Number of entries: 0
>>>> > > > >
>>>> > > > > Brick gluster2:/gluster/md4/workdata
>>>> > > > > Number of entries: 10745
>>>> > > > >
>>>> > > > > Brick gluster3:/gluster/md4/workdata
>>>> > > > > Number of entries: 10744
>>>> > > > >
>>>> > > > > Am Di., 5. M?rz 2019 um 08:18 Uhr schrieb
Hu Bert <
>>>> revirii at googlemail.com>:
>>>> > > > > >
>>>> > > > > > Hi Miling,
>>>> > > > > >
>>>> > > > > > well, there are such entries, but
those haven't been a
>>>> problem during
>>>> > > > > > install and the last kernel
update+reboot. The entries look
>>>> like:
>>>> > > > > >
>>>> > > > > > PUBLIC_IP  gluster2.alpserver.de
gluster2
>>>> > > > > >
>>>> > > > > > 192.168.0.50 gluster1
>>>> > > > > > 192.168.0.51 gluster2
>>>> > > > > > 192.168.0.52 gluster3
>>>> > > > > >
>>>> > > > > > 'ping gluster2' resolves to
LAN IP; I removed the last entry
>>>> in the
>>>> > > > > > 1st line, did a reboot ... no,
didn't help. From
>>>> > > > > > /var/log/glusterfs/glusterd.log
>>>> > > > > >  on gluster 2:
>>>> > > > > >
>>>> > > > > > [2019-03-05 07:04:36.188128] E [MSGID:
106010]
>>>> > > > > >
[glusterd-utils.c:3483:glusterd_compare_friend_volume]
>>>> 0-management:
>>>> > > > > > Version of Cksums persistent differ.
local cksum >>>> 3950307018, remote
>>>> > > > > > cksum = 455409345 on peer gluster1
>>>> > > > > > [2019-03-05 07:04:36.188314] I [MSGID:
106493]
>>>> > > > > >
[glusterd-handler.c:3843:glusterd_xfer_friend_add_resp]
>>>> 0-glusterd:
>>>> > > > > > Responded to gluster1 (0), ret: 0,
op_ret: -1
>>>> > > > > >
>>>> > > > > > Interestingly there are no entries in
the brick logs of the
>>>> rejected
>>>> > > > > > server. Well, not surprising as no
brick process is running.
>>>> The
>>>> > > > > > server gluster1 is still in rejected
state.
>>>> > > > > >
>>>> > > > > > 'gluster volume start workdata
force' starts the brick
>>>> process on
>>>> > > > > > gluster1, and some heals are happening
on gluster2+3, but via
>>>> 'gluster
>>>> > > > > > volume status workdata' the
volumes still aren't complete.
>>>> > > > > >
>>>> > > > > > gluster1:
>>>> > > > > >
>>>>
------------------------------------------------------------------------------
>>>> > > > > > Brick gluster1:/gluster/md4/workdata  
49152     0
>>>>     Y       2523
>>>> > > > > > Self-heal Daemon on localhost         
N/A       N/A
>>>>     Y       2549
>>>> > > > > >
>>>> > > > > > gluster2:
>>>> > > > > > Gluster process                       
TCP Port  RDMA
>>>> Port  Online  Pid
>>>> > > > > >
>>>>
------------------------------------------------------------------------------
>>>> > > > > > Brick gluster2:/gluster/md4/workdata  
49153     0
>>>>     Y       1723
>>>> > > > > > Brick gluster3:/gluster/md4/workdata  
49153     0
>>>>     Y       2068
>>>> > > > > > Self-heal Daemon on localhost         
N/A       N/A
>>>>     Y       1732
>>>> > > > > > Self-heal Daemon on gluster3          
N/A       N/A
>>>>     Y       2077
>>>> > > > > >
>>>> > > > > >
>>>> > > > > > Hubert
>>>> > > > > >
>>>> > > > > > Am Di., 5. M?rz 2019 um 07:58 Uhr
schrieb Milind Changire <
>>>> mchangir at redhat.com>:
>>>> > > > > > >
>>>> > > > > > > There are probably DNS entries or
/etc/hosts entries with
>>>> the public IP Addresses that the host names (gluster1,
gluster2, gluster3)
>>>> are getting resolved to.
>>>> > > > > > > /etc/resolv.conf would tell which
is the default domain
>>>> searched for the node names and the DNS servers which respond
to the
>>>> queries.
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > > On Tue, Mar 5, 2019 at 12:14 PM
Hu Bert <
>>>> revirii at googlemail.com> wrote:
>>>> > > > > > >>
>>>> > > > > > >> Good morning,
>>>> > > > > > >>
>>>> > > > > > >> i have a replicate 3 setup
with 2 volumes, running on
>>>> version 5.3 on
>>>> > > > > > >> debian stretch. This morning
i upgraded one server to
>>>> version 5.4 and
>>>> > > > > > >> rebooted the machine; after
the restart i noticed that:
>>>> > > > > > >>
>>>> > > > > > >> - no brick process is running
>>>> > > > > > >> - gluster volume status only
shows the server itself:
>>>> > > > > > >> gluster volume status
workdata
>>>> > > > > > >> Status of volume: workdata
>>>> > > > > > >> Gluster process              
TCP Port  RDMA
>>>> Port  Online  Pid
>>>> > > > > > >>
>>>>
------------------------------------------------------------------------------
>>>> > > > > > >> Brick
gluster1:/gluster/md4/workdata        N/A       N/A
>>>>       N       N/A
>>>> > > > > > >> NFS Server on localhost      
N/A       N/A
>>>>       N       N/A
>>>> > > > > > >>
>>>> > > > > > >> - gluster peer status on the
server
>>>> > > > > > >> gluster peer status
>>>> > > > > > >> Number of Peers: 2
>>>> > > > > > >>
>>>> > > > > > >> Hostname: gluster3
>>>> > > > > > >> Uuid:
c7b4a448-ca6a-4051-877f-788f9ee9bc4a
>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>> > > > > > >>
>>>> > > > > > >> Hostname: gluster2
>>>> > > > > > >> Uuid:
162fea82-406a-4f51-81a3-e90235d8da27
>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>> > > > > > >>
>>>> > > > > > >> - gluster peer status on the
other 2 servers:
>>>> > > > > > >> gluster peer status
>>>> > > > > > >> Number of Peers: 2
>>>> > > > > > >>
>>>> > > > > > >> Hostname: gluster1
>>>> > > > > > >> Uuid:
9a360776-7b58-49ae-831e-a0ce4e4afbef
>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>> > > > > > >>
>>>> > > > > > >> Hostname: gluster3
>>>> > > > > > >> Uuid:
c7b4a448-ca6a-4051-877f-788f9ee9bc4a
>>>> > > > > > >> State: Peer in Cluster
(Connected)
>>>> > > > > > >>
>>>> > > > > > >> I noticed that, in the brick
logs, i see that the public
>>>> IP is used
>>>> > > > > > >> instead of the LAN IP. brick
logs from one of the volumes:
>>>> > > > > > >>
>>>> > > > > > >> rejected node:
https://pastebin.com/qkpj10Sd
>>>> > > > > > >> connected nodes:
https://pastebin.com/8SxVVYFV
>>>> > > > > > >>
>>>> > > > > > >> Why is the public IP suddenly
used instead of the LAN IP?
>>>> Killing all
>>>> > > > > > >> gluster processes and
rebooting (again) didn't help.
>>>> > > > > > >>
>>>> > > > > > >>
>>>> > > > > > >> Thx,
>>>> > > > > > >> Hubert
>>>> > > > > > >>
_______________________________________________
>>>> > > > > > >> Gluster-users mailing list
>>>> > > > > > >> Gluster-users at gluster.org
>>>> > > > > > >>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > > --
>>>> > > > > > > Milind
>>>> > > > > > >
>>>> > > > >
_______________________________________________
>>>> > > > > Gluster-users mailing list
>>>> > > > > Gluster-users at gluster.org
>>>> > > > >
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > --
>>>> > > > Regards,
>>>> > > > Hari Gowtham.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Regards,
>>>> > Hari Gowtham.
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users


-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190307/7e03818f/attachment.html>

Artem Russakovskii

2019-Mar-12 17:28 UTC

head link

[Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP

Hi Amar,

Any updates on this? I'm still not seeing it in OpenSUSE build repos. Maybe
later today?

Thanks.

Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>


On Wed, Mar 6, 2019 at 10:30 PM Amar Tumballi Suryanarayan <
atumball at redhat.com> wrote:
> We are talking days. Not weeks. Considering already it is Thursday here. 1
> more day for tagging, and packaging. May be ok to expect it on Monday.
>
> -Amar
>
> On Thu, Mar 7, 2019 at 11:54 AM Artem Russakovskii <archon810 at
gmail.com>
> wrote:
>
>> Is the next release going to be an imminent hotfix, i.e. something like
>> today/tomorrow, or are we talking weeks?
>>
>> Sincerely,
>> Artem
>>
>> --
>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>> <http://www.apkmirror.com/>, Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>> <http://twitter.com/ArtemR>
>>
>>
>> On Tue, Mar 5, 2019 at 11:09 AM Artem Russakovskii <archon810 at
gmail.com>
>> wrote:
>>
>>> Ended up downgrading to 5.3 just in case. Peer status and volume
status
>>> are OK now.
>>>
>>> zypper install --oldpackage glusterfs-5.3-lp150.100.1
>>> Loading repository data...
>>> Reading installed packages...
>>> Resolving package dependencies...
>>>
>>> Problem: glusterfs-5.3-lp150.100.1.x86_64 requires libgfapi0 = 5.3,
but
>>> this requirement cannot be provided
>>>   not installable providers:
libgfapi0-5.3-lp150.100.1.x86_64[glusterfs]
>>>  Solution 1: Following actions will be done:
>>>   downgrade of libgfapi0-5.4-lp150.100.1.x86_64 to
>>> libgfapi0-5.3-lp150.100.1.x86_64
>>>   downgrade of libgfchangelog0-5.4-lp150.100.1.x86_64 to
>>> libgfchangelog0-5.3-lp150.100.1.x86_64
>>>   downgrade of libgfrpc0-5.4-lp150.100.1.x86_64 to
>>> libgfrpc0-5.3-lp150.100.1.x86_64
>>>   downgrade of libgfxdr0-5.4-lp150.100.1.x86_64 to
>>> libgfxdr0-5.3-lp150.100.1.x86_64
>>>   downgrade of libglusterfs0-5.4-lp150.100.1.x86_64 to
>>> libglusterfs0-5.3-lp150.100.1.x86_64
>>>  Solution 2: do not install glusterfs-5.3-lp150.100.1.x86_64
>>>  Solution 3: break glusterfs-5.3-lp150.100.1.x86_64 by ignoring
some of
>>> its dependencies
>>>
>>> Choose from above solutions by number or cancel [1/2/3/c] (c): 1
>>> Resolving dependencies...
>>> Resolving package dependencies...
>>>
>>> The following 6 packages are going to be downgraded:
>>>   glusterfs libgfapi0 libgfchangelog0 libgfrpc0 libgfxdr0
libglusterfs0
>>>
>>> 6 packages to downgrade.
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>>
>>> On Tue, Mar 5, 2019 at 10:57 AM Artem Russakovskii <archon810 at
gmail.com>
>>> wrote:
>>>
>>>> Noticed the same when upgrading from 5.3 to 5.4, as mentioned.
>>>>
>>>> I'm confused though. Is actual replication affected,
because the 5.4
>>>> server and the 3x 5.3 servers still show heal info as all 4
connected, and
>>>> the files seem to be replicating correctly as well.
>>>>
>>>> So what's actually affected - just the status command, or
leaving 5.4
>>>> on one of the nodes is doing some damage to the underlying fs?
Is it
>>>> fixable by tweaking transport.socket.ssl-enabled? Does
upgrading all
>>>> servers to 5.4 resolve it, or should we revert back to 5.3?
>>>>
>>>> Sincerely,
>>>> Artem
>>>>
>>>> --
>>>> Founder, Android Police <http://www.androidpolice.com>,
APK Mirror
>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>> beerpla.net | +ArtemRussakovskii
>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>> <http://twitter.com/ArtemR>
>>>>
>>>>
>>>> On Tue, Mar 5, 2019 at 2:02 AM Hu Bert <revirii at
googlemail.com> wrote:
>>>>
>>>>> fyi: did a downgrade 5.4 -> 5.3 and it worked. all
replicas are up and
>>>>> running. Awaiting updated v5.4.
>>>>>
>>>>> thx :-)
>>>>>
>>>>> Am Di., 5. M?rz 2019 um 09:26 Uhr schrieb Hari Gowtham <
>>>>> hgowtham at redhat.com>:
>>>>> >
>>>>> > There are plans to revert the patch causing this error
and rebuilt
>>>>> 5.4.
>>>>> > This should happen faster. the rebuilt 5.4 should be
void of this
>>>>> upgrade issue.
>>>>> >
>>>>> > In the meantime, you can use 5.3 for this cluster.
>>>>> > Downgrading to 5.3 will work if it was just one node
that was
>>>>> upgrade to 5.4
>>>>> > and the other nodes are still in 5.3.
>>>>> >
>>>>> > On Tue, Mar 5, 2019 at 1:07 PM Hu Bert <revirii at
googlemail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > Hi Hari,
>>>>> > >
>>>>> > > thx for the hint. Do you know when this will be
fixed? Is a
>>>>> downgrade
>>>>> > > 5.4 -> 5.3 a possibility to fix this?
>>>>> > >
>>>>> > > Hubert
>>>>> > >
>>>>> > > Am Di., 5. M?rz 2019 um 08:32 Uhr schrieb Hari
Gowtham <
>>>>> hgowtham at redhat.com>:
>>>>> > > >
>>>>> > > > Hi,
>>>>> > > >
>>>>> > > > This is a known issue we are working on.
>>>>> > > > As the checksum differs between the updated
and non updated
>>>>> node, the
>>>>> > > > peers are getting rejected.
>>>>> > > > The bricks aren't coming because of the
same issue.
>>>>> > > >
>>>>> > > > More about the issue:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1685120
>>>>> > > >
>>>>> > > > On Tue, Mar 5, 2019 at 12:56 PM Hu Bert
<revirii at googlemail.com>
>>>>> wrote:
>>>>> > > > >
>>>>> > > > > Interestingly: gluster volume status
misses gluster1, while
>>>>> heal
>>>>> > > > > statistics show gluster1:
>>>>> > > > >
>>>>> > > > > gluster volume status workdata
>>>>> > > > > Status of volume: workdata
>>>>> > > > > Gluster process                        
TCP Port  RDMA
>>>>> Port  Online  Pid
>>>>> > > > >
>>>>>
------------------------------------------------------------------------------
>>>>> > > > > Brick gluster2:/gluster/md4/workdata   
49153     0
>>>>>   Y       1723
>>>>> > > > > Brick gluster3:/gluster/md4/workdata   
49153     0
>>>>>   Y       2068
>>>>> > > > > Self-heal Daemon on localhost          
N/A       N/A
>>>>>   Y       1732
>>>>> > > > > Self-heal Daemon on gluster3           
N/A       N/A
>>>>>   Y       2077
>>>>> > > > >
>>>>> > > > > vs.
>>>>> > > > >
>>>>> > > > > gluster volume heal workdata statistics
heal-count
>>>>> > > > > Gathering count of entries to be healed
on volume workdata has
>>>>> been successful
>>>>> > > > >
>>>>> > > > > Brick gluster1:/gluster/md4/workdata
>>>>> > > > > Number of entries: 0
>>>>> > > > >
>>>>> > > > > Brick gluster2:/gluster/md4/workdata
>>>>> > > > > Number of entries: 10745
>>>>> > > > >
>>>>> > > > > Brick gluster3:/gluster/md4/workdata
>>>>> > > > > Number of entries: 10744
>>>>> > > > >
>>>>> > > > > Am Di., 5. M?rz 2019 um 08:18 Uhr
schrieb Hu Bert <
>>>>> revirii at googlemail.com>:
>>>>> > > > > >
>>>>> > > > > > Hi Miling,
>>>>> > > > > >
>>>>> > > > > > well, there are such entries, but
those haven't been a
>>>>> problem during
>>>>> > > > > > install and the last kernel
update+reboot. The entries look
>>>>> like:
>>>>> > > > > >
>>>>> > > > > > PUBLIC_IP  gluster2.alpserver.de
gluster2
>>>>> > > > > >
>>>>> > > > > > 192.168.0.50 gluster1
>>>>> > > > > > 192.168.0.51 gluster2
>>>>> > > > > > 192.168.0.52 gluster3
>>>>> > > > > >
>>>>> > > > > > 'ping gluster2' resolves
to LAN IP; I removed the last entry
>>>>> in the
>>>>> > > > > > 1st line, did a reboot ... no,
didn't help. From
>>>>> > > > > > /var/log/glusterfs/glusterd.log
>>>>> > > > > >  on gluster 2:
>>>>> > > > > >
>>>>> > > > > > [2019-03-05 07:04:36.188128] E
[MSGID: 106010]
>>>>> > > > > >
[glusterd-utils.c:3483:glusterd_compare_friend_volume]
>>>>> 0-management:
>>>>> > > > > > Version of Cksums persistent
differ. local cksum >>>>> 3950307018, remote
>>>>> > > > > > cksum = 455409345 on peer gluster1
>>>>> > > > > > [2019-03-05 07:04:36.188314] I
[MSGID: 106493]
>>>>> > > > > >
[glusterd-handler.c:3843:glusterd_xfer_friend_add_resp]
>>>>> 0-glusterd:
>>>>> > > > > > Responded to gluster1 (0), ret: 0,
op_ret: -1
>>>>> > > > > >
>>>>> > > > > > Interestingly there are no entries
in the brick logs of the
>>>>> rejected
>>>>> > > > > > server. Well, not surprising as no
brick process is running.
>>>>> The
>>>>> > > > > > server gluster1 is still in
rejected state.
>>>>> > > > > >
>>>>> > > > > > 'gluster volume start workdata
force' starts the brick
>>>>> process on
>>>>> > > > > > gluster1, and some heals are
happening on gluster2+3, but
>>>>> via 'gluster
>>>>> > > > > > volume status workdata' the
volumes still aren't complete.
>>>>> > > > > >
>>>>> > > > > > gluster1:
>>>>> > > > > >
>>>>>
------------------------------------------------------------------------------
>>>>> > > > > > Brick
gluster1:/gluster/md4/workdata        49152     0
>>>>>     Y       2523
>>>>> > > > > > Self-heal Daemon on localhost     
N/A       N/A
>>>>>     Y       2549
>>>>> > > > > >
>>>>> > > > > > gluster2:
>>>>> > > > > > Gluster process                   
TCP Port  RDMA
>>>>> Port  Online  Pid
>>>>> > > > > >
>>>>>
------------------------------------------------------------------------------
>>>>> > > > > > Brick
gluster2:/gluster/md4/workdata        49153     0
>>>>>     Y       1723
>>>>> > > > > > Brick
gluster3:/gluster/md4/workdata        49153     0
>>>>>     Y       2068
>>>>> > > > > > Self-heal Daemon on localhost     
N/A       N/A
>>>>>     Y       1732
>>>>> > > > > > Self-heal Daemon on gluster3      
N/A       N/A
>>>>>     Y       2077
>>>>> > > > > >
>>>>> > > > > >
>>>>> > > > > > Hubert
>>>>> > > > > >
>>>>> > > > > > Am Di., 5. M?rz 2019 um 07:58 Uhr
schrieb Milind Changire <
>>>>> mchangir at redhat.com>:
>>>>> > > > > > >
>>>>> > > > > > > There are probably DNS
entries or /etc/hosts entries with
>>>>> the public IP Addresses that the host names (gluster1,
gluster2, gluster3)
>>>>> are getting resolved to.
>>>>> > > > > > > /etc/resolv.conf would tell
which is the default domain
>>>>> searched for the node names and the DNS servers which
respond to the
>>>>> queries.
>>>>> > > > > > >
>>>>> > > > > > >
>>>>> > > > > > > On Tue, Mar 5, 2019 at 12:14
PM Hu Bert <
>>>>> revirii at googlemail.com> wrote:
>>>>> > > > > > >>
>>>>> > > > > > >> Good morning,
>>>>> > > > > > >>
>>>>> > > > > > >> i have a replicate 3
setup with 2 volumes, running on
>>>>> version 5.3 on
>>>>> > > > > > >> debian stretch. This
morning i upgraded one server to
>>>>> version 5.4 and
>>>>> > > > > > >> rebooted the machine;
after the restart i noticed that:
>>>>> > > > > > >>
>>>>> > > > > > >> - no brick process is
running
>>>>> > > > > > >> - gluster volume status
only shows the server itself:
>>>>> > > > > > >> gluster volume status
workdata
>>>>> > > > > > >> Status of volume:
workdata
>>>>> > > > > > >> Gluster process          
TCP Port
>>>>> RDMA Port  Online  Pid
>>>>> > > > > > >>
>>>>>
------------------------------------------------------------------------------
>>>>> > > > > > >> Brick
gluster1:/gluster/md4/workdata        N/A
>>>>>  N/A        N       N/A
>>>>> > > > > > >> NFS Server on localhost  
N/A
>>>>>  N/A        N       N/A
>>>>> > > > > > >>
>>>>> > > > > > >> - gluster peer status on
the server
>>>>> > > > > > >> gluster peer status
>>>>> > > > > > >> Number of Peers: 2
>>>>> > > > > > >>
>>>>> > > > > > >> Hostname: gluster3
>>>>> > > > > > >> Uuid:
c7b4a448-ca6a-4051-877f-788f9ee9bc4a
>>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>>> > > > > > >>
>>>>> > > > > > >> Hostname: gluster2
>>>>> > > > > > >> Uuid:
162fea82-406a-4f51-81a3-e90235d8da27
>>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>>> > > > > > >>
>>>>> > > > > > >> - gluster peer status on
the other 2 servers:
>>>>> > > > > > >> gluster peer status
>>>>> > > > > > >> Number of Peers: 2
>>>>> > > > > > >>
>>>>> > > > > > >> Hostname: gluster1
>>>>> > > > > > >> Uuid:
9a360776-7b58-49ae-831e-a0ce4e4afbef
>>>>> > > > > > >> State: Peer Rejected
(Connected)
>>>>> > > > > > >>
>>>>> > > > > > >> Hostname: gluster3
>>>>> > > > > > >> Uuid:
c7b4a448-ca6a-4051-877f-788f9ee9bc4a
>>>>> > > > > > >> State: Peer in Cluster
(Connected)
>>>>> > > > > > >>
>>>>> > > > > > >> I noticed that, in the
brick logs, i see that the public
>>>>> IP is used
>>>>> > > > > > >> instead of the LAN IP.
brick logs from one of the volumes:
>>>>> > > > > > >>
>>>>> > > > > > >> rejected node:
https://pastebin.com/qkpj10Sd
>>>>> > > > > > >> connected nodes:
https://pastebin.com/8SxVVYFV
>>>>> > > > > > >>
>>>>> > > > > > >> Why is the public IP
suddenly used instead of the LAN IP?
>>>>> Killing all
>>>>> > > > > > >> gluster processes and
rebooting (again) didn't help.
>>>>> > > > > > >>
>>>>> > > > > > >>
>>>>> > > > > > >> Thx,
>>>>> > > > > > >> Hubert
>>>>> > > > > > >>
_______________________________________________
>>>>> > > > > > >> Gluster-users mailing
list
>>>>> > > > > > >> Gluster-users at
gluster.org
>>>>> > > > > > >>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>> > > > > > >
>>>>> > > > > > >
>>>>> > > > > > >
>>>>> > > > > > > --
>>>>> > > > > > > Milind
>>>>> > > > > > >
>>>>> > > > >
_______________________________________________
>>>>> > > > > Gluster-users mailing list
>>>>> > > > > Gluster-users at gluster.org
>>>>> > > > >
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > --
>>>>> > > > Regards,
>>>>> > > > Hari Gowtham.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Regards,
>>>>> > Hari Gowtham.
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Amar Tumballi (amarts)
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190312/d0ae000a/attachment.html>

Gluster users - Mar 2019 - Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP

[Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP

[Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP