Marcus Pedersén
2018-Sep-01  11:38 UTC
[Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work Now: Upgraded to 4.1.3 geo node Faulty
Hi again,
I found another problem on the other master node.
The node toggles Active/Faulty and it is the same error over and over again.
[2018-09-01 11:23:02.94080] E [repce(worker /urd-gds/gluster):197:__call__]
RepceClient: call failed    call=1226:139955262510912:1535800981.24
method=entry_ops        error=GsyncdError
[2018-09-01 11:23:02.94214] E [syncdutils(worker
/urd-gds/gluster):300:log_raise_exception] <top>: execution of
"gluster" failed with ENOENT (No such file or directory)
[2018-09-01 11:23:02.106194] I [repce(agent /urd-gds/gluster):80:service_loop]
RepceServer: terminating on reaching EOF.
[2018-09-01 11:23:02.124444] I [gsyncdstatus(monitor):244:set_worker_status]
GeorepStatus: Worker Status Change status=Faulty
I have also found a python error as well, I have only seen this once though.
[2018-09-01 11:16:45.907660] I [master(worker /urd-gds/gluster):1536:crawl]
_GMaster: slave's time      stime=(1524101534, 0)
[2018-09-01 11:16:47.364109] E [syncdutils(worker
/urd-gds/gluster):332:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
362, in twrap
    tf(*aargs)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1939, in syncjob
    po = self.sync_engine(pb, self.log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
1442, in rsync
    rconf.ssh_ctl_args + \
AttributeError: 'NoneType' object has no attribute 'split'
[2018-09-01 11:16:47.384531] I [repce(agent /urd-gds/gluster):80:service_loop]
RepceServer: terminating on reaching EOF.
[2018-09-01 11:16:48.362987] I [monitor(monitor):279:monitor] Monitor: worker
died in startup phase     brick=/urd-gds/gluster
[2018-09-01 11:16:48.370701] I [gsyncdstatus(monitor):244:set_worker_status]
GeorepStatus: Worker Status Change status=Faulty
[2018-09-01 11:16:58.390548] I [monitor(monitor):158:monitor] Monitor: starting
gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
I attach the logs as well.
Many thanks!
Best regards
Marcus Peders?n
________________________________
Fr?n: gluster-users-bounces at gluster.org <gluster-users-bounces at
gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
Skickat: den 31 augusti 2018 16:09
Till: khiremat at redhat.com
Kopia: gluster-users at gluster.org
?mne: Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work
Now: Upgraded to 4.1.3 geo node Faulty
I realy appologize, third try to make mail smaller.
/Marcus
________________________________
Fr?n: Marcus Peders?n
Skickat: den 31 augusti 2018 16:03
Till: Kotresh Hiremath Ravishankar
Kopia: gluster-users at gluster.org
?mne: SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work
Now: Upgraded to 4.1.3 geo node Faulty
Sorry, resend due to too large mail.
/Marcus
________________________________
Fr?n: Marcus Peders?n
Skickat: den 31 augusti 2018 15:19
Till: Kotresh Hiremath Ravishankar
Kopia: gluster-users at gluster.org
?mne: SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work
Now: Upgraded to 4.1.3 geo node Faulty
Hi Kotresh,
Please find attached logs, only logs from today.
The python error was repeated over and over again until I disabled selinux.
After that the node bacame active again.
The return code 23 seems to be repeated over and over again.
rsync version 3.1.2
Thanks a lot!
Best regards
Marcus
________________________________
Fr?n: Kotresh Hiremath Ravishankar <khiremat at redhat.com>
Skickat: den 31 augusti 2018 11:09
Till: Marcus Peders?n
Kopia: gluster-users at gluster.org
?mne: Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work
Now: Upgraded to 4.1.3 geo node Faulty
Hi Marcus,
Could you attach full logs? Is the same trace back happening repeatedly? It will
be helpful you attach the corresponding mount log as well.
What's the rsync version, you are using?
Thanks,
Kotresh HR
On Fri, Aug 31, 2018 at 12:16 PM, Marcus Peders?n <marcus.pedersen at
slu.se<mailto:marcus.pedersen at slu.se>> wrote:
Hi all,
I had problems with stopping sync after upgrade to 4.1.2.
I upgraded to 4.1.3 and it ran fine for one day, but now one of the master nodes
shows faulty.
Most of the sync jobs have return code 23, how do I resolve this?
I see messages like:
_GMaster: Sucessfully fixed all entry ops with gfid mismatch
Will this resolve error code 23?
There is also a python error.
The python error was a selinux problem, turning off selinux made node go to
active again.
See log below.
CentOS 7, installed through SIG Gluster (OS updated to latest at the same time)
Master cluster: 2 x (2 + 1) distributed, replicated
Client cluster: 1 x (2 + 1) replicated
Many thanks in advance!
Best regards
Marcus Peders?n
gsyncd.log from Faulty node:
[2018-08-31 06:25:51.375267] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.8099 num_files=57    job=3   return_code=23
[2018-08-31 06:25:51.465895] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0904 num_files=3     job=3   return_code=23
[2018-08-31 06:25:52.562107] E [repce(worker /urd-gds/gluster):197:__call__]
RepceClient: call failed   call=30069:139655665837888:1535696752.35       
method=entry_ops        error=OSError
[2018-08-31 06:25:52.562346] E [syncdutils(worker
/urd-gds/gluster):332:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311,
in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72,
in subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
1288, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615,
in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1545, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1445, in changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1280, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1179, in process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216,
in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198,
in __call__
    raise res
OSError: [Errno 13] Permission denied
[2018-08-31 06:25:52.578367] I [repce(agent /urd-gds/gluster):80:service_loop]
RepceServer: terminating on reaching EOF.
[2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor: worker
died in startup phase     brick=/urd-gds/gluster
[2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status]
GeorepStatus: Worker Status Change status=Faulty
[2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor: starting
gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
[2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main]
<top>: Using session config file      
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main]
<top>: Using session config file     
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-31 06:26:03.637486] I [changelogagent(agent
/urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-08-31 06:26:03.650330] I [resource(worker
/urd-gds/gluster):1377:connect_remote] SSH: Initializing SSH connection between
master and slave...
[2018-08-31 06:26:05.296473] I [resource(worker
/urd-gds/gluster):1424:connect_remote] SSH: SSH connection between master and
slave established.        duration=1.6457
[2018-08-31 06:26:05.297904] I [resource(worker /urd-gds/gluster):1096:connect]
GLUSTER: Mounting gluster volume locally...
[2018-08-31 06:26:06.396939] I [resource(worker /urd-gds/gluster):1119:connect]
GLUSTER: Mounted gluster volume duration=1.0985
[2018-08-31 06:26:06.397691] I [subcmds(worker
/urd-gds/gluster):70:subcmd_worker] <top>: Worker spawn successful.
Acknowledging back to monitor
[2018-08-31 06:26:16.815566] I [master(worker /urd-gds/gluster):1593:register]
_GMaster: Working dir   
path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/urd-gds-gluster
[2018-08-31 06:26:16.816423] I [resource(worker
/urd-gds/gluster):1282:service_loop] GLUSTER: Register time     time=1535696776
[2018-08-31 06:26:16.888772] I [gsyncdstatus(worker
/urd-gds/gluster):277:set_active] GeorepStatus: Worker Status Change       
status=Active
[2018-08-31 06:26:16.892049] I [gsyncdstatus(worker
/urd-gds/gluster):249:set_worker_crawl_status] GeorepStatus: Crawl Status Change
status=History Crawl
[2018-08-31 06:26:16.892703] I [master(worker /urd-gds/gluster):1507:crawl]
_GMaster: starting history crawl    turns=1 stime=(1525739167, 0)  
entry_stime=(1525740143, 0)     etime=1535696776
[2018-08-31 06:26:17.914803] I [master(worker /urd-gds/gluster):1536:crawl]
_GMaster: slave's time      stime=(1525739167, 0)
[2018-08-31 06:26:18.521718] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.1063 num_files=17    job=3   return_code=23
[2018-08-31 06:26:19.260137] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.3441 num_files=34    job=1   return_code=23
[2018-08-31 06:26:19.615191] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0923 num_files=7     job=3   return_code=23
[2018-08-31 06:26:19.891227] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.1302 num_files=12    job=1   return_code=23
[2018-08-31 06:26:19.922700] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.5024 num_files=50    job=2   return_code=23
[2018-08-31 06:26:21.639342] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=1.5233 num_files=5     job=3   return_code=23
[2018-08-31 06:26:22.12726] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken    duration=0.1191 num_files=7     job=1  
return_code=23
[2018-08-31 06:26:22.86136] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken    duration=0.0731 num_files=4     job=1  
return_code=23
[2018-08-31 06:26:22.503290] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0779 num_files=15    job=2   return_code=23
[2018-08-31 06:26:23.214704] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0738 num_files=9     job=3   return_code=23
[2018-08-31 06:26:23.251876] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.2478 num_files=33    job=2   return_code=23
[2018-08-31 06:26:23.802699] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0873 num_files=9     job=3   return_code=23
[2018-08-31 06:26:23.828176] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0758 num_files=3     job=2   return_code=23
[2018-08-31 06:26:23.854063] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.2662 num_files=34    job=1   return_code=23
[2018-08-31 06:26:24.403228] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0997 num_files=30    job=3   return_code=23
[2018-08-31 06:26:25.526] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken      duration=0.0965 num_files=8     job=3  
return_code=23
[2018-08-31 06:26:25.438527] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0832 num_files=9     job=1   return_code=23
[2018-08-31 06:26:25.447256] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.6180 num_files=86    job=2   return_code=23
[2018-08-31 06:26:25.571913] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0706 num_files=2     job=3   return_code=0
[2018-08-31 06:26:27.21325] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken    duration=0.0814 num_files=1     job=1  
return_code=23
[2018-08-31 06:26:27.615520] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0933 num_files=13    job=1   return_code=23
[2018-08-31 06:26:27.668323] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.2190 num_files=95    job=2   return_code=23
[2018-08-31 06:26:27.740139] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0716 num_files=11    job=2   return_code=23
[2018-08-31 06:26:28.191068] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.1167 num_files=38    job=3   return_code=23
[2018-08-31 06:26:28.268213] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0768 num_files=7     job=3   return_code=23
[2018-08-31 06:26:28.317909] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0770 num_files=4     job=2   return_code=23
[2018-08-31 06:26:28.710064] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0932 num_files=23    job=1   return_code=23
[2018-08-31 06:26:28.907250] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0886 num_files=26    job=2   return_code=23
[2018-08-31 06:26:28.976679] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0692 num_files=4     job=2   return_code=23
[2018-08-31 06:26:29.55774] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken    duration=0.0788 num_files=9     job=2  
return_code=23
[2018-08-31 06:26:29.295576] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0847 num_files=16    job=1   return_code=23
[2018-08-31 06:26:29.665076] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.1087 num_files=25    job=2   return_code=23
[2018-08-31 06:26:30.277998] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.1122 num_files=40    job=2   return_code=23
[2018-08-31 06:26:31.153105] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.3822 num_files=74    job=3   return_code=23
[2018-08-31 06:26:31.227639] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0743 num_files=18    job=3   return_code=23
[2018-08-31 06:26:31.302660] I [master(worker /urd-gds/gluster):1944:syncjob]
Syncer: Sync Time Taken   duration=0.0748 num_files=18    job=3   return_code=23
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks and Regards,
Kotresh H R
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180901/4f974ef1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: urd-gds-001_20180901_gsyncd.log
Type: text/x-log
Size: 31007 bytes
Desc: urd-gds-001_20180901_gsyncd.log
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180901/4f974ef1/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: urd-gds-001_20180901_mnt-urd-gds-gluster.log
Type: text/x-log
Size: 396913 bytes
Desc: urd-gds-001_20180901_mnt-urd-gds-gluster.log
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180901/4f974ef1/attachment-0003.bin>
Kotresh Hiremath Ravishankar
2018-Sep-03  05:58 UTC
[Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work Now: Upgraded to 4.1.3 geo node Faulty
Hi Marcus, Geo-rep had few important fixes in 4.1.3. Is it possible to upgrade and check whether the issue is still seen? Thanks, Kotresh HR On Sat, Sep 1, 2018 at 5:08 PM, Marcus Peders?n <marcus.pedersen at slu.se> wrote:> Hi again, > > I found another problem on the other master node. > > The node toggles Active/Faulty and it is the same error over and over > again. > > > [2018-09-01 11:23:02.94080] E [repce(worker /urd-gds/gluster):197:__call__] > RepceClient: call failed call=1226:139955262510912:1535800981.24 > method=entry_ops error=GsyncdError > [2018-09-01 11:23:02.94214] E [syncdutils(worker /urd-gds/gluster):300:log_raise_exception] > <top>: execution of "gluster" failed with ENOENT (No such file or directory) > [2018-09-01 11:23:02.106194] I [repce(agent /urd-gds/gluster):80:service_loop] > RepceServer: terminating on reaching EOF. > [2018-09-01 11:23:02.124444] I [gsyncdstatus(monitor):244:set_worker_status] > GeorepStatus: Worker Status Change status=Faulty > > > I have also found a python error as well, I have only seen this once > though. > > > [2018-09-01 11:16:45.907660] I [master(worker > /urd-gds/gluster):1536:crawl] _GMaster: slave's time > stime=(1524101534, 0) > [2018-09-01 11:16:47.364109] E [syncdutils(worker > /urd-gds/gluster):332:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 362, in twrap > tf(*aargs) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1939, > in syncjob > po = self.sync_engine(pb, self.log_err) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1442, > in rsync > rconf.ssh_ctl_args + \ > AttributeError: 'NoneType' object has no attribute 'split' > [2018-09-01 11:16:47.384531] I [repce(agent /urd-gds/gluster):80:service_loop] > RepceServer: terminating on reaching EOF. > [2018-09-01 11:16:48.362987] I [monitor(monitor):279:monitor] Monitor: > worker died in startup phase brick=/urd-gds/gluster > [2018-09-01 11:16:48.370701] I [gsyncdstatus(monitor):244:set_worker_status] > GeorepStatus: Worker Status Change status=Faulty > [2018-09-01 11:16:58.390548] I [monitor(monitor):158:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000 > > > I attach the logs as well. > > > Many thanks! > > > Best regards > > Marcus Peders?n > > > > > ------------------------------ > *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces@ > gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se> > *Skickat:* den 31 augusti 2018 16:09 > *Till:* khiremat at redhat.com > > *Kopia:* gluster-users at gluster.org > *?mne:* Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > I realy appologize, third try to make mail smaller. > > > /Marcus > > > ------------------------------ > *Fr?n:* Marcus Peders?n > *Skickat:* den 31 augusti 2018 16:03 > *Till:* Kotresh Hiremath Ravishankar > *Kopia:* gluster-users at gluster.org > *?mne:* SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > Sorry, resend due to too large mail. > > > /Marcus > ------------------------------ > *Fr?n:* Marcus Peders?n > *Skickat:* den 31 augusti 2018 15:19 > *Till:* Kotresh Hiremath Ravishankar > *Kopia:* gluster-users at gluster.org > *?mne:* SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > Hi Kotresh, > > Please find attached logs, only logs from today. > > The python error was repeated over and over again until I disabled selinux. > > After that the node bacame active again. > > The return code 23 seems to be repeated over and over again. > > > rsync version 3.1.2 > > > Thanks a lot! > > > Best regards > > Marcus > > > ------------------------------ > *Fr?n:* Kotresh Hiremath Ravishankar <khiremat at redhat.com> > *Skickat:* den 31 augusti 2018 11:09 > *Till:* Marcus Peders?n > *Kopia:* gluster-users at gluster.org > *?mne:* Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > Hi Marcus, > > Could you attach full logs? Is the same trace back happening repeatedly? > It will be helpful you attach the corresponding mount log as well. > What's the rsync version, you are using? > > Thanks, > Kotresh HR > > On Fri, Aug 31, 2018 at 12:16 PM, Marcus Peders?n <marcus.pedersen at slu.se> > wrote: > >> Hi all, >> >> I had problems with stopping sync after upgrade to 4.1.2. >> >> I upgraded to 4.1.3 and it ran fine for one day, but now one of the >> master nodes shows faulty. >> >> Most of the sync jobs have return code 23, how do I resolve this? >> >> I see messages like: >> >> _GMaster: Sucessfully fixed all entry ops with gfid mismatch >> >> Will this resolve error code 23? >> >> There is also a python error. >> >> The python error was a selinux problem, turning off selinux made node go >> to active again. >> >> See log below. >> >> >> CentOS 7, installed through SIG Gluster (OS updated to latest at the same >> time) >> >> Master cluster: 2 x (2 + 1) distributed, replicated >> >> Client cluster: 1 x (2 + 1) replicated >> >> >> Many thanks in advance! >> >> >> Best regards >> >> Marcus Peders?n >> >> >> >> gsyncd.log from Faulty node: >> >> [2018-08-31 06:25:51.375267] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.8099 num_files=57 job=3 return_code=23 >> [2018-08-31 06:25:51.465895] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0904 num_files=3 job=3 return_code=23 >> [2018-08-31 06:25:52.562107] E [repce(worker >> /urd-gds/gluster):197:__call__] RepceClient: call failed >> call=30069:139655665837888:1535696752.35 method=entry_ops >> error=OSError >> [2018-08-31 06:25:52.562346] E [syncdutils(worker >> /urd-gds/gluster):332:log_raise_exception] <top>: FAIL: >> Traceback (most recent call last): >> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, >> in main >> func(args) >> File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, >> in subcmd_worker >> local.service_loop(remote) >> File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >> 1288, in service_loop >> g3.crawlwrap(oneshot=True) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, >> in crawlwrap >> self.crawl() >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545, >> in crawl >> self.changelogs_batch_process(changes) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445, >> in changelogs_batch_process >> self.process(batch) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280, >> in process >> self.process_change(change, done, retry) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179, >> in process_change >> failures = self.slave.server.entry_ops(entries) >> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in >> __call__ >> return self.ins(self.meth, *a) >> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in >> __call__ >> raise res >> OSError: [Errno 13] Permission denied >> [2018-08-31 06:25:52.578367] I [repce(agent /urd-gds/gluster):80:service_loop] >> RepceServer: terminating on reaching EOF. >> [2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor: >> worker died in startup phase brick=/urd-gds/gluster >> [2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status] >> GeorepStatus: Worker Status Change status=Faulty >> [2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor: >> starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000 >> [2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main] >> <top>: Using session config file path=/var/lib/glusterd/geo-rep >> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf >> [2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main] >> <top>: Using session config file path=/var/lib/glusterd/geo-rep >> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf >> [2018-08-31 06:26:03.637486] I [changelogagent(agent >> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining... >> [2018-08-31 06:26:03.650330] I [resource(worker >> /urd-gds/gluster):1377:connect_remote] SSH: Initializing SSH connection >> between master and slave... >> [2018-08-31 06:26:05.296473] I [resource(worker >> /urd-gds/gluster):1424:connect_remote] SSH: SSH connection between >> master and slave established. duration=1.6457 >> [2018-08-31 06:26:05.297904] I [resource(worker >> /urd-gds/gluster):1096:connect] GLUSTER: Mounting gluster volume >> locally... >> [2018-08-31 06:26:06.396939] I [resource(worker >> /urd-gds/gluster):1119:connect] GLUSTER: Mounted gluster volume >> duration=1.0985 >> [2018-08-31 06:26:06.397691] I [subcmds(worker >> /urd-gds/gluster):70:subcmd_worker] <top>: Worker spawn successful. >> Acknowledging back to monitor >> [2018-08-31 06:26:16.815566] I [master(worker >> /urd-gds/gluster):1593:register] _GMaster: Working dir >> path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo >> -001_urd-gds-volume/urd-gds-gluster >> [2018-08-31 06:26:16.816423] I [resource(worker >> /urd-gds/gluster):1282:service_loop] GLUSTER: Register time >> time=1535696776 >> [2018-08-31 06:26:16.888772] I [gsyncdstatus(worker >> /urd-gds/gluster):277:set_active] GeorepStatus: Worker Status >> Change status=Active >> [2018-08-31 06:26:16.892049] I [gsyncdstatus(worker >> /urd-gds/gluster):249:set_worker_crawl_status] GeorepStatus: Crawl >> Status Change status=History Crawl >> [2018-08-31 06:26:16.892703] I [master(worker >> /urd-gds/gluster):1507:crawl] _GMaster: starting history crawl turns=1 >> stime=(1525739167, 0) entry_stime=(1525740143, 0) etime=1535696776 >> [2018-08-31 06:26:17.914803] I [master(worker >> /urd-gds/gluster):1536:crawl] _GMaster: slave's time >> stime=(1525739167, 0) >> [2018-08-31 06:26:18.521718] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1063 num_files=17 job=3 return_code=23 >> [2018-08-31 06:26:19.260137] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.3441 num_files=34 job=1 return_code=23 >> [2018-08-31 06:26:19.615191] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0923 num_files=7 job=3 return_code=23 >> [2018-08-31 06:26:19.891227] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1302 num_files=12 job=1 return_code=23 >> [2018-08-31 06:26:19.922700] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.5024 num_files=50 job=2 return_code=23 >> >> [2018-08-31 06:26:21.639342] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=1.5233 num_files=5 job=3 return_code=23 >> [2018-08-31 06:26:22.12726] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1191 num_files=7 job=1 return_code=23 >> [2018-08-31 06:26:22.86136] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0731 num_files=4 job=1 return_code=23 >> [2018-08-31 06:26:22.503290] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0779 num_files=15 job=2 return_code=23 >> [2018-08-31 06:26:23.214704] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0738 num_files=9 job=3 return_code=23 >> [2018-08-31 06:26:23.251876] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2478 num_files=33 job=2 return_code=23 >> [2018-08-31 06:26:23.802699] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0873 num_files=9 job=3 return_code=23 >> [2018-08-31 06:26:23.828176] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0758 num_files=3 job=2 return_code=23 >> [2018-08-31 06:26:23.854063] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2662 num_files=34 job=1 return_code=23 >> [2018-08-31 06:26:24.403228] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0997 num_files=30 job=3 return_code=23 >> [2018-08-31 06:26:25.526] I [master(worker /urd-gds/gluster):1944:syncjob] >> Syncer: Sync Time Taken duration=0.0965 num_files=8 job=3 >> return_code=23 >> [2018-08-31 06:26:25.438527] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0832 num_files=9 job=1 return_code=23 >> [2018-08-31 06:26:25.447256] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.6180 num_files=86 job=2 return_code=23 >> [2018-08-31 06:26:25.571913] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0706 num_files=2 job=3 return_code=0 >> [2018-08-31 06:26:27.21325] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0814 num_files=1 job=1 return_code=23 >> [2018-08-31 06:26:27.615520] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0933 num_files=13 job=1 return_code=23 >> [2018-08-31 06:26:27.668323] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2190 num_files=95 job=2 return_code=23 >> [2018-08-31 06:26:27.740139] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0716 num_files=11 job=2 return_code=23 >> [2018-08-31 06:26:28.191068] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1167 num_files=38 job=3 return_code=23 >> [2018-08-31 06:26:28.268213] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0768 num_files=7 job=3 return_code=23 >> [2018-08-31 06:26:28.317909] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0770 num_files=4 job=2 return_code=23 >> [2018-08-31 06:26:28.710064] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0932 num_files=23 job=1 return_code=23 >> [2018-08-31 06:26:28.907250] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0886 num_files=26 job=2 return_code=23 >> [2018-08-31 06:26:28.976679] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0692 num_files=4 job=2 return_code=23 >> [2018-08-31 06:26:29.55774] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0788 num_files=9 job=2 return_code=23 >> [2018-08-31 06:26:29.295576] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0847 num_files=16 job=1 return_code=23 >> [2018-08-31 06:26:29.665076] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1087 num_files=25 job=2 return_code=23 >> [2018-08-31 06:26:30.277998] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1122 num_files=40 job=2 return_code=23 >> [2018-08-31 06:26:31.153105] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.3822 num_files=74 job=3 return_code=23 >> [2018-08-31 06:26:31.227639] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0743 num_files=18 job=3 return_code=23 >> [2018-08-31 06:26:31.302660] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0748 num_files=18 job=3 return_code=23 >> >> >> --- >> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina >> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r >> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> >> E-mailing SLU will result in SLU processing your personal data. For more >> information on how this is done, click here >> <https://www.slu.se/en/about-slu/contact-slu/personal-data/> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Thanks and Regards, > Kotresh H R > > --- > N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina > personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > > --- > N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina > personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180903/d98b5d10/attachment.html>