We are running glusterfs-3.8.9-1.el7.x86_64 with geo-replication.
I have been having ongoing problems with the replication failing after
some time.
Once it has failed restarting it results in the attached logfile snippet.
--
Alvin Starr || land: (905)513-7688
Netvel Inc. || Cell: (416)806-0133
alvin at netvel.net ||
-------------- next part --------------
[2018-09-12 03:01:04.433048] I [monitor(monitor):267:monitor] Monitor:
------------------------------------------------------------
[2018-09-12 03:01:04.433470] I [monitor(monitor):268:monitor] Monitor: starting
gsyncd worker
[2018-09-12 03:01:04.599227] D [gsyncd(agent):730:main_i] <top>: rpc_fd:
'9,12,11,10'
[2018-09-12 03:01:04.600925] I [changelogagent(agent):73:__init__]
ChangelogAgent: Agent listining...
[2018-09-12 03:01:04.625732] I [gsyncd(/bricks/ccto_us/data):736:main_i]
<top>: syncing: gluster://localhost:CCTO-US-EDOCS -> ssh://root at
archive2.vpn.sycle.net:gluster://localhost:arch-CCTO-US-EDOCS
[2018-09-12 03:01:04.675003] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__() ...
[2018-09-12 03:01:06.518789] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__ ->
1.0
[2018-09-12 03:01:06.519186] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721266.52 version() ...
[2018-09-12 03:01:06.522499] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721266.52 version -> 1.0
[2018-09-12 03:01:06.522882] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721266.52 pid() ...
[2018-09-12 03:01:06.525834] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721266.52 pid -> 2647
[2018-09-12 03:01:06.623212] D [resource(/bricks/ccto_us/data):1281:inhibit]
DirectMounter: auxiliary glusterfs mount in place
[2018-09-12 03:01:07.678328] D [resource(/bricks/ccto_us/data):1336:inhibit]
DirectMounter: auxiliary glusterfs mount prepared
[2018-09-12 03:01:07.679094] I [master(/bricks/ccto_us/data):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2018-09-12 03:01:07.679126] D [monitor(monitor):337:monitor] Monitor:
worker(/bricks/ccto_us/data) connected
[2018-09-12 03:01:07.679547] I [master(/bricks/ccto_us/data):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.681130] I [master(/bricks/ccto_us/data):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2018-09-12 03:01:07.681557] I [master(/bricks/ccto_us/data):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.683561] I [master(/bricks/ccto_us/data):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2018-09-12 03:01:07.683960] I [master(/bricks/ccto_us/data):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.688644] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721267.69 version() ...
[2018-09-12 03:01:07.689547] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721267.69 version -> 1.0
[2018-09-12 03:01:07.689709] D
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working
dir
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:07.689863] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721267.69 init() ...
[2018-09-12 03:01:07.706136] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721267.69 init -> None
[2018-09-12 03:01:07.706440] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721267.71
register('/bricks/ccto_us/data',
'/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6',
'/var/log/glusterfs/geo-replication/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS.%2Fbricks%2Fccto_us%2Fdata-changes.log',
7, 5) ...
[2018-09-12 03:01:09.711715] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139692706621248:1536721267.71 register -> None
[2018-09-12 03:01:09.712357] D
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working
dir
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712651] D
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working
dir
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712901] D
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working
dir
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.713129] I [master(/bricks/ccto_us/data):1251:register]
_GMaster: xsync temp directory:
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6/xsync
[2018-09-12 03:01:09.713479] I
[resource(/bricks/ccto_us/data):1533:service_loop] GLUSTER: Register time:
1536721269
[2018-09-12 03:01:09.714504] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139691772856064:1536721269.71 keep_alive(None,) ...
[2018-09-12 03:01:09.719439] I [master(/bricks/ccto_us/data):510:crawlwrap]
_GMaster: primary master with volume id 900656fd-3f13-4ba2-bf04-90832508566e ...
[2018-09-12 03:01:09.723702] D [repce(/bricks/ccto_us/data):209:__call__]
RepceClient: call 26412:139691772856064:1536721269.71 keep_alive -> 1
[2018-09-12 03:01:09.726247] I [master(/bricks/ccto_us/data):519:crawlwrap]
_GMaster: crawl interval: 1 seconds
[2018-09-12 03:01:09.733443] I [master(/bricks/ccto_us/data):1165:crawl]
_GMaster: starting history crawl... turns: 1, stime: (1536718883, 0), etime:
1536721269
[2018-09-12 03:01:09.733824] D [repce(/bricks/ccto_us/data):191:push]
RepceClient: call 26412:139692706621248:1536721269.73
history('/bricks/ccto_us/data/.glusterfs/changelogs', 1536718883,
1536721269, 3) ...
[2018-09-12 03:01:09.735060] E [repce(agent):117:worker] <top>: call
failed:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113,
in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py",
line 54, in history
num_parallel)
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
line 100, in cl_history_changelog
cls.raise_changelog_err()
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
line 27, in raise_changelog_err
raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 2] No such file or directory
[2018-09-12 03:01:09.736625] E [repce(/bricks/ccto_us/data):207:__call__]
RepceClient: call 26412:139692706621248:1536721269.73 (history) failed on peer
with ChangelogException
[2018-09-12 03:01:09.736931] E
[resource(/bricks/ccto_us/data):1551:service_loop] GLUSTER: Changelog History
Crawl failed, [Errno 2] No such file or directory
[2018-09-12 03:01:09.737512] I [syncdutils(/bricks/ccto_us/data):220:finalize]
<top>: exiting.
[2018-09-12 03:01:09.743961] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2018-09-12 03:01:09.744335] I [syncdutils(agent):220:finalize] <top>:
exiting.
[2018-09-12 03:01:10.682538] I [monitor(monitor):344:monitor] Monitor:
worker(/bricks/ccto_us/data) died in startup phase