Matthew Benstead
2020-Oct-01 21:08 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Hello, I'm looking for some help with a GeoReplication Error in my Gluster 7/CentOS 7 setup. Replication progress has basically stopped, and the status of the replication keeps switching. The gsyncd log has errors like "Operation not permitted", "incomplete sync", etc... help? I'm not sure how to proceed in troubleshooting this. The log is here, it basically just repeats - from: /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log [2020-10-01 20:52:15.291923] I [master(worker /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=32.8466??????? num_files=1749? job=3?? return_code=23 [2020-10-01 20:52:18.700062] I [master(worker /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=43.1210??????? num_files=3167? job=6?? return_code=23 [2020-10-01 20:52:23.383234] W [master(worker /data/storage_c/storage):1393:process] _GMaster: incomplete sync, retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] [2020-10-01 20:52:28.537657] E [repce(worker /data/storage_b/storage):213:__call__] RepceClient: call failed??? call=258187:140538843596608:1601585515.63?????? method=entry_ops??????? error=OSError [2020-10-01 20:52:28.538064] E [syncdutils(worker /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: Traceback (most recent call last): ? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 332, in main ??? func(args) ? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, in subcmd_worker ??? local.service_loop(remote) ? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1308, in service_loop ??? g1.crawlwrap(oneshot=True, register_time=register_time) ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 602, in crawlwrap ??? self.crawl() ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1682, in crawl ??? self.process([item[1]], 0) ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1327, in process ??? self.process_change(change, done, retry) ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1221, in process_change ??? failures = self.slave.server.entry_ops(entries) ? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 232, in __call__ ??? return self.ins(self.meth, *a) ? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 214, in __call__ ??? raise res OSError: [Errno 1] Operation not permitted [2020-10-01 20:52:28.570316] I [repce(agent /data/storage_b/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2020-10-01 20:52:28.613603] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Faulty [2020-10-01 20:52:29.619797] I [master(worker /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=5.6458 num_files=455?? job=3?? return_code=23 [2020-10-01 20:52:38.286245] I [master(worker /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=14.1824??????? num_files=1333? job=2?? return_code=23 [2020-10-01 20:52:38.628156] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing... [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] Monitor: starting gsyncd worker?? brick=/data/storage_b/storage?? slave_node=10.0.231.82 [2020-10-01 20:52:38.684736] I [gsyncd(agent /data/storage_b/storage):318:main] <top>: Using session config file??????? path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf [2020-10-01 20:52:38.687213] I [gsyncd(worker /data/storage_b/storage):318:main] <top>: Using session config file?????? path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf [2020-10-01 20:52:38.687401] I [changelogagent(agent /data/storage_b/storage):72:__init__] ChangelogAgent: Agent listining... [2020-10-01 20:52:38.703295] I [resource(worker /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH connection between master and slave... [2020-10-01 20:52:40.388372] I [resource(worker /data/storage_b/storage):1435:connect_remote] SSH: SSH connection between master and slave established. duration=1.6849 [2020-10-01 20:52:40.388582] I [resource(worker /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster volume locally... [2020-10-01 20:52:41.501105] I [resource(worker /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster volume? duration=1.1123 [2020-10-01 20:52:41.501405] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2020-10-01 20:52:43.531146] I [master(worker /data/storage_b/storage):1640:register] _GMaster: Working dir???? path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2020-10-01 20:52:43.533953] I [resource(worker /data/storage_b/storage):1291:service_loop] GLUSTER: Register time????? time=1601585563 [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change status=Active [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change???? status=History Crawl [2020-10-01 20:52:43.562184] I [master(worker /data/storage_b/storage):1554:crawl] _GMaster: starting history crawl???? turns=1 stime=None????? entry_stime=None??????? etime=1601585563 [2020-10-01 20:52:43.562269] I [resource(worker /data/storage_b/storage):1307:service_loop] GLUSTER: No stime available, using xsync crawl [2020-10-01 20:52:43.569799] I [master(worker /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid crawl????? stime=None [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change???? status=Hybrid Crawl [2020-10-01 20:52:44.370985] I [master(worker /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=20.4307??????? num_files=2609? job=5?? return_code=23 [2020-10-01 20:52:49.431854] W [master(worker /data/storage_c/storage):1393:process] _GMaster: incomplete sync, retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] [2020-10-01 20:52:54.801500] I [master(worker /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=72.7492??????? num_files=4227? job=2?? return_code=23 [2020-10-01 20:52:56.766547] I [master(worker /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=74.3569??????? num_files=4674? job=5?? return_code=23 [2020-10-01 20:53:18.853333] I [master(worker /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=28.7125??????? num_files=4397? job=3?? return_code=23 [2020-10-01 20:53:21.224921] W [master(worker /data/storage_a/storage):1393:process] _GMaster: incomplete sync, retrying changelogs???? files=['CHANGELOG.1601044033', 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', 'CHANGELOG.1601044078', 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', 'CHANGELOG.1601044123'] [2020-10-01 20:53:22.134536] I [master(worker /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken??? duration=0.2159 num_files=3???? job=3?? return_code=23 [2020-10-01 20:53:25.615712] I [master(worker /data/storage_b/storage):1681:crawl] _GMaster: processing xsync changelog path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 [2020-10-01 20:53:25.634970] W [master(worker /data/storage_c/storage):1393:process] _GMaster: incomplete sync, retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] GeoReplication status - see it change from Active to Faulty: [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS??? CRAWL STATUS?????? LAST_SYNCED????????????????? --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Changelog Crawl??? 2020-09-25 07:26:57????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? History Crawl????? 2020-09-23 01:56:05????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Changelog Crawl??? 2020-09-25 06:55:57????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Hybrid Crawl?????? N/A??? [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS??? CRAWL STATUS?????? LAST_SYNCED????????????????? --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Changelog Crawl??? 2020-09-25 07:26:57????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty??? N/A??????????????? N/A????????????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? History Crawl????? 2020-09-23 01:58:05????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty??? N/A??????????????? N/A????????????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Changelog Crawl??? 2020-09-25 06:58:56????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active??? Hybrid Crawl?????? N/A????????????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty??? N/A??????????????? N/A???? Cluster information: (Note - disabled performance.open-behind to work around https://github.com/gluster/glusterfs/issues/1440 ) [root at storage01 ~]# gluster --version | head -1; cat /etc/centos-release; uname -r glusterfs 7.7 CentOS Linux release 7.8.2003 (Core) 3.10.0-1127.10.1.el7.x86_64 [root at storage01 ~]# df -h /storage2/ Filesystem??????????? Size? Used Avail Use% Mounted on 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 [root at storage01 ~]# gluster volume info ? Volume Name: storage Type: Distributed-Replicate Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: 10.0.231.91:/data/storage_a/storage Brick2: 10.0.231.92:/data/storage_b/storage Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) Brick4: 10.0.231.92:/data/storage_a/storage Brick5: 10.0.231.93:/data/storage_b/storage Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) Brick7: 10.0.231.93:/data/storage_a/storage Brick8: 10.0.231.91:/data/storage_b/storage Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on network.ping-timeout: 10 features.inode-quota: on features.quota: on nfs.disable: on features.quota-deem-statfs: on storage.fips-mode-rchecksum: on performance.readdir-ahead: on performance.parallel-readdir: on cluster.lookup-optimize: on client.event-threads: 4 server.event-threads: 4 performance.cache-size: 256MB performance.open-behind: off Thanks, ?-Matthew
Felix Kölzow
2020-Oct-05 08:39 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Dear Matthew, can you provide more information regarding to the geo-replication brick logs. These files area also located in: /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ Usually, these log files are more precise to figure out the root cause of the error. Additionally, it is also worth to look at the log-files on the slave side. Regards, Felix On 01/10/2020 23:08, Matthew Benstead wrote:> Hello, > > I'm looking for some help with a GeoReplication Error in my Gluster > 7/CentOS 7 setup. Replication progress has basically stopped, and the > status of the replication keeps switching. > > The gsyncd log has errors like "Operation not permitted", "incomplete > sync", etc... help? I'm not sure how to proceed in troubleshooting this. > > The log is here, it basically just repeats - from: > /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log > > [2020-10-01 20:52:15.291923] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=32.8466??????? num_files=1749? job=3?? return_code=23 > [2020-10-01 20:52:18.700062] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=43.1210??????? num_files=3167? job=6?? return_code=23 > [2020-10-01 20:52:23.383234] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] > [2020-10-01 20:52:28.537657] E [repce(worker > /data/storage_b/storage):213:__call__] RepceClient: call failed > call=258187:140538843596608:1601585515.63?????? method=entry_ops > error=OSError > [2020-10-01 20:52:28.538064] E [syncdutils(worker > /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > ? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 332, > in main > ??? func(args) > ? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, > in subcmd_worker > ??? local.service_loop(remote) > ? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line > 1308, in service_loop > ??? g1.crawlwrap(oneshot=True, register_time=register_time) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 602, > in crawlwrap > ??? self.crawl() > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1682, > in crawl > ??? self.process([item[1]], 0) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1327, > in process > ??? self.process_change(change, done, retry) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1221, > in process_change > ??? failures = self.slave.server.entry_ops(entries) > ? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 232, in > __call__ > ??? return self.ins(self.meth, *a) > ? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 214, in > __call__ > ??? raise res > OSError: [Errno 1] Operation not permitted > [2020-10-01 20:52:28.570316] I [repce(agent > /data/storage_b/storage):96:service_loop] RepceServer: terminating on > reaching EOF. > [2020-10-01 20:52:28.613603] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change status=Faulty > [2020-10-01 20:52:29.619797] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=5.6458 num_files=455?? job=3?? return_code=23 > [2020-10-01 20:52:38.286245] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=14.1824??????? num_files=1333? job=2?? return_code=23 > [2020-10-01 20:52:38.628156] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change status=Initializing... > [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] Monitor: > starting gsyncd worker?? brick=/data/storage_b/storage > slave_node=10.0.231.82 > [2020-10-01 20:52:38.684736] I [gsyncd(agent > /data/storage_b/storage):318:main] <top>: Using session config > file > path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf > [2020-10-01 20:52:38.687213] I [gsyncd(worker > /data/storage_b/storage):318:main] <top>: Using session config > file > path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf > [2020-10-01 20:52:38.687401] I [changelogagent(agent > /data/storage_b/storage):72:__init__] ChangelogAgent: Agent listining... > [2020-10-01 20:52:38.703295] I [resource(worker > /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH > connection between master and slave... > [2020-10-01 20:52:40.388372] I [resource(worker > /data/storage_b/storage):1435:connect_remote] SSH: SSH connection > between master and slave established. duration=1.6849 > [2020-10-01 20:52:40.388582] I [resource(worker > /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster volume > locally... > [2020-10-01 20:52:41.501105] I [resource(worker > /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster volume > duration=1.1123 > [2020-10-01 20:52:41.501405] I [subcmds(worker > /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn > successful. Acknowledging back to monitor > [2020-10-01 20:52:43.531146] I [master(worker > /data/storage_b/storage):1640:register] _GMaster: Working dir > path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage > [2020-10-01 20:52:43.533953] I [resource(worker > /data/storage_b/storage):1291:service_loop] GLUSTER: Register time > time=1601585563 > [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker > /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status > Change status=Active > [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker > /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: > Crawl Status Change???? status=History Crawl > [2020-10-01 20:52:43.562184] I [master(worker > /data/storage_b/storage):1554:crawl] _GMaster: starting history > crawl???? turns=1 stime=None????? entry_stime=None??????? etime=1601585563 > [2020-10-01 20:52:43.562269] I [resource(worker > /data/storage_b/storage):1307:service_loop] GLUSTER: No stime available, > using xsync crawl > [2020-10-01 20:52:43.569799] I [master(worker > /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid > crawl????? stime=None > [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker > /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: > Crawl Status Change???? status=Hybrid Crawl > [2020-10-01 20:52:44.370985] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=20.4307??????? num_files=2609? job=5?? return_code=23 > [2020-10-01 20:52:49.431854] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] > [2020-10-01 20:52:54.801500] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=72.7492??????? num_files=4227? job=2?? return_code=23 > [2020-10-01 20:52:56.766547] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=74.3569??????? num_files=4674? job=5?? return_code=23 > [2020-10-01 20:53:18.853333] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=28.7125??????? num_files=4397? job=3?? return_code=23 > [2020-10-01 20:53:21.224921] W [master(worker > /data/storage_a/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs???? files=['CHANGELOG.1601044033', > 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', 'CHANGELOG.1601044078', > 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', 'CHANGELOG.1601044123'] > [2020-10-01 20:53:22.134536] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=0.2159 num_files=3???? job=3?? return_code=23 > [2020-10-01 20:53:25.615712] I [master(worker > /data/storage_b/storage):1681:crawl] _GMaster: processing xsync > changelog > path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 > [2020-10-01 20:53:25.634970] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] > > GeoReplication status - see it change from Active to Faulty: > > [root at storage01 ~]# gluster volume geo-replication status > > MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER > SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS > CRAWL STATUS?????? LAST_SYNCED > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Changelog Crawl??? 2020-09-25 07:26:57 > 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > Hybrid Crawl?????? N/A > 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > Hybrid Crawl?????? N/A > 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > History Crawl????? 2020-09-23 01:56:05 > 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > Hybrid Crawl?????? N/A > 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Hybrid Crawl?????? N/A > 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Changelog Crawl??? 2020-09-25 06:55:57 > 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Hybrid Crawl?????? N/A > 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Hybrid Crawl?????? N/A > > [root at storage01 ~]# gluster volume geo-replication status > > MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER > SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS > CRAWL STATUS?????? LAST_SYNCED > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Changelog Crawl??? 2020-09-25 07:26:57 > 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > Hybrid Crawl?????? N/A > 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty > N/A??????????????? N/A > 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > History Crawl????? 2020-09-23 01:58:05 > 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active > Hybrid Crawl?????? N/A > 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty > N/A??????????????? N/A > 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Changelog Crawl??? 2020-09-25 06:58:56 > 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active > Hybrid Crawl?????? N/A > 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount > ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty > N/A??????????????? N/A > > > Cluster information: (Note - disabled performance.open-behind to work > around https://github.com/gluster/glusterfs/issues/1440 ) > > [root at storage01 ~]# gluster --version | head -1; cat > /etc/centos-release; uname -r > glusterfs 7.7 > CentOS Linux release 7.8.2003 (Core) > 3.10.0-1127.10.1.el7.x86_64 > > [root at storage01 ~]# df -h /storage2/ > Filesystem??????????? Size? Used Avail Use% Mounted on > 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 > > [root at storage01 ~]# gluster volume info > > Volume Name: storage > Type: Distributed-Replicate > Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 > Status: Started > Snapshot Count: 0 > Number of Bricks: 3 x (2 + 1) = 9 > Transport-type: tcp > Bricks: > Brick1: 10.0.231.91:/data/storage_a/storage > Brick2: 10.0.231.92:/data/storage_b/storage > Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) > Brick4: 10.0.231.92:/data/storage_a/storage > Brick5: 10.0.231.93:/data/storage_b/storage > Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) > Brick7: 10.0.231.93:/data/storage_a/storage > Brick8: 10.0.231.91:/data/storage_b/storage > Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) > Options Reconfigured: > changelog.changelog: on > geo-replication.ignore-pid-check: on > geo-replication.indexing: on > network.ping-timeout: 10 > features.inode-quota: on > features.quota: on > nfs.disable: on > features.quota-deem-statfs: on > storage.fips-mode-rchecksum: on > performance.readdir-ahead: on > performance.parallel-readdir: on > cluster.lookup-optimize: on > client.event-threads: 4 > server.event-threads: 4 > performance.cache-size: 256MB > performance.open-behind: off > > Thanks, > ?-Matthew > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users