Matthew Benstead
2020-Oct-05 20:28 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Hmm... Looks like I forgot to set the xattr's to sa - I left them as default. [root at pcic-backup01 ~]# zfs get xattr pcic-backup01-zpool NAME???????????????? PROPERTY? VALUE? SOURCE pcic-backup01-zpool? xattr???? on???? default [root at pcic-backup02 ~]# zfs get xattr pcic-backup02-zpool NAME???????????????? PROPERTY? VALUE? SOURCE pcic-backup02-zpool? xattr???? on???? default I wonder if I can change them and continue, or if I need to blow away the zpool and start over? Thanks, ?-Matthew -- Matthew Benstead System Administrator Pacific Climate Impacts Consortium <https://pacificclimate.org/> University of Victoria, UH1 PO Box 1800, STN CSC Victoria, BC, V8W 2Y2 Phone: +1-250-721-8432 Email: matthewb at uvic.ca On 10/5/20 12:53 PM, Felix K?lzow wrote:> > Dear Matthew, > > this is our configuration: > > zfs get all mypool > > mypool? xattr?????????????????????????? > sa????????????????????????????? local > mypool? acltype???????????????????????? > posixacl??????????????????????? local > > > Something more to consider? > > > Regards, > > Felix > > > > On 05/10/2020 21:11, Matthew Benstead wrote: >> Thanks Felix - looking through some more of the logs I may have found >> the reason... >> >> From >> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/mnt-data-storage_a-storage.log >> >> [2020-10-05 18:13:35.736838] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:18:53.885591] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:22:14.405234] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:25:53.971679] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:31:44.571557] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:36:36.508772] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:40:10.401055] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:42:57.833536] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:45:19.691953] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:48:26.478532] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> [2020-10-05 18:52:24.466914] E [fuse-bridge.c:4288:fuse_xattr_cbk] >> 0-glusterfs-fuse: extended attribute not supported by the backend storage >> >> >> The slave nodes are running gluster on top of ZFS, but I had >> configured ACLs - is there something else missing to make this work >> with ZFS? >> >> [root at pcic-backup01 ~]# gluster volume info >> ? >> Volume Name: pcic-backup >> Type: Distribute >> Volume ID: 7af8a424-f4b6-4405-bba1-0dbafb0fa231 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 2 >> Transport-type: tcp >> Bricks: >> Brick1: 10.0.231.81:/pcic-backup01-zpool/brick >> Brick2: 10.0.231.82:/pcic-backup02-zpool/brick >> Options Reconfigured: >> network.ping-timeout: 10 >> performance.cache-size: 256MB >> server.event-threads: 4 >> client.event-threads: 4 >> cluster.lookup-optimize: on >> performance.parallel-readdir: on >> performance.readdir-ahead: on >> features.quota-deem-statfs: on >> features.inode-quota: on >> features.quota: on >> transport.address-family: inet >> nfs.disable: on >> features.read-only: off >> performance.open-behind: off >> >> >> [root at pcic-backup01 ~]# zfs get acltype pcic-backup01-zpool >> NAME???????????????? PROPERTY? VALUE???? SOURCE >> pcic-backup01-zpool? acltype?? posixacl? local >> >> [root at pcic-backup01 ~]# grep "pcic-backup0" /proc/mounts >> pcic-backup01-zpool /pcic-backup01-zpool zfs >> rw,seclabel,xattr,posixacl 0 0 >> >> >> [root at pcic-backup02 ~]# zfs get acltype pcic-backup02-zpool >> NAME???????????????? PROPERTY? VALUE???? SOURCE >> pcic-backup02-zpool? acltype?? posixacl? local >> >> [root at pcic-backup02 ~]# grep "pcic-backup0" /proc/mounts >> pcic-backup02-zpool /pcic-backup02-zpool zfs >> rw,seclabel,xattr,posixacl 0 0 >> >> Thanks, >> ?-Matthew >> >> >> -- >> Matthew Benstead >> System Administrator >> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >> University of Victoria, UH1 >> PO Box 1800, STN CSC >> Victoria, BC, V8W 2Y2 >> Phone: +1-250-721-8432 >> Email: matthewb at uvic.ca >> >> On 10/5/20 1:39 AM, Felix K?lzow wrote: >>> Dear Matthew, >>> >>> >>> can you provide more information regarding to the geo-replication brick >>> logs. >>> >>> These files area also located in: >>> >>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ >>> >>> >>> Usually, these log files are more precise to figure out the root cause >>> of the error. >>> >>> Additionally, it is also worth to look at the log-files on the slave >>> side. >>> >>> >>> Regards, >>> >>> Felix >>> >>> >>> On 01/10/2020 23:08, Matthew Benstead wrote: >>>> Hello, >>>> >>>> I'm looking for some help with a GeoReplication Error in my Gluster >>>> 7/CentOS 7 setup. Replication progress has basically stopped, and the >>>> status of the replication keeps switching. >>>> >>>> The gsyncd log has errors like "Operation not permitted", "incomplete >>>> sync", etc... help? I'm not sure how to proceed in troubleshooting >>>> this. >>>> >>>> The log is here, it basically just repeats - from: >>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log >>>> >>>> >>>> [2020-10-01 20:52:15.291923] I [master(worker >>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=32.8466??????? num_files=1749? job=3?? return_code=23 >>>> [2020-10-01 20:52:18.700062] I [master(worker >>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=43.1210??????? num_files=3167? job=6?? return_code=23 >>>> [2020-10-01 20:52:23.383234] W [master(worker >>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>> [2020-10-01 20:52:28.537657] E [repce(worker >>>> /data/storage_b/storage):213:__call__] RepceClient: call failed >>>> call=258187:140538843596608:1601585515.63?????? method=entry_ops >>>> error=OSError >>>> [2020-10-01 20:52:28.538064] E [syncdutils(worker >>>> /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: >>>> Traceback (most recent call last): >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line >>>> 332, >>>> in main >>>> ???? func(args) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line >>>> 86, >>>> in subcmd_worker >>>> ???? local.service_loop(remote) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>> 1308, in service_loop >>>> ???? g1.crawlwrap(oneshot=True, register_time=register_time) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>> 602, >>>> in crawlwrap >>>> ???? self.crawl() >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>> 1682, >>>> in crawl >>>> ???? self.process([item[1]], 0) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>> 1327, >>>> in process >>>> ???? self.process_change(change, done, retry) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>> 1221, >>>> in process_change >>>> ???? failures = self.slave.server.entry_ops(entries) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>> 232, in >>>> __call__ >>>> ???? return self.ins(self.meth, *a) >>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>> 214, in >>>> __call__ >>>> ???? raise res >>>> OSError: [Errno 1] Operation not permitted >>>> [2020-10-01 20:52:28.570316] I [repce(agent >>>> /data/storage_b/storage):96:service_loop] RepceServer: terminating on >>>> reaching EOF. >>>> [2020-10-01 20:52:28.613603] I >>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>> Status Change status=Faulty >>>> [2020-10-01 20:52:29.619797] I [master(worker >>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=5.6458 num_files=455?? job=3?? return_code=23 >>>> [2020-10-01 20:52:38.286245] I [master(worker >>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=14.1824??????? num_files=1333? job=2?? return_code=23 >>>> [2020-10-01 20:52:38.628156] I >>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>> Status Change status=Initializing... >>>> [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] Monitor: >>>> starting gsyncd worker?? brick=/data/storage_b/storage >>>> slave_node=10.0.231.82 >>>> [2020-10-01 20:52:38.684736] I [gsyncd(agent >>>> /data/storage_b/storage):318:main] <top>: Using session config >>>> file >>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>> >>>> [2020-10-01 20:52:38.687213] I [gsyncd(worker >>>> /data/storage_b/storage):318:main] <top>: Using session config >>>> file >>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>> >>>> [2020-10-01 20:52:38.687401] I [changelogagent(agent >>>> /data/storage_b/storage):72:__init__] ChangelogAgent: Agent >>>> listining... >>>> [2020-10-01 20:52:38.703295] I [resource(worker >>>> /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH >>>> connection between master and slave... >>>> [2020-10-01 20:52:40.388372] I [resource(worker >>>> /data/storage_b/storage):1435:connect_remote] SSH: SSH connection >>>> between master and slave established. duration=1.6849 >>>> [2020-10-01 20:52:40.388582] I [resource(worker >>>> /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster >>>> volume >>>> locally... >>>> [2020-10-01 20:52:41.501105] I [resource(worker >>>> /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster volume >>>> duration=1.1123 >>>> [2020-10-01 20:52:41.501405] I [subcmds(worker >>>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn >>>> successful. Acknowledging back to monitor >>>> [2020-10-01 20:52:43.531146] I [master(worker >>>> /data/storage_b/storage):1640:register] _GMaster: Working dir >>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>>> >>>> [2020-10-01 20:52:43.533953] I [resource(worker >>>> /data/storage_b/storage):1291:service_loop] GLUSTER: Register time >>>> time=1601585563 >>>> [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker >>>> /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status >>>> Change status=Active >>>> [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker >>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>> Crawl Status Change???? status=History Crawl >>>> [2020-10-01 20:52:43.562184] I [master(worker >>>> /data/storage_b/storage):1554:crawl] _GMaster: starting history >>>> crawl???? turns=1 stime=None????? entry_stime=None??????? >>>> etime=1601585563 >>>> [2020-10-01 20:52:43.562269] I [resource(worker >>>> /data/storage_b/storage):1307:service_loop] GLUSTER: No stime >>>> available, >>>> using xsync crawl >>>> [2020-10-01 20:52:43.569799] I [master(worker >>>> /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid >>>> crawl????? stime=None >>>> [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker >>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>> Crawl Status Change???? status=Hybrid Crawl >>>> [2020-10-01 20:52:44.370985] I [master(worker >>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=20.4307??????? num_files=2609? job=5?? return_code=23 >>>> [2020-10-01 20:52:49.431854] W [master(worker >>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>> [2020-10-01 20:52:54.801500] I [master(worker >>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=72.7492??????? num_files=4227? job=2?? return_code=23 >>>> [2020-10-01 20:52:56.766547] I [master(worker >>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=74.3569??????? num_files=4674? job=5?? return_code=23 >>>> [2020-10-01 20:53:18.853333] I [master(worker >>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=28.7125??????? num_files=4397? job=3?? return_code=23 >>>> [2020-10-01 20:53:21.224921] W [master(worker >>>> /data/storage_a/storage):1393:process] _GMaster: incomplete sync, >>>> retrying changelogs???? files=['CHANGELOG.1601044033', >>>> 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', >>>> 'CHANGELOG.1601044078', >>>> 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', >>>> 'CHANGELOG.1601044123'] >>>> [2020-10-01 20:53:22.134536] I [master(worker >>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>> duration=0.2159 num_files=3???? job=3?? return_code=23 >>>> [2020-10-01 20:53:25.615712] I [master(worker >>>> /data/storage_b/storage):1681:crawl] _GMaster: processing xsync >>>> changelog >>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 >>>> >>>> [2020-10-01 20:53:25.634970] W [master(worker >>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>> >>>> GeoReplication status - see it change from Active to Faulty: >>>> >>>> [root at storage01 ~]# gluster volume geo-replication status >>>> >>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>> CRAWL STATUS?????? LAST_SYNCED >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> History Crawl????? 2020-09-23 01:56:05 >>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Changelog Crawl??? 2020-09-25 06:55:57 >>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Hybrid Crawl?????? N/A >>>> >>>> [root at storage01 ~]# gluster volume geo-replication status >>>> >>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>> CRAWL STATUS?????? LAST_SYNCED >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>> N/A??????????????? N/A >>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> History Crawl????? 2020-09-23 01:58:05 >>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>> N/A??????????????? N/A >>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Changelog Crawl??? 2020-09-25 06:58:56 >>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>> Hybrid Crawl?????? N/A >>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>> N/A??????????????? N/A >>>> >>>> >>>> Cluster information: (Note - disabled performance.open-behind to work >>>> around https://github.com/gluster/glusterfs/issues/1440 ) >>>> >>>> [root at storage01 ~]# gluster --version | head -1; cat >>>> /etc/centos-release; uname -r >>>> glusterfs 7.7 >>>> CentOS Linux release 7.8.2003 (Core) >>>> 3.10.0-1127.10.1.el7.x86_64 >>>> >>>> [root at storage01 ~]# df -h /storage2/ >>>> Filesystem??????????? Size? Used Avail Use% Mounted on >>>> 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 >>>> >>>> [root at storage01 ~]# gluster volume info >>>> >>>> Volume Name: storage >>>> Type: Distributed-Replicate >>>> Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 3 x (2 + 1) = 9 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: 10.0.231.91:/data/storage_a/storage >>>> Brick2: 10.0.231.92:/data/storage_b/storage >>>> Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) >>>> Brick4: 10.0.231.92:/data/storage_a/storage >>>> Brick5: 10.0.231.93:/data/storage_b/storage >>>> Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) >>>> Brick7: 10.0.231.93:/data/storage_a/storage >>>> Brick8: 10.0.231.91:/data/storage_b/storage >>>> Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) >>>> Options Reconfigured: >>>> changelog.changelog: on >>>> geo-replication.ignore-pid-check: on >>>> geo-replication.indexing: on >>>> network.ping-timeout: 10 >>>> features.inode-quota: on >>>> features.quota: on >>>> nfs.disable: on >>>> features.quota-deem-statfs: on >>>> storage.fips-mode-rchecksum: on >>>> performance.readdir-ahead: on >>>> performance.parallel-readdir: on >>>> cluster.lookup-optimize: on >>>> client.event-threads: 4 >>>> server.event-threads: 4 >>>> performance.cache-size: 256MB >>>> performance.open-behind: off >>>> >>>> Thanks, >>>> ??-Matthew >>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://bluejeans.com/441850968 >>>> >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201005/aa75510b/attachment.html>
Felix Kölzow
2020-Oct-05 20:37 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Dear Matthew, from my current experience with gluster geo-replication and since this is a key in your backup-procedure (this is how it seems to me), I would come up with a new one, just to be sure. Regards, Felix On 05/10/2020 22:28, Matthew Benstead wrote:> Hmm... Looks like I forgot to set the xattr's to sa - I left them as > default. > > [root at pcic-backup01 ~]# zfs get xattr pcic-backup01-zpool > NAME???????????????? PROPERTY? VALUE? SOURCE > pcic-backup01-zpool? xattr???? on???? default > > [root at pcic-backup02 ~]# zfs get xattr pcic-backup02-zpool > NAME???????????????? PROPERTY? VALUE? SOURCE > pcic-backup02-zpool? xattr???? on???? default > > I wonder if I can change them and continue, or if I need to blow away > the zpool and start over? > > Thanks, > ?-Matthew > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: matthewb at uvic.ca > > On 10/5/20 12:53 PM, Felix K?lzow wrote: >> >> Dear Matthew, >> >> this is our configuration: >> >> zfs get all mypool >> >> mypool? xattr sa????????????????????????????? local >> mypool? acltype posixacl??????????????????????? local >> >> >> Something more to consider? >> >> >> Regards, >> >> Felix >> >> >> >> On 05/10/2020 21:11, Matthew Benstead wrote: >>> Thanks Felix - looking through some more of the logs I may have >>> found the reason... >>> >>> From >>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/mnt-data-storage_a-storage.log >>> >>> [2020-10-05 18:13:35.736838] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:18:53.885591] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:22:14.405234] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:25:53.971679] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:31:44.571557] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:36:36.508772] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:40:10.401055] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:42:57.833536] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:45:19.691953] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:48:26.478532] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:52:24.466914] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> >>> >>> The slave nodes are running gluster on top of ZFS, but I had >>> configured ACLs - is there something else missing to make this work >>> with ZFS? >>> >>> [root at pcic-backup01 ~]# gluster volume info >>> >>> Volume Name: pcic-backup >>> Type: Distribute >>> Volume ID: 7af8a424-f4b6-4405-bba1-0dbafb0fa231 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.0.231.81:/pcic-backup01-zpool/brick >>> Brick2: 10.0.231.82:/pcic-backup02-zpool/brick >>> Options Reconfigured: >>> network.ping-timeout: 10 >>> performance.cache-size: 256MB >>> server.event-threads: 4 >>> client.event-threads: 4 >>> cluster.lookup-optimize: on >>> performance.parallel-readdir: on >>> performance.readdir-ahead: on >>> features.quota-deem-statfs: on >>> features.inode-quota: on >>> features.quota: on >>> transport.address-family: inet >>> nfs.disable: on >>> features.read-only: off >>> performance.open-behind: off >>> >>> >>> [root at pcic-backup01 ~]# zfs get acltype pcic-backup01-zpool >>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>> pcic-backup01-zpool? acltype?? posixacl? local >>> >>> [root at pcic-backup01 ~]# grep "pcic-backup0" /proc/mounts >>> pcic-backup01-zpool /pcic-backup01-zpool zfs >>> rw,seclabel,xattr,posixacl 0 0 >>> >>> >>> [root at pcic-backup02 ~]# zfs get acltype pcic-backup02-zpool >>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>> pcic-backup02-zpool? acltype?? posixacl? local >>> >>> [root at pcic-backup02 ~]# grep "pcic-backup0" /proc/mounts >>> pcic-backup02-zpool /pcic-backup02-zpool zfs >>> rw,seclabel,xattr,posixacl 0 0 >>> >>> Thanks, >>> ?-Matthew >>> >>> >>> -- >>> Matthew Benstead >>> System Administrator >>> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >>> University of Victoria, UH1 >>> PO Box 1800, STN CSC >>> Victoria, BC, V8W 2Y2 >>> Phone: +1-250-721-8432 >>> Email: matthewb at uvic.ca >>> >>> On 10/5/20 1:39 AM, Felix K?lzow wrote: >>>> Dear Matthew, >>>> >>>> >>>> can you provide more information regarding to the geo-replication >>>> brick >>>> logs. >>>> >>>> These files area also located in: >>>> >>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ >>>> >>>> >>>> Usually, these log files are more precise to figure out the root cause >>>> of the error. >>>> >>>> Additionally, it is also worth to look at the log-files on the >>>> slave side. >>>> >>>> >>>> Regards, >>>> >>>> Felix >>>> >>>> >>>> On 01/10/2020 23:08, Matthew Benstead wrote: >>>>> Hello, >>>>> >>>>> I'm looking for some help with a GeoReplication Error in my Gluster >>>>> 7/CentOS 7 setup. Replication progress has basically stopped, and the >>>>> status of the replication keeps switching. >>>>> >>>>> The gsyncd log has errors like "Operation not permitted", "incomplete >>>>> sync", etc... help? I'm not sure how to proceed in troubleshooting >>>>> this. >>>>> >>>>> The log is here, it basically just repeats - from: >>>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log >>>>> >>>>> >>>>> [2020-10-01 20:52:15.291923] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=32.8466??????? num_files=1749? job=3 return_code=23 >>>>> [2020-10-01 20:52:18.700062] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=43.1210??????? num_files=3167? job=6 return_code=23 >>>>> [2020-10-01 20:52:23.383234] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] >>>>> [2020-10-01 20:52:28.537657] E [repce(worker >>>>> /data/storage_b/storage):213:__call__] RepceClient: call failed >>>>> call=258187:140538843596608:1601585515.63 method=entry_ops >>>>> error=OSError >>>>> [2020-10-01 20:52:28.538064] E [syncdutils(worker >>>>> /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: >>>>> Traceback (most recent call last): >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line >>>>> 332, >>>>> in main >>>>> ???? func(args) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", >>>>> line 86, >>>>> in subcmd_worker >>>>> ???? local.service_loop(remote) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>>> 1308, in service_loop >>>>> ???? g1.crawlwrap(oneshot=True, register_time=register_time) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 602, >>>>> in crawlwrap >>>>> ???? self.crawl() >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1682, >>>>> in crawl >>>>> ???? self.process([item[1]], 0) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1327, >>>>> in process >>>>> ???? self.process_change(change, done, retry) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1221, >>>>> in process_change >>>>> ???? failures = self.slave.server.entry_ops(entries) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>> 232, in >>>>> __call__ >>>>> ???? return self.ins(self.meth, *a) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>> 214, in >>>>> __call__ >>>>> ???? raise res >>>>> OSError: [Errno 1] Operation not permitted >>>>> [2020-10-01 20:52:28.570316] I [repce(agent >>>>> /data/storage_b/storage):96:service_loop] RepceServer: terminating on >>>>> reaching EOF. >>>>> [2020-10-01 20:52:28.613603] I >>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>> Status Change status=Faulty >>>>> [2020-10-01 20:52:29.619797] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=5.6458 num_files=455?? job=3?? return_code=23 >>>>> [2020-10-01 20:52:38.286245] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=14.1824??????? num_files=1333? job=2 return_code=23 >>>>> [2020-10-01 20:52:38.628156] I >>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>> Status Change status=Initializing... >>>>> [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] >>>>> Monitor: >>>>> starting gsyncd worker?? brick=/data/storage_b/storage >>>>> slave_node=10.0.231.82 >>>>> [2020-10-01 20:52:38.684736] I [gsyncd(agent >>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>> file >>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>> >>>>> [2020-10-01 20:52:38.687213] I [gsyncd(worker >>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>> file >>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>> >>>>> [2020-10-01 20:52:38.687401] I [changelogagent(agent >>>>> /data/storage_b/storage):72:__init__] ChangelogAgent: Agent >>>>> listining... >>>>> [2020-10-01 20:52:38.703295] I [resource(worker >>>>> /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH >>>>> connection between master and slave... >>>>> [2020-10-01 20:52:40.388372] I [resource(worker >>>>> /data/storage_b/storage):1435:connect_remote] SSH: SSH connection >>>>> between master and slave established. duration=1.6849 >>>>> [2020-10-01 20:52:40.388582] I [resource(worker >>>>> /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster >>>>> volume >>>>> locally... >>>>> [2020-10-01 20:52:41.501105] I [resource(worker >>>>> /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster >>>>> volume >>>>> duration=1.1123 >>>>> [2020-10-01 20:52:41.501405] I [subcmds(worker >>>>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn >>>>> successful. Acknowledging back to monitor >>>>> [2020-10-01 20:52:43.531146] I [master(worker >>>>> /data/storage_b/storage):1640:register] _GMaster: Working dir >>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>>>> >>>>> [2020-10-01 20:52:43.533953] I [resource(worker >>>>> /data/storage_b/storage):1291:service_loop] GLUSTER: Register time >>>>> time=1601585563 >>>>> [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status >>>>> Change status=Active >>>>> [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>> Crawl Status Change???? status=History Crawl >>>>> [2020-10-01 20:52:43.562184] I [master(worker >>>>> /data/storage_b/storage):1554:crawl] _GMaster: starting history >>>>> crawl???? turns=1 stime=None????? entry_stime=None etime=1601585563 >>>>> [2020-10-01 20:52:43.562269] I [resource(worker >>>>> /data/storage_b/storage):1307:service_loop] GLUSTER: No stime >>>>> available, >>>>> using xsync crawl >>>>> [2020-10-01 20:52:43.569799] I [master(worker >>>>> /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid >>>>> crawl????? stime=None >>>>> [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>> Crawl Status Change???? status=Hybrid Crawl >>>>> [2020-10-01 20:52:44.370985] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=20.4307??????? num_files=2609? job=5 return_code=23 >>>>> [2020-10-01 20:52:49.431854] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] >>>>> [2020-10-01 20:52:54.801500] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=72.7492??????? num_files=4227? job=2 return_code=23 >>>>> [2020-10-01 20:52:56.766547] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=74.3569??????? num_files=4674? job=5 return_code=23 >>>>> [2020-10-01 20:53:18.853333] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=28.7125??????? num_files=4397? job=3 return_code=23 >>>>> [2020-10-01 20:53:21.224921] W [master(worker >>>>> /data/storage_a/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs???? files=['CHANGELOG.1601044033', >>>>> 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', >>>>> 'CHANGELOG.1601044078', >>>>> 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', >>>>> 'CHANGELOG.1601044123'] >>>>> [2020-10-01 20:53:22.134536] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=0.2159 num_files=3???? job=3?? return_code=23 >>>>> [2020-10-01 20:53:25.615712] I [master(worker >>>>> /data/storage_b/storage):1681:crawl] _GMaster: processing xsync >>>>> changelog >>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 >>>>> >>>>> [2020-10-01 20:53:25.634970] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] >>>>> >>>>> GeoReplication status - see it change from Active to Faulty: >>>>> >>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>> >>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK SLAVE USER >>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>> CRAWL STATUS?????? LAST_SYNCED >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> 10.0.231.91??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>> 10.0.231.91??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.91??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> History Crawl????? 2020-09-23 01:56:05 >>>>> 10.0.231.92??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 06:55:57 >>>>> 10.0.231.93??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> >>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>> >>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK SLAVE USER >>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>> CRAWL STATUS?????? LAST_SYNCED >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> 10.0.231.91??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>> 10.0.231.91??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.91??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> History Crawl????? 2020-09-23 01:58:05 >>>>> 10.0.231.92??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_c/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 06:58:56 >>>>> 10.0.231.93??? storage?????? /data/storage_b/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_a/storage geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> >>>>> >>>>> Cluster information: (Note - disabled performance.open-behind to work >>>>> around https://github.com/gluster/glusterfs/issues/1440 ) >>>>> >>>>> [root at storage01 ~]# gluster --version | head -1; cat >>>>> /etc/centos-release; uname -r >>>>> glusterfs 7.7 >>>>> CentOS Linux release 7.8.2003 (Core) >>>>> 3.10.0-1127.10.1.el7.x86_64 >>>>> >>>>> [root at storage01 ~]# df -h /storage2/ >>>>> Filesystem??????????? Size? Used Avail Use% Mounted on >>>>> 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 >>>>> >>>>> [root at storage01 ~]# gluster volume info >>>>> >>>>> Volume Name: storage >>>>> Type: Distributed-Replicate >>>>> Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 >>>>> Status: Started >>>>> Snapshot Count: 0 >>>>> Number of Bricks: 3 x (2 + 1) = 9 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: 10.0.231.91:/data/storage_a/storage >>>>> Brick2: 10.0.231.92:/data/storage_b/storage >>>>> Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) >>>>> Brick4: 10.0.231.92:/data/storage_a/storage >>>>> Brick5: 10.0.231.93:/data/storage_b/storage >>>>> Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) >>>>> Brick7: 10.0.231.93:/data/storage_a/storage >>>>> Brick8: 10.0.231.91:/data/storage_b/storage >>>>> Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) >>>>> Options Reconfigured: >>>>> changelog.changelog: on >>>>> geo-replication.ignore-pid-check: on >>>>> geo-replication.indexing: on >>>>> network.ping-timeout: 10 >>>>> features.inode-quota: on >>>>> features.quota: on >>>>> nfs.disable: on >>>>> features.quota-deem-statfs: on >>>>> storage.fips-mode-rchecksum: on >>>>> performance.readdir-ahead: on >>>>> performance.parallel-readdir: on >>>>> cluster.lookup-optimize: on >>>>> client.event-threads: 4 >>>>> server.event-threads: 4 >>>>> performance.cache-size: 256MB >>>>> performance.open-behind: off >>>>> >>>>> Thanks, >>>>> ??-Matthew >>>>> ________ >>>>> >>>>> >>>>> >>>>> Community Meeting Calendar: >>>>> >>>>> Schedule - >>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>> Bridge: https://bluejeans.com/441850968 >>>>> >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://bluejeans.com/441850968 >>>> >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge:https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge:https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201005/66b1d8c1/attachment.html>
Matthew Benstead
2020-Oct-13 15:57 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Further to this - After rebuilding the slave volume with the xattr=sa option and starting the destroying and restarting the geo-replication sync I am still getting "extended attribute not supported by the backend storage" errors: [root at storage01 storage_10.0.231.81_pcic-backup]# tail mnt-data-storage_a-storage.log [2020-10-13 14:00:27.095418] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 14:13:44.497710] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 14:19:07.245191] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 14:33:24.031232] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 14:41:54.070198] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 14:53:27.740279] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 15:02:31.951660] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 15:07:41.470933] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 15:18:42.664005] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage [2020-10-13 15:26:17.510656] E [fuse-bridge.c:4288:fuse_xattr_cbk] 0-glusterfs-fuse: extended attribute not supported by the backend storage When checkiung the logs I see errors around the set-ctime-mdata - dict set of key for set-ctime-mdata failed [Input/output error] [root at pcic-backup01 storage_10.0.231.81_pcic-backup]# tail -20 mnt-10.0.231.93-data-storage_b-storage.log [2020-10-13 15:40:38.579096] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15133: LINK() /.gfid/ba2374d3-23ac-4094-b795-b03738583765/ui-icons_222222_256x240.png => -1 (File exists) [2020-10-13 15:40:38.583874] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15138: LINK() /.gfid/ba2374d3-23ac-4094-b795-b03738583765/ui-bg_glass_95_fef1ec_1x400.png => -1 (File exists) [2020-10-13 15:40:38.584828] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-pcic-backup-client-1: remote operation failed [No such file or directory] [2020-10-13 15:40:38.585887] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: 5e2d07f2-253f-442b-9df7-68848cf3b541: gfid different on the target file on pcic-backup-readdir-ahead-0 [2020-10-13 15:40:38.585916] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] [2020-10-13 15:40:38.604843] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15152: LINK() /.gfid/5fb546eb-87b3-4a4d-954c-c3a2ad8d06b5/font-awesome.min.css => -1 (File exists) [2020-10-13 15:40:38.770794] W [MSGID: 114031] [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-pcic-backup-client-1: remote operation failed. Path: <gfid:c4173839-957a-46ef-873b-7974305ee5ff> (c4173839-957a-46ef-873b-7974305ee5ff) [No such file or directory] [2020-10-13 15:40:38.774303] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15292: LINK() /.gfid/52e174c2-766f-4d95-8415-27ea020b7c8d/MathML.js => -1 (File exists) [2020-10-13 15:40:38.790133] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15297: LINK() /.gfid/52e174c2-766f-4d95-8415-27ea020b7c8d/HTML-CSS.js => -1 (File exists) [2020-10-13 15:40:38.813826] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15323: LINK() /.gfid/ad758f3a-e6b9-4a1c-a9f2-2e4a58954e83/latin-mathfonts-bold-fraktur.js => -1 (File exists) [2020-10-13 15:40:38.830217] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15340: LINK() /.gfid/ad758f3a-e6b9-4a1c-a9f2-2e4a58954e83/math_harpoons.js => -1 (File exists) [2020-10-13 15:40:39.084522] W [MSGID: 114031] [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-pcic-backup-client-1: remote operation failed. Path: <gfid:10e95272-a0d3-404e-b0da-9c87f2f450b0> (10e95272-a0d3-404e-b0da-9c87f2f450b0) [No such file or directory] [2020-10-13 15:40:39.114516] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15571: LINK() /.gfid/8bb4435f-32f1-44a6-9346-82e4d7d867d4/sieve.js => -1 (File exists) [2020-10-13 15:40:39.233346] W [MSGID: 114031] [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-pcic-backup-client-1: remote operation failed. Path: <gfid:318e1260-cf0e-44d3-b964-33165aabf6fe> (318e1260-cf0e-44d3-b964-33165aabf6fe) [No such file or directory] [2020-10-13 15:40:39.236109] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15720: LINK() /.gfid/179f6634-be43-44d6-8e46-d493ddc52b9e/__init__.cpython-36.pyc => -1 (File exists) [2020-10-13 15:40:39.259296] W [MSGID: 114031] [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-pcic-backup-client-1: remote operation failed. Path: /.gfid/1c4b39ee-cadb-49df-a3ee-ea9648913d8a/blocking.py (00000000-0000-0000-0000-000000000000) [No data available] [2020-10-13 15:40:39.340758] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15870: LINK() /.gfid/f5d6b380-b9b7-46e7-be69-668f0345a8a4/top_level.txt => -1 (File exists) [2020-10-13 15:40:39.414092] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15945: LINK() /.gfid/2ae57be0-ec4b-4b2a-95e5-cdedd98061f0/ar_MR.dat => -1 (File exists) [2020-10-13 15:40:39.941258] W [MSGID: 114031] [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-pcic-backup-client-1: remote operation failed. Path: <gfid:7752a80a-7dea-4dc1-80dd-e57d10b57640> (7752a80a-7dea-4dc1-80dd-e57d10b57640) [No such file or directory] [2020-10-13 15:40:39.944186] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16504: LINK() /.gfid/943e08bf-803d-492f-81c1-cba34e867956/heaps.cpython-36.pyc => -1 (File exists) Any thoughts on this? Thanks, ?-Matthew -- Matthew Benstead System Administrator Pacific Climate Impacts Consortium <https://pacificclimate.org/> University of Victoria, UH1 PO Box 1800, STN CSC Victoria, BC, V8W 2Y2 Phone: +1-250-721-8432 Email: matthewb at uvic.ca On 10/5/20 1:28 PM, Matthew Benstead wrote:> Hmm... Looks like I forgot to set the xattr's to sa - I left them as > default. > > [root at pcic-backup01 ~]# zfs get xattr pcic-backup01-zpool > NAME???????????????? PROPERTY? VALUE? SOURCE > pcic-backup01-zpool? xattr???? on???? default > > [root at pcic-backup02 ~]# zfs get xattr pcic-backup02-zpool > NAME???????????????? PROPERTY? VALUE? SOURCE > pcic-backup02-zpool? xattr???? on???? default > > I wonder if I can change them and continue, or if I need to blow away > the zpool and start over? > > Thanks, > ?-Matthew > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: matthewb at uvic.ca > > On 10/5/20 12:53 PM, Felix K?lzow wrote: >> >> Dear Matthew, >> >> this is our configuration: >> >> zfs get all mypool >> >> mypool? xattr?????????????????????????? >> sa????????????????????????????? local >> mypool? acltype???????????????????????? >> posixacl??????????????????????? local >> >> >> Something more to consider? >> >> >> Regards, >> >> Felix >> >> >> >> On 05/10/2020 21:11, Matthew Benstead wrote: >>> Thanks Felix - looking through some more of the logs I may have >>> found the reason... >>> >>> From >>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/mnt-data-storage_a-storage.log >>> >>> [2020-10-05 18:13:35.736838] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:18:53.885591] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:22:14.405234] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:25:53.971679] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:31:44.571557] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:36:36.508772] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:40:10.401055] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:42:57.833536] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:45:19.691953] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:48:26.478532] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> [2020-10-05 18:52:24.466914] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>> 0-glusterfs-fuse: extended attribute not supported by the backend >>> storage >>> >>> >>> The slave nodes are running gluster on top of ZFS, but I had >>> configured ACLs - is there something else missing to make this work >>> with ZFS? >>> >>> [root at pcic-backup01 ~]# gluster volume info >>> ? >>> Volume Name: pcic-backup >>> Type: Distribute >>> Volume ID: 7af8a424-f4b6-4405-bba1-0dbafb0fa231 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.0.231.81:/pcic-backup01-zpool/brick >>> Brick2: 10.0.231.82:/pcic-backup02-zpool/brick >>> Options Reconfigured: >>> network.ping-timeout: 10 >>> performance.cache-size: 256MB >>> server.event-threads: 4 >>> client.event-threads: 4 >>> cluster.lookup-optimize: on >>> performance.parallel-readdir: on >>> performance.readdir-ahead: on >>> features.quota-deem-statfs: on >>> features.inode-quota: on >>> features.quota: on >>> transport.address-family: inet >>> nfs.disable: on >>> features.read-only: off >>> performance.open-behind: off >>> >>> >>> [root at pcic-backup01 ~]# zfs get acltype pcic-backup01-zpool >>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>> pcic-backup01-zpool? acltype?? posixacl? local >>> >>> [root at pcic-backup01 ~]# grep "pcic-backup0" /proc/mounts >>> pcic-backup01-zpool /pcic-backup01-zpool zfs >>> rw,seclabel,xattr,posixacl 0 0 >>> >>> >>> [root at pcic-backup02 ~]# zfs get acltype pcic-backup02-zpool >>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>> pcic-backup02-zpool? acltype?? posixacl? local >>> >>> [root at pcic-backup02 ~]# grep "pcic-backup0" /proc/mounts >>> pcic-backup02-zpool /pcic-backup02-zpool zfs >>> rw,seclabel,xattr,posixacl 0 0 >>> >>> Thanks, >>> ?-Matthew >>> >>> >>> -- >>> Matthew Benstead >>> System Administrator >>> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >>> University of Victoria, UH1 >>> PO Box 1800, STN CSC >>> Victoria, BC, V8W 2Y2 >>> Phone: +1-250-721-8432 >>> Email: matthewb at uvic.ca >>> >>> On 10/5/20 1:39 AM, Felix K?lzow wrote: >>>> Dear Matthew, >>>> >>>> >>>> can you provide more information regarding to the geo-replication >>>> brick >>>> logs. >>>> >>>> These files area also located in: >>>> >>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ >>>> >>>> >>>> Usually, these log files are more precise to figure out the root cause >>>> of the error. >>>> >>>> Additionally, it is also worth to look at the log-files on the >>>> slave side. >>>> >>>> >>>> Regards, >>>> >>>> Felix >>>> >>>> >>>> On 01/10/2020 23:08, Matthew Benstead wrote: >>>>> Hello, >>>>> >>>>> I'm looking for some help with a GeoReplication Error in my Gluster >>>>> 7/CentOS 7 setup. Replication progress has basically stopped, and the >>>>> status of the replication keeps switching. >>>>> >>>>> The gsyncd log has errors like "Operation not permitted", "incomplete >>>>> sync", etc... help? I'm not sure how to proceed in troubleshooting >>>>> this. >>>>> >>>>> The log is here, it basically just repeats - from: >>>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log >>>>> >>>>> >>>>> [2020-10-01 20:52:15.291923] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=32.8466??????? num_files=1749? job=3?? return_code=23 >>>>> [2020-10-01 20:52:18.700062] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=43.1210??????? num_files=3167? job=6?? return_code=23 >>>>> [2020-10-01 20:52:23.383234] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>> [2020-10-01 20:52:28.537657] E [repce(worker >>>>> /data/storage_b/storage):213:__call__] RepceClient: call failed >>>>> call=258187:140538843596608:1601585515.63?????? method=entry_ops >>>>> error=OSError >>>>> [2020-10-01 20:52:28.538064] E [syncdutils(worker >>>>> /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: >>>>> Traceback (most recent call last): >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line >>>>> 332, >>>>> in main >>>>> ???? func(args) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", >>>>> line 86, >>>>> in subcmd_worker >>>>> ???? local.service_loop(remote) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>>> 1308, in service_loop >>>>> ???? g1.crawlwrap(oneshot=True, register_time=register_time) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 602, >>>>> in crawlwrap >>>>> ???? self.crawl() >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1682, >>>>> in crawl >>>>> ???? self.process([item[1]], 0) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1327, >>>>> in process >>>>> ???? self.process_change(change, done, retry) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line >>>>> 1221, >>>>> in process_change >>>>> ???? failures = self.slave.server.entry_ops(entries) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>> 232, in >>>>> __call__ >>>>> ???? return self.ins(self.meth, *a) >>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>> 214, in >>>>> __call__ >>>>> ???? raise res >>>>> OSError: [Errno 1] Operation not permitted >>>>> [2020-10-01 20:52:28.570316] I [repce(agent >>>>> /data/storage_b/storage):96:service_loop] RepceServer: terminating on >>>>> reaching EOF. >>>>> [2020-10-01 20:52:28.613603] I >>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>> Status Change status=Faulty >>>>> [2020-10-01 20:52:29.619797] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=5.6458 num_files=455?? job=3?? return_code=23 >>>>> [2020-10-01 20:52:38.286245] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=14.1824??????? num_files=1333? job=2?? return_code=23 >>>>> [2020-10-01 20:52:38.628156] I >>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>> Status Change status=Initializing... >>>>> [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] >>>>> Monitor: >>>>> starting gsyncd worker?? brick=/data/storage_b/storage >>>>> slave_node=10.0.231.82 >>>>> [2020-10-01 20:52:38.684736] I [gsyncd(agent >>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>> file >>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>> >>>>> [2020-10-01 20:52:38.687213] I [gsyncd(worker >>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>> file >>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>> >>>>> [2020-10-01 20:52:38.687401] I [changelogagent(agent >>>>> /data/storage_b/storage):72:__init__] ChangelogAgent: Agent >>>>> listining... >>>>> [2020-10-01 20:52:38.703295] I [resource(worker >>>>> /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH >>>>> connection between master and slave... >>>>> [2020-10-01 20:52:40.388372] I [resource(worker >>>>> /data/storage_b/storage):1435:connect_remote] SSH: SSH connection >>>>> between master and slave established. duration=1.6849 >>>>> [2020-10-01 20:52:40.388582] I [resource(worker >>>>> /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster >>>>> volume >>>>> locally... >>>>> [2020-10-01 20:52:41.501105] I [resource(worker >>>>> /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster >>>>> volume >>>>> duration=1.1123 >>>>> [2020-10-01 20:52:41.501405] I [subcmds(worker >>>>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn >>>>> successful. Acknowledging back to monitor >>>>> [2020-10-01 20:52:43.531146] I [master(worker >>>>> /data/storage_b/storage):1640:register] _GMaster: Working dir >>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>>>> >>>>> [2020-10-01 20:52:43.533953] I [resource(worker >>>>> /data/storage_b/storage):1291:service_loop] GLUSTER: Register time >>>>> time=1601585563 >>>>> [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status >>>>> Change status=Active >>>>> [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>> Crawl Status Change???? status=History Crawl >>>>> [2020-10-01 20:52:43.562184] I [master(worker >>>>> /data/storage_b/storage):1554:crawl] _GMaster: starting history >>>>> crawl???? turns=1 stime=None????? entry_stime=None??????? >>>>> etime=1601585563 >>>>> [2020-10-01 20:52:43.562269] I [resource(worker >>>>> /data/storage_b/storage):1307:service_loop] GLUSTER: No stime >>>>> available, >>>>> using xsync crawl >>>>> [2020-10-01 20:52:43.569799] I [master(worker >>>>> /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid >>>>> crawl????? stime=None >>>>> [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker >>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>> Crawl Status Change???? status=Hybrid Crawl >>>>> [2020-10-01 20:52:44.370985] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=20.4307??????? num_files=2609? job=5?? return_code=23 >>>>> [2020-10-01 20:52:49.431854] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>> [2020-10-01 20:52:54.801500] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=72.7492??????? num_files=4227? job=2?? return_code=23 >>>>> [2020-10-01 20:52:56.766547] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=74.3569??????? num_files=4674? job=5?? return_code=23 >>>>> [2020-10-01 20:53:18.853333] I [master(worker >>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=28.7125??????? num_files=4397? job=3?? return_code=23 >>>>> [2020-10-01 20:53:21.224921] W [master(worker >>>>> /data/storage_a/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs???? files=['CHANGELOG.1601044033', >>>>> 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', >>>>> 'CHANGELOG.1601044078', >>>>> 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', >>>>> 'CHANGELOG.1601044123'] >>>>> [2020-10-01 20:53:22.134536] I [master(worker >>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>> duration=0.2159 num_files=3???? job=3?? return_code=23 >>>>> [2020-10-01 20:53:25.615712] I [master(worker >>>>> /data/storage_b/storage):1681:crawl] _GMaster: processing xsync >>>>> changelog >>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 >>>>> >>>>> [2020-10-01 20:53:25.634970] W [master(worker >>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>> >>>>> GeoReplication status - see it change from Active to Faulty: >>>>> >>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>> >>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>> CRAWL STATUS?????? LAST_SYNCED >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> History Crawl????? 2020-09-23 01:56:05 >>>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 06:55:57 >>>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> >>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>> >>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>> CRAWL STATUS?????? LAST_SYNCED >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> History Crawl????? 2020-09-23 01:58:05 >>>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Changelog Crawl??? 2020-09-25 06:58:56 >>>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>> Hybrid Crawl?????? N/A >>>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>> N/A??????????????? N/A >>>>> >>>>> >>>>> Cluster information: (Note - disabled performance.open-behind to work >>>>> around https://github.com/gluster/glusterfs/issues/1440 ) >>>>> >>>>> [root at storage01 ~]# gluster --version | head -1; cat >>>>> /etc/centos-release; uname -r >>>>> glusterfs 7.7 >>>>> CentOS Linux release 7.8.2003 (Core) >>>>> 3.10.0-1127.10.1.el7.x86_64 >>>>> >>>>> [root at storage01 ~]# df -h /storage2/ >>>>> Filesystem??????????? Size? Used Avail Use% Mounted on >>>>> 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 >>>>> >>>>> [root at storage01 ~]# gluster volume info >>>>> >>>>> Volume Name: storage >>>>> Type: Distributed-Replicate >>>>> Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 >>>>> Status: Started >>>>> Snapshot Count: 0 >>>>> Number of Bricks: 3 x (2 + 1) = 9 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: 10.0.231.91:/data/storage_a/storage >>>>> Brick2: 10.0.231.92:/data/storage_b/storage >>>>> Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) >>>>> Brick4: 10.0.231.92:/data/storage_a/storage >>>>> Brick5: 10.0.231.93:/data/storage_b/storage >>>>> Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) >>>>> Brick7: 10.0.231.93:/data/storage_a/storage >>>>> Brick8: 10.0.231.91:/data/storage_b/storage >>>>> Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) >>>>> Options Reconfigured: >>>>> changelog.changelog: on >>>>> geo-replication.ignore-pid-check: on >>>>> geo-replication.indexing: on >>>>> network.ping-timeout: 10 >>>>> features.inode-quota: on >>>>> features.quota: on >>>>> nfs.disable: on >>>>> features.quota-deem-statfs: on >>>>> storage.fips-mode-rchecksum: on >>>>> performance.readdir-ahead: on >>>>> performance.parallel-readdir: on >>>>> cluster.lookup-optimize: on >>>>> client.event-threads: 4 >>>>> server.event-threads: 4 >>>>> performance.cache-size: 256MB >>>>> performance.open-behind: off >>>>> >>>>> Thanks, >>>>> ??-Matthew >>>>> ________ >>>>> >>>>> >>>>> >>>>> Community Meeting Calendar: >>>>> >>>>> Schedule - >>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>> Bridge: https://bluejeans.com/441850968 >>>>> >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://bluejeans.com/441850968 >>>> >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201013/cd98554b/attachment.html>